Open
Description
NOTE
I am still in the process of triaging this one. I suspect that this is an issue in either MLIR or QEMU, but I need more time to confirm.
INPUT IR:
// repro.mlir
func.func private @getFlatMemRef_i32() -> memrefi32> {
%c0 = arith.constant 0 : index
%c16 = arith.constant 16 : index
%multiplier = vector.vscale
// UNCOMMENT THIS FOR VLS VARIANT
// %multiplier = arith.constant 1 : index
%vscale_times_16 = arith.muli %multiplier, %c16 : index
%flat_mem = memref.alloc(%c16) : memrefi32>
%vector_i32 = llvm.intr.stepvector : vector<[16]xi32>
vector.transfer_write %vector_i32, %flat_mem[%c0] : vector<[16]xi32>, memrefi32>
return %flat_mem : memrefi32>
}
func.func @main() {
%c256 = arith.constant 256 : i32
// Commenting this out removes the segfault
func.call @setArmVLBits(%c256) : (i32) -> ()
%c0 = arith.constant 0 : index
%c0_i32 = arith.constant 0 : i32
%acc_flat = func.call @getFlatMemRef_i32() : () -> memrefi32>
%flat_vec = vector.transfer_read %acc_flat[%c0], %c0_i32 {in_bounds = [true]} : memrefi32>, vector<[16]xi32>
%acc = vector.shape_cast %flat_vec : vector<[16]xi32> to vector<4x[4]xi32>
%u0 = vector.extract %acc[0] : vector<[4]xi32> from vector<4x[4]xi32>
vector.print %u0 : vector<[4]xi32>
// Un-commenting this out removes the segfault
// %acc_cast = memref.cast %acc_flat : memref to memref<*xi32>
// call @printMemrefI32(%acc_cast) : (memref<*xi32>) -> ()
memref.dealloc %acc_flat : memrefi32>
return
}
func.func private @printMemrefI32(%ptr : memref<*xi32>)
func.func private @setArmVLBits(%bits : i32)
TO REPRODUCE:
cd <llvm-build-dir>
# Make sure repro.mlir is available in this directory!
bin/mlir-opt repro.mlir --convert-vector-to-scf --convert-scf-to-cf --convert-vector-to-llvm='enable-arm-sve enable-arm-i8mm' --expand-strided-metadata --convert-to-llvm --finalize-memref-to-llvm --reconcile-unrealized-casts -o repro.tmp && bin/mlir-runner repro.tmp -e main -entry-point-result=void --march=aarch64 --mattr="+sve" -shared-libs=lib/libmlir_runner_utils.so,lib/libmlir_c_runner_utils.so,lib/libmlir_arm_runner_utils.so
OUTPUT
( 0, 1, 2, 3, 4, 5, 6, 7 )
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0. Program arguments: bin/mlir-runner repro.tmp -e main -entry-point-result=void --march=aarch64 --mattr=+sve,+i8mm -shared-libs=lib/libmlir_runner_utils.so,lib/libmlir_c_runner_utils.so,lib/libmlir_arm_runner_utils.so
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault (core dumped)
DISCUSSION
There are 3 ways to remove the seg-fault:
- Run natively rather than using QEMU.
- This will require an SVE-enabled machine.
- Note, I am using QEMU 9.0.0.
- Comment-out
func.call @setArmVLBits
and use the default vector length,VL
, instead.- Note, this method is used in other tests that work fine.
- Un-comment
call @printMemrefI32(%acc_cast)
.- This implies that there might be some allocation/de-allocation issue.
NEXT STEPS
I need to allocate more cycles to this. For now I am creating this to document workarounds that I am going to propose for #140573.