Skip to content

[AArch64][SVE] MLIR's @setArmVLBits causing a seg-fault under QEMU #143670

Open
@banach-space

Description

@banach-space

NOTE
I am still in the process of triaging this one. I suspect that this is an issue in either MLIR or QEMU, but I need more time to confirm.

INPUT IR:

// repro.mlir

func.func private @getFlatMemRef_i32() -> memrefi32> {
  %c0 = arith.constant 0 : index
  %c16 = arith.constant 16 : index
  %multiplier = vector.vscale
  // UNCOMMENT THIS FOR VLS VARIANT
  // %multiplier = arith.constant 1 : index

  %vscale_times_16 = arith.muli %multiplier, %c16 : index
  %flat_mem = memref.alloc(%c16) : memrefi32>
  %vector_i32 = llvm.intr.stepvector : vector<[16]xi32>

  vector.transfer_write %vector_i32, %flat_mem[%c0] : vector<[16]xi32>, memrefi32>
  return %flat_mem : memrefi32>
}

func.func @main() {
  %c256 = arith.constant 256 : i32
  // Commenting this out removes the segfault
  func.call @setArmVLBits(%c256) : (i32) -> ()

  %c0 = arith.constant 0 : index
  %c0_i32 = arith.constant 0 : i32

  %acc_flat = func.call @getFlatMemRef_i32() : () -> memrefi32>
  %flat_vec = vector.transfer_read %acc_flat[%c0], %c0_i32 {in_bounds = [true]} : memrefi32>, vector<[16]xi32>
  %acc = vector.shape_cast %flat_vec : vector<[16]xi32> to vector<4x[4]xi32>

  %u0 = vector.extract %acc[0] : vector<[4]xi32> from vector<4x[4]xi32>
  vector.print %u0 : vector<[4]xi32>

  // Un-commenting this out removes the segfault
  // %acc_cast = memref.cast %acc_flat : memref to memref<*xi32>
  // call @printMemrefI32(%acc_cast) : (memref<*xi32>) -> ()

  memref.dealloc %acc_flat : memrefi32>

  return
}

func.func private @printMemrefI32(%ptr : memref<*xi32>)
func.func private @setArmVLBits(%bits : i32)

TO REPRODUCE:

cd <llvm-build-dir>
# Make sure repro.mlir is available in this directory!
bin/mlir-opt repro.mlir --convert-vector-to-scf --convert-scf-to-cf  --convert-vector-to-llvm='enable-arm-sve enable-arm-i8mm' --expand-strided-metadata --convert-to-llvm --finalize-memref-to-llvm  --reconcile-unrealized-casts -o repro.tmp &&  bin/mlir-runner repro.tmp -e main -entry-point-result=void  --march=aarch64 --mattr="+sve" -shared-libs=lib/libmlir_runner_utils.so,lib/libmlir_c_runner_utils.so,lib/libmlir_arm_runner_utils.so

OUTPUT

( 0, 1, 2, 3, 4, 5, 6, 7 )
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: bin/mlir-runner repro.tmp -e main -entry-point-result=void --march=aarch64 --mattr=+sve,+i8mm -shared-libs=lib/libmlir_runner_utils.so,lib/libmlir_c_runner_utils.so,lib/libmlir_arm_runner_utils.so
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault (core dumped)

DISCUSSION

There are 3 ways to remove the seg-fault:

  • Run natively rather than using QEMU.
    • This will require an SVE-enabled machine.
    • Note, I am using QEMU 9.0.0.
  • Comment-out func.call @setArmVLBits and use the default vector length, VL, instead.
    • Note, this method is used in other tests that work fine.
  • Un-comment call @printMemrefI32(%acc_cast).
    • This implies that there might be some allocation/de-allocation issue.

NEXT STEPS

I need to allocate more cycles to this. For now I am creating this to document workarounds that I am going to propose for #140573.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions