Pulse · iree-org/iree · GitHub

October 10, 2024 – October 17, 2024

Overview

56 Active pull requests

33 Active issues

8 Releases published by 1 person

candidate-20241011.1043 iree candidate candidate-20241011.1043
published Oct 11, 2024
candidate-20241012.1044 iree candidate candidate-20241012.1044
published Oct 12, 2024
candidate-20241013.1045 iree candidate candidate-20241013.1045
published Oct 13, 2024
candidate-20241014.1046 iree candidate candidate-20241014.1046
published Oct 14, 2024
candidate-20241015.1047 iree candidate candidate-20241015.1047
published Oct 15, 2024
candidate-20241016.1048 iree candidate candidate-20241016.1048
published Oct 16, 2024
candidate-20241016.21 iree candidate candidate-20241016.21
published Oct 16, 2024
candidate-20241017.1049 iree candidate candidate-20241017.1049
published Oct 17, 2024

40 Pull requests merged by 18 people

[Codegen] Drop TransformStrategies
#18820 merged Oct 18, 2024
Skip ROCM/test/opt_pass_plugin on Windows while broken.
#18823 merged Oct 17, 2024
iree_gpu Python bindings (GPUPipelineOptionsAttr)
#18804 merged Oct 17, 2024
[TileSwizzle] Make the dump and variable name match. (NFC)
#18821 merged Oct 17, 2024
Warn when --iree-llvmcpu-target-cpu defaults to "generic".
#18682 merged Oct 17, 2024
[Flow] Fix FoldSplatReshapeIntoSplat pattern
#18818 merged Oct 17, 2024
Opt into free-threaded Python
#18770 merged Oct 17, 2024
Enable arithmetic optimizations as part of the stream simplification pipeline.
#18806 merged Oct 17, 2024
Fixes a range inference overflow with util.align.
#18808 merged Oct 17, 2024
Add util.assume.int folder.
#18805 merged Oct 17, 2024
Various tweaks to numeric optimizations found while looking at programs.
#18765 merged Oct 17, 2024
Run ONNX model tests as part of pkgci_test_onnx.
#18795 merged Oct 17, 2024
Integrates LLVM @ 36d936a2d057ddbd7822614edf01e39a0c21d654
#18801 merged Oct 16, 2024
Produce releases for Python 3.13.
#18799 merged Oct 16, 2024
Bump to latest stablehlo commit, including MSVC build fix.
#18797 merged Oct 16, 2024
Add region to linalg_ext.attention
#18728 merged Oct 16, 2024
[GPU] Adding support for opt pass plugins during AMDGPU executable serialization
#18347 merged Oct 16, 2024
[DispatchCreation] Extend multi-use producer fusion
#18551 merged Oct 16, 2024
[docs] Update and harmonize guids for deployment
#18762 merged Oct 16, 2024
Integrates LLVM @ a758bcdbd92efb64a3482eb95d2769d74e33f5bb
#18783 merged Oct 16, 2024
Revert tensor.cast to flow reshape conversion
#18772 merged Oct 16, 2024
[CMake] Don't update compile definitions for imported targets for MSCV
#18766 merged Oct 15, 2024
Bump torch-mlir to 45bb17e
#18782 merged Oct 15, 2024
[Codegen] Replace LICM with a version that checks trip count
#18679 merged Oct 15, 2024
[ROCM] Fix feature flags for gfx1100 and improve flag handling
#18781 merged Oct 15, 2024
[Codegen][GPU] Add tiling cleanup pattern to fuse pad without zero gaurd
#18748 merged Oct 15, 2024
Integrate LLVM @ 7900daaa7ba57b5f9729bbbdb54f4e0599a45cd7
#18773 merged Oct 15, 2024
Add a default lowering config setting for custom_op.
#18737 merged Oct 14, 2024
[LLVMGPU] Add configuration tests for IGEMM, fix NCHW case
#18734 merged Oct 14, 2024
[IGEMM] Generate matmuls with expanded H and W dims
#18735 merged Oct 14, 2024
[Flow] Fold flow reshape with mismatching dyn dims
#18680 merged Oct 14, 2024
[LLVMCPU] Enable tileDispatchUsingForall for multiTilingExpert
#18730 merged Oct 14, 2024
[LinalgExt] Remove default implementation for getStaticLoopRanges
#18745 merged Oct 14, 2024
[Codegen][GPU] Add pass for fallback distribution
#18726 merged Oct 13, 2024
Bump torch-mlir to ab62f35373c3944b68e564214fd04fff39dd92fc
#18763 merged Oct 11, 2024
Use integer range and divisibility analysis to propagate int assumptions into dispatch executables.
#18755 merged Oct 11, 2024
Adding iree_vm_context_fork to fork a context.
#18751 merged Oct 11, 2024
[Codegen] Remove unused arguements from attention op
#18743 merged Oct 11, 2024
[Codegen] Remove memref optimizations from OptimizeTensorInsertExtractSlices
#18732 merged Oct 11, 2024
Enable analysis based integer optimizations.
#18756 merged Oct 11, 2024

16 Pull requests opened by 11 people

[Codegen] Remove wrong usages of OptimizeVectorTransfer
#18757 opened Oct 11, 2024
[Codegen] Add pass to convert splat constants to fills
#18758 opened Oct 11, 2024
Add end-to-end tests for `iree_linalg_ext.custom_op`.
#18764 opened Oct 11, 2024
[LLVMGPU] Combine parallel and reduction padding in LLVMGPUPadAndVectorDistribute
#18771 opened Oct 14, 2024
[LLVMCPU] Enable tileDispatchUsingForall as default
#18777 opened Oct 15, 2024
[LinalgExt] Generalize attribute setting for attention decomposition
#18780 opened Oct 15, 2024
[VectorDistribution] Add scalar support for distributing multi-dim reduction (1/4)
#18784 opened Oct 15, 2024
[hip] Added hip_device_group_device to the runtime.
#18790 opened Oct 16, 2024
[GPU][WIP] Use TileAndFuse pipeline for non-intrinsic multiple batch matmuls
#18791 opened Oct 16, 2024
[GPU] Prefer TileAndFuse pipeline over SIMT pipeline
#18793 opened Oct 16, 2024
[Codegen][LLVMGPU] Use `scf.forall` for workgroup distribution in LLVMGPU
#18796 opened Oct 16, 2024
[VectorDistribution] Add layout analysis for distributing multi-dim reduction (2/4)
#18800 opened Oct 16, 2024
Add generalize matmul pass to sdxl fp16 benchmarks
#18816 opened Oct 17, 2024
Document new external ONNX model and linalg operator test suites.
#18819 opened Oct 17, 2024
[VectorDistribution] Plumb the VectorDistribute pipeline to support reduction operations (3/4)
#18822 opened Oct 17, 2024
[ROCM] OPT Pass plugin test refactoring and organization
#18824 opened Oct 18, 2024

16 Issues closed by 11 people

error: 'memref.alloca' op expected no unbounded stack allocations
#18810 closed Oct 17, 2024
crash: verifyInvariants failed
#18809 closed Oct 17, 2024
Operand `#1` does not dominate this use
#18815 closed Oct 17, 2024
error: 'func.func' op exceeded stack allocation limit of 32768 bytes for function. Got 1204288 bytes
#18776 closed Oct 17, 2024
Hit an assertion in several onnx models (likely a bug in frontend shape inference)
#18741 closed Oct 17, 2024
Windows compiler builds broken with "number of sections exceeded object file format limit"
#18785 closed Oct 16, 2024
Does the front-end support onnx-mlir? Why not use Onnx-mlir directly when importing onnx, but use Torch-mlir to convert the onnx file into the torch dialect?
#18788 closed Oct 16, 2024
Why is the size of the exported mnist_test.bin file 0 bytes?
#18787 closed Oct 16, 2024
Matmul codegen regression compared to MLPerf branch on main.
#18786 closed Oct 15, 2024
Deprecate Attention scripts and make C++ pipeline functional
#18025 closed Oct 15, 2024
Land Horizontal contraction fusion changes into IREE main
#18009 closed Oct 15, 2024
[HIP] f8e4m3fnuz matmul compile failure for RDNA3
#18769 closed Oct 14, 2024
canonicalize dropping the `lowering_config` attached to an operation.
#18697 closed Oct 14, 2024
Support different access pattern for iree_linalg_ext.attention
#18768 closed Oct 14, 2024
Add `iree_vm_context_fork` to support forking contexts while reusing resources.
#18747 closed Oct 11, 2024
Numerical inaccuracies in multi-device sharded toy-sized Llama
#18687 closed Oct 11, 2024

17 Issues opened by 12 people

Add Apple GPU runners and run Metal tests again
#18817 opened Oct 17, 2024
Add NVIDIA GPU runners and run CUDA tests again
#18814 opened Oct 17, 2024
Run Windows build/test workflows more regularly
#18813 opened Oct 17, 2024
Does IREE support native cuda buffer VM invocation or is there any chance to improve the performance of result torch tensors construction?
#18811 opened Oct 17, 2024
Does IREE currently support large language models?
#18807 opened Oct 17, 2024
Rework iree-compile pipeline for ONNX
#18803 opened Oct 17, 2024
In-place argument updates with device affinities fails
#18802 opened Oct 16, 2024
[Codegen][AMDGPU Backend] Correctness issue for conv_2d_ngchw_gfchw
#18798 opened Oct 16, 2024
Rename `iree-hip-` compiler flags to `iree-rocm-` when they apply to codegen.
#18792 opened Oct 16, 2024
[GPU] : HSA_STATUS_ERROR_MEMORY_APERTURE_VIOLATION in GPU while passing inference in CPU
#18789 opened Oct 16, 2024
HoistIntoGlobals fails to hoist constantOp
#18779 opened Oct 15, 2024
Bufferization needs tensor.extract_slice to bufferize in place
#18778 opened Oct 15, 2024
When importing onnx model, do I convert onnx to torch dialect?
#18775 opened Oct 15, 2024
CUDA out of memory due to huge memory allocation request
#18767 opened Oct 13, 2024
Some friction when using Rye/uv for package management.
#18761 opened Oct 11, 2024
[ROCM] Evaluate whether we can attach `amdgpu-no-implicitarg-ptr` to our generated functions.
#18760 opened Oct 11, 2024
Robustify upstream LICM for zero-trip count loops and other loop kinds
#18759 opened Oct 11, 2024

20 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

[VectorDistribution]reduction support along LLVMGPUVectorDistribute pipeline
#18519 commented on Oct 15, 2024 • 11 new comments
[compiler][flow] Move cast, reshape and bitcast after transfer op
#18742 commented on Oct 15, 2024 • 10 new comments
[LLVMGPU] Use forall workgroup distribution in TileAndFuse pipeline
#18565 commented on Oct 17, 2024 • 5 new comments
Add conversions for 1x1 conv_2d to matmul
#18736 commented on Oct 17, 2024 • 4 new comments
[Codegen] Replace LLVMGPUVectorize with LLVMGPUTileAndFuse for supported cases
#18474 commented on Oct 15, 2024 • 2 new comments
Build and publish Python 3.13 and 3.13t wheels
#18652 commented on Oct 15, 2024 • 0 new comments
[gpu] 'func.func' op uses 401920 bytes of shared memory; exceeded the limit of 65536 bytes
#18603 commented on Oct 15, 2024 • 0 new comments
'util.initializer' op failed to inline into combined initializer
#18386 commented on Oct 15, 2024 • 0 new comments
Migrate jobs off current GCP GHA runner cluster
#18238 commented on Oct 17, 2024 • 0 new comments
Release tracker - 2024/09
#18432 commented on Oct 18, 2024 • 0 new comments
stablehlo.sort not working correctly on Metal
#18698 commented on Oct 15, 2024 • 0 new comments
[Compiler] Rename compiler ROCM target to HIP
#18477 commented on Oct 17, 2024 • 0 new comments
onnx.Pad with mode 'reflect' fails to compile
#18695 commented on Oct 15, 2024 • 0 new comments
[codegen] [gpu]: SD3 MMDiT attention dispatch fails on LinalgExtToLoops for amdgpu targets
#18629 commented on Oct 14, 2024 • 0 new comments
[compiler] strip execution context affinities in const eval
#18663 commented on Oct 16, 2024 • 0 new comments
Support i1 datatype
#18713 commented on Oct 15, 2024 • 0 new comments
[GPU] Support multiple contraction dims in MmaSchedules
#18720 commented on Oct 17, 2024 • 0 new comments
[DT] Plans for the buffer allocation in data-tiling
#17924 commented on Oct 11, 2024 • 0 new comments
DNS - check bazel deps
#18738 commented on Oct 12, 2024 • 0 new comments
[GlobalOptimization] 1x1 filter convolutions not converted to matmul
#18710 commented on Oct 11, 2024 • 0 new comments