refactor: change to TORCH_LIBRARY #823

abmfy · 2025-02-13T11:31:35Z

This PR updates FlashInfer's C++/CUDA extensions from pybind11 modules to torch.libraries, which is recommended since PyTorch 2.5.

This is mainly implemented in #764. We have investigated that the issue in #820 was not caused by this PR, so we're opening it up again.

Signed-off-by: youkaichao <[email protected]>

Signed-off-by: abmfy <[email protected]>

… testcases Signed-off-by: abmfy <[email protected]>

Signed-off-by: abmfy <[email protected]>

Many files in `include/flashinfer/attention/hopper` have their `namespace flashinfer` to `using namespace cute` directly, which is then `using namespace flashinfer`'ed by some files, causing names from `cute` to leak into the global namespace. This is a temporary fix to make compilations work. Signed-off-by: abmfy <[email protected]>

zhyncs · 2025-02-13T12:11:09Z

Please wait a moment, the CUDA graph issue has been fixed by #822.

Follow up of #823 , the `CutlassSegmentGEMMSM90` API do not have member `plan_info_vec`

apply #662 again, since we have #823 merged now. --------- Signed-off-by: youkaichao <[email protected]>

Followup of #823 , we should import `from .. import flashinfer_kernels, flashinfer_kernels_sm90` instead of `from .. import _kernels, _kernels_sm90`, otherwise we will be using JIT compilation all the code. Also add some logic to catch "undefined symbol" errors in case the AOT wheel compilation is successful but failed to be loaded.

youkaichao and others added 27 commits January 29, 2025 22:23

change to TORCH_LIBRARY

d0533ef

Signed-off-by: youkaichao <[email protected]>

use tensor

4a3c9ad

Signed-off-by: youkaichao <[email protected]>

include

2b9eb02

Signed-off-by: youkaichao <[email protected]>

int64_t and double

0878e37

Signed-off-by: youkaichao <[email protected]>

fix unsigned

55a9806

Signed-off-by: youkaichao <[email protected]>

feat: modify types to comply with torch.library interface

5414780

Signed-off-by: abmfy <[email protected]>

fix: fix parameter types of page kernels

c86a910

Signed-off-by: abmfy <[email protected]>

feat: change to TORCH_LIBRARY_FRAGMENT

0b02853

Signed-off-by: abmfy <[email protected]>

feat: add dynamic module initializer for extension modules

451a6ad

Signed-off-by: abmfy <[email protected]>

feat: link cuda libraries for extension modules

aa05be0

feat: unified python module init function

ab48671

feat: import operators from torch.ops

cd0075c

Signed-off-by: abmfy <[email protected]>

fix: fix kernel module name

04bf9f5

Signed-off-by: abmfy <[email protected]>

fix: add static_cast to ensure type consistency in tensors

7ce79aa

Signed-off-by: abmfy <[email protected]>

fix: fix parameter types in batch_prefill

ceea3e4

Signed-off-by: abmfy <[email protected]>

Merge branch 'main' into torch_lib

a9fb73e

Signed-off-by: abmfy <[email protected]>

fix: fix kernel module name in flashinfer.gemm

7606a8a

Signed-off-by: abmfy <[email protected]>

fix: fix type of plan_info_vec in BatchDecodeWithPagedKVCacheRunMLA

3c64091

Signed-off-by: abmfy <[email protected]>

test: update additional tensor dtypes to double in test_jit_example

a086294

Signed-off-by: abmfy <[email protected]>

fix: include test parameters in operator uri to avoid conflicts among…

605e40e

… testcases Signed-off-by: abmfy <[email protected]>

feat: add prefix in torch.ops module name

56aacd0

Signed-off-by: abmfy <[email protected]>

feat: include <torch/library.h> only to speed up JIT

2f83198

Signed-off-by: abmfy <[email protected]>

feat: move out vector-tensor conversion

9142661

Signed-off-by: abmfy <[email protected]>

feat: ensure converted tensor is on cpu; remove type check

ad1117f

Signed-off-by: abmfy <[email protected]>

Merge remote-tracking branch 'origin/main' into torch_lib

f47b357

fix-mla

6b17339

yzh119 merged commit dbb1e4e into flashinfer-ai:main Feb 13, 2025

youkaichao mentioned this pull request Feb 13, 2025

redo ci: cross python wheel #824

Merged

yzh119 mentioned this pull request Feb 13, 2025

bugfix: fix the signature of CutlassSegmentGEMMSM90 #827

Merged

yzh119 added a commit that referenced this pull request Feb 13, 2025

bugfix: fix the signature of CutlassSegmentGEMMSM90 (#827)

fdda958

Follow up of #823 , the `CutlassSegmentGEMMSM90` API do not have member `plan_info_vec`

zhyncs pushed a commit that referenced this pull request Feb 13, 2025

redo ci: cross python wheel (#824)

2076f72

apply #662 again, since we have #823 merged now. --------- Signed-off-by: youkaichao <[email protected]>

yzh119 mentioned this pull request Feb 13, 2025

bugfix: Another bugfix for torch.library #828

Merged

youkaichao mentioned this pull request Feb 15, 2025

[ci/build] update flashinfer vllm-project/vllm#13323

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: change to TORCH_LIBRARY #823

refactor: change to TORCH_LIBRARY #823

abmfy commented Feb 13, 2025

zhyncs commented Feb 13, 2025

refactor: change to TORCH_LIBRARY #823

refactor: change to TORCH_LIBRARY #823

Conversation

abmfy commented Feb 13, 2025

zhyncs commented Feb 13, 2025