[sp] : fix the attention kernel for sp #6061

wangbluo · 2024-09-13T02:36:29Z

📝 What does this PR do?

For cases where s_q * s_kv * element size >= 10 GB, dispatch only to the FlashAttentionDaoLoader kernel, and use an empty tensor as a placeholder for the attention_mask. Additionally, only causal and padded causal modes are supported.

for more information, see https://pre-commit.ci

colossalai/shardformer/layer/attn.py

fix the sp

fdd84b9

wangbluo requested a review from a team as a code owner September 13, 2024 02:36

[pre-commit.ci] auto fixes from pre-commit.com hooks

216d54e

for more information, see https://pre-commit.ci

ver217 reviewed Sep 13, 2024

View reviewed changes

colossalai/shardformer/layer/attn.py Outdated Show resolved Hide resolved

colossalai/shardformer/layer/attn.py Outdated Show resolved Hide resolved

colossalai/shardformer/layer/attn.py Outdated Show resolved Hide resolved

colossalai/shardformer/layer/attn.py Outdated Show resolved Hide resolved

wangbluo added 2 commits September 13, 2024 03:38

fix the attn

0a01e2a

fix

683179c

ver217 reviewed Sep 13, 2024

View reviewed changes

colossalai/shardformer/layer/attn.py Outdated Show resolved Hide resolved

colossalai/shardformer/layer/attn.py Outdated Show resolved Hide resolved

colossalai/shardformer/layer/attn.py Outdated Show resolved Hide resolved

wangbluo added 3 commits September 13, 2024 05:06

fix

6eb8832

fix

f393867

fix

dc03217

ver217 approved these changes Sep 13, 2024

View reviewed changes

fix

0b14a55

ver217 reviewed Sep 13, 2024

View reviewed changes

colossalai/shardformer/layer/attn.py Outdated Show resolved Hide resolved

wangbluo added 3 commits September 13, 2024 09:01

fix

0ad3129

fix

b582319

fix

827ef3e

wangbluo merged commit 37e3523 into hpcaitech:main Sep 14, 2024
4 checks passed

wangbluo mentioned this pull request Sep 16, 2024

[sp] : fix the attention kernel for sp #6064

Merged

wangbluo deleted the sp_fix branch September 26, 2024 10:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[sp] : fix the attention kernel for sp #6061

[sp] : fix the attention kernel for sp #6061

wangbluo commented Sep 13, 2024

[sp] : fix the attention kernel for sp #6061

[sp] : fix the attention kernel for sp #6061

Conversation

wangbluo commented Sep 13, 2024

📝 What does this PR do?