[Dev] Bug fix for Block Reduce Template and improve TL #183

LeiWang1999 · 2024-09-16T09:14:29Z

Fix bugs for incorrect thread binding and assignment for K Tile Size.

Improve TL Pass MergeStaticSharedMemory to Handle complex Memory Planing (For example, Stream-K).

Enhance LowerThreadAllReduce Pass to Auto Select Memory Scope based on the IR Script.

…ability and maintainability

…ayout

LeiWang1999 · 2024-09-16T09:17:56Z

Also enhance auto tune.

@autotune(configs=get_configs(), keys=['block_row_warps', 'block_col_warps', 'warp_rows', 'warp_cols', 'chunk', 'reduce_k', 'num_stages', 'splitk_factor'], warmup=5, rep=20)
    @jit(out_idx=[2], supply_type=tl.TensorSupplyType.Normal, ref_prog=ref_program, rtol=1, skip_check=True, profiler='tvm')

Now we have profiler option ['tvm', 'torch'] to enable accurate kernel profile with TVM Runtime.

Also introduce tqdm to reveal the tune proc.

LeiWang1999 added 30 commits July 5, 2024 08:54

Refactor BatchMatMulEmitter and BatchMatMulSelector for improved read…

d8884e6

…ability and maintainability

Refactor import statements for improved readability and maintainability

fc84173

Refactor import statements for improved readability and maintainability

02f64de

disable failure email for ci

397eee6

remove email notifications.

20f6ad1

move relax pass from testing to mlc_llm

b93c394

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into main

ba6a6df

Refactor scripts with se check_eual_ref_scripts_with_emitter function

257693a

Lint Fix

9bb7f49

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into main

39e7614

Refactor scripts with se check_eual_ref_scripts_with_emitter function

93eb5a5

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into main

72b9740

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into main

5b65979

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into main

d9bd479

buf fix for matrix support

99515cb

lint fix

14406ef

dispatch tensor core based on shapes

d30ec4f

update install commands

fde4029

import scripts

6a04749

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into docs

9d90c40

remove shared mem hack

9ef14e9

revert change for swizzling

63f363e

bug fix

b29c66c

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into docs

4643dd9

tl examples

28beb13

Enhance Swizzle

c0b476f

lint fix

2bf14a8

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into tl-l…

52accbf

…ayout

test fix

19aa985

lint fix

ef8f93c

LeiWang1999 added 26 commits September 3, 2024 07:32

optimize layout

4015cc4

update tl utils.

5c5880c

macro optimization

1042ffd

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into tl-l…

1ecd76e

…ayout

test fix

7bb21e7

gemm_ss

6a22442

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into tl-l…

b9ea093

…ayout

doc fix

e9b56b4

lint fix

3eb6888

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into tl-l…

5322785

…ayout

lint fix

6f18d15

remove debug print

187f448

remove debug print

e1fac68

vectorization init

4f25626

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into tl-l…

2686030

…ayout

lint fix

23a8e8b

prelude update

069ad5e

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into tl-l…

23fe3f8

…ayout

update tvm

9119dd3

bug fix for reduce_k with shared memory

15f4c1f

bug fix

f8518ae

bug fix

ea50147

Enhance Macro Generation

f888af1

Lift Layout to reduce load time

a0bfabf

lint fix

b1fdbcf

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into tl-l…

137b6fd

…ayout

LeiWang1999 added 2 commits September 16, 2024 17:17

test fix

0acc369

red fix

62de446

LeiWang1999 merged commit 916a54c into microsoft:main Sep 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Dev] Bug fix for Block Reduce Template and improve TL #183

[Dev] Bug fix for Block Reduce Template and improve TL #183

Uh oh!

LeiWang1999 commented Sep 16, 2024

Uh oh!

LeiWang1999 commented Sep 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[Dev] Bug fix for Block Reduce Template and improve TL #183

[Dev] Bug fix for Block Reduce Template and improve TL #183

Uh oh!

Conversation

LeiWang1999 commented Sep 16, 2024

Uh oh!

LeiWang1999 commented Sep 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant