Add CUTLASS-based row-wise scaled sparse FP8 kernel #1313
Job | Run time |
---|---|
5s | |
2m 54s | |
10m 19s | |
9m 21s | |
8m 38s | |
10m 13s | |
15m 55s | |
19m 3s | |
6m 12s | |
16s | |
16s | |
26s | |
25s | |
13s | |
14s | |
15s | |
14s | |
0s | |
1h 24m 59s |
Job | Run time |
---|---|
5s | |
2m 54s | |
10m 19s | |
9m 21s | |
8m 38s | |
10m 13s | |
15m 55s | |
19m 3s | |
6m 12s | |
16s | |
16s | |
26s | |
25s | |
13s | |
14s | |
15s | |
14s | |
0s | |
1h 24m 59s |