Add CUTLASS-based row-wise scaled sparse FP8 kernel #1331
Job | Run time |
---|---|
7s | |
2m 59s | |
9m 50s | |
9m 42s | |
8m 25s | |
10m 8s | |
15m 24s | |
18m 9s | |
5m 26s | |
16s | |
10s | |
13s | |
14s | |
15s | |
12s | |
18s | |
15s | |
0s | |
1h 22m 3s |
Job | Run time |
---|---|
7s | |
2m 59s | |
9m 50s | |
9m 42s | |
8m 25s | |
10m 8s | |
15m 24s | |
18m 9s | |
5m 26s | |
16s | |
10s | |
13s | |
14s | |
15s | |
12s | |
18s | |
15s | |
0s | |
1h 22m 3s |