Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tensor contractions #105

Draft
wants to merge 206 commits into
base: master
Choose a base branch
from

Conversation

wardvermeulen
Copy link
Collaborator

@wardvermeulen wardvermeulen commented Jun 7, 2023

This PR adds Tensor Contraction functionality to GemmKernels.jl using the GEMM-like Tensor Tensor (GETT) multiplication algorithm. The API mimics the cuTENSOR API. It is still a draft, the benchmark scripts need to be refined further. Because it is a draft, I disabled the other tests temporarily. I also had to revert the CUDA runtime to v11.8 for cuTENSOR to work.

The following function of cuTENSOR is implemented:

  • contraction!

This works both with WMMAOp as with FPUOp of #101. As far as I can tell, cuTENSOR does not support different operations than multiplication and addition for its contraction. Because of the FPU operator, that is a possibility here.

The following functions are not implemented:

  • permutation!
  • reduction!
  • elementwiseTrinary!
  • elementwiseBinary!

I think it could be interesting future work to create a permutation (i.e. transposition) kernel and reduction kernel through the reuse of GemmKernels.jl building blocks. The same goes for the elementwise functions. You could probably do something extremely similar to GETT, but the kernel would need changing since contractions are not allowed.

The contraction! functionality is tested against the TCCG benchmark suite, using cuTENSOR to verify the results. Benchmarks for the Tesla P100, GeForce RTX 2080 Ti and Tesla V100 will be added later.

@thomasfaingnaert
Copy link
Member

Can you rebase against master?

@maleadt
Copy link
Member

maleadt commented Jun 13, 2023

See JuliaGPU/CUDA.jl#1960, both for CI and for any impact it may have on your work.

@maleadt
Copy link
Member

maleadt commented Jun 15, 2023

If you want to use a Manifest, it'll have to be one generated by the oldest version of Julia you want to test, i.e., 1.6 (should be easy enough using juliaup). And for that Manifest to work with 1.9, you need to call Pkg.resolve: https://github.com/JuliaGPU/CUDA.jl/blob/3321fc8aa53fee2ac39783ad1119af665545750b/.buildkite/pipeline.yml#L22-L24

@@ -0,0 +1,56 @@
export GETT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this file for? It's included nowhere.

extent::Vector{Int}
stride::Vector{Int}
dataType::DataType
unaryOp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this part of the tensor, and not the contraction plan?

@wardvermeulen wardvermeulen force-pushed the TensorContractions branch 3 times, most recently from f9266e8 to 6a7af93 Compare October 24, 2023 21:20
@wardvermeulen wardvermeulen force-pushed the TensorContractions branch 3 times, most recently from 582366c to cf04fa2 Compare November 7, 2023 21:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants