Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

End to end support for bfp16 scl2vec intrinsics #278

Open
wants to merge 4 commits into
base: aie-public
Choose a base branch
from

Conversation

niwinanto
Copy link
Collaborator

TODO:
v128bfp16ebs8 shuffle(v128bfp16ebs8 , unsigned int );
This depends on intrinsics from bfp16 upd_ext

@@ -13,3 +13,4 @@

include "AIEBaseRegisterBanks.td"
def AccRegBank : RegisterBank<"AccRegBank", [ACC512, ACC1024, ACC2048]>;
def GPRRegBank : RegisterBank<"GPRRegBank", [eR, eL, eE, EXPVEC64]>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, include a new line in the end.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not be needed after rebase

Copy link
Collaborator

@konstantinschwarz konstantinschwarz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, few nits

clang/lib/Headers/aiebase_typedefs.h Show resolved Hide resolved
@@ -572,4 +572,10 @@ def int_aie2p_sqrtf : ClangBuiltin<"__builtin_aie2p_sqrtf">, AIE2PNLF;
// DIVS
def int_aie2p_divs : AIE2PDIVS;

// BFP16 MAC MUL
class AIE2PSHUFFLEBFP16
: Intrinsic<[llvm_v64i8_ty, llvm_v8i8_ty], [llvm_v64i8_ty, llvm_v8i8_ty, llvm_v64i8_ty, llvm_v8i8_ty, llvm_i32_ty],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should use DefaultAttrsIntrinsic instead of Intrinsic

class AIE2PSHUFFLEBFP16
: Intrinsic<[llvm_v64i8_ty, llvm_v8i8_ty], [llvm_v64i8_ty, llvm_v8i8_ty, llvm_v64i8_ty, llvm_v8i8_ty, llvm_i32_ty],
[IntrNoMem]>;
def int_aie2p_vshuffle_576_bfp16 : AIE2PSHUFFLEBFP16;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This selects to the same instruction whether we come from v64bfp16ebs8 (aka 576 size) or v64bfp16ebs16 (aka 544 size) so I would just name it aie2p_vshuffle_bfp16

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In reality 576 for both, 512-bit mantissa and 64-bit exponent. ebs16 will replicate its exponent to make it 64-bit. 576 will help to distinguish between v64* or v128* variant.

const RegClassOrRegBank &RegClassOrBank = MRI.getRegClassOrRegBank(DstReg);
const TargetRegisterClass *DstRC =
RegClassOrBank.dyn_cast<const TargetRegisterClass *>();
if (!DstRC) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if this is needed, could you put it in its own commit along with the tests that it affects?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont know all the details happening under the hood, but register constrainer will fail with out deducing the register bank or class if I dont expand this. You could find other instructions follow this as well.

I did separate the change with respective test updates.

Copy link
Collaborator

@khallouh khallouh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add .ll test from IR to assembly?

@@ -300,3 +300,6 @@ BUILTIN(__builtin_aie2p_tanh, "V16yV16g", "nc")

//division/mod
BUILTIN(__builtin_aie2p_divstep, "vUi&Ui&Ui", "nc")

// SHUFFLE
BUILTIN(__builtin_aie2p_vshuffle_576_bfp16, "vV64cV8cV64cV8ciV64c&V8c&", "nc")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as for the llvm intrinsic, we don't need the 576 in the name

@niwinanto niwinanto force-pushed the niwin.bfp16.scl2vecIntrinsic branch 2 times, most recently from 644abd2 to c085f62 Compare January 20, 2025 07:02
@niwinanto niwinanto force-pushed the niwin.bfp16.scl2vecIntrinsic branch from b4ffd99 to 0f04536 Compare January 20, 2025 07:53
@niwinanto niwinanto force-pushed the niwin.bfp16.scl2vecIntrinsic branch from 0f04536 to 487aede Compare January 20, 2025 08:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants