-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
End to end support for bfp16 scl2vec intrinsics #278
base: aie-public
Are you sure you want to change the base?
Conversation
@@ -13,3 +13,4 @@ | |||
|
|||
include "AIEBaseRegisterBanks.td" | |||
def AccRegBank : RegisterBank<"AccRegBank", [ACC512, ACC1024, ACC2048]>; | |||
def GPRRegBank : RegisterBank<"GPRRegBank", [eR, eL, eE, EXPVEC64]>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, include a new line in the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should not be needed after rebase
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, few nits
@@ -572,4 +572,10 @@ def int_aie2p_sqrtf : ClangBuiltin<"__builtin_aie2p_sqrtf">, AIE2PNLF; | |||
// DIVS | |||
def int_aie2p_divs : AIE2PDIVS; | |||
|
|||
// BFP16 MAC MUL | |||
class AIE2PSHUFFLEBFP16 | |||
: Intrinsic<[llvm_v64i8_ty, llvm_v8i8_ty], [llvm_v64i8_ty, llvm_v8i8_ty, llvm_v64i8_ty, llvm_v8i8_ty, llvm_i32_ty], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should use DefaultAttrsIntrinsic
instead of Intrinsic
class AIE2PSHUFFLEBFP16 | ||
: Intrinsic<[llvm_v64i8_ty, llvm_v8i8_ty], [llvm_v64i8_ty, llvm_v8i8_ty, llvm_v64i8_ty, llvm_v8i8_ty, llvm_i32_ty], | ||
[IntrNoMem]>; | ||
def int_aie2p_vshuffle_576_bfp16 : AIE2PSHUFFLEBFP16; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This selects to the same instruction whether we come from v64bfp16ebs8
(aka 576 size) or v64bfp16ebs16
(aka 544 size) so I would just name it aie2p_vshuffle_bfp16
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In reality 576 for both, 512-bit mantissa and 64-bit exponent. ebs16 will replicate its exponent to make it 64-bit. 576 will help to distinguish between v64* or v128* variant.
const RegClassOrRegBank &RegClassOrBank = MRI.getRegClassOrRegBank(DstReg); | ||
const TargetRegisterClass *DstRC = | ||
RegClassOrBank.dyn_cast<const TargetRegisterClass *>(); | ||
if (!DstRC) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, if this is needed, could you put it in its own commit along with the tests that it affects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont know all the details happening under the hood, but register constrainer will fail with out deducing the register bank or class if I dont expand this. You could find other instructions follow this as well.
I did separate the change with respective test updates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add .ll
test from IR to assembly?
@@ -300,3 +300,6 @@ BUILTIN(__builtin_aie2p_tanh, "V16yV16g", "nc") | |||
|
|||
//division/mod | |||
BUILTIN(__builtin_aie2p_divstep, "vUi&Ui&Ui", "nc") | |||
|
|||
// SHUFFLE | |||
BUILTIN(__builtin_aie2p_vshuffle_576_bfp16, "vV64cV8cV64cV8ciV64c&V8c&", "nc") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as for the llvm intrinsic, we don't need the 576 in the name
644abd2
to
c085f62
Compare
b4ffd99
to
0f04536
Compare
0f04536
to
487aede
Compare
TODO:
v128bfp16ebs8 shuffle(v128bfp16ebs8 , unsigned int );
This depends on intrinsics from bfp16
upd_ext