Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about pe_bias build for position embedding #18

Open
1311894932 opened this issue Dec 15, 2024 · 1 comment
Open

Question about pe_bias build for position embedding #18

1311894932 opened this issue Dec 15, 2024 · 1 comment

Comments

@1311894932
Copy link

1311894932 commented Dec 15, 2024

Hello, thanks for your great work in FlowMDM !!! But i have a question about the pe_bias matrix

In code, it explains, pe_bias --> [T, T] matrix with -inf and 0's limiting where the attention during APE mode focuses (0's), i.e., inside each subsequence , and in the BPE_Rotary:
if pe_bias != None:
assert (w.int() == w).all(), "w should be 0 or 1 when using multitext at training"
pe_bias[w.squeeze() == 1] = 0 # need to zero bias out for the relative PE batch elements
i am completely confused. In my opinion, this matrix will be used to adding into QK dots when using abs_pos,and i dont know what the above code want to do...

Any response will be appreciated !!!

@1311894932
Copy link
Author

1311894932 commented Dec 15, 2024

wait...maybe because rot_pos_emb do not need -inf , it has attention horizon? so we should zero the pe_bias out?
hope someone can answer my question, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant