Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zero masked arithmetic operations #2426

Merged

Conversation

mazimkhan
Copy link
Contributor

Introduces:

  • MaskedMaxOrZero(m, a, b): returns Max(a, b)[i] or zero if m[i] is false
  • MaskedAddOrZero(m, a, b): returns a[i] + b[i] or 0 if m[i] is false.
  • MaskedSubOrZero(m, a, b): returns a[i] - b[i] or 0 if m[i] is false.
  • MaskedMulOrZero( m, a, b): returns a[i] * b[i] or 0 if m[i] is false.
  • MaskedDivideOrZero(m, a, b): returns a[i] / b[i] or 0 if m[i] is false.
  • MaskedSaturatedAddOrZero(m, a, b): returns a[i] + b[i] saturated to the minimum/maximum representable value, or 0 if m[i]` is false.
  • MaskedSaturatedSubOrZero(m, a, b): returns a[i] - b[i] saturated to the minimum/maximum representable value, or 0 if m[i]` is false.
  • MaskedMulFixedPoint15OrZero(m, a): returns returns the result of multiplying two Q1.15 fixed-point numbers, or 0 if m[i] is false.
  • MaskedMulAddOrZero(m, a, b, c): returns a[i] * b[i] + c[i] or 0 if m[i] is false.
  • MaskedNegMulAddOrZero(m, a, b, c): returns -a[i] * b[i] + c[i] or 0 if m[i] is false.
  • MaskedWidenMulPairwiseAddOrZero(d, m, a, b): widens a and b to TFromD<D> and computes a[2*i+1]*b[2*i+1] + a[2*i+0]*b[2*i+0], or 0 if m[i] is false.

Testing is included for all operations where both the masking and the underlying operation is tested.

Copy link

google-cla bot commented Jan 6, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

g3doc/quick_reference.md Outdated Show resolved Hide resolved
g3doc/quick_reference.md Outdated Show resolved Hide resolved
g3doc/quick_reference.md Outdated Show resolved Hide resolved
hwy/ops/arm_sve-inl.h Outdated Show resolved Hide resolved
hwy/ops/arm_sve-inl.h Outdated Show resolved Hide resolved
hwy/ops/arm_sve-inl.h Outdated Show resolved Hide resolved
hwy/ops/arm_sve-inl.h Outdated Show resolved Hide resolved
hwy/tests/masked_arithmetic_test.cc Outdated Show resolved Hide resolved
@wbb-ccl wbb-ccl force-pushed the cc_up_masked_arithmetic branch 2 times, most recently from 439cc55 to 3ab06f2 Compare January 30, 2025 09:31
jan-wassenberg
jan-wassenberg previously approved these changes Jan 30, 2025
Copy link
Member

@jan-wassenberg jan-wassenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice :)

mazimkhan and others added 6 commits February 3, 2025 15:22
Remove OrZero suffixes for consistency
Rename MaskedDIvide for consistency
Rename HWY_SVE_RETV_ARGMVVZ etc. for consistency
Undef all new macros at end of file
Remove unused macro
Remove unnecessary wrapper functions
Remove MulLower, the rest of this implementation is in google#2429
Remove docs for ops not in this branch
Add missing MaskedMax op
Consolidate MulAdd tests
@jan-wassenberg
Copy link
Member

This one also failed due to a shadowed local variable at masked_arithmetic_test.cc:636, sorry I didn't see that.

Copy link
Member

@jan-wassenberg jan-wassenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shadowed variable at masked_arithmetic_test.cc:636.

@jan-wassenberg
Copy link
Member

CI looks good, hopefully will be merged soon.

@copybara-service copybara-service bot merged commit 4afbd48 into google:master Feb 5, 2025
34 of 40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants