QS8 / QU8 PReLU microkernels #7738

swamipreksha · 2025-01-30T08:24:50Z

Implementations for various ISAs:
- x86 AVX2
- Scalar ISA
Unit tests

google-cla · 2025-01-30T08:24:55Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

dsharlet · 2025-01-30T18:43:55Z

Thanks for the PR. Unfortunately, it looks like this is implementing the old prelu kernels. We now support prelu as a binary operator, and removed the old prelu operator: #6962, #7034

Can you please add binary operator implementation of the kernels you would like to have instead?

src/operators/binary-elementwise-nd.c

src/xnnpack/microparams.h

src/qs8-vpreluc/avx2.c.in

src/qs8-vpreluc/scalar.c.in

src/qs8-vprelu/avx2.c.in

src/qs8-vprelu/scalar.c.in

dsharlet · 2025-02-03T06:55:52Z

src/qs8-vpreluc/avx2.c.in

+        __m256i vacc${N} = _mm256_blendv_epi8(va${N}_sub, _mm256_mullo_epi32(va${N}_sub, vslope), _mm256_cmpgt_epi32(_mm256_setzero_si256(), va${N}_sub));
+
+      $for N in range(2*SIMD_TILE):
+        __m256 vscale${N} = _mm256_blendv_ps(vnegative_multiplier, vpositive_multiplier, _mm256_castsi256_ps(_mm256_cmpgt_epi32(va${N}_sub, _mm256_setzero_si256())));


Use the same condition here too? Both for consistency, and to rely less on compiler smartness to effectively do CSE. In fact, consider computing the comparison explicitly with an intermediate?

We have addressed all your reviews. Kindly let us know if any other change is required.

dsharlet

Thank you for the PR, this is great work.

Just a few remaining minor nits.

src/configs/binary-elementwise-config.c

src/qs8-vprelu/gen/qs8-vprelu-avx2-u16.c

- Implementations for various ISAs: - x86 AVX2 - Scalar ISA - Unit tests Signed-Off-by: Ravi Kumar Soni <[email protected]> Signed-off-by: Swami, Preksha <[email protected]>

swamipreksha force-pushed the qs8_qu8_vprelu branch from 2adbcf3 to 4af1bc9 Compare January 30, 2025 09:54

swamipreksha commented Jan 31, 2025

View reviewed changes

src/operators/binary-elementwise-nd.c Show resolved Hide resolved

dsharlet reviewed Feb 3, 2025

View reviewed changes

swamipreksha force-pushed the qs8_qu8_vprelu branch from 4af1bc9 to cfac799 Compare February 5, 2025 06:11

dsharlet reviewed Feb 5, 2025

View reviewed changes

src/configs/binary-elementwise-config.c Outdated Show resolved Hide resolved

src/configs/binary-elementwise-config.c Outdated Show resolved Hide resolved

src/qs8-vprelu/gen/qs8-vprelu-avx2-u16.c Show resolved Hide resolved

swamipreksha force-pushed the qs8_qu8_vprelu branch 3 times, most recently from aa81513 to fee5c12 Compare February 6, 2025 07:57

QS8 / QU8 PReLU microkernels

fee5c12

- Implementations for various ISAs: - x86 AVX2 - Scalar ISA - Unit tests Signed-Off-by: Ravi Kumar Soni <[email protected]> Signed-off-by: Swami, Preksha <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QS8 / QU8 PReLU microkernels #7738

QS8 / QU8 PReLU microkernels #7738

swamipreksha commented Jan 30, 2025

google-cla bot commented Jan 30, 2025

dsharlet commented Jan 30, 2025

dsharlet Feb 3, 2025

swamipreksha Feb 5, 2025

dsharlet left a comment

QS8 / QU8 PReLU microkernels #7738

Are you sure you want to change the base?

QS8 / QU8 PReLU microkernels #7738

Conversation

swamipreksha commented Jan 30, 2025

google-cla bot commented Jan 30, 2025

dsharlet commented Jan 30, 2025

dsharlet Feb 3, 2025

Choose a reason for hiding this comment

swamipreksha Feb 5, 2025

Choose a reason for hiding this comment

dsharlet left a comment

Choose a reason for hiding this comment