Fix Memory Model and Internal TRSM failure #852

AGonzales-amd · 2024-11-13T23:17:10Z

This PR fixes the checkin_misc_MEMORY_MODEL.user_managed test failure when ROCSOLVER_USE_INTERNAL_BLAS is set and internal trsm is used for rocsolver_dgetrf_strided_batched.

jmachado-amd · 2024-12-11T21:08:44Z

clients/gtest/memory_model_gtest.cpp

+#ifndef USE_INTERNAL_TRSM
    EXPECT_GT(size, 2000000);
+#else
+    // internal trsm does not use scratch memory
+    EXPECT_LT(size, 2000000);
+#endif


Hi @AGonzales-amd, those changes fix the failure, but I am afraid that they are not improving the test. The real problem is that this test is making assumptions about the implementation of getrf, and this is brittle: Unless it is absolutely necessary, we should not allow tests to make any assumptions about internal implementations. Implementations can change and render those assumptions redundant (which is exactly what is happening when getrf uses our TRSM).

A more appropriate solution would be to completely remove any calls to getrf that expect it to allocate an implementation dependent amount of memory, and make sure that the test itself sets everything it needs, explicitly.

EdDAzevedo · 2024-12-12T04:50:42Z

clients/gtest/memory_model_gtest.cpp

+#else
+    // internal trsm does not use scratch memory
+    EXPECT_EQ(status, rocblas_status_success);
+#endif


Minor questions: Is there a rocsolver function so the user can know say 100MB will be sufficient for the dgetrf rocsolver call for a certain matrix size, or this is just trial-and-error or keep doubling the size until it works? Perhaps when using rocblas managed mode, there is a way to query the max memory pool size used? Perhaps the user can use rocblas managed mode, then later switch to user owned mode (but knowing the max pool size used in rocblas) for higher performance?

AGonzales-amd added 2 commits November 13, 2024 16:36

fix Memory Model failure with internal trsm

8c20e3a

temp. change default to internal blas

8737cb4

tfalders added the noOptimizations Disable optimized kernels for small sizes for some routines label Nov 21, 2024

revert default internal blas flag

61d90cb

AGonzales-amd marked this pull request as ready for review December 11, 2024 20:26

AGonzales-amd requested review from jzuniga-amd, tfalders, cgmb, qjojo, EdDAzevedo and jmachado-amd as code owners December 11, 2024 20:27

jmachado-amd reviewed Dec 11, 2024

View reviewed changes

EdDAzevedo reviewed Dec 12, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Memory Model and Internal TRSM failure #852

Fix Memory Model and Internal TRSM failure #852

AGonzales-amd commented Nov 13, 2024

jmachado-amd Dec 11, 2024

EdDAzevedo Dec 12, 2024

Fix Memory Model and Internal TRSM failure #852

Are you sure you want to change the base?

Fix Memory Model and Internal TRSM failure #852

Conversation

AGonzales-amd commented Nov 13, 2024

jmachado-amd Dec 11, 2024

Choose a reason for hiding this comment

EdDAzevedo Dec 12, 2024

Choose a reason for hiding this comment