Paralellize the filling on the std::vector in DMatrix #41

d-consoli · 2025-02-11T10:36:59Z

Currently the DMatrix creation is a bottleneck in the library because is not performed in parallel. This small modification can solve the issue and I am sure that it can be done also in more clean and performing way (e.g. playing with omp chunking).

wphicks · 2025-02-11T23:10:24Z

include/tl2cgen/detail/data_matrix_impl.h

-    data_ = std::vector<ElementType>{data_ptr, data_ptr + num_elem};
+    data_.reserve(num_elem);
+    #pragma omp parallel for
+    for (std::uint64_t i = 0; i < num_elem; ++i) {


Do we have benchmarks showing the impact of this change at various scales? How often is this code likely to get called inside a block that is already parallelized? You mentioned this was a bottleneck; can you give an example of a workload where it is a bottleneck?

wphicks · 2025-02-11T23:11:15Z

python/tl2cgen/util.py

@@ -17,6 +17,8 @@ def py_str(string):
    """Convert C string back to Python string"""
    return string.decode("utf-8")

+def check_if_fast():
+    return True


Not sure if there's missing code here or if this was supposed to be removed before the PR was made.

d-consoli added 2 commits February 5, 2025 12:55

Parallelize DMatrix

ddfff11

Add a method to check if is fast

00cd33c

wphicks reviewed Feb 11, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Paralellize the filling on the std::vector in DMatrix #41

Paralellize the filling on the std::vector in DMatrix #41

d-consoli commented Feb 11, 2025

wphicks Feb 11, 2025

wphicks Feb 11, 2025

Paralellize the filling on the std::vector in DMatrix #41

Are you sure you want to change the base?

Paralellize the filling on the std::vector in DMatrix #41

Conversation

d-consoli commented Feb 11, 2025

wphicks Feb 11, 2025

Choose a reason for hiding this comment

wphicks Feb 11, 2025

Choose a reason for hiding this comment