You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A few parts of the Accumulation loop are bottlenecks at the moment.
make_all_cat_comids takes ~15 minutes.
Zone processing for loop in MakeVectors function takes ~80 minutes.
Bastards function itself takes ~2-3 minutes.
Generic parallelism and changes to libraries like pyogrio or doing numpy vectorization will give us massive speedups without having to alter code too much.
The text was updated successfully, but these errors were encountered:
Created Speedup branch and pushed changes to makeVector process. This was the main slowdown in the accumulation process. Could continue working on children / bastard functions as well as adding numba to swapper function since it uses pure numpy processes.
Adding another parallel loop to the main Accumulation function in StreamCat_functions.py when we use a for loop to go through the columns in the table containing watershed values to create the numpy ndarray, called data, that will be used to create the final parquet file. The parallel function will use a thread for each column since at this point columns do not need to share data or go in any specific order.
A few parts of the Accumulation loop are bottlenecks at the moment.
Generic parallelism and changes to libraries like pyogrio or doing numpy vectorization will give us massive speedups without having to alter code too much.
The text was updated successfully, but these errors were encountered: