HDDS-12110. Optimize memory overhead for OM background tasks. #7743

devmadhuu · 2025-01-24T06:05:46Z

What changes were proposed in this pull request?

This PR change is to reduce creation of too many local hashmap and other objects on repeated processing of OM events during delta sync of Recon with OM.

Some OM background tasks need code optimisations where the process method is creating static maps and other objects again and again with every run which are not needed, so that memory and GC overhead can be reduced.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-12110

How was this patch tested?

Patch is tested using existing junit tests.

SaketaChalamchala · 2025-01-28T02:50:57Z

Thanks for the patch @devmadhuu.
nit: Have you also considered optimizing nsSummary.setNumOfFiles(nsSummary.getNumOfFiles() + 1); and nsSummary.setSizeOfFiles(nsSummary.getSizeOfFiles() + dataSize); in NSSummaryTaskDbEventHandler.handlePutKeyEvent() and NSSummaryTaskDbEventHandler.handleDeleteDirEvent()

LGTM otherwise

devmadhuu · 2025-01-28T05:42:30Z

Thanks for the patch @devmadhuu. nit: Have you also considered optimizing nsSummary.setNumOfFiles(nsSummary.getNumOfFiles() + 1); and nsSummary.setSizeOfFiles(nsSummary.getSizeOfFiles() + dataSize); in NSSummaryTaskDbEventHandler.handlePutKeyEvent() and NSSummaryTaskDbEventHandler.handleDeleteDirEvent()

LGTM otherwise

Thanks @SaketaChalamchala for the review. I have optimized code in NSSummaryTaskDbEventHandler as well. Pls review.

dombizita

Thanks for this improvement @devmadhuu! Please take a look at my comments, overall my question is that shouldn't we initialise these in the constructor?

  private Collection<String> tables;
  private HashMap<String, Long> objectCountMap;
  private HashMap<String, Long> unReplicatedSizeMap;
  private HashMap<String, Long> replicatedSizeMap;

dombizita · 2025-01-28T13:34:19Z

hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/OmTableInsightTask.java

-    // Initialize maps to store count and size information
-    HashMap<String, Long> objectCountMap = initializeCountMap();
-    HashMap<String, Long> unReplicatedSizeMap = initializeSizeMap(false);
-    HashMap<String, Long> replicatedSizeMap = initializeSizeMap(true);
-    final Collection<String> taskTables = getTaskTables();


Aren't the map initialisations and the getTaskTables() still necessary here?

Aren't the map initialisations and the getTaskTables() still necessary here?

Once tables and maps are initialised once in reprocess, they will be available in each delta OM sync. Do you have any other use case in mind ?

dombizita · 2025-01-28T13:36:40Z

hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/OmTableInsightTask.java

  }

  /**
   * Initializes and returns a count map with the counts for the tables.
   *
   * @return The count map containing the counts for each table.
   */
-  private HashMap<String, Long> initializeCountMap() {
-    Collection<String> tables = getTaskTables();


How do we make sure that tables is initialised here?

Before initialising count map , we are already initialising taskTables in reprocess. Please explain what issue you see here.

devmadhuu · 2025-01-28T15:30:18Z

Thanks for this improvement @devmadhuu! Please take a look at my comments, overall my question is that shouldn't we initialise these in the constructor?
  private Collection<String> tables;
  private HashMap<String, Long> objectCountMap;
  private HashMap<String, Long> unReplicatedSizeMap;
  private HashMap<String, Long> replicatedSizeMap;

In restart or upgrade case it is ok to initialise them in constructor because recon OM DB will already be there and tables list will be initialised, but in fresh installation of Recon, it will need OM full snapshot and then populate tables list, and initialising maps needs SQL tables as well as OM tables list, else it will fail. So it is safe to be initialised and populated once with full snapshot in reprocess method.

peterxcli · 2025-01-29T02:43:02Z

...op-ozone/recon/src/test/java/org/apache/hadoop/ozone/recon/tasks/TestOmTableInsightTask.java

+      Field tableField = OmTableInsightTask.class.getDeclaredField("tables");
+      tableField.setAccessible(true);
+      tableField.set(omTableInsightTask, omTableInsightTask.getTaskTables());


Would you like to add setXXX setter with onlyForTesting tag for them?

HDDS-12110. Optimize memory overhead for OM background tasks.

453790f

devmadhuu marked this pull request as ready for review January 24, 2025 06:05

devmadhuu added the recon label Jan 24, 2025

devmadhuu requested review from ArafatKhan2198 and dombizita January 24, 2025 06:06

devmadhuu marked this pull request as draft January 24, 2025 07:14

HDDS-12110. Fixed test case failures.

86fd111

devmadhuu marked this pull request as ready for review January 25, 2025 07:53

HDDS-12110. Fixed review comments.

6c1d230

dombizita reviewed Jan 28, 2025

View reviewed changes

peterxcli reviewed Jan 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HDDS-12110. Optimize memory overhead for OM background tasks. #7743

HDDS-12110. Optimize memory overhead for OM background tasks. #7743

devmadhuu commented Jan 24, 2025

SaketaChalamchala commented Jan 28, 2025

devmadhuu commented Jan 28, 2025

dombizita left a comment

dombizita Jan 28, 2025

devmadhuu Jan 28, 2025

dombizita Jan 28, 2025

devmadhuu Jan 28, 2025

devmadhuu commented Jan 28, 2025

peterxcli Jan 29, 2025

HDDS-12110. Optimize memory overhead for OM background tasks. #7743

Are you sure you want to change the base?

HDDS-12110. Optimize memory overhead for OM background tasks. #7743

Conversation

devmadhuu commented Jan 24, 2025

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

SaketaChalamchala commented Jan 28, 2025

devmadhuu commented Jan 28, 2025

dombizita left a comment

Choose a reason for hiding this comment

dombizita Jan 28, 2025

Choose a reason for hiding this comment

devmadhuu Jan 28, 2025

Choose a reason for hiding this comment

dombizita Jan 28, 2025

Choose a reason for hiding this comment

devmadhuu Jan 28, 2025

Choose a reason for hiding this comment

devmadhuu commented Jan 28, 2025

peterxcli Jan 29, 2025

Choose a reason for hiding this comment