Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-12110. Optimize memory overhead for OM background tasks. #7743

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

devmadhuu
Copy link
Contributor

What changes were proposed in this pull request?

This PR change is to reduce creation of too many local hashmap and other objects on repeated processing of OM events during delta sync of Recon with OM.

Some OM background tasks need code optimisations where the process method is creating static maps and other objects again and again with every run which are not needed, so that memory and GC overhead can be reduced.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-12110

How was this patch tested?

Patch is tested using existing junit tests.

@devmadhuu devmadhuu marked this pull request as ready for review January 24, 2025 06:05
@devmadhuu devmadhuu marked this pull request as draft January 24, 2025 07:14
@devmadhuu devmadhuu marked this pull request as ready for review January 25, 2025 07:53
@SaketaChalamchala
Copy link
Contributor

Thanks for the patch @devmadhuu.
nit: Have you also considered optimizing nsSummary.setNumOfFiles(nsSummary.getNumOfFiles() + 1); and nsSummary.setSizeOfFiles(nsSummary.getSizeOfFiles() + dataSize); in NSSummaryTaskDbEventHandler.handlePutKeyEvent() and NSSummaryTaskDbEventHandler.handleDeleteDirEvent()

LGTM otherwise

@devmadhuu
Copy link
Contributor Author

Thanks for the patch @devmadhuu. nit: Have you also considered optimizing nsSummary.setNumOfFiles(nsSummary.getNumOfFiles() + 1); and nsSummary.setSizeOfFiles(nsSummary.getSizeOfFiles() + dataSize); in NSSummaryTaskDbEventHandler.handlePutKeyEvent() and NSSummaryTaskDbEventHandler.handleDeleteDirEvent()

LGTM otherwise

Thanks @SaketaChalamchala for the review. I have optimized code in NSSummaryTaskDbEventHandler as well. Pls review.

Copy link
Contributor

@dombizita dombizita left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this improvement @devmadhuu! Please take a look at my comments, overall my question is that shouldn't we initialise these in the constructor?

  private Collection<String> tables;
  private HashMap<String, Long> objectCountMap;
  private HashMap<String, Long> unReplicatedSizeMap;
  private HashMap<String, Long> replicatedSizeMap;

Comment on lines -160 to -164
// Initialize maps to store count and size information
HashMap<String, Long> objectCountMap = initializeCountMap();
HashMap<String, Long> unReplicatedSizeMap = initializeSizeMap(false);
HashMap<String, Long> replicatedSizeMap = initializeSizeMap(true);
final Collection<String> taskTables = getTaskTables();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't the map initialisations and the getTaskTables() still necessary here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't the map initialisations and the getTaskTables() still necessary here?

Once tables and maps are initialised once in reprocess, they will be available in each delta OM sync. Do you have any other use case in mind ?

}

/**
* Initializes and returns a count map with the counts for the tables.
*
* @return The count map containing the counts for each table.
*/
private HashMap<String, Long> initializeCountMap() {
Collection<String> tables = getTaskTables();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we make sure that tables is initialised here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before initialising count map , we are already initialising taskTables in reprocess. Please explain what issue you see here.

@devmadhuu
Copy link
Contributor Author

Thanks for this improvement @devmadhuu! Please take a look at my comments, overall my question is that shouldn't we initialise these in the constructor?

  private Collection<String> tables;
  private HashMap<String, Long> objectCountMap;
  private HashMap<String, Long> unReplicatedSizeMap;
  private HashMap<String, Long> replicatedSizeMap;

In restart or upgrade case it is ok to initialise them in constructor because recon OM DB will already be there and tables list will be initialised, but in fresh installation of Recon, it will need OM full snapshot and then populate tables list, and initialising maps needs SQL tables as well as OM tables list, else it will fail. So it is safe to be initialised and populated once with full snapshot in reprocess method.

Comment on lines +165 to +167
Field tableField = OmTableInsightTask.class.getDeclaredField("tables");
tableField.setAccessible(true);
tableField.set(omTableInsightTask, omTableInsightTask.getTaskTables());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you like to add setXXX setter with onlyForTesting tag for them?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants