Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce memory hoarding in the unit test #2028

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

bbockelm
Copy link
Collaborator

This PR reduces memory hoarding and memory churn by:

  1. Hoisting a regexp compilation from a local (called from the HTTP request handler) to a global.
  2. Deleting an entry out of a cache once we will no longer use it.
  3. Providing the unit tests with a configuration knob to completely disable local xrootd monitoring, making their memory usage more predictable.

Various internal caches in the origin/cache monitoring were filling
with temporary session information, causing unpredictable results in
the memory "stress test" unit tests.

This commit:

- Allows monitoring to be turned off completely if desired; labelled
  as a hidden variable.
- Removes the user ID from the users cache as soon as it is used.
  Since there's a single use case - and no other indication the user
  ID cache entry isn't needed - we remove it here.
- Reduce redundant calls into the ttlcache.
Reduces churn of objects on the heap observed in memory profiles
@bbockelm bbockelm linked an issue Feb 18, 2025 that may be closed by this pull request
@bbockelm bbockelm added bug Something isn't working infrastructure GitHub Actions, Release management, and CI internal Internal code improvements, not user-facing labels Feb 18, 2025
@bbockelm bbockelm added this to the v7.14 milestone Feb 18, 2025
Comment on lines +921 to +923
// Remove the user id record as this is the only use for it; otherwise, it'll sit around in the cache,
// using memory.
userids.Delete(xrdUserId)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the userids.Get up a line 624?

existingRec := sessions.Get(userId).Value()
item, found := sessions.GetOrSet(userId, UserRecord{Project: project, Host: xrdUserId.Host}, ttlcache.WithTTL[UserId, UserRecord](ttlcache.DefaultTTL))
if found {
existingRec := item.Value()
existingRec.Project = project
existingRec.Host = xrdUserId.Host
sessions.Set(userId, existingRec, ttlcache.DefaultTTL)
} else {
sessions.Set(userId, UserRecord{Project: project, Host: xrdUserId.Host}, ttlcache.DefaultTTL)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this now redundant with the sessions.GetOrSet above?

@brianaydemir
Copy link
Contributor

I modified the test to run for 10,000 iterations, and collected one profile without this PR's changes (mem.before), and one profile with them (mem.after).

The reduction in allocations:

# go tool pprof -nodefraction 0 -alloc_space -unit megabyte -diff_base mem.before mem.after
File: director.test
Build ID: 8c833cb3bfda65d2a42fcab1e017b1fce61c1442
Type: alloc_space
Time: Feb 19, 2025 at 4:34pm (UTC)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for -330.97MB, 11.25% of 2940.82MB total
Showing top 10 nodes out of 994
      flat  flat%   sum%        cum   cum%
 -103.05MB  3.50%  3.50%  -103.05MB  3.50%  regexp/syntax.(*compiler).inst
  -60.01MB  2.04%  5.54%   -60.01MB  2.04%  regexp/syntax.(*parser).newRegexp
     -46MB  1.56%  7.11%  -185.52MB  6.31%  github.com/sirupsen/logrus.(*TextFormatter).Format
  -43.01MB  1.46%  8.57%   -43.01MB  1.46%  strconv.appendQuotedWith
  -28.50MB  0.97%  9.54%   -71.51MB  2.43%  fmt.Sprintf
  -18.49MB  0.63% 10.17%   -18.49MB  0.63%  bytes.growSlice
  -16.50MB  0.56% 10.73%  -153.52MB  5.22%  github.com/pelicanplatform/pelican/metrics.GetSIDRest
  -14.91MB  0.51% 11.24%   -14.91MB  0.51%  strings.genSplit
  -14.50MB  0.49% 11.73%   -16.99MB  0.58%  regexp.(*Regexp).ReplaceAllString
      14MB  0.48% 11.25%       30MB  1.02%  vendor/golang.org/x/crypto/hkdf.Expand

The reduction in live heap:

# go tool pprof -nodefraction 0 -inuse_space -unit megabyte -diff_base mem.before mem.after
File: director.test
Build ID: 8c833cb3bfda65d2a42fcab1e017b1fce61c1442
Type: inuse_space
Time: Feb 19, 2025 at 4:34pm (UTC)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for -18.55MB, 69.91% of 26.53MB total
Showing top 10 nodes out of 111
      flat  flat%   sum%        cum   cum%
   -7.50MB 28.27% 28.27%    -7.50MB 28.27%  github.com/pelicanplatform/pelican/metrics.GetSIDRest
   -4.91MB 18.53% 46.80%    -9.06MB 34.15%  github.com/jellydator/ttlcache/v3.(*Cache[go.shape.struct { Prot string; User string; Pid int; Sid int; Host string },go.shape.struct { Id uint32 }]).set
   -2.50MB  9.43% 56.23%    -2.50MB  9.43%  github.com/jellydator/ttlcache/v3.newItem[go.shape.struct { Id uint32 },go.shape.struct { AuthenticationProtocol string; User string; DN string; Role string; Org string; Groups []string; Project string; Host string }]
   -2.50MB  9.43% 65.65%    -2.50MB  9.43%  github.com/jellydator/ttlcache/v3.newItem[go.shape.struct { Prot string; User string; Pid int; Sid int; Host string },go.shape.struct { Id uint32 }]
      -2MB  7.54% 73.19%       -2MB  7.54%  container/list.(*List).insertValue (inline)
    1.50MB  5.67% 67.52%     1.50MB  5.67%  github.com/aws/aws-sdk-go/aws/endpoints.init
      -1MB  3.77% 71.29%       -1MB  3.77%  regexp.onePassCopy
   -0.64MB  2.43% 73.72%    -0.64MB  2.43%  github.com/jellydator/ttlcache/v3.(*expirationQueue[go.shape.struct { Prot string; User string; Pid int; Sid int; Host string },go.shape.struct { Id uint32 }]).Push
    0.51MB  1.92% 71.81%     0.51MB  1.92%  encoding/xml.map.init.0
    0.50MB  1.89% 69.91%     0.50MB  1.89%  sync.(*Map).LoadOrStore

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working infrastructure GitHub Actions, Release management, and CI internal Internal code improvements, not user-facing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tidy up memory stress test
2 participants