Issue-6642 Add http.send request attribute to ignore headers for caching key #6675

rudrakhp · 2024-04-04T12:09:09Z

Why the changes in this PR are needed?

Need to support exclusion of certain headers from http.send query cache key.

What are the changes in this PR?

A new attribute cache_ignored_headers in the http.send built in request object. Enables policy authors to define an exclusion list for headers to ignore while caching.
Modify key used by the HTTP request executor when building the cache key.
Unit tests for these changes.

Notes to assist PR review:

Note that if request attribute cache_ignored_headers changes but does not affect the final computed cache key, the results are still served from cache. This attribute is always excluded from the cache key.
If there is a change in the exclusion list that leads to change in the cache key (addition/removal of a header), a new cache key is generated which leads to a cache miss.

Further comments:

Resolves #6642

ashutosh-narkar

Thanks for working on this. Here we are excluding certain headers from the key calculation. Is there a reason for that level of granularity? Why not exclude certain request object params themselves from the key calculation?

rudrakhp · 2024-04-05T21:44:25Z

Thanks for working on this. Here we are excluding certain headers from the key calculation. Is there a reason for that level of granularity? Why not exclude certain request object params themselves from the key calculation?

I would have to explore if there is any straight forward way to delete a value by path reference in the Request ast.Object, since we might require that to ignore any generic request attribute from the cache key. Let me get back to you on this next week. Meanwhile please do share if you have any pointers. Thanks!

ashutosh-narkar · 2024-04-11T17:44:52Z

@rudrakhp just checking to see if you got a chance to explore the option of excluding certain request object params.

rudrakhp · 2024-04-17T15:05:30Z

@ashutosh-narkar was out last week, will raise an updated PR this week.

stale · 2024-05-17T21:39:19Z

This pull request has been automatically marked as stale because it has not had any activity in the last 30 days.

ashutosh-narkar

@rudrakhp the changes seem fine. More testing would be helpful. Also we need to update the docs for the builtin.

ashutosh-narkar · 2024-05-17T19:53:08Z

topdown/http.go

+func getKeyFromRequest(req ast.Object) (ast.Object, error) {
+	var cacheIgnoredHeaders []string
+	var allHeaders map[string]interface{}
+	cacheIgnoredHeadersTerm := req.Get(ast.StringTerm("cache_ignored_headers"))


Nit: We can do an early exit here

if cacheIgnoredHeadersTerm == nil { return nil, nil }

// new copy so headers in request object doesn't change

Did you mean deep copy?

I guess new and copy are redundant, will update comment to:

// deep copy so changes to key do not reflect in the request object

ashutosh-narkar · 2024-05-17T19:57:53Z

topdown/http_test.go

+			expectedReqCount: 1,
+		},
+		{
+			note: "http.send GET cache miss different headers (force_cache enabled)",


Not sure what scenario in the changes is this test case trying to exercise.

This is to clarify how cache behaves when cache_ignored_headers has not been set. There was no such test case which tested this cache miss scenario due to different header values.

ashutosh-narkar · 2024-05-17T19:59:40Z

topdown/http_test.go

+			note: "http.send GET different cache_ignored_headers but still cached (force_cache enabled)",
+			ruleTemplate: `p = x {
+									r1 = http.send({"method": "get", "url": "%URL%", "force_json_decode": true, "headers": {"h1": "v1", "h2": "v2"}, "force_cache": true, "force_cache_duration_seconds": 300, "cache_ignored_headers": ["h2"]})
+									r2 = http.send({"method": "get", "url": "%URL%", "force_json_decode": true, "headers": {"h1": "v1", "h2": "v2"}, "force_cache": true, "force_cache_duration_seconds": 300, "cache_ignored_headers": ["h2", "h3"]}) # cached


Just for testing we can actually have a h3 header in the headers object.

Also a test for the scenario when the value of cache_ignored_headers and headers differs and we get a cache miss would be helpful

One more case I can think of:

R1: {"headers": {"h1": "v1"}}
R2: {"headers": {"h1": "v1", "h2": "v2"}, "cache_ignored_headers": ["h2"]}

So here R1 and R2 are equivalent, correct?

Another one:

R1: {"headers": {"h1": "v1"}, "cache_ignored_headers": []}
R2: {"headers": {"h1": "v1", "h2": "v2"}, "cache_ignored_headers": ["h2"]}

So here R1 and R2 are equivalent, correct?

ashutosh-narkar · 2024-05-23T17:01:05Z

@rudrakhp this would be a good one to get in the next release. Let us know if you need any help. Thanks.

rudrakhp · 2024-05-26T14:28:16Z

@ashutosh-narkar I just realised that request object was a pointer and it was getting modified as we modify the key, the downstream request will also not get these ignored headers. I have made a couple of new changes to address that, please do review. Thanks for the comments!

netlify · 2024-05-26T19:53:48Z

✅ Deploy Preview for openpolicyagent ready!

Name	Link
🔨 Latest commit	`54a5835`
🔍 Latest deploy log	https://app.netlify.com/sites/openpolicyagent/deploys/6658a82ed49ad8000858d4b7
😎 Deploy Preview	https://deploy-preview-6675--openpolicyagent.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

rudrakhp · 2024-05-26T20:00:10Z

topdown/http.go

@@ -729,13 +772,13 @@ func newHTTPSendCache() *httpSendCache {
 }

 func valueHash(v util.T) int {
-	return v.(ast.Value).Hash()
+	return ast.StringTerm(v.(ast.Value).String()).Hash()


@ashutosh-narkar fyi needed changes to the hash and eq functions of the hashmap to bring parity between inter and intra query cache key generation. The inter query cache already relies on the String() representation of Value for cache key.
Relying on the raw Value instead of theString() representation will always lead to a cache miss in intra query due to change in Hash when Copy() from request object is performed during key generation.

Thanks for the explanation.

The inter query cache already relies on the String() representation of Value for cache key.

Can you point to where this code is?

@ashutosh-narkar please refer topdown/cache/cache.go#L240, String() representation of the input key (of type ast.Value) is used as a cache key.

ashutosh-narkar · 2024-05-28T22:12:21Z

docs/content/policy-reference.md

 | `caching_mode` | no | `string` | Controls the format in which items are inserted into the inter-query cache. Allowed modes are `serialized` and `deserialized`. In the `serialized` mode, items will be serialized before inserting into the cache. This mode is helpful if memory conservation is preferred over higher latency during cache lookup. This is the default mode. In the `deserialized` mode, an item will be inserted in the cache without any serialization. This means when items are fetched from the cache, there won't be a need to decode them. This mode helps to make the cache lookup faster at the expense of more memory consumption. If this mode is enabled, the configured `caching.inter_query_builtin_cache.max_size_bytes` value will be ignored. This means an unlimited cache size will be assumed. |
-| `raise_error` | no | `bool` | If `raise_error` is set, `http.send` will return an error that can halt policy evaluation when used in conjunction with the `strict-builtin-errors` option. Default: `true`.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
-| `max_retry_attempts` | no | `number` | Number of times to retry a HTTP request when a network error is encountered. If provided, retries are performed with an exponential backoff delay. Default: `0`.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
+| `cache_ignored_headers` | no | `list` | List of header keys from `headers` parameter that should not considered when interacting with the cache. Default is `nil`, meaning all headers will be considered. **Important:** Note that if a cache entry exists with a subset/superset of headers that are considered in this request, it will lead to a cache miss.                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |


Important: Note that if a cache entry exists with a subset/superset of headers that are considered in this request, it will lead to a cache miss.

Can you provide an example of this please?

Added an unit test for this here.

ashutosh-narkar · 2024-05-28T22:16:53Z

topdown/http.go

+func getKeyFromRequest(req ast.Object) (ast.Object, error) {
+	var cacheIgnoredHeaders []string
+	var allHeaders map[string]interface{}
+	cacheIgnoredHeadersTerm := req.Get(ast.StringTerm("cache_ignored_headers"))


// new copy so headers in request object doesn't change

Did you mean deep copy?

ashutosh-narkar · 2024-05-28T22:30:45Z

topdown/http.go

@@ -729,13 +772,13 @@ func newHTTPSendCache() *httpSendCache {
 }

 func valueHash(v util.T) int {
-	return v.(ast.Value).Hash()
+	return ast.StringTerm(v.(ast.Value).String()).Hash()


Thanks for the explanation.

The inter query cache already relies on the String() representation of Value for cache key.

Can you point to where this code is?

ashutosh-narkar · 2024-05-29T23:01:04Z

topdown/http_test.go

+			note: "http.send GET cache miss different headers in cache key",
+			ruleTemplate: `p = x {
+									r1 = http.send({"method": "get", "url": "%URL%", "force_json_decode": true, "headers": {"h1": "v1", "h2": "v2", "h3": "v3"}, "cache_ignored_headers": ["h2"]})
+									r2 = http.send({"method": "get", "url": "%URL%", "force_json_decode": true, "headers": {"h1": "v1", "h2": "v21"}, "cache_ignored_headers": ["h2"]}) # cached


# cached this is a miss, correct?

ashutosh-narkar

The changes look good @rudrakhp. Can you please squash your commits.

Signed-off-by: Rudrakh Panigrahi <[email protected]>

ashutosh-narkar

LGTM

rudrakhp force-pushed the cache_ignored_headers branch 2 times, most recently from 8a0052e to 81c75e9 Compare April 4, 2024 12:44

ashutosh-narkar reviewed Apr 4, 2024

View reviewed changes

stale bot added the inactive label May 17, 2024

ashutosh-narkar reviewed May 18, 2024

View reviewed changes

stale bot removed the inactive label May 18, 2024

rudrakhp force-pushed the cache_ignored_headers branch 2 times, most recently from 85d4af0 to 8c78f07 Compare May 26, 2024 19:52

rudrakhp force-pushed the cache_ignored_headers branch from 8c78f07 to 66508d1 Compare May 26, 2024 19:54

rudrakhp commented May 26, 2024

View reviewed changes

rudrakhp force-pushed the cache_ignored_headers branch from 66508d1 to 06c7372 Compare May 26, 2024 20:06

ashutosh-narkar reviewed May 28, 2024

View reviewed changes

rudrakhp force-pushed the cache_ignored_headers branch 2 times, most recently from 4fd9a4a to 4cf2338 Compare May 29, 2024 18:02

ashutosh-narkar reviewed May 29, 2024

View reviewed changes

rudrakhp force-pushed the cache_ignored_headers branch from 4cf2338 to a74670b Compare May 30, 2024 16:23

add http.send request attribute to ignore headers for caching key

54a5835

Signed-off-by: Rudrakh Panigrahi <[email protected]>

rudrakhp force-pushed the cache_ignored_headers branch from a74670b to 54a5835 Compare May 30, 2024 16:24

ashutosh-narkar approved these changes May 30, 2024

View reviewed changes

ashutosh-narkar merged commit eeb6338 into open-policy-agent:main May 30, 2024
28 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue-6642 Add http.send request attribute to ignore headers for caching key #6675

Issue-6642 Add http.send request attribute to ignore headers for caching key #6675

rudrakhp commented Apr 4, 2024 •

edited

Loading

ashutosh-narkar left a comment

rudrakhp commented Apr 5, 2024 •

edited

Loading

ashutosh-narkar commented Apr 11, 2024

rudrakhp commented Apr 17, 2024

stale bot commented May 17, 2024

ashutosh-narkar left a comment

ashutosh-narkar May 17, 2024

ashutosh-narkar May 28, 2024

rudrakhp May 29, 2024

ashutosh-narkar May 17, 2024

rudrakhp May 26, 2024 •

edited

Loading

ashutosh-narkar May 17, 2024

ashutosh-narkar May 17, 2024

ashutosh-narkar May 18, 2024

ashutosh-narkar May 18, 2024

rudrakhp May 26, 2024

ashutosh-narkar commented May 23, 2024

rudrakhp commented May 26, 2024

netlify bot commented May 26, 2024 •

edited

Loading

rudrakhp May 26, 2024 •

edited

Loading

ashutosh-narkar May 28, 2024

rudrakhp May 29, 2024

ashutosh-narkar May 28, 2024

rudrakhp May 29, 2024

ashutosh-narkar May 28, 2024

ashutosh-narkar May 28, 2024

ashutosh-narkar May 29, 2024

ashutosh-narkar left a comment

ashutosh-narkar left a comment

Issue-6642 Add http.send request attribute to ignore headers for caching key #6675

Issue-6642 Add http.send request attribute to ignore headers for caching key #6675

Conversation

rudrakhp commented Apr 4, 2024 • edited Loading

Why the changes in this PR are needed?

What are the changes in this PR?

Notes to assist PR review:

Further comments:

ashutosh-narkar left a comment

Choose a reason for hiding this comment

rudrakhp commented Apr 5, 2024 • edited Loading

ashutosh-narkar commented Apr 11, 2024

rudrakhp commented Apr 17, 2024

stale bot commented May 17, 2024

ashutosh-narkar left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rudrakhp May 26, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ashutosh-narkar commented May 23, 2024

rudrakhp commented May 26, 2024

netlify bot commented May 26, 2024 • edited Loading

✅ Deploy Preview for openpolicyagent ready!

rudrakhp May 26, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ashutosh-narkar left a comment

Choose a reason for hiding this comment

ashutosh-narkar left a comment

Choose a reason for hiding this comment

rudrakhp commented Apr 4, 2024 •

edited

Loading

rudrakhp commented Apr 5, 2024 •

edited

Loading

rudrakhp May 26, 2024 •

edited

Loading

netlify bot commented May 26, 2024 •

edited

Loading

rudrakhp May 26, 2024 •

edited

Loading