-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue-6642 Add http.send request attribute to ignore headers for caching key #6675
Issue-6642 Add http.send request attribute to ignore headers for caching key #6675
Conversation
8a0052e
to
81c75e9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this. Here we are excluding certain headers from the key calculation. Is there a reason for that level of granularity? Why not exclude certain request object params themselves from the key calculation?
I would have to explore if there is any straight forward way to delete a value by path reference in the Request |
@rudrakhp just checking to see if you got a chance to explore the option of excluding certain request object params. |
@ashutosh-narkar was out last week, will raise an updated PR this week. |
This pull request has been automatically marked as stale because it has not had any activity in the last 30 days. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rudrakhp the changes seem fine. More testing would be helpful. Also we need to update the docs for the builtin.
func getKeyFromRequest(req ast.Object) (ast.Object, error) { | ||
var cacheIgnoredHeaders []string | ||
var allHeaders map[string]interface{} | ||
cacheIgnoredHeadersTerm := req.Get(ast.StringTerm("cache_ignored_headers")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: We can do an early exit here
if cacheIgnoredHeadersTerm == nil {
return nil, nil
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// new copy so headers in request object doesn't change
Did you mean deep copy
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess new and copy are redundant, will update comment to:
// deep copy so changes to key do not reflect in the request object
expectedReqCount: 1, | ||
}, | ||
{ | ||
note: "http.send GET cache miss different headers (force_cache enabled)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what scenario in the changes is this test case trying to exercise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to clarify how cache behaves when cache_ignored_headers
has not been set. There was no such test case which tested this cache miss scenario due to different header values.
topdown/http_test.go
Outdated
note: "http.send GET different cache_ignored_headers but still cached (force_cache enabled)", | ||
ruleTemplate: `p = x { | ||
r1 = http.send({"method": "get", "url": "%URL%", "force_json_decode": true, "headers": {"h1": "v1", "h2": "v2"}, "force_cache": true, "force_cache_duration_seconds": 300, "cache_ignored_headers": ["h2"]}) | ||
r2 = http.send({"method": "get", "url": "%URL%", "force_json_decode": true, "headers": {"h1": "v1", "h2": "v2"}, "force_cache": true, "force_cache_duration_seconds": 300, "cache_ignored_headers": ["h2", "h3"]}) # cached |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for testing we can actually have a h3
header in the headers
object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also a test for the scenario when the value of cache_ignored_headers
and headers
differs and we get a cache miss would be helpful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more case I can think of:
R1: {"headers": {"h1": "v1"}}
R2: {"headers": {"h1": "v1", "h2": "v2"}, "cache_ignored_headers": ["h2"]}
So here R1 and R2 are equivalent, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another one:
R1: {"headers": {"h1": "v1"}, "cache_ignored_headers": []}
R2: {"headers": {"h1": "v1", "h2": "v2"}, "cache_ignored_headers": ["h2"]}
So here R1 and R2 are equivalent, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@rudrakhp this would be a good one to get in the next release. Let us know if you need any help. Thanks. |
@ashutosh-narkar I just realised that request object was a pointer and it was getting modified as we modify the key, the downstream request will also not get these ignored headers. I have made a couple of new changes to address that, please do review. Thanks for the comments! |
85d4af0
to
8c78f07
Compare
✅ Deploy Preview for openpolicyagent ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
8c78f07
to
66508d1
Compare
@@ -729,13 +772,13 @@ func newHTTPSendCache() *httpSendCache { | |||
} | |||
|
|||
func valueHash(v util.T) int { | |||
return v.(ast.Value).Hash() | |||
return ast.StringTerm(v.(ast.Value).String()).Hash() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ashutosh-narkar fyi needed changes to the hash
and eq
functions of the hashmap to bring parity between inter and intra query cache key generation. The inter query cache already relies on the String()
representation of Value
for cache key.
Relying on the raw Value
instead of theString()
representation will always lead to a cache miss in intra query due to change in Hash
when Copy()
from request object is performed during key generation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation.
The inter query cache already relies on the String() representation of Value for cache key.
Can you point to where this code is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ashutosh-narkar please refer topdown/cache/cache.go#L240, String()
representation of the input key (of type ast.Value
) is used as a cache key.
66508d1
to
06c7372
Compare
| `caching_mode` | no | `string` | Controls the format in which items are inserted into the inter-query cache. Allowed modes are `serialized` and `deserialized`. In the `serialized` mode, items will be serialized before inserting into the cache. This mode is helpful if memory conservation is preferred over higher latency during cache lookup. This is the default mode. In the `deserialized` mode, an item will be inserted in the cache without any serialization. This means when items are fetched from the cache, there won't be a need to decode them. This mode helps to make the cache lookup faster at the expense of more memory consumption. If this mode is enabled, the configured `caching.inter_query_builtin_cache.max_size_bytes` value will be ignored. This means an unlimited cache size will be assumed. | | ||
| `raise_error` | no | `bool` | If `raise_error` is set, `http.send` will return an error that can halt policy evaluation when used in conjunction with the `strict-builtin-errors` option. Default: `true`. | | ||
| `max_retry_attempts` | no | `number` | Number of times to retry a HTTP request when a network error is encountered. If provided, retries are performed with an exponential backoff delay. Default: `0`. | | ||
| `cache_ignored_headers` | no | `list` | List of header keys from `headers` parameter that should not considered when interacting with the cache. Default is `nil`, meaning all headers will be considered. **Important:** Note that if a cache entry exists with a subset/superset of headers that are considered in this request, it will lead to a cache miss. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important: Note that if a cache entry exists with a subset/superset of headers that are considered in this request, it will lead to a cache miss.
Can you provide an example of this please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added an unit test for this here.
func getKeyFromRequest(req ast.Object) (ast.Object, error) { | ||
var cacheIgnoredHeaders []string | ||
var allHeaders map[string]interface{} | ||
cacheIgnoredHeadersTerm := req.Get(ast.StringTerm("cache_ignored_headers")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// new copy so headers in request object doesn't change
Did you mean deep copy
?
@@ -729,13 +772,13 @@ func newHTTPSendCache() *httpSendCache { | |||
} | |||
|
|||
func valueHash(v util.T) int { | |||
return v.(ast.Value).Hash() | |||
return ast.StringTerm(v.(ast.Value).String()).Hash() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation.
The inter query cache already relies on the String() representation of Value for cache key.
Can you point to where this code is?
4fd9a4a
to
4cf2338
Compare
topdown/http_test.go
Outdated
note: "http.send GET cache miss different headers in cache key", | ||
ruleTemplate: `p = x { | ||
r1 = http.send({"method": "get", "url": "%URL%", "force_json_decode": true, "headers": {"h1": "v1", "h2": "v2", "h3": "v3"}, "cache_ignored_headers": ["h2"]}) | ||
r2 = http.send({"method": "get", "url": "%URL%", "force_json_decode": true, "headers": {"h1": "v1", "h2": "v21"}, "cache_ignored_headers": ["h2"]}) # cached |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# cached
this is a miss, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look good @rudrakhp. Can you please squash your commits.
4cf2338
to
a74670b
Compare
Signed-off-by: Rudrakh Panigrahi <[email protected]>
a74670b
to
54a5835
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Why the changes in this PR are needed?
Need to support exclusion of certain headers from
http.send
query cache key.What are the changes in this PR?
cache_ignored_headers
in thehttp.send
built in request object. Enables policy authors to define an exclusion list for headers to ignore while caching.Notes to assist PR review:
cache_ignored_headers
changes but does not affect the final computed cache key, the results are still served from cache. This attribute is always excluded from the cache key.Further comments:
Resolves #6642