Validation: App Layer Rate Limiting #137
Labels
Dev Reviewed
Reviewed by Tech Lead
Notify
Board trigger
PM Reviewed
Reviewed by Product Manager
QA Reviewed
Reviewed by Quality Assurance
QA
Issue requires QA collaboration
User Story - Business Need
We will use redis to handle app-layer rate limiting.
Investigated alternatives, memcached it not really supported by the python community, whereas redis has async support.
User Story(ies)
As a VA Notify reliability dev
I want to limit API requests
So that one client is not capable of affecting another
Additional Info and Resources
We likely want to use:
and set that as part of our
EnpState
with lifespan (setup/shutdown), making sure to close it with the shutdown.Redis Keys should be
rate-limit-<service_id>-<api_key_id>
of an authenticated request because some clients have sub-groups using different keys.Example taken (and tweaked) from Redis' Youtube account:
Acceptance Criteria
This AC is based on the suggestion in the Additional Info. If the dev would like to move in a different direction please discuss with @k-macmillan so we can update the AC.
limit
(value for keys). Default = 1800 (1800/60 = 30 req/s)observation_period
(how long until the key expires). Default = 60remaining-rate-limit-<service_id>-<api_key_id>
ci/.env.local
Depends
on sms/email/ POST routes and notification GET route (for any that are implemented)QA Considerations
Multiple services and api keys should all work as expected. Could be tested by automating a load test to hammer it, or extending the
observation_period
and reducing thelimit
. There are some mild concerns with async behaving correctly, so the preference is to hammer the endpoint(s). May be good to try both (start slow, finish with the hammer) .Potential Dependencies
Out of Scope
Determining limits/observation periods. We will fine tune that based on performance/reliability evaluations later.
There may be some race conditions due to decrby creating the key value pair. If we find it in testing we'll address with transactions and modified flow. Docs aren't clear enough.
The text was updated successfully, but these errors were encountered: