Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QPM Limit Bypass List apparently not being honored #1181

Open
shalafi99 opened this issue Jan 11, 2025 · 15 comments
Open

QPM Limit Bypass List apparently not being honored #1181

shalafi99 opened this issue Jan 11, 2025 · 15 comments

Comments

@shalafi99
Copy link

Hi @ShreyasZare!

I have upgraded the DNS server on two VPS I use, from v12.2.1.0 to v13.1.1.0 (And v13 that added a feature to log clients which are being rate limited due to QPM).

Because of having set such a limit, for my own DNS resolver usage I have a script which checks when the ISP changes the IPv4 public address assigned to my router and if that has changed it accesses the DNS server API to update the "QPM Limit Bypass List" field.
This is the API call
https://${DnsServer}:53443/api/settings/set?token=${Token}&qpmLimitBypassList=${encodedIpv4},${encodedIpv6}
The script has been working as expected. (If I intentionally "mess up" the bypass config changing entries to something else the script "corrects" it)

By using v13, I noticed the defined subnet size for the public IPv4 I use in the router has been appearing in the logs like so:

[2025-01-11 00:41:51 Local] Client subnet '179.208.140.0/24' is being rate limited till the query rate limit (52 qpm for requests) falls below 50 qpm.
[2025-01-11 00:42:01 Local] Client subnet '179.208.140.0/24' is no longer being rate limited (1 qpm for requests).
[2025-01-11 00:42:41 Local] Client subnet '179.208.140.0/24' is being rate limited till the query rate limit (70 qpm for requests) falls below 50 qpm.
[2025-01-11 00:43:01 Local] Client subnet '179.208.140.0/24' is no longer being rate limited (2 qpm for requests).
[2025-01-11 00:49:31 Local] Client subnet '179.208.140.0/24' is being rate limited till the query rate limit (51 qpm for requests) falls below 50 qpm.
[2025-01-11 00:50:01 Local] Client subnet '179.208.140.0/24' is no longer being rate limited (0 qpm for requests).

I trimmed the logs - there are several other subnets being rate limited too, but it's well they are since there is no bypass configured for them (they're not related to the public IPv4 address I'm using)

The related QPM configuration is set like this:
image

Would you have any ideas on what could be wrong or suggestions on what to check for further troubleshooting?
Thanks!

@ShreyasZare
Copy link
Member

Thanks for reporting this. The bypass list is being honored but the limit logs are still being written based on the collected stats without checking for the bypass list.

Will get this fixed in the next update by adding a check for bypass list.

@bcookatpcsd
Copy link

@shalafi99

How/where did you get those logs?

❯ docker logs technitium --tail 8
Technitium DNS Server is stopping...
Technitium DNS Server was stopped successfully.
Technitium DNS Server was started successfully.
Using config folder: /etc/dns

Note: Open http://tchntm:5380/ in web browser to access web console.

Press [CTRL + C] to stop...

Thank you in advance

@ShreyasZare
Copy link
Member

@bcookatpcsd these logs are in the DNS log file which can be viewed in the admin panel Logs section.

@bcookatpcsd
Copy link

My logs look like this (via a browser)

Image

I had to remove ratelimiting (set to 0) as I had a dozen subnets being limited.. and I could not find logs for a count..

I couldn't get "non dns" logs..

Image

those logs look to be on a command line.. (imho)

@Hemsby
Copy link

Hemsby commented Jan 17, 2025

The operational logs are here

Image

I have mine set to keep for 14days

@bcookatpcsd
Copy link

🤦

I forgot about those.. ty

(I knew that..) but recalling it was a problem.. lol

@bcookatpcsd
Copy link

[2025-01-16 08:35:21 Local] Client subnet '10.120.14.0/24' is being rate limited till the query rate limit (616 qpm for errors) falls below 600 qpm.

Image

I believe the default is 6000

I couldn't understand how the clients were doing 6k rq/m

600 is totally possible.. certainly within a /24

(defaults)

Image

Image

(the drop is nxdomain to 0.0.0.0)

but those top hosts were doing 10k until I found the backlog of 100 and raised it..

I think the 6k does not mean 6k.. as evidenced in the logs where it says 600

@ShreyasZare
Copy link
Member

I think the 6k does not mean 6k.. as evidenced in the logs where it says 600

@bcookatpcsd The QPM limit is different for positive requests and for requests that generate error response. The error QPS is default set to 600 q/m. Its useful for authoritative DNS servers to rate limit resolvers who are causing too much error responses.

@bcookatpcsd
Copy link

OH..

so the errors is what caused the lock; not the query

(assuming nxdomain is treated as a error which totaled more than 600 then causing the block?)

(if so) that completely makes sense..

@bcookatpcsd
Copy link

Image

This is something like 150/qps this hour.. (550000 / 3600)

all doh clients on a lan..

(they have external doh when they leave for the evening and weekend/out of office/etc.. so they need doh when they are here for the day.. to keep them from routing out for each doh query during the day.. )

@bcookatpcsd
Copy link

bcookatpcsd commented Jan 17, 2025

I thought I had everything (mostly) worked out and accounted for (yesterday) when I fully implemented..

Image

I couldn't find those logs to show what those dropped queries were.. (couldn't find 'drop' in the administration screens..)

then figured it must be the qpm limits.. (I think that was 500+/qps)

(no gaps in the graphing.. nice)

anyway..

thank you for all the details.

Sorry for hijacking the thread..

Greatly appreciated.

@shalafi99
Copy link
Author

Thanks for reporting this. The bypass list is being honored but the limit logs are still being written based on the collected stats without checking for the bypass list.

Will get this fixed in the next update by adding a check for bypass list.

Thanks, @ShreyasZare! The norm would be to keep this issue open until you roll out the fix, correct?

@ShreyasZare
Copy link
Member

OH..

so the errors is what caused the lock; not the query

(assuming nxdomain is treated as a error which totaled more than 600 then causing the block?)

(if so) that completely makes sense..

Only FormatError, ServerFailure, or Refused are considered as error responses.

@ShreyasZare
Copy link
Member

I couldn't find those logs to show what those dropped queries were.. (couldn't find 'drop' in the administration screens..)

Dropped responses are not logged anywhere but only counted in stats. This is to avoid filling log file/db in case when there is some kind of attack.

then figured it must be the qpm limits.. (I think that was 500+/qps)

(no gaps in the graphing.. nice)

Note that these QPM limits are not enforced per client but per subnet which default is set to /24. So it could be 2 or more clients in same subnet causing the limit to exceed.

anyway..

thank you for all the details.

Sorry for hijacking the thread..

Greatly appreciated.

You're welcome.

@ShreyasZare
Copy link
Member

Thanks, @ShreyasZare! The norm would be to keep this issue open until you roll out the fix, correct?

You're welcome. Yes, keep it open. When the fix is available, I will post here and close the issue so that anyone tracking this issue gets a notification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants