-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add client statistics to the connection spec #308
base: main
Are you sure you want to change the base?
Conversation
adr/ADR-40.md
Outdated
#### Messages out | ||
Every message (`PUB` and `HPUB`) sent to the server. | ||
|
||
#### Connets |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reconnects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I meant connects.
I feel that counting only reconnects gets messy when you have allow initial reconnect
, which starts with 0 reconnects and 0 connets. Then, after initial connection is established, it would still be 0 reconnects, but 1 connect. Hence I feel it's clearer if we count every successful connect, including the initial one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is Reconnects in the nats.go client already, so would try to model that rather than changing it at this point: https://github.com/nats-io/nats.go/blob/f0c0194ad0ca716dfa162baac7e94d900a53f2c9/nats.go#L780
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not model the Statistics
after Go client, or any other client, as each is counting bytes in/out in their own way. That's because the feature predates ADR and parity concepts.
We agreed that we should do what Server does - count all bytes in and out. Go counts just headears + payloads.
Regarding the connect versus reconnect - maybe we could allow this to be client specific? It can make sense depending on how retry_on_connect
is implemented in given language after all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think first pass at least should try to converge on what the nats.go client does which is the practical spec already used, trying to add new behavior on top via the ADR that is not implemented is not helpful. It could be a different field for example or a revised Statistics v2 implementation. If you want to stamp out the behavior for this ADR it has to be what the Go client does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its not new behaviour that is not align with clients. Every client counts things diferently:
- counting just payloads as bytes in/out
- counting headers + payloads (Go does that)
- counting subjects + headers + payloads (rust)
- counting everything that goes through the socket (server & Javascript)
additionally, some languages have more fields than others.
As you can see, its impossible to make consistent ADR without introducing changes everywhere.
Because of that, we decided to count bytes as server does, which also seems to make most sense.
I agree that the connected
vs reconnected
might be a change we should avoid in some clients, depending how they handle retry_initial_connect
, so I'll update the ADR that both are ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wallyqs I think that Piotr's idea is the best here:
he proposed that in clients that count data differently, we could add total_bytes_in
total_bytes_out
etc.
Check the upates to this PR.
@Jarema This is what the java client currently collects:
Advanced Stats
|
Signed-off-by: Tomasz Pietrek <[email protected]>
3414ba6
to
70f439a
Compare
Signed-off-by: Tomasz Pietrek <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Signed-off-by: Tomasz Pietrek <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The description of what we should be doing doesn't seem to align with what the server is reporting.
Clients should implement statistics gathered throughout the lifetime of the client (persisted between connections): | ||
|
||
#### Bytes In | ||
Should account for everything received from the server, a record of bytes received on the socket. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Jarema - so went looking augment with what is discussed here, but the discussion on the alignment with server is incorrect - Server accounting seems to only include headers/payload for published messages - it doesn't include other protocol messages, subjects, etc. We could add a raw
metric for in/out bytes, but right now at least for JavaScript, it lines up with what the server does.
Deno.test("basics - stats", async () => {
const { ns, nc } = await _setup(connect);
const cid = nc.info?.client_id || -1;
if (cid === -1) {
fail("client_id not found");
}
await nc.flush();
console.log(nc.stats())
console.log(await ns.connz(cid, "detail"));
nc.publish("hello", "world");
await nc.flush();
console.log(nc.stats())
console.log(await ns.connz(cid, "detail"));
nc.subscribe("hello", {callback: () => {}});
nc.publish("hello", "hi");
await nc.flush();
console.log(nc.stats())
console.log(await ns.connz(cid, "detail"));
await cleanup(ns, nc);
});
{ inBytes: 0, outBytes: 0, inMsgs: 0, outMsgs: 0 }
{
server_id: "NBN7DZNKVGIGZBAVMGYTK7RHZZITVELFTCJI5BKOHORHTTO2TOS4OHTL",
now: "2024-10-15T23:32:44.636996Z",
num_connections: 1,
total: 1,
offset: 0,
limit: 1,
connections: [
{
cid: 6,
kind: "Client",
type: "nats",
ip: "127.0.0.1",
port: 59457,
start: "2024-10-15T18:32:44.633103-05:00",
last_activity: "2024-10-15T23:32:44.634232Z",
rtt: "1.129333ms",
uptime: "0s",
idle: "0s",
pending_bytes: 0,
in_msgs: 0,
out_msgs: 0,
in_bytes: 0,
out_bytes: 0,
subscriptions: 0,
lang: "nats.deno",
version: "3.0.0-6"
}
]
}
{ inBytes: 0, outBytes: 5, inMsgs: 0, outMsgs: 1 }
{
server_id: "NBN7DZNKVGIGZBAVMGYTK7RHZZITVELFTCJI5BKOHORHTTO2TOS4OHTL",
now: "2024-10-15T23:32:44.638501Z",
num_connections: 1,
total: 1,
offset: 0,
limit: 1,
connections: [
{
cid: 6,
kind: "Client",
type: "nats",
ip: "127.0.0.1",
port: 59457,
start: "2024-10-15T18:32:44.633103-05:00",
last_activity: "2024-10-15T18:32:44.638146-05:00",
rtt: "1.129333ms",
uptime: "0s",
idle: "0s",
pending_bytes: 0,
in_msgs: 1,
out_msgs: 0,
in_bytes: 5,
out_bytes: 0,
subscriptions: 0,
lang: "nats.deno",
version: "3.0.0-6"
}
]
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. I probably made a mistake checking the server code.
Though I still agree with your point that we should count everything in and out.
We for sure can have more stats - current one and raw.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that would be the correct thing to do. The other nuance is we should reset on reconnect the current counters this way the reporting from the client will always match the current session and the server. We can add a total_in/out
which accounts for all the in/out but these are always all session - Not sure what the benefit is to say that a current client over all its reconnects has sent XX amount of data, so preserving that session data is possibly not useful.
Signed-off-by: Tomasz Pietrek [email protected]