-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SQLClient connection leak in getConnectionAwait(), leads to a deadlock #195
Comments
can you provide a reproducer project ? that would help |
@vietj I hoped that my description was detailed enough to avoid implementing PoC as I am quite busy 😅 Here is the POC:
Code of interest: Ping with fixed race condition |
can you elaborate on how the coroutine can get cancelled in your case ? |
I understand the issue and it seems it is related to the fact that Vert.x futures cannot be cancelled and I am wondering how we could solve this. e.g we could a close called when the coroutine is cancelled and the future result implements an interface like Closeable. |
Perhaps we could make Vert.x coroutines as non cancellable ? |
Coroutine gets usually cancelled due to timeout in Another reason is client disconnect during the request, e.g., in the websocket server handler, client may stop responding to pings, socket can get disconnected due to network issue, client disconnects when using mobile data. The coroutine chain triggered by the client request get cancelled. Similarly, the CoroutineScope may be cancelled from another reason (e.g., reloading TLS certificates, maintenance mode).
I was also thinking along these lines. It seems like a fine workaround. It
I am not sure about the impact of this. Personally, it feels a bit risky and maybe too restrictive. |
I would actually discourage using such cancellation. The more I think about it, the more I see coroutine cancellation as something similar to killing a thread without having an opportunity to cleanly close all the resources associated with the coroutines. Instead you should try to use timeout on the resources you are using and let them fail for you, e.g if your HTTP server uses a WebClient then set timeout on the HTTP requests. It will have a similar effect and has a more reliable handling. |
I actually quite like structured concurrency principles coroutines offer. As cancellation is cooperative and it throws an exception, it is possible to reasonably close resources with The problem is that with current Future design, when handler is invoked and coroutine is not active anymore, we have no way to signal the completion to the calling code, so the logical idea is to close acquired resource. suspend fun <T> awaitEvent(block: (h: Handler<T>) -> Unit): T {
return suspendCancellableCoroutine { cont: CancellableContinuation<T> ->
try {
block.invoke(Handler { t ->
if (cont.isActive)
cont.resume(t)
} else if(it is java.io.Closeable) {
t.close() // <--- close connection if coroutine is already cancelled when handler is invoked
}
})
} catch (e: Exception) {
cont.resumeWithException(e)
}
}
} When I experimented with non-cancellable coroutine in Good point with the resources timeouts. Using timeout also on the top level of coroutine processing makes sure we didn't forget setting any timeouts and guarantee maximal execution time for a request (availability, scalability). |
Closing as we prefer not moving forward with this. |
@tsegismont so if I understand correctly, you basically say that deadlock will be preserved in code and there is nothing you will do about it? :) |
no, we are saying that we will not implement cancelation as this is not what we want to support, e.g supporting loom in the future will not be able to deal with cancelation |
Version
Context
I observed a resource leak in JDBCClient, namely in the
vertx-lang-kotlin:3.9.7
functionThe problem is if coroutine calling
getConnectionAwait()
is cancelled before the handler on the following method is called:If such situation happens (e.g., calling REST connection is terminated / timed out), the obtained connection, passed to the handler, is not closed and returned to the pool. This causes leakage, depleting all free connections (if using with connection pool like C3P0). The program ends up in the terminal state without usable DB connection.
I believe that the problem lies in the
vertx-lang-kotlin/vertx-lang-kotlin-coroutines/src/main/java/io/vertx/kotlin/coroutines/VertxCoroutine.kt
Line 63 in d991b16
I solved the problem by implementing the coroutine bridging on my own with a if-branch handling cancelled coroutine, such as
I noticed that this method is deprecated vert-x3/vertx-jdbc-client#196
vertx-lang-kotlin/vertx-lang-kotlin/src/main/kotlin/io/vertx/kotlin/ext/sql/SQLClient.kt
Lines 64 to 65 in c1938f8
Also, vertx 3 API is planned to be supported also in vertx 4 so this race condition can potentially affect a lot of users, making their servers unresponsive. Btw was it deprecated also because of this potential issue?
The similar issues can be present also elsewhere in the same pattern, i.e., call await method (coroutine), wait for handler, when handler is called, coroutine is already cancelled, thus resource obtained & returned in the handler is not properly closed.
Do you have a reproducer?
Not yet, I can create one if description is not enough.
Steps to reproduce
The text was updated successfully, but these errors were encountered: