-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatically run multi domains and parallel tasks instead of using domain manager #791
Comments
On thinking about this some more, I believe that Eio users creating new domains with the domain manager, or creating domain pools with the executor pool and using them in different parts of the app, is usually the wrong level of abstraction. Imho, we should provide an API that makes it clear that Eio programs are expected to run in a single domain, and that Eio will replicate the program across multiple domains (so of course it must be domain-safe). And this API should not expose any decisions about the number of domains, because we already know what the number should be. Eg: (* val run_multi : (Eio_unix.Stdenv.base -> unit) -> unit *)
let run_multi domain =
Eio_main.run (fun env ->
let fiber () = domain env in
let new_domain _ () =
Eio.Domain_manager.run (Eio.Stdenv.domain_mgr env) fiber
in
let fibers =
fiber :: List.init (Domain.recommended_domain_count () - 1) new_domain
in
Eio.Fiber.all fibers)
let job_queue ~sw env () = ... (* Could be a periodic sync job or whatever *)
let () =
run_multi (fun env ->
Eio.Switch.run (fun sw ->
Eio.Fiber.fork ~sw (job_queue ~sw env);
Dream.run env
@@ Dream.logger
@@ Dream.router [
...
])) This way, apps running with Of course the key issue is that the callback to Implementation of run_multi that removes domain manager access
|
OK I have a slightly more sophisticated POC here: https://github.com/yawaramin/dream/blob/eio-par/example/w-dream-html/par.mli#L20 Sorry it's in a slightly haphazard place because I'm experimenting with Dream's Eio port. So we have But the key new addition here is a parallelized task runner that is available in each domain: All domains except domain 0 are treated as worker domains. Domain 0 is not given any of the tasks in order to keep it available for I/O. If there is only a single domain available, we are just falling back to regular Here's an example showing float array summation: https://github.com/yawaramin/dream/blob/eio-par/example/w-dream-html/par.ml#L93 . This is giving a slice of the array to each domain to sum, getting back a promise of an array of the per-domain sums, then finally summing up the array into a single float promise. It's called in the request handler: https://github.com/yawaramin/dream/blob/eio-par/example/w-dream-html/html.ml#L24 |
A few thoughts on this:
|
Thanks @talex5. I have some follow-up questions.
Isn't And even if it's not a good default for some cases, why should a simplified multicore scheduler function like
My proposal wouldn't prevent that design–we could check that we are on domain 0 and only run the accept loop there. But I am a bit surprised to hear you say this, because 'Running multiple accept loops' seems to be exactly what Line 384 in fdd2593
Am I missing something?
With my suggestion we can easily share anything that is defined before let () =
...shared stuff...
in
Par.run @@ fun env ->
...access shared stuff... |
OCaml requires all domains to synchronise on every minor GC. If any domain is slow, it will delay all of them. The more domains you have, the more likely this is (and if you exceed the number of cores then at least one domain will always spin waiting for the others to be ready, preventing the remaining domain from running until the OS decides to preempt the spinning one). See https://roscidus.com/blog/blog/2024/07/22/performance-2/ for examples.
Yes; what I mean is: passing a domain manager to |
This issue is related to the forum discussion https://discuss.ocaml.org/t/multiple-domains-design-question-in-eio-etc/15861 and the cohttp issue mirage/ocaml-cohttp#1101
The concern is that including cohttp-eio, multiple parts of an app may be using the raw domain manager to each create their own pools of domains. Instead, they should all coordinate and use the same executor pool.
The text was updated successfully, but these errors were encountered: