Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for :foreigncall to transition to GC safe automatically #49933

Merged
merged 4 commits into from
Feb 15, 2025

Conversation

vchuravy
Copy link
Member

@vchuravy vchuravy commented May 23, 2023

This has been bouncing around as a idea for a while.
One of the challenges around time-to-safepoint has been Julia code
that is calling libraries.

Since foreign code will not include safepoints we see increased latency
when one thread is running a foreign-call and another wants to trigger GC.

The open design question here is:

  • Do we expose this as an option the user must "opt-in", e.g. by using a
    keyword arg to @ccall or a specific calling-convetion.
  • Or do we turn this on for all ccall, except for Julia runtime calls.

There is relativly little code outside the Julia runtime that needs to be "GC unsafe",
exception are programs that directly use the Julia C-API. Incidentially jl_adopt_thread
and @cfunction/@ccallable do the right thing and transition to "GC unsafe", regardless
of what state the thread currently is in.

I still need to figure out how to reliably detect Julia runtime calls, but I think we can
switch all other calls to "GC safe". We should also consider optimizations that mark large
regions of code without Julia runtime interactions as "GC safe" in particular numeric
for-loops.

Closes #57057

@gbaraldi
Copy link
Member

Since we already have the trampoline stuff, what about adding a safepoint into every call into the julia runtime? Similar to the safepoint in JIT function calls?

@Taaitaaiger
Copy link
Contributor

How would this affect packages that use CxxWrap or jlrs to provide access to C++ and Rust libraries, which call back into into Julia from foreign functions?

@vchuravy vchuravy mentioned this pull request Oct 3, 2023
@fingolfin
Copy link
Member

but I think we can
switch all other calls to "GC safe"

All in Julia itself? Perhaps.

All in all Julia packages? Definitely not! That would be a breaking change.

@kpamnany
Copy link
Contributor

kpamnany commented Oct 4, 2023

All in all Julia packages? Definitely not! That would be a breaking change.

Can you point at a for-instance @fingolfin?

@gbaraldi
Copy link
Member

gbaraldi commented Oct 4, 2023

The not common but existing case is for julia code to call c code that calls julia runtime functions.

@fingolfin
Copy link
Member

@kpamnany any package using a JLL linking against libjulia would be a first candidate, that includes all packages using CxxWrap (I count 29 in the general registry). For myself, GAP.jl.

I am not saying this is an impossible change, but only if there is a well thought out migration strategy that is coordinated with all stakeholders. But just saying "it probably won't affect that many packages, let's just do it", without a full analysis and without involving the community, repeats the "main" debacle. Let's not go there.

Overall, I prefer "code that is not as fast as it could be but is correct" over "code that is perhaps a bit faster, or not, but will occasionally crash or produce garbage...

A more viable alternative would be an opt-in solution -- but then of course many packages will miss this opportunity. A pity, but it wouldn't leave us worse than we are right now, new packages could benefit from it.

(If we had something like Rust editions, then for "new" packages this could become the default, I guess...?)

src/ccall.cpp Outdated Show resolved Hide resolved
@vchuravy
Copy link
Member Author

@eval function put(msg)
    cmsg = Base.cconvert(Cstring, msg)
    ptr = Base.unsafe_convert(Cstring, cmsg)
    $(Expr(:foreigncall, QuoteNode(:puts), Cint, Core.svec(Cstring), 0, true, QuoteNode(:ccall), :ptr, :cmsg))
end

Now is:

L28:                                              ; preds = %L15
; └
;  @ /home/vchuravy/src/julia2/gc_safe.jl:4 within `put`
  %ptls_field18 = getelementptr inbounds {}**, {}*** %tls_pgcstack, i64 2
  %ptls_load19 = load {}**, {}*** %ptls_field18, align 8
  %12 = bitcast {}** %ptls_load19 to i8*
  %gc_state = getelementptr inbounds i8, i8* %12, i64 25
  %13 = load atomic i8, i8* %gc_state monotonic, align 1
  store atomic i8 2, i8* %gc_state release, align 8
  %14 = call i32 bitcast (void ()* @jlplt_puts_689_got.jit to i32 (i64)*)(i64 %string_ptr) #10
  store atomic i8 %13, i8* %gc_state release, align 8
  %15 = getelementptr inbounds {}*, {}** %ptls_load19, i64 2
  %16 = bitcast {}** %15 to i64**
  %safepoint = load i64*, i64** %16, align 8
  fence syncscope("singlethread") seq_cst
  %17 = load volatile i64, i64* %safepoint, align 8
  fence syncscope("singlethread") seq_cst
  %frame.prev14 = load {}*, {}** %frame.prev, align 8
  %18 = bitcast {}*** %tls_pgcstack to {}**
  store {}* %frame.prev14, {}** %18, align 8
  ret i32 %14

The safepoint emission at that point needs to be cleaner.

@vtjnash that's probably closer to what it needs to be?

@kpamnany
Copy link
Contributor

kpamnany commented Nov 2, 2023

Note for when this lands: we should redo JuliaLang/MbedTLS.jl#265.

maleadt added a commit to JuliaGPU/CUDA.jl that referenced this pull request Feb 14, 2024
This allows the GC to run while potentially blocking in a CUDA library.
To make this safe, callbacks into Julia should again transition to
GC-unsafe mode.

It should be reimplemented when JuliaLang/julia#49933 lands.

Co-authored-by: Tim Besard <[email protected]>
@vchuravy vchuravy marked this pull request as ready for review January 23, 2025 16:33
@vchuravy
Copy link
Member Author

vchuravy commented Jan 23, 2025

@gbaraldi I rebased this and confirmed that:

@eval function put(msg)
    cmsg = Base.cconvert(Cstring, msg)
    ptr = Base.unsafe_convert(Cstring, cmsg)
    $(Expr(:foreigncall, QuoteNode(:puts), Cint, Core.svec(Cstring), 0, true, QuoteNode(:ccall), :ptr, :cmsg))
end
@code_llvm put("1")
  %gc_state = getelementptr inbounds i8, ptr %ptls_load14, i64 25
  %4 = load atomic i8, ptr %gc_state monotonic, align 1
  store atomic i8 2, ptr %gc_state release, align 8
  %5 = call i32 @jlplt_puts_1286_got.jit(i64 %bitcast_coercion2) #14
  store atomic i8 %4, ptr %gc_state release, align 8

@gbaraldi
Copy link
Member

gbaraldi commented Jan 23, 2025

This is missing some safepoints no?

STATIC_INLINE int8_t jl_gc_state_set(jl_ptls_t ptls, int8_t state,
                                     int8_t old_state)
{
    assert(old_state != JL_GC_PARALLEL_COLLECTOR_THREAD);
    assert(old_state != JL_GC_CONCURRENT_COLLECTOR_THREAD);
    jl_atomic_store_release(&ptls->gc_state, state);
    if (state == JL_GC_STATE_UNSAFE || old_state == JL_GC_STATE_UNSAFE)
        jl_gc_safepoint_(ptls);
    return old_state;
}

@vchuravy
Copy link
Member Author

I only posted an excerpt:

  %ptls_field13 = getelementptr inbounds i8, ptr %tls_pgcstack, i64 16
  %ptls_load14 = load ptr, ptr %ptls_field13, align 8
  %gc_state = getelementptr inbounds i8, ptr %ptls_load14, i64 25
  %4 = load atomic i8, ptr %gc_state monotonic, align 1
  store atomic i8 2, ptr %gc_state release, align 8
  %5 = call i32 @jlplt_puts_1286_got.jit(i64 %bitcast_coercion2) #14
  store atomic i8 %4, ptr %gc_state release, align 8
  %6 = getelementptr inbounds i8, ptr %ptls_load14, i64 16
  %safepoint = load ptr, ptr %6, align 8
  fence syncscope("singlethread") seq_cst
  %7 = load volatile i64, ptr %safepoint, align 8
  fence syncscope("singlethread") seq_cst

@gbaraldi
Copy link
Member

And yeah I was gonna say, the conditionality of the safepoint is probably a bit dumb, even in gc_state_set

base/meta.jl Show resolved Hide resolved
@vchuravy
Copy link
Member Author

@gbaraldi I would prefer @ccall gc_safe=true, instead of having to use the @assume_effects macro on top of it.

@vchuravy vchuravy requested a review from vtjnash February 6, 2025 09:39
base/c.jl Outdated Show resolved Hide resolved
base/c.jl Outdated Show resolved Hide resolved
src/ccall.cpp Outdated Show resolved Hide resolved
@vchuravy vchuravy added the backport 1.12 Change should be backported to release-1.12 label Feb 6, 2025
@@ -404,9 +434,16 @@ Example using an external library:

The string literal could also be used directly before the function
name, if desired `"libglib-2.0".g_uri_escape_string(...`

It's possible to declare the ccall as `gc_safe` by using the `gc_safe = true` option:
@ccall gc_safe=true strlen(s::Cstring)::Csize_t
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚲 If gc_safe=false is the default, perhaps marking a ccall as gc_safe could just be spelled like:

@ccall :gc_safe strlen(s::Cstring)::Csize_t

(similar to how the @atomic and @spawn and @assume_effects macros take Symbols to control behaviour)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We hope to be able to flip the default.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would that work (flipping the default)? What's the transition strategy?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn't a firm strategy yet, right now anything using cfunction/@ccallable will transition correctly (this was done for foreign thread adoption) so the big issue is anything that uses the runtime. Already we have a trampoline in place and it may be feasible to add a transition check to that trampoline.

This would make calls into the runtime safe as well, but we wouldn't want to double transition for calls from Julia to the runtime, so we would need to manually mark those as gc_safe=false.

If the trampoline strategy works we should be able to flip the switch without any correctness issues, but it may be inefficient due to frequent transition and double transitions.

@KristofferC KristofferC mentioned this pull request Feb 6, 2025
32 tasks
@vchuravy vchuravy force-pushed the vc/ccall_safe_gc branch 2 times, most recently from ce14078 to 41ef0c4 Compare February 13, 2025 16:35
@vtjnash vtjnash added the needs news A NEWS entry is required for this change label Feb 13, 2025
@vchuravy vchuravy force-pushed the vc/ccall_safe_gc branch 2 times, most recently from 2ebb73e to 557119f Compare February 14, 2025 08:56
@vchuravy vchuravy added merge me PR is reviewed. Merge when all tests are passing and removed needs news A NEWS entry is required for this change labels Feb 14, 2025
@giordano
Copy link
Contributor

llvmpasses and windows tests are failing

@KristofferC KristofferC mentioned this pull request Feb 14, 2025
31 tasks
@giordano giordano merged commit 85458a0 into master Feb 15, 2025
7 checks passed
@giordano giordano deleted the vc/ccall_safe_gc branch February 15, 2025 17:51
@giordano giordano removed the merge me PR is reviewed. Merge when all tests are passing label Feb 15, 2025
@@ -4,6 +4,8 @@ Julia v1.12 Release Notes
New language features
---------------------

* The `@ccall` macro can now take a `gc_safe` argument, that if set to true allows the runtime to run garbage collection concurrently to the `ccall`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably have been in HISTORY since you want it backported

KristofferC pushed a commit that referenced this pull request Feb 15, 2025
This has been bouncing around as a idea for a while.
One of the challenges around time-to-safepoint has been Julia code
that is calling libraries.

Since foreign code will not include safepoints we see increased latency
when one thread is running a foreign-call and another wants to trigger
GC.

The open design question here is:
- Do we expose this as an option the user must "opt-in", e.g. by using a
  keyword arg to `@ccall` or a specific calling-convetion.
- Or do we turn this on for all ccall, except for Julia runtime calls.

There is relativly little code outside the Julia runtime that needs to
be "GC unsafe",
exception are programs that directly use the Julia C-API. Incidentially
`jl_adopt_thread`
and `@cfunction`/`@ccallable` do the right thing and transition to "GC
unsafe", regardless
of what state the thread currently is in.

I still need to figure out how to reliably detect Julia runtime calls,
but I think we can
switch all other calls to "GC safe". We should also consider
optimizations that mark large
regions of code without Julia runtime interactions as "GC safe" in
particular numeric
for-loops.

Closes #57057

---------

Co-authored-by: Gabriel Baraldi <[email protected]>
(cherry picked from commit 85458a0)
KristofferC added a commit that referenced this pull request Feb 17, 2025
Backported PRs:
- [x] #57346 <!-- lowering: Only try to define the method once -->
- [x] #57341 <!-- bpart: When backdating replace the entire bpart chain
-->
- [x] #57381 <!-- staticdata: Set min validation world to require world
-->
- [x] #57357 <!-- Only implicitly `using` Base, not Core -->
- [x] #57383 <!-- staticdata: Fix typo in recursive edge revalidation
-->
- [x] #57385 <!-- bpart: Move kind enum into its intended place -->
- [x] #57275 <!-- Compiler: fix unsoundness of getfield_tfunc on Tuple
Types -->
- [x] #57378 <!-- print admonition for auto-import only once per module
-->
- [x] #57392 <!-- [LateLowerGCFrame] fix PlaceGCFrameReset for
returns_twice -->
- [x] #57388 <!-- Bump JuliaSyntax to v1.0.2 -->
- [x] #57266 <!-- 🤖 [master] Bump the Statistics stdlib from d49c2bf to
77bd570 -->
- [x] #57395 <!-- lowering: fix has_fcall computation -->
- [x] #57204 <!-- Clarify mathematical definition of `gcd` -->
- [x] #56794 <!-- Make `Pairs` public -->
- [x] #57407 <!-- staticdata: corrected implementation of
jl_collect_new_roots -->
- [x] #57405 <!-- bpart: Also partition the export flag -->
- [x] #57420 <!-- Compiler: Fix check for IRShow definedness -->
- [x] #55875 <!-- fix `(-Inf)^-1` inconsistency -->
- [x] #57317 <!-- internals: add _defaultctor function for defining
ctors -->
- [x] #57406 <!-- bpart: Ignore guard bindings for ambiguity purposes
-->
- [x] #49933 <!-- Allow for :foreigncall to transition to GC safe
automatically -->
@KristofferC KristofferC removed the backport 1.12 Change should be backported to release-1.12 label Feb 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Assertion failure during undefined variable error reporting
9 participants