Skip to content
This repository has been archived by the owner on Nov 5, 2018. It is now read-only.

Questions/Improvement suggestions for scoped thread implementation #29

Closed
oliver-giersch opened this issue Jul 25, 2018 · 8 comments
Closed

Comments

@oliver-giersch
Copy link
Contributor

oliver-giersch commented Jul 25, 2018

As mentioned in #27 I had a few questions about the scoped thread code, which could possibly also be suggestions for improvements, but maybe there are some subtleties involved that I don't fully understand.

  1. The Scope struct uses a somewhat strange custom linked list for storing the boxed destructors/deferred closures. I am not sure why this is, I've tried replacing it with a simple Vec as well as std::collections::LinkedList and both solutions worked equally well, although I did not do any in-depth testing and panicking destructors might have unexpected side effects

  2. The linked list is wrapped in a RefCell like so: dtors: RefCell<Option<DtorChain<'a, ()>>>.
    When I tried replacing the DtorChain with a regular Vec or LinkedList I also removed the RefCell and changed all relevant &Scope and &self references to mutable ones and there were no aliasing problems, so I am not sure why the RefCell wrapping exists.

  3. Scope implements the Drop trait. Since drop_all() is called manually before the end of the scope function, implementing Drop should not be necessary. Also, the signature for drop_all could be changed to fn drop_all(self).

  4. The implementation uses a fairly arcane procedure to allocate a Box with std::mem::ManuallyDrop and std::mem::unitialized and casts a raw pointer into an usize in order to smuggle it across thread boundaries and into the two ScopedJoinHandles.
    I thought there must be a better and more readable/understandable way to express this, but haven't figured one out yet.
    The stdlib uses a Arc<UnsafeCell<Option<T>> which is smuggled across thread boundaries.
    Not sure if using an Arc might have a negative impact on performance, but using Option<T> instead of mem::uninitialized() would be preferable I believe.

  5. This is purely my own opinion but I believe the scoped thread module could benefit from a few rearrangements here and there, keeping structs and their impls together (ScopedJoinHandle is all over the place for instance) and maybe putting some things into submodules.

@jeehoonkang
Copy link
Collaborator

First of all, thank you so much for your detailed and thoughtful comments! I think we can discuss several aspects of thread and improve it. Let's start!

Regarding 1 and 2: I think we want methods to use &Scope and &self rather than &mut Scope and &mut self, because it's easier to use for users. For example, inside a scope, users may want to store &'a scope inside multiple structs, etc. We use a strange custom linked list and RefCell for this reason. In order to improve the code quality, maybe we can implement a container that works with shared references, and replace DtorChain with the container.

Regarding 3: suppose code inside a scope() panics. Then Scope will be dropped, and we need to release resources accordingly (i.e. joining threads and executing deferred functions). That's the reason why we need to have drop for Scope, I think. (Actually, the comment I wrote is wrong. I'm sending a PR on this...)

Regarding 4, Rc is fine because scope and ScopedJoinHandle resides only in a single thread, but if we're going to allow Send/Sync for them, definitely Arc will be needed. Also, I prefer mem::uninitialized() to Option, because it's the way we will normally do in C/C++... Yes it's unsafe, but as far as we will document and explain it, I think it's okay.

Regarding 5, I'd like to hear your opinion in more details! Could you send us a PR?

@ghost
Copy link

ghost commented Jul 26, 2018

Also, I prefer mem::uninitialized() to Option, because it's the way we will normally do in C/C++... Yes it's unsafe, but as far as we will document and explain it, I think it's okay.

While mem::uninitialized() is fine for concrete types, be wary of generic types. In fact, that's exactly what we do in scoped threads:

Box::<ManuallyDrop<T>>::new(unsafe { mem::uninitialized() })

The problem with generic types is that they can be ! and calling mem::uninitialized::<!>() is UB. So we should fix our code. :)

I've been bitten by this case of UB in crossbeam-channel before: crossbeam-rs/crossbeam-channel#76
The code was initializing some data of type T (which could be !) using mem::uninitialized and causing a SIGSEGV, even though the value wasn't really used!

  1. This is purely my own opinion but I believe the scoped thread module could benefit from a few rearrangements here and there, keeping structs and their impls together (ScopedJoinHandle is all over the place for instance) and maybe putting some things into submodules.

I'm also interested in how you'd rearrange the modules. This is our current module structure:
https://docs.rs/crossbeam-utils/0.5.0/crossbeam_utils/thread/index.html

@jeehoonkang
Copy link
Collaborator

@stjepang wow, I didn't know that... Yes, in that case, we should definitely remove calls to mem::uninitialized. By the way, do you have a reference that says mem::uninitialized::<!>() is UB? Google doesn't give me useful results...

@ghost
Copy link

ghost commented Jul 26, 2018

@oliver-giersch
Copy link
Contributor Author

@jeehoonkang and @stjepang First off, thanks to both of you for your clarifying comments!

Regarding 1) and 2) the argument regarding flexibility through interior mutability makes sense to me, but it should be possible to simple use dtors: RefCell<Vec<Box<dyn FnBox<()> + 'scope>> I think.
Would require some restructuring of the fn drop_all(&mut self) and Drop implementation.
Regarding 3) and the panic issue I get the explanation, I hadn't thought about that.

Regarding 4) I have fiddled around a little and think I might have found a solution does not require any arcane stuff (note that I am not 100% sure yet if this is totally sound, but compiles and apparently works):

unsafe impl<T> Send for UnsafeBox<T> where T: Send {}
unsafe impl<T> Sync for UnsafeBox<T> where T: Send {}

#[derive(Debug)]
pub struct UnsafeBox<T> {
    pointer: std::ptr::NonNull<Option<T>>
}

//not sure why, but Clone and Copy don't seem to be derivable
impl<T> Clone for UnsafeBox<T> {
    #[inline]
    fn clone(&self) -> Self {
        Self {
            pointer: self.pointer
        }
    }
}

impl<T> Copy for UnsafeBox<T> {}

impl<T> UnsafeBox<T> where T: Send {
    #[inline]
    pub fn new() -> Self {
        let boxed = Box::new(None);
        Self {
            pointer: unsafe { NonNull::new_unchecked(Box::into_raw(boxed)) }
        }
    }

    #[inline]
    pub unsafe fn unbox(self) -> Option<T> {
        let boxed = Box::from_raw(self.pointer.as_ptr());
        *boxed
    }

    #[inline]
    pub unsafe fn as_mut(&mut self) -> &mut Option<T> {
        self.pointer.as_mut()
    }
}

This is just an unsafe wrapper around a raw pointer from box that implements Send and Sync in order to smuggle it across the thread boundary of a scoped thread. This is obviously totally unsound in any other context (or rather most) but scoped threads.

The UnsafeBox can be copied into the thread closure, written into by the thread through *result.as_mut() = Some(f()); and consumed in the join() call by calling unbox.
This should also avoid the UB issue mentioned by @stjepang (which I did not know about either).

Regarding 5) I will post my ideas later, though I should note that this is pretty much a question of personal preference that I prefer to split code into a lot of submodules in order to keep files focused and readable (especially with doc comments I find .rs files become bloated very quickly).

@oliver-giersch
Copy link
Contributor Author

UnsafeBox is a terrible name to be honest. Something along the lines of ScopedThreadResult or ScopedThreadPacket (more in line with the stdlib) would be better, and unbox() should probably just be called into_inner()

@ghost
Copy link

ghost commented Sep 29, 2018

@oliver-giersch Can we close this issue? Is there anything here that we haven't resolved yet?

@oliver-giersch
Copy link
Contributor Author

As far as I'm concerned thus can be closed.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

No branches or pull requests

2 participants