Skip to content

Latest commit

 

History

History
57 lines (33 loc) · 7.29 KB

01 Share By Communicating.md

File metadata and controls

57 lines (33 loc) · 7.29 KB

Concurrent programming is a large topic and there is space only for some Go-specific highlights here. Concurrent programming in many environments is made difficult by the subtleties required to implement correct access to shared variables. Go encourages a different approach in which shared values are passed around on channels and, in fact, never actively shared by separate threads of execution. Only one goroutine has access to the value at any given time. Data races cannot occur, by design. To encourage this way of thinking we have reduced it to a slogan: Do not communicate by sharing memory; instead, share memory by communicating.

Slogan Explanation: "Do not communicate by sharing memory; instead, share memory by communicating."

The slogan captures the essence of Go's concurrency model, which is heavily channel-oriented. In more traditional models like that of languages such as Java, C++, or Python (using threads), you would often have shared variables that multiple threads can access. You would use locks, semaphores, or other synchronization mechanisms to make sure that only one thread can access a particular piece of data at a time.

This often leads to complex, hard-to-reason code and the potential for bugs like deadlocks and race conditions. The more you share, the more you have to lock, and the harder your code is to understand.

Example/Analogy

Imagine a group of people (threads) trying to read and update a physical notice board (shared memory). If everyone tries to update the notice board at the same time, there would be chaos (data races). The traditional approach would be to place a guard (mutex/lock) at the notice board, and only one person can talk to the guard and update the board at a given time. However, this can lead to people waiting in line (deadlocks, contention) and complicates the rules for interaction.

In Go's model, instead of having a shared notice board, each person (goroutine) has a mailbox (channel). If you want to update something, you send a message (data) to that person's mailbox. Only that person can access their mailbox, making sure that only one entity is dealing with the data at a given time. No locks or guards are needed, and yet everyone gets to communicate effectively without chaos.

This means data is never actively "shared" between threads of execution; instead, you "communicate" data from one thread to another safely via channels. This design avoids many of the common pitfalls associated with shared memory models.

This approach can be taken too far. Reference counts may be best done by putting a mutex around an integer variable, for instance. But as a high-level approach, using channels to control access makes it easier to write clear, correct programs.

Reference Count Approach

The text also mentions that sometimes this approach can be taken "too far." In some cases, simpler synchronization mechanisms might be more appropriate.

A "reference count" is a simple integer variable that keeps track of how many references exist to a particular resource. When a new reference to a resource is made, this count is incremented. When a reference is destroyed or goes out of scope, the count is decremented. When the count reaches zero, the resource can safely be freed.

In some languages, this might be done by wrapping an integer in a mutex to ensure that increment and decrement operations are atomic, i.e., they appear to be instantaneous and aren't interrupted by other operations.

In Go, you could technically create a goroutine whose sole job is to manage this count, and use channels to send "increment" or "decrement" commands to it. However, for such a simple task, using a mutex around an integer variable is generally much simpler and more efficient.

Therefore, while Go's concurrency model offers a robust and straightforward way to deal with many complex issues, it's not always the best tool for every job. Sometimes a simpler, more traditional approach may be more appropriate.

One way to think about this model is to consider a typical single-threaded program running on one CPU. It has no need for synchronization primitives. Now run another such instance; it too needs no synchronization. Now let those two communicate; if the communication is the synchronizer, there's still no need for other synchronization. Unix pipelines, for example, fit this model perfectly. Although Go's approach to concurrency originates in Hoare's Communicating Sequential Processes (CSP), it can also be seen as a type-safe generalization of Unix pipes.

This pessage outlines the benefits of thinking about concurrency in terms of isolated units of execution that communicate with one another rather than sharing memory. This approach aligns with the philosophy of Communicating Sequential Processes (CSP) and, as the text notes, can be seen as a type-safe generalization of Unix pipes.

Single-Threaded Program Running on One CPU

Consider a single-threaded program running on one CPU. It has complete control over its own memory and does not have to worry about synchronization with other threads since there aren't any. Everything it does is sequential: it reads data, performs computations, writes data, and so on.

Another Instance Running Independently

Now consider running another instance of this program on the same system. This second instance also has no need for synchronization primitives within itself, because, like the first one, it operates sequentially and independently. Both programs can perform their tasks without knowing or caring about the existence of the other.

Communication as the Synchronizer

Now, what if these two independent programs need to communicate? They can do so by sending messages to each other. This message-passing becomes the synchronizing mechanism. Because each program is internally sequential and only communicates with the other via messages, there's no need for additional synchronization primitives like mutexes or semaphores.

Unix Pipelines

In Unix, programs can be connected using pipes. Each program reads from its standard input and writes to its standard output. When you connect these using pipes, the output of one program becomes the input of another. This is an example of communicating sequential processes, albeit less type-safe. Each program in the pipeline operates independently and is synchronized only by the data it reads from its input and writes to its output.

# Unix pipeline example
program1 | program2 | program3

Here, program1, program2, and program3 are like isolated instances. They don't need internal synchronization, and they communicate via pipes, which synchronize them.

Go's Approach (CSP and Type-Safety)

Go's concurrency model builds upon this philosophy. It offers goroutines (which are like lightweight threads) and channels for type-safe communication between them. This way, each goroutine can operate like an isolated, single-threaded program that communicates with other goroutines via channels. This is type-safe, meaning that the language checks that the types of data being sent and received match up, which adds a layer of safety not present in Unix pipes.

In summary, the philosophy is: Instead of having to lock shared resources to achieve synchronization, isolate units of computation and have them communicate in a synchronized manner. This often leads to simpler, more manageable, and less error-prone code.