On this page

Goroutines

15 min read TextCh. 4 — Concurrency

Goroutines: Lightweight Concurrency Built Into the Language

Goroutines are the feature that makes Go unique for concurrent systems programming. A goroutine is a function that executes concurrently with the rest of the program, managed by the Go runtime (not directly by the operating system).

The key difference from OS threads is cost:

  • OS thread: 1-8 MB of stack, expensive to create and destroy
  • Goroutine: ~2-4 KB initial stack (grows dynamically), creation in microseconds

A typical Go web server can handle hundreds of thousands of concurrent connections with goroutines — something impossible with traditional OS threads.

Go's Scheduler: M:N Threading

The Go runtime implements an M:N scheduling model: it multiplexes M goroutines over N OS threads. The key parameters are:

  • GOMAXPROCS: number of OS threads that can execute Go code in parallel (default = number of CPUs)
  • Go's scheduler is cooperative with preemption: goroutines yield control at specific points (function calls, I/O operations, channel operations)
import "runtime"

fmt.Println("CPUs:", runtime.NumCPU())
fmt.Println("GOMAXPROCS:", runtime.GOMAXPROCS(0))  // 0 = read current value
fmt.Println("Active goroutines:", runtime.NumGoroutine())

Launching Goroutines

The go keyword turns any function call into a goroutine:

// Normal function
func greet(name string) {
    fmt.Printf("Hello, %s!\n", name)
}

func main() {
    // Execute in goroutine — returns immediately
    go greet("Ana")
    go greet("Luis")

    // Anonymous function in goroutine
    go func() {
        fmt.Println("Anonymous goroutine")
    }()

    // IMPORTANT! main() does not wait for goroutines
    // If main() exits, all goroutines die too
    time.Sleep(100 * time.Millisecond)  // simple wait (do not use in production)
}

`sync.WaitGroup`: Waiting for Multiple Goroutines

WaitGroup is the standard mechanism for waiting for a group of goroutines to finish:

import "sync"

func processItems(items []string) {
    var wg sync.WaitGroup

    for _, item := range items {
        wg.Add(1)  // increment before launching the goroutine

        go func(i string) {
            defer wg.Done()  // decrement when done
            fmt.Println("Processed:", i)
        }(item)  // pass item as parameter — crucial!
    }

    wg.Wait()  // block until counter reaches 0
    fmt.Println("All items processed")
}

Classic Error: Variable Capture in Closures

// INCORRECT: all goroutines capture the same variable i
for i := 0; i < 5; i++ {
    go func() {
        fmt.Println(i)  // i could be any value, even 5
    }()
}

// CORRECT: pass i as a parameter
for i := 0; i < 5; i++ {
    go func(id int) {
        fmt.Println(id)  // each goroutine has its own copy of id
    }(i)
}

// Also correct in Go 1.22+: the loop creates a new i variable each iteration
for i := range 5 {
    go func() {
        fmt.Println(i)  // Go 1.22: i is a new variable each iteration
    }()
}

Data Races: The Danger of Concurrency Without Synchronization

A data race occurs when two goroutines access the same memory concurrently and at least one of them writes, without any synchronization:

// INCORRECT: data race
counter := 0
var wg sync.WaitGroup

for i := 0; i < 1000; i++ {
    wg.Add(1)
    go func() {
        defer wg.Done()
        counter++  // read-modify-write is not atomic
    }()
}

wg.Wait()
fmt.Println(counter)  // could be < 1000 — incorrect result

Detect with the Race Detector

go run -race main.go
# ==================
# WARNING: DATA RACE
# Write at 0x00c0000b4010 by goroutine 7:
#   main.main.func1()
# ...

`sync.Mutex`: Mutual Exclusion

A Mutex (Mutual Exclusion) guarantees that only one goroutine can execute a critical section at a time:

type Cache struct {
    mu   sync.Mutex
    data map[string]string
}

func NewCache() *Cache {
    return &Cache{data: make(map[string]string)}
}

func (c *Cache) Set(key, value string) {
    c.mu.Lock()
    defer c.mu.Unlock()  // always use defer to guarantee Unlock
    c.data[key] = value
}

func (c *Cache) Get(key string) (string, bool) {
    c.mu.Lock()
    defer c.mu.Unlock()
    val, ok := c.data[key]
    return val, ok
}

`sync.RWMutex`: Multiple Readers, Single Writer

When reads are frequent and writes are rare, RWMutex is more efficient:

type RWCache struct {
    mu   sync.RWMutex
    data map[string]string
}

func (c *RWCache) Set(key, value string) {
    c.mu.Lock()          // blocks readers AND writers
    defer c.mu.Unlock()
    c.data[key] = value
}

func (c *RWCache) Get(key string) (string, bool) {
    c.mu.RLock()         // allows concurrent readers, only blocks writers
    defer c.mu.RUnlock()
    val, ok := c.data[key]
    return val, ok
}

`sync.Once`: Guaranteed Single Initialization

Once guarantees that a function executes exactly once, even when called from multiple goroutines:

type Service struct {
    config *Config
}

var (
    service *Service
    once    sync.Once
)

func GetService() *Service {
    once.Do(func() {
        // This function runs exactly once
        service = &Service{
            config: loadConfigFromFile(),
        }
    })
    return service
}

`sync.Map`: Concurrent Map

For maps with frequent concurrent access, Go offers sync.Map:

var sm sync.Map

// Store, Load, LoadOrStore, Delete, Range
sm.Store("key", "value")

if val, ok := sm.Load("key"); ok {
    fmt.Println(val.(string))
}

sm.Range(func(k, v any) bool {
    fmt.Printf("%v: %v\n", k, v)
    return true  // continue iteration
})

sync.Map is optimized for two cases: when the map is written once and read many times, or when multiple goroutines read and write disjoint keys.

`sync/atomic`: Low-Level Atomic Operations

For simple counters, atomic is more efficient than a Mutex:

import "sync/atomic"

var counter int64  // must be int32, int64, uint32, uint64, or Pointer

// Atomic operations guarantee atomicity without a Mutex
atomic.AddInt64(&counter, 1)
atomic.AddInt64(&counter, -1)
val := atomic.LoadInt64(&counter)
atomic.StoreInt64(&counter, 0)
ok := atomic.CompareAndSwapInt64(&counter, 0, 1)  // CAS

The Complete Pattern with Goroutines and Anonymous Functions

func processWithConcurrency(tasks []Task) []Result {
    results := make([]Result, len(tasks))
    var wg sync.WaitGroup
    var mu sync.Mutex

    for i, task := range tasks {
        wg.Add(1)
        go func(idx int, t Task) {
            defer wg.Done()

            result, err := processTask(t)
            if err != nil {
                log.Printf("error in task %d: %v", idx, err)
                return
            }

            mu.Lock()
            results[idx] = result
            mu.Unlock()
        }(i, task)
    }

    wg.Wait()
    return results
}

With goroutines mastered, in the next lesson we will learn channels — the communication mechanism between goroutines that makes concurrency in Go safe and elegant.

A goroutine costs ~2-4 KB, an OS thread costs ~1-8 MB
Goroutines are extremely lightweight compared to operating system threads. Go can run hundreds of thousands or even millions of goroutines simultaneously because the Go runtime manages its own scheduler and multiplexes goroutines over a fixed number of OS threads (GOMAXPROCS).
Detect data races with go run -race
Go's race detector detects unsynchronized concurrent memory accesses. Always use it during development and testing: go run -race main.go or go test -race ./... Data races are silent bugs that only manifest under specific load conditions and can be extremely difficult to reproduce.