The Problem: The Simplest Unsolved Math Problem

The Collatz conjecture (3n+1 problem) is deceptively simple:

Start with any positive integer n
If n is even: divide by 2
If n is odd: multiply by 3 and add 1
Repeat until you reach 1

The conjecture: Every positive integer eventually reaches 1.

Example (n=12):

12 → 6 → 3 → 10 → 5 → 16 → 8 → 4 → 2 → 1

The mystery: This has been verified for numbers up to 2^68, but never proven. It’s one of mathematics’ most famous unsolved problems.

The parallel challenge: Each starting number is independent (easy to parallelize), but storing results is shared state (hard to get right).

The Sequential Implementation

package main

import (
    "fmt"
)

func collatzSequence(n uint64) []uint64 {
    seq := []uint64{n}

    for n != 1 {
        if n%2 == 0 {
            n = n / 2
        } else {
            n = 3*n + 1
        }
        seq = append(seq, n)
    }

    return seq
}

func collatzLength(n uint64) int {
    length := 1

    for n != 1 {
        if n%2 == 0 {
            n = n / 2
        } else {
            n = 3*n + 1
        }
        length++
    }

    return length
}

func main() {
    for i := uint64(1); i <= 10; i++ {
        length := collatzLength(i)
        fmt.Printf("%2d: length %d\n", i, length)
    }
}

Output:

 1: length 1
 2: length 2
 3: length 8
 4: length 3
 5: length 6
 6: length 9
 7: length 17
 8: length 4
 9: length 20
10: length 7

Simple! Now let’s find interesting patterns.

Problem 1: Find Longest Sequence

Which starting number up to N produces the longest sequence?

type result struct {
    number uint64
    length int
}

func findLongestSequential(limit uint64) result {
    longest := result{1, 1}

    for n := uint64(1); n <= limit; n++ {
        length := collatzLength(n)
        if length > longest.length {
            longest = result{n, length}
        }
    }

    return longest
}

func main() {
    r := findLongestSequential(1000000)
    fmt.Printf("Longest: %d (length %d)\n", r.number, r.length)
}

Output:

Longest: 837799 (length 525)

Time: ~1.2s for 1 million numbers.

Let’s parallelize!

Naive Parallel: Worker Pool

import "sync"

func findLongestParallel(limit uint64, workers int) result {
    jobs := make(chan uint64, 1000)
    results := make(chan result, 1000)

    var wg sync.WaitGroup

    // Start workers
    for w := 0; w < workers; w++ {
        wg.Add(1)
        go func() {
            defer wg.Done()

            localLongest := result{0, 0}

            for n := range jobs {
                length := collatzLength(n)
                if length > localLongest.length {
                    localLongest = result{n, length}
                }
            }

            results <- localLongest
        }()
    }

    // Send jobs
    go func() {
        for n := uint64(1); n <= limit; n++ {
            jobs <- n
        }
        close(jobs)
    }()

    // Wait for workers
    go func() {
        wg.Wait()
        close(results)
    }()

    // Find global longest
    longest := result{0, 0}
    for r := range results {
        if r.length > longest.length {
            longest = r
        }
    }

    return longest
}

Benchmark (8 workers, 1M numbers):

Sequential: 1.2s
Parallel:   0.18s
Speedup:    6.7x

Excellent speedup! Each number is independent, no shared state needed.

Problem 2: Search for Counterexamples

What if we want to find a number that doesn’t reach 1? (None found yet, but the search continues.)

func searchCounterexample(start, limit uint64, workers int) {
    jobs := make(chan uint64, 1000)
    var wg sync.WaitGroup

    // Found channel (buffered, so first finder doesn't block)
    found := make(chan uint64, 1)

    for w := 0; w < workers; w++ {
        wg.Add(1)
        go func() {
            defer wg.Done()

            for n := range jobs {
                // Check if already found
                select {
                case <-found:
                    return // Another worker found it
                default:
                }

                // Run Collatz, limit iterations to avoid infinite loops
                current := n
                for i := 0; i < 10000000; i++ {
                    if current == 1 {
                        break // Reached 1, no counterexample
                    }

                    if current%2 == 0 {
                        current = current / 2
                    } else {
                        current = 3*current + 1
                    }
                }

                if current != 1 {
                    found <- n
                    return
                }
            }
        }()
    }

    // Send jobs
    go func() {
        for n := start; n <= limit; n++ {
            jobs <- n
        }
        close(jobs)
    }()

    wg.Wait()

    select {
    case n := <-found:
        fmt.Printf("Counterexample found: %d\n", n)
    default:
        fmt.Println("No counterexample found in range")
    }
}

This allows early termination when a counterexample is found (though none exist in verified ranges).

Problem 3: Caching Results

Many numbers share subsequences:

10 → 5 → 16 → 8 → 4 → 2 → 1
20 → 10 → … (shares tail with 10)

If we cache lengths, we can avoid recomputation:

type cache struct {
    mu    sync.RWMutex
    store map[uint64]int
}

func newCache() *cache {
    return &cache{
        store: make(map[uint64]int),
    }
}

func (c *cache) get(n uint64) (int, bool) {
    c.mu.RLock()
    defer c.mu.RUnlock()
    length, ok := c.store[n]
    return length, ok
}

func (c *cache) set(n uint64, length int) {
    c.mu.Lock()
    defer c.mu.Unlock()
    c.store[n] = length
}

func collatzLengthCached(n uint64, c *cache) int {
    // Check cache
    if length, ok := c.get(n); ok {
        return length
    }

    original := n
    length := 0

    for n != 1 {
        // Check cache mid-sequence
        if cached, ok := c.get(n); ok {
            length += cached
            c.set(original, length)
            return length
        }

        if n%2 == 0 {
            n = n / 2
        } else {
            n = 3*n + 1
        }
        length++
    }

    length++ // Count the final 1
    c.set(original, length)
    return length
}

Benchmark (1M numbers):

Without cache: 1.2s
With cache:    0.8s
Speedup:       1.5x

Modest improvement. But there’s a problem…

The Cache Contention Problem

With parallel workers accessing the cache:

func findLongestCached(limit uint64, workers int) result {
    c := newCache()
    jobs := make(chan uint64, 1000)
    results := make(chan result, 1000)

    var wg sync.WaitGroup

    for w := 0; w < workers; w++ {
        wg.Add(1)
        go func() {
            defer wg.Done()

            localLongest := result{0, 0}

            for n := range jobs {
                length := collatzLengthCached(n, c)
                if length > localLongest.length {
                    localLongest = result{n, length}
                }
            }

            results <- localLongest
        }()
    }

    go func() {
        for n := uint64(1); n <= limit; n++ {
            jobs <- n
        }
        close(jobs)
    }()

    go func() {
        wg.Wait()
        close(results)
    }()

    longest := result{0, 0}
    for r := range results {
        if r.length > longest.length {
            longest = r
        }
    }

    return longest
}

Benchmark (8 workers):

Sequential with cache:  0.8s
Parallel with cache:    0.5s
Parallel without cache: 0.18s

The cache makes it slower! The mutex contention overwhelms the benefit.

Solution: Per-Worker Caches

func findLongestPerWorkerCache(limit uint64, workers int) result {
    jobs := make(chan uint64, 1000)
    results := make(chan result, 1000)

    var wg sync.WaitGroup

    for w := 0; w < workers; w++ {
        wg.Add(1)
        go func() {
            defer wg.Done()

            localCache := newCache()
            localLongest := result{0, 0}

            for n := range jobs {
                length := collatzLengthCached(n, localCache)
                if length > localLongest.length {
                    localLongest = result{n, length}
                }
            }

            results <- localLongest
        }()
    }

    go func() {
        for n := uint64(1); n <= limit; n++ {
            jobs <- n
        }
        close(jobs)
    }()

    go func() {
        wg.Wait()
        close(results)
    }()

    longest := result{0, 0}
    for r := range results {
        if r.length > longest.length {
            longest = r
        }
    }

    return longest
}

Benchmark:

Parallel no cache:           0.18s
Parallel shared cache:       0.50s
Parallel per-worker cache:   0.14s

Even faster! Each worker has its own cache, no contention.

Advanced: Sequence Fingerprinting

Store full sequences to find patterns:

type sequenceDB struct {
    mu        sync.RWMutex
    sequences map[uint64][]uint64
}

func (db *sequenceDB) store(n uint64, seq []uint64) {
    db.mu.Lock()
    defer db.mu.Unlock()
    db.sequences[n] = seq
}

func (db *sequenceDB) get(n uint64) ([]uint64, bool) {
    db.mu.RLock()
    defer db.mu.RUnlock()
    seq, ok := db.sequences[n]
    return seq, ok
}

func exploreSequences(limit uint64, workers int) *sequenceDB {
    db := &sequenceDB{
        sequences: make(map[uint64][]uint64),
    }

    jobs := make(chan uint64, 1000)
    var wg sync.WaitGroup

    for w := 0; w < workers; w++ {
        wg.Add(1)
        go func() {
            defer wg.Done()

            for n := range jobs {
                seq := collatzSequence(n)
                db.store(n, seq)
            }
        }()
    }

    go func() {
        for n := uint64(1); n <= limit; n++ {
            jobs <- n
        }
        close(jobs)
    }()

    wg.Wait()
    return db
}

Problem: Storing millions of sequences uses massive memory.

Solution: Store only “interesting” sequences (longest, highest peak, etc.).

Finding the Highest Peak

Some sequences reach very large values before converging:

func collatzPeak(n uint64) uint64 {
    peak := n

    for n != 1 {
        if n%2 == 0 {
            n = n / 2
        } else {
            n = 3*n + 1
        }

        if n > peak {
            peak = n
        }
    }

    return peak
}

func findHighestPeak(limit uint64, workers int) result {
    jobs := make(chan uint64, 1000)
    results := make(chan result, 1000)

    var wg sync.WaitGroup

    for w := 0; w < workers; w++ {
        wg.Add(1)
        go func() {
            defer wg.Done()

            localHighest := result{0, 0}

            for n := range jobs {
                peak := collatzPeak(n)
                if peak > uint64(localHighest.length) {
                    localHighest = result{n, int(peak)}
                }
            }

            results <- localHighest
        }()
    }

    go func() {
        for n := uint64(1); n <= limit; n++ {
            jobs <- n
        }
        close(jobs)
    }()

    go func() {
        wg.Wait()
        close(results)
    }()

    highest := result{0, 0}
    for r := range results {
        if r.length > highest.length {
            highest = r
        }
    }

    return highest
}

func main() {
    r := findHighestPeak(100000, 8)
    fmt.Printf("Highest peak: %d → %d\n", r.number, r.length)
}

Output:

Highest peak: 77671 → 21933016

Starting from 77,671, the sequence reaches over 21 million before eventually descending to 1!

Distributed Exploration: Sharding

For very large ranges, shard across multiple machines:

type shard struct {
    start, end uint64
}

func distributeWork(total uint64, shards int) []shard {
    perShard := total / uint64(shards)
    result := make([]shard, shards)

    for i := 0; i < shards; i++ {
        result[i] = shard{
            start: uint64(i)*perShard + 1,
            end:   uint64(i+1) * perShard,
        }
    }

    // Last shard takes remainder
    result[shards-1].end = total

    return result
}

func exploreShardparallel(s shard, workers int) result {
    return findLongestParallel(s.end-s.start+1, workers)
}

func main() {
    shards := distributeWork(100000000, 10)

    var wg sync.WaitGroup
    results := make(chan result, len(shards))

    for _, s := range shards {
        wg.Add(1)
        go func(s shard) {
            defer wg.Done()
            r := exploreShardParallel(s, 8)
            results <- r
        }(s)
    }

    wg.Wait()
    close(results)

    // Find global longest
    longest := result{0, 0}
    for r := range results {
        if r.length > longest.length {
            longest = r
        }
    }

    fmt.Printf("Global longest: %d (length %d)\n",
        longest.number, longest.length)
}

This pattern scales to cluster computing.

Overflow Detection

Large numbers can overflow uint64:

func collatzSafe(n uint64) (uint64, error) {
    if n%2 == 0 {
        return n / 2, nil
    }

    // Check for overflow before computing 3n+1
    if n > (math.MaxUint64-1)/3 {
        return 0, fmt.Errorf("overflow: %d too large", n)
    }

    return 3*n + 1, nil
}

func collatzLengthSafe(n uint64, maxIter int) (int, error) {
    length := 1

    for i := 0; i < maxIter && n != 1; i++ {
        next, err := collatzSafe(n)
        if err != nil {
            return 0, err
        }
        n = next
        length++
    }

    if n != 1 {
        return 0, fmt.Errorf("did not converge after %d iterations", maxIter)
    }

    return length, nil
}

For arbitrary precision, use math/big:

import "math/big"

func collatzBig(n *big.Int) *big.Int {
    result := new(big.Int)

    if n.Bit(0) == 0 { // Even
        return result.Rsh(n, 1) // Divide by 2
    }

    // 3n+1
    result.Mul(n, big.NewInt(3))
    return result.Add(result, big.NewInt(1))
}

This allows exploring numbers beyond uint64 range.

Benchmarking Cache Strategies

func benchmarkCacheStrategies(b *testing.B) {
    strategies := []struct {
        name string
        fn   func(uint64, int) result
    }{
        {"No Cache", findLongestParallel},
        {"Shared Cache", findLongestCached},
        {"Per-Worker Cache", findLongestPerWorkerCache},
    }

    limit := uint64(100000)

    for _, s := range strategies {
        b.Run(s.name, func(b *testing.B) {
            for i := 0; i < b.N; i++ {
                s.fn(limit, 8)
            }
        })
    }
}

Results:

No Cache:           180ms/op
Shared Cache:       500ms/op
Per-Worker Cache:   140ms/op

Progress Tracking

For long computations:

func findLongestWithProgress(limit uint64, workers int) result {
    jobs := make(chan uint64, 1000)
    results := make(chan result, 1000)
    progress := make(chan uint64, 1000)

    var wg sync.WaitGroup

    // Progress reporter
    go func() {
        completed := uint64(0)
        ticker := time.NewTicker(1 * time.Second)
        defer ticker.Stop()

        for {
            select {
            case _, ok := <-progress:
                if !ok {
                    return
                }
                completed++
            case <-ticker.C:
                pct := float64(completed) / float64(limit) * 100
                fmt.Printf("\rProgress: %.2f%% (%d/%d)",
                    pct, completed, limit)
            }
        }
    }()

    // Workers
    for w := 0; w < workers; w++ {
        wg.Add(1)
        go func() {
            defer wg.Done()

            localLongest := result{0, 0}

            for n := range jobs {
                length := collatzLength(n)
                if length > localLongest.length {
                    localLongest = result{n, length}
                }
                progress <- 1
            }

            results <- localLongest
        }()
    }

    // Send jobs
    go func() {
        for n := uint64(1); n <= limit; n++ {
            jobs <- n
        }
        close(jobs)
    }()

    go func() {
        wg.Wait()
        close(results)
        close(progress)
    }()

    longest := result{0, 0}
    for r := range results {
        if r.length > longest.length {
            longest = r
        }
    }

    fmt.Println()
    return longest
}

Testing

func TestCollatzConverges(t *testing.T) {
    // Test known sequences
    tests := []struct {
        n      uint64
        length int
    }{
        {1, 1},
        {2, 2},
        {3, 8},
        {4, 3},
        {5, 6},
        {27, 112},
    }

    for _, tt := range tests {
        length := collatzLength(tt.n)
        if length != tt.length {
            t.Errorf("collatzLength(%d) = %d, want %d",
                tt.n, length, tt.length)
        }
    }
}

func TestParallelCorrectness(t *testing.T) {
    seq := findLongestSequential(10000)
    par := findLongestParallel(10000, 8)

    if seq != par {
        t.Errorf("Sequential and parallel disagree: %v vs %v", seq, par)
    }
}

Real-World Applications

Mathematical Research: Exploring conjectures, finding patterns
Cryptographic Analysis: Iterative number-theoretic functions
Optimization Problems: Heuristic search spaces
Algorithm Testing: Stress-testing parallel frameworks
Educational: Teaching concurrency and mathematical thinking

Common Pitfalls

Shared cache contention: Kills performance
Overflow: Large numbers exceed uint64
Infinite loops: No bound on iterations
Memory explosion: Storing all sequences
Load imbalance: Some numbers take much longer

Best Practices

Per-worker caches: Avoid shared state contention
Overflow checking: Detect or use arbitrary precision
Bounded iterations: Prevent infinite loops
Selective storage: Only save interesting results
Progress tracking: For long-running explorations

Conclusion

The Collatz explorer teaches us about shared state in parallel computing:

Independence is easy: Each starting number is separate
Shared caches are hard: Contention destroys speedup
Per-worker state wins: Eliminate contention entirely
Caching helps… sometimes: Depends on access patterns
Overflow matters: Mathematical functions can exceed bounds

Key insights:

Embarrassingly parallel: Each number is independent
Cache contention is real: Shared maps under write load perform poorly
Local state scales: Per-worker caches have no contention
Load varies: Some numbers take 10x longer than others

The Collatz problem remains unsolved, but exploring it with Go teaches valuable lessons about:

Parallel algorithm design
Cache architecture
Shared state management
Mathematical computing

Whether you’re searching for counterexamples, finding longest sequences, or analyzing patterns, the techniques apply broadly: minimize shared state, maximize parallelism.

Go Concurrency Pattern: The Collatz Explorer

The Problem: The Simplest Unsolved Math Problem

The Sequential Implementation

Problem 1: Find Longest Sequence

Naive Parallel: Worker Pool

Problem 2: Search for Counterexamples

Problem 3: Caching Results

The Cache Contention Problem

Solution: Per-Worker Caches

Advanced: Sequence Fingerprinting

Finding the Highest Peak

Distributed Exploration: Sharding

Overflow Detection

Benchmarking Cache Strategies

Progress Tracking

Testing

Real-World Applications

Common Pitfalls

Best Practices

Conclusion

Further Reading

The Problem: The Simplest Unsolved Math Problem#

The Sequential Implementation#

Problem 1: Find Longest Sequence#

Naive Parallel: Worker Pool#

Problem 2: Search for Counterexamples#

Problem 3: Caching Results#

The Cache Contention Problem#

Solution: Per-Worker Caches#

Advanced: Sequence Fingerprinting#

Finding the Highest Peak#

Distributed Exploration: Sharding#

Overflow Detection#

Benchmarking Cache Strategies#

Progress Tracking#

Testing#

Real-World Applications#

Common Pitfalls#

Best Practices#

Conclusion#

Further Reading#

The Problem: The Simplest Unsolved Math Problem

The Sequential Implementation

Problem 1: Find Longest Sequence

Naive Parallel: Worker Pool

Problem 2: Search for Counterexamples

Problem 3: Caching Results

The Cache Contention Problem

Solution: Per-Worker Caches

Advanced: Sequence Fingerprinting

Finding the Highest Peak

Distributed Exploration: Sharding

Overflow Detection

Benchmarking Cache Strategies

Progress Tracking

Testing

Real-World Applications

Common Pitfalls

Best Practices

Conclusion

Further Reading