I have been fascinated by the Collatz conjecture for years. It’s a math problem that is so simple to understand, yet no mathematician has managed to solve it. Since the problem was first proposed by Lothar Collatz in 1937, many mathematicians have gone crazy trying to solve it. Here’s how it works.

The *Collatz function* takes one number, denoted by n, and turns it into another number. If n is even, the result is n/2. If n is odd, the result is 3n+1. For example, if n is 3 then the result is 10. If n is 4 then the result is 2. If n is 5 then the result is 16. You get the idea.

A *Collatz sequence* is formed by starting with a number, and repeatedly applying the Collatz function to extend the sequence. For example, the Collatz sequence for n=3 is 3-10-5-16-8-4-2-1. The Collatz sequence for n=5 is 5-16-8-4-2-1. The Collatz sequence for n=2 is just 2-1. Here is a picture of the sequences for n=7 and n=19.

After looking at some of these sequences, you may start to notice a pattern. No matter how high they go, they always seem to come back down to 1. Does every Collatz sequence always come back down to 1? Perhaps some of them get into a loop and keep going round forever, or perhaps some of them just keep going up and up towards infinity. This is the Collatz conjecture: prove that every Collatz sequence eventually comes back down to 1.

Another way to picture the problem is using *total stopping times*. The total stopping time of a number is the number of steps it takes for the number’s sequence to reach 1. The Collatz conjecture states that every number has a finite stopping time. Despite being such a simple problem, and being open for almost 75 years, nobody has managed to prove or disprove the Collatz conjecture.

There are a couple really tantalizing patterns in the total stopping times. Let me show you just one. Have a look at the first 1000 stopping times.

Notice any patterns? It looks like some of the points are bunching up into short horizontal lines. Furthermore, these short horizontal lines seem to line up in large sweeping curves. To investigate these patterns further, I wrote a short C++ program to draw the same chart, except for the first ONE BILLION stopping times. This time, the *x*-axis is logarithmic.

Whoa, now that’s a pattern! Look at all those nice straight lines. Check out a closeup of the bottom-right part of that image.

What is causing all those straight lines? Why are they all the same length? The most interesting question to me is, why is there always a long “dash” followed by a short “dot”? If we can explain the structure of these regular patterns, can we construct an exact probability distribution for the total stopping time of any number? Could this distribution be the key to finally proving or disproving the Collatz conjecture? If you want to do a bit of work on the Collatz conjecture, answering these questions might be a good place to start.