Generalized Collatz Functions: Cycle Lengths and Statistics

Consider the function T (n) defined on the positive integers as follows. If n is even, T (n) = n/2. If n is odd, T (n) = 3n + 1. The Collatz Conjecture states that for any integer n, the sequence n, T (n), T (T (n)), . . . will eventually reach 1. We consider several generalizations of this function, focusing on functions which replace "3n + 1" with "3n + b" for odd b. We show that for all odd b < 400, and all integers n ≤ 106, iterating this function always results in a finite cycle of values. Furthermore, we empirically observe several interesting patterns in the lengths of these cycles for several classes of values of b.


INTRODUCTION -THE 3n + 1 PROBLEM
The 3n + 1 problem concerns the following experiment: Pick any positive integer n.If n is even, divide it by 2. If n is odd, multiply it by 3 and add 1. Iterate this process by applying the same procedure to the result.Repeat this iteration process many times.The question is: what will happen in the long run?
The Collatz Conjecture states that no matter which integer one starts with, if one does this procedure enough times, eventually the sequence of numbers generated by it will include 1.That is, given an integer n, if we define the function T (n) as and we write T (k) (n) to mean iterating T (n) k times, then the Collatz conjecture states that for every positive integer n, there is some k such that T (k) (n) = 1.Simply by choosing different values of n and applying T (n) several times, one can be convinced that the conjecture seems to be true.Despite several decades of effort and several hundred published papers on the subject (see Lagarias, 2003Lagarias, , 2011)), the conjecture remains unsolved 2 .Some of the best evidence in favor of the conjecture is the strong body of computational work.The current record for empirical verification of the Collatz conjecture is due to Oliveira e Silva, who has verified the conjecture for all n < 5.76 ⇥ 10 18 (Oliveira e Silva, 2010).
Beyond the primary queston of whether iterating T (n) eventually leads to 1, however, are many other questions about this function.How many iterations of T (n) are needed to reach 1 for various n?How large might the intermediate values get?Does T (n) behave qualitatively differently on certain classes of input (primes, odd numbers, etc.)?
Many of these questions, too, remain unsolved or only partially solved.This work is an attempt to gain new insight into these questions by generalizing T (n) to a wider class of functions, and by studying their respective behavior.Consider the conjecture again.It tells us that if a number is odd, we should multiply it by 3 and add 1.Why 3? Why 1? Why should we check for divisibility by 2 and then divide?Might other numbers work?And, crucially for our work, does varying these numbers affect qualitatively the outcome of iterating these new versions of T (n)?

GENERALIZED FUNCTIONS
The idea of generalizing the function T (n) dates back at least to Hasse (1975), who suggested a generalization largely similar to the one below, and then gave some probabilistic arguments about the behavior of these generalized functions.In particular, Hasse suggested that for a fixed m > d 2, one could define the map where f (r) ⌘ mr (mod d).In our generalization, we change only the constant being added in the fraction's numerator, in order to allow ourselves more freedom in varying the original question.In particular, we define Some experimentation quickly reveals that some values of m, a, and b give rise to uninteresting functions.For example, T (n; 2, 3, 2) triples any odd number and adds 2 -giving another odd number.This continues until the iterates become arbitrarily large; see Theorem 2.1.Other functions may have some starting values for which they go to infinity, but others for which they converge.Still others might send all numbers into one of many possible "cycles".
We note also that many other generalizations of this function are possible, but we believe the our version, based on that of Hasse, is the most general which "preserves the spirit" of the original problem.This generalization of Hasse has been studied by Allouche (1979), Heppner (1978), Möller (1978), Metzger (1999), and Lagarias (1990).
2.1.Cycles and divergent sequences.The original Collatz function seems (empirically) to continue iterating an input until the values reach the cycle 4 ! 2 ! 1 !4. One of the first interesting things we found was that some values of m, a, and b, define functions which have more than one cycle.Additionally, some of these cycles are quite long.For example, consider the effect of modifying the original T (n) only slightly, by changing the rule if n is odd to "multiply by 3 and add 5".We represent this function as T (n; 2, 3, 5).Under this function, the integers 19 !62 ! 31 !98 !49 !152 !76 !38 !19 form a cycle.In fact, this function has six distinct cycles which include at least one integer less than 10 5 .

An interesting class of func-
tions.After some initial experimentation with various values of a, b, and m, we noticed that while the behavior of the functions varied widely, the functions with a = 3 and m = 2 behaved in a somewhat uniform manner.Specifically, if we fixed those values and let b vary through odd values, we could find no starting value n which did not end up in some cycle (see Theorem 4.1).However, when b was even, we saw very different behavior.In fact, we have the following: Theorem 2.1.Let b be a positive even integer and n a postiive odd integer.Then For this reason we shall restrict our attention to functions T (n; 2, 3, b) with odd b.Beyond this observation, though, no clear pattern was present.Some of these functions (like the standard function T (n)) seemed to send all integers to the same cycle.Others had many different cycles which appeared.It was to a deeper understanding of the structure and behavior of cycles that we next turned our efforts.

CYCLE STATISTICS
3.1.Some definitions.Because in the rest of this work, we shall be studying functions of a particular type, it may be useful to assign new notation to these.Let T (n; 2, 3, b) be denoted by We shall refer to these C b (n) collectively as Collatz Functions.The Collatz function C 1 (n), then, corresponds to the original function described in the introduction.Recall also that we restrict our attention only to the case where b is odd.
In order to describe properties of these Collatz functions, we shall need a few definitions.We note first the crucial fact that for no function C b is the cycle number known.One way to state the original Collatz conjecture is to say that the cycle number of C 1 is 1, but of course this has not been proved.
In the following, when we refer to the cycle number, we always mean the number of distinct cycles we have found empirically: for b < 100, this is the number of cycles observed by testing all inputs up to 10 6 ; for 100 < b < 400, we tested inputs to up 10 5 .The cycle number for some of the C b are given in the following table: Note that like C 1 , C 3 also seems to have a cycle number of 1.The function C 5 , however, behaves very differently.It has cycle number 6.There is once again a cycle containing 1, namely 1 !8 ! 4 ! 2 ! 1.Using 3 as our starting value, however, gives 3 !14 !7 !26 !13 !40 !20 ! 10 ! 5 !20 -these last three numbers represent a cycle distinct from the first.Others are much longer.One cycle of C 5 begins with 187, then passes through 43 other integers before returning to 187.These differences suggest that it will be useful to have the following: Definition 3.2.The cycle length of any cycle is the number of distinct integers contained within the cycle.
Given that some cycles are quite long, we shall also need a way to represent them without writing out every integer in the cycle.For this, we shall use the following: The motivation for this choice of terminology is simple: loosely, the greater the gravity, the more integers are "pulled in" to the cycle.We should note, however, that it is not clear that this is well-defined.Let P b,r (n) be the number of integers not greater than n which end up in a cycle with cycle minimum r after iteration by C b .The cycle gravity is lim n!• P b,r (n)/n, if the limit exists.

COMPUTATIONS AND OBSERVATIONS
4.1.General experiments.For every value of b  99, we tested all inputs n  10 6 .For each of these, we first found all cycles which are entered by any of the n.The first important discovery is that every value of n does indeed enter a cycle.In fact, we verified the following generalization of the Collatz Conjecture.
Theorem 4.1.For all odd b  99 and all n  10 6 , the iterates of C b (n) eventually converge to some cycle.
For each cycle, we recorded several pieces of data.We first recorded the cycle minimum and the cycle length.Metzger (1999) observed that it is often useful to measure a cycle not by its total number of entries, but by the number of odd entries, which correspond to its number of multiplications.We also computed this value, but we found it not to be more useful than our definition, and we do not report that value here (but see the accompanying tables online for complete data).Finally, we recorded each cycle's gravity, in this case estimated using the proportion of integers n  10 6 which enter that cycle.Part of this data (for b  13) is given in Table 4.1.
We attempted to find some relationship between a cycle's length and its gravity.No pattern is apparent, though we encourage readers to look for one.(All data for b < 400 are available in the accompanying tables online.)These observations seem to point experimentally to a more general property; namely that if n > 1 is an integer, and C b has a cycle of length k, then C nb will also have a cycle of length k.In fact, this is the case, as we shall prove in the next section.

RESULTS
Before we state our results, it will be useful to have the following definition: Definition 5.1.A cycle {x 1 , x 2 , . . .x k } of a function C b is said to be primitive if the greatest common divisor of all the x i is 1.
Using this, it is easy to prove the following: In fact, we observe more.Usually these two cycles have cycle minima greater than any other cycle given by C b .We found this to be true for all b except those which are multiples of 29.we cannot prove, nor are we convinced, that C 5b will always have exactly two cycles of length 44.We note that it is easy to see that such functions always have at least one cycle; that the function C 3 n has a cycle with cycle minimum 3 n follows from Theorem 5.2.

CONCLUSIONS
In one sense, the generalization of the Collatz function to the functions C b leads to very similar behavior to that seen in the original functionnamely, iterating any value leads to a cycle (rather than diverging).This suggests at least that the original function is not "special", but rather that its behavior follows from more general principles.However, the fact that the number of cycles differs as b changes does give evidence that there are underlying irregularities behind the more systematic behavior.This, together with the striking patterns in behavior for values of b which are multiples of 5, 13, and 29, leads us to believe that there is considerable benefit to studying these generalized functions.

NOTE
We would like to acknowledge one of our reviewers who pointed out that many of our results are not as novel as we thought, in particular, for directing our attention to prior work of Lagarias (1990) that our work extends.We were unaware of this work at the time of our own studies.To our knowledge, our data on cycle gravity is still new.
Work.The class of functions T (n; 2, m, b) was studied by Crandall (1978), who proved several interesting results.In particular, he conjectured that aside from (m, b) = (3, 1), every function T (n; 2, m, b) has at least one cycle that does not reach 1; he proved this for all b 3, and for the pairs (m, b

Definition 3. 1 .
The cycle number of a Collatz function C b is the number of distinct cycles created by iterating C b .
Data on cycle numbers of C b for small b Definition 3.3.For a given Collatz function C b and a given cycle, the cycle minimum is the least integer in the cycle.Since no integer can be in more than one cycle of a given C b , a cycle minimum uniquely determines a cycle of a given C b .For this reason we shall often use the value of the cycle minimum to describe the cycle itself.Definition 3.4.For a given Collatz function C b and a given cycle, the cycle gravity is the proportion of integers which end up in the cycle (if such a proportion exists).

Theorem 5. 2 .
Let m be the cycle minimum of a primitive cycle of length k under the function C b .Then for any integer n > 1, nm will be the cycle minimum of a (imprimitive) cycle of length k under the function C nb .Proof.The idea of the proof is that the function C nb preserves the primitive cycle in C b , and merely scales it up.For C nb (nm) = 3nm + nb = n(3m + b) = nC b (m).Therefore after futher interation, we see that C k nb (nm) = nC k b (m) = m, since m was the minimum of a cycle of length k. ⇤ From this theorem, we can deduce three corollaries which parallel the three theorems above: Corollary 5.3.If b is an odd multiple of 5, C b will have at least two cycles with cycle length 44 and two cycles with cycle length 8.
Corollary 5.4.If b is an odd multiple of 13, C b will have at least seven cycles of length 13.Corollary 5.5.If b is an odd multiple of 29, C b will have at least two cycles of length 106.Finally, we note that of all b < 400, the only C b with cycle number 1 are b = 1, 3, 9, 27, 81, 243.From this, we conjecture the following: Conjecture 5.6.The function C b will have precisely one cycle if and only if b is a power of 3.