And yes, there really is no algorithmic limit to how deep you can go, but of course your browser's available
memory will run out eventualy. The beauty of the algorithm is that we don't need to know in advance how many
denominators will be needed, nor do we have to recalculate \(\pi\) to new precision each time. It
proceeds step by step, producing the current denominator and preparing for the next one.
In fact, we can think of it
not as an algorithm working on numbers, but as the number itself becoming a dynamic object, which is how
Bill Gosper described it in his construction of continued fraction arithmetic.
What you will find here
is really his idea of egestion used to obtain however many terms for certain irrational numbers.
Let's review shortly what continued fractions are. Take the famous approximation of \(\pi\approx 355/113\),
and separate the integer part:
$$ \frac{355}{113} = 3 + \frac{16}{113}. $$
The fraction is less than one, so its inverse is larger than 1,
presumably the fraction is 1 over some large integer, so let's rewrite it further:
$$ \frac{355}{113} = 3 + \frac{16}{113} = 3 + \cfrac{1}{\cfrac{113}{16}}
= 3 + \cfrac{1}{7+\cfrac{1}{16}} = 3 + \cfrac{1}{7+\cfrac{1}{15+\cfrac{1}{1}}}. $$
The last form is there to show the inherent ambiguity, and also to make it resemble the fraction at the top.
But either way the process always stops for rational numbers, when the denominator reaches an integer
(16 or 1 in the above). And conversely, when there is only a finite number of nested fraction, we can tediously
add them up to one big fraction, which immediately means that an irrational number cannot have a finite expansion.
So let's take an irrational number, and see what happens. For example \(\sqrt{11}\). We know it must lie between
3 and 4, so can extract the integer and write \(\sqrt{11} = 3 + \sqrt{11}-3\), which seems rather empty until we
invert it, and remove the root from the (final) denominator like so:
$$ \sqrt{11} = 3 + \sqrt{11}-3 = 3 + \cfrac{1}{\,\cfrac{1}{\sqrt{11}-3}\,} =
3 + \cfrac{1}{\,\cfrac{3+\sqrt{11}}{2}\,}.$$
Using the bound \(3<\sqrt{11}< 4\) again, we extract another 3, and rinse and repeat...
$$ \sqrt{11} = 3 + \cfrac{1}{\,3+\cfrac{\sqrt{11}-3}{2}\,} =
3 + \cfrac{1}{\,3+\cfrac{1}{3+\sqrt{11}}\,} =
3 + {\color{green}
\boxed{\color{black}\cfrac{1}{3+\cfrac{1}{6+{\color{green}\boxed{\color{black}\sqrt{11}-3}}}}}
},$$
at which point we notice, that we already have dealt with \(\sqrt{11}-3\) above, and so we can just substitute
the whole double fraction into itself (the green frame). And, informally, we arrive at the infinite fraction
$$ \sqrt{11} =
3 + \cfrac{1}{\,3+\cfrac{1}{\,6+\cfrac{1}{\,3+\cfrac{1}{\,6+\cfrac{1}{\,3+\cfrac{1}{\,\raisebox{-0.5em}{\(6+\ddots\)}\,}\,}\,}\,}\,}\,}, $$
with alternating 3 and 6.
Noticeably, the green frame refers to itself (that is \(\sqrt{11}-3\)), but instead of philosophical panic
over circularity, we just rewrite it as a proper equation
$$ x = \cfrac{1}{\,3+\cfrac{1}{\,6+x\,}}.$$
This is just the quadratic \(x^2+6x-2=0\) in disguise. Its solutions are \(-3\pm\sqrt{11}\) and all is well, because
our green frame is one of them.
This short detour illustrates a theorem that all quadratic irrationals have periodic continued fractions
(and vice versa), so they won't be very interesting when it comes to infinitely scrollable expansions. To learn more
check out the usual place or, better yet,
a textbook. I'll just add that in a very good sense, one could say that we know all the “digits”
of \(\sqrt{11}\). The integer part is \(3\), followed by infinitely alternating \(3\) and \(6\), which could be written as
\([3;(3,6)]\), by analogy with periodic decimals like \(1/7 = 0.(142857)\).
The crucial difference is that unlike ordinary digits, the denominators are not limited to \(0\) through \(9\),
but can by any positive integers. This makes data structure trickier than for floats, and the arithmetic is more
complex. But it has to be stressed, that it can be done, i.e., addition and multiplication can be done using
the continued fraction expansions directly, as shown by Gosper. Those complications are the price to pay
for finite and exact representation of all rationals, and periodic representation of quadratic rationals.
At this point, an obvious pragmatic question arises: What if I need the value itself? The huge fraction tower
might look pretty, but what if someone insists on the decimal form? At the very least we'd like to have a single
fraction, like \(\frac{355}{113}\), which can be the basis for further arithmetical or numerical computation.
And it is scary to consider, that we would have to start at the very bottom of the fraction, and proceed by
meticulously adding and inverting them, all the while grasping just the tail, and arriving at the sought-for value
only when the final, or rather initial, leading integer is reached.
Fortunately, there is a straightforward way of proceeding from top to bottom, which yields better and better
approximations with each step. Say a number \(x\) has the general fraction
$$ x = a_0 + \cfrac{1}{\,a_1+\cfrac{1}{\,a_2+\cfrac{1}{\,\raisebox{-0.5em}{\(a_3+\ddots\)}\,}\,}\,}.$$
We start by taking its zeroth approximation to be \(a_0\) as represented by \(a_0/1\), or
$$ x_0 = \frac{p_0}{q_0},\quad\text{where}\quad p_0 = a_0,\; q_0 = 1,$$
then generate the sequence
$$ \left\{\begin{aligned}
p_{n+1} &= a_{n+1} p_{n} + p_{n-1}\\
q_{n+1} &= a_{n+1} q_{n} + q_{n-1}
\end{aligned}\right., \quad\text{where}\quad p_{-1} = 1,\; q_{-1} = 0. \tag{R}$$
Our number \(x\) is then the limit of \(x_n = \frac{p_n}{q_n}\),
provided the fraction was convergent. For \(\sqrt{11}\) in particular
we get
$$ x_0 = \frac{3}{1},\quad
x_1 = \frac{10}{3},\quad %= 3.33333,\quad
x_2 = \frac{63}{19},\quad %= 3.31579
x_3 = \frac{199}{60},\quad %= 3.31667
x_4 = \frac{1257}{379} \approx 3.316623, $$
the last one with an error of about \(2.1\times 10^{-6}\). We'll come back to the errors later,
and go back to finding fractions for now.
II
What of the higher, possibly transcendental, irrational numbers? Let's look at another zoomer, this time of the
(Euler's) number \(e\).
Though not periodic, this is definitely very regular. Starting from the second denominator,
every third one is just the next even number; all other denominators are 1. In other words, what repeats
is the triple \([1,2k,1]\) as \(k\) goes over all positive integers. Much easier to remember
(if that's your thing) than \(2.718281828459045\ldots\), isn't it? But how do we go about deriving it?
It's not like with \(\sqrt{11}\), where we had easy bounds thanks to \( 9 < 11 < 16\), and algebraic formulae
to help us tidy up the denominators. And it would feel like cheating to use the decimal approximation, because
it too needs to be iteratively computed based on the true value (and properties) of \(e\).
This is where analytic theory comes to help, and for now it will have to remain an external result, that
infinite series can be converted to infinite (continued) fractions. It's a result of Euler,
how to do it
directly, and a further improvement by Gauss for a whole class of hypergeometric series.
We will thus shamelessly make use of the fact, that e.g.
$$ \tanh(z) = \cfrac{z}{\,1+\cfrac{z^2}{\,3+\cfrac{z^2}{\,5+\cfrac{z^2}{\,\raisebox{-0.5em}{\(7+\ddots\)}\,}\,}\,}\,}.$$
This is now a generalized continued fraction, because the numerators are no longer \(1\)
(in which case it is called regular).
Evaluating at \(z=\frac12\) gives
$$ \tanh(\tfrac12) = \frac{e^{1/2} - e^{-1/2}}{e^{1/2}+e^{-1/2}} =
\frac{e-1}{e+1} = \cfrac{1}{\,2+\cfrac{1}{\,6+\cfrac{1}{\,10+\cfrac{1}{\,\raisebox{-0.5em}{\(14+\ddots\)}\,}\,}\,}\,} = g.$$
Now comes the cumbersome algebra part: solving the last equality for \(e\), we obtain
$$ e = 1 + \frac{2}{\frac{1}{g}-1} = 1 + \cfrac{2}{\,1+\cfrac{1}{\,6+\cfrac{1}{\,\raisebox{-0.5em}{\(10+\ddots\)}\,}\,}\,},$$
and to get rid of the numerator \(2\), we will have to iteratively make use of the identity
$$ \cfrac{2}{2a+1+\cfrac{1}{b}} = \cfrac{1}{a+\cfrac{1}{1+\cfrac{1}{1+\cfrac{2}{b-1}}}}.$$
Notice the two additional levels with denominators \(1\) — this is what leads to the repeated pattern in the regular fraction.
You can try it yourself to see several first levels, (\(a=0\) initially, then \(2\) then \(4\) etc.).
This isn't the only derivation, as we could start with the usual factorial series for \(e\), recover a different generlized fraction,
and then tidy it up. What matters is that a possible path emerges: obtain a generalized fraction, for which an analytic
formula is known, and find a procedure for conversion to a regular fraction. Many basic transcendental functions have simple Taylor
and continued fraction expansions (with explicitly known coefficients),
and many transcendental constants are defined in terms of such functions. Indeed we have more formulae
than necessary to choose from, especially for \(\pi\)!
So let's recall that \(\tan(\frac{\pi}{4})=1\), and make use of the expansion of the inverse
\(\arctan(z)\) in the form
$$ \arctan(z) = \cfrac{z}{\,1+\cfrac{(1z)^2}{\,3+\cfrac{(2z)^2}{\,5+\cfrac{(3z)^2}{\,\raisebox{-0.5em}{\(7+\ddots\)}\,}\,}\,}\,}
\quad\Longrightarrow\quad
\pi = \cfrac{4}{\,1+\cfrac{1}{\,3+\cfrac{4}{\,5+\cfrac{9}{\,\raisebox{-0.5em}{\(7+\ddots\)}\,}\,}\,}\,}. \tag{S}
$$
Unfortunately, upon applying various simplifications, no obvious pattern emerges. We begin to suspect that, despite
the beautiful simple generalized fraction for \(\pi\), the regular one will have to be determined dynamically,
as we go. This brings us to the promised numbers-as-routines, achieved with employing matrices.
III
First, we need to obtain the generalized version of calculating the rational approximation top-down. Both for
numerical results, but also because the process of tidying up shown above is really working from the top, and
transforming various rational expressions. What happens along the way is that the current integer denominator is
extracted, leaving the rest as a new fraction smaller than \(1\), which is then inverted before the next extraction.
At each step we need to be certain of the integer part, in other words, we need to have precise bounds on the value.
The miracle is that increasing the order of approximaiton does not interfere with the extraction! Here is why.
The iterative propagation of the generalized fraction is the following:
$$ \left\{\begin{aligned}
p_n &= a_n p_{n-1} + b_n p_{n-2}\\
q_n &= a_n q_{n-1} + b_n q_{n-2}
\end{aligned}\right.,\quad\text{for}\quad
a_0 + \cfrac{b_1}{\,a_1+\cfrac{b_2}{\,a_2+\cfrac{b_3}{\,\raisebox{-0.5em}{\(a_3+\ddots\)}\,}\,}\,},$$
and we will come back to the initial conditions in a moment. Because it's a three-term recurrence, the state
of the approximation is, for a given \(n\) four numbers. Let's collect them in a single matrix, with numerators on top,
for which the recurrence is a simple linear system:
$$
X_n = \begin{bmatrix} p_n & p_{n-1}\\ q_n & q_{n-1} \end{bmatrix} \quad\Longrightarrow\quad
X_n = X_{n-1} M_n\quad\text{with}\quad
M_n = \begin{bmatrix} a_n & 1\\ b_n & 0 \end{bmatrix}.$$
This way, the columns of \(X_n\) give the approximate fractions \(x_n=\frac{p_n}{q_n}\) and
\(x_{n-1}=\frac{p_{n-1}}{q_{n-1}}\). With each iteration, the old left column (\(x_{n}\))
becomes the new right column (\(x_{n-1}\)), while a new left column is constructed with the help of
\(a_n\) and \(b_n\).
For the initial conditions, we want the initial fraction to be \(a_0\), so we take
$$ X_0 = \begin{bmatrix} a_0 & 1 \\ 1 & 0 \end{bmatrix}, $$
or \(b_0=1\), \(p_{-1}=1\), \(q_{-1}=0\) by convention, as for the regular fraction.
But notice that now \(X_0\) looks exactly like \(M_0\), so we can admit the recurrence
for \(n=0\), i.e., \(X_0 = X_{-1}M_0\), with \(X_{-1}\) being simply the identity matrix. Does that correspond
to meaningful numbers? Yes, if we allow the projective
space to intrude! The “fraction”
\(\frac{1}{0}\) is the point at infinity, and \(\frac{0}{1}=0\), so the columns of the identity matrix
mean that our initial approximation is in the interval \((0,\infty)\). What else could one wish for with
an unknown positive real number?
The last statement requires proof, but is in fact the case, owing to the positive \(a_n\) and \(b_n\), that
the real value \(x\) indeed lies between the consecutive approximants \(x_{n-1}\) and \(x_n\). It is this
bracketing that allows us to determine the integer parts and add another fraction to the tower. The first
few steps for \(\pi\) specifically look as follows.
$$ \begin{aligned}
X_{0} &= \begin{bmatrix} 0 & 1\\1 & 0\end{bmatrix} &&\Rightarrow 0 < \pi < \infty, &
X_{1} &= \begin{bmatrix} 4 & 0\\1 & 1\end{bmatrix} &&\Rightarrow 4 > \pi > 0, & \\
X_{2} &= \begin{bmatrix} 12 & 4\\4 & 1\end{bmatrix} &&\Rightarrow 3 < \pi < 4, &
X_{3} &= \begin{bmatrix} 76 & 12\\24 & 4\end{bmatrix} &&\Rightarrow 3\tfrac16 > \pi > 3, & \\
X_{4} &= \begin{bmatrix} 640 & 76\\204 & 24\end{bmatrix} &&\Rightarrow 3\tfrac16 > \pi > 3\tfrac{7}{51}, &
X_{5} &= \begin{bmatrix} 6976 & 640\\2220 & 204\end{bmatrix} &&\Rightarrow 3\tfrac{7}{51} < \pi < 3\tfrac{79}{555}. &
\end{aligned}$$
So at \(X_3\) we learn the integer part, but the lower bound (\(3\)) does not tell us what the next denominator
is. At \(X_4\) we can almost move a step further:
$$ 3 + \cfrac16 \,>\, \pi \,>\, 3 + \cfrac{1}{7+\cfrac{2}{7}} = 3\tfrac{7}{51},$$
and the first denominator is determined at \(X_5\):
$$ 3\tfrac{79}{555} = 3 + \cfrac{1}{7+\cfrac{2}{79}} = 3 + \cfrac{1}{7+\cfrac{1}{39+\cfrac{1}{2}}}
\,>\, \pi \,>\, 3 + \cfrac{1}{7+\cfrac{1}{3+\cfrac{1}{2}}} = 3 + \cfrac{1}{7+\cfrac{2}{7}} = 3\tfrac{7}{51},$$
with the promise for the next one to be between \(3\) and \(39\) (it will be \(15\)). Note also that
to determine each denominator, more than one iteration of \(X_n\) might be required — the regular fraction
is in a sense more efficient than the (particular) generalized one.
The whole process now becomes evident: the columns of \(X_n\), considered as fractions, have to be reduced simultaneously,
so long as their continued fractions agree. And this too happens to be easily expressed with matrices.
For subtracting an integer \(m\) from the fraction \(\frac{p}{q}\) gives \(\frac{q}{p-mq}\), or
$$ \begin{bmatrix}p\\q\end{bmatrix} \quad\longrightarrow\quad \begin{bmatrix}q\\p-mq\end{bmatrix}
= \begin{bmatrix} 0 & 1\\1 & -m\end{bmatrix}\begin{bmatrix}p\\q\end{bmatrix},$$
which extends nicely to the whole matrix \(X_n\), because want to apply the same operation to both fractions
(columns) with the same integer part \(m\).
In other words, we have now reached the egestion operation:
$$ X_n \quad\text{becomes}\quad J(m) X_n, \quad
\text{where}\;\; J(m) = \begin{bmatrix} 0 & 1\\1 & -m\end{bmatrix}, $$
by which the current approximation spits out an integer \(m\), and inverts the remaining fraction.
Crucially, this means that what is left is no longer an approximation of the original number,
but of the tail in its expansion. So — and this is the last piece of the puzzle —
how can we continue improving the approximation? Just as before, because multiplication of matrices is
associative, meaning \((AB)C = A(BC)\), and the two required operations take place on different
sides of \(X_n\): each time we increase the order, we multiply by \(M_n\) on the right, while each time
we require egestion, we multiply by \(J(m)\) on the left. At any given time our dynamic number is thus
$$ \ldots J(m_3) J(m_2) J(m_1) J(m_0) X_0 M_1 M_2 M_3 M_4 \ldots,$$
and it doesn't matter what the order of operations was (where we insert parentheses),
so \(X_0 M_1 M_2 M_3 M_4\) could be parsed as one block (the approximation) with all egestions
\(J(m_k)\) applied afterwards.
It might be educational to go back to the numeric example from before, where the fourth stage, \(X_4\),
gave us
$$ 3 + \frac{1}{r_1} > \pi > 3 + \frac{1}{s_1},$$
but we couldn't go further, because the tails \(r_1\) and \(s_1\) had different integer parts:
6 and 7, respectively. We could only egest the \(3\) and invert, which would bring the tails to the top
thus:
$$ r_1 < \frac{1}{\pi - 3} < s_1. \tag{T1}$$
The next stage, \(X_5\), gave us more information, namely that
$$ 3 + \cfrac{1}{7+\cfrac{1}{r_2}} > \pi > 3 + \cfrac{1}{7+\cfrac{1}{s_2}},$$
which is is the same as
$$ r_2 > \frac{1}{\frac{1}{\pi-3}-7} > s_2. \tag{T2}$$
Hopefully it is now clear, that the jump from \((\text{T1})\) to \((\text{T2})\) is equivalent to
an iteration step from \(X_4\) to \(X_5\), which extends the tail through \(r_1 = 7 + 1/r_2\),
followed by egesting the latest denominator. In general several steps might be required, because the other
tail must also extend to the analogous \(s_1 = 7 + 1/s_2\).
Finally, let's go through the algorithm directly on the matrices, starting at \(X_2\) to skip the boring part.
The coefficient matrix can be read from \((\text{S})\) to be
$$ M_n = \begin{bmatrix} 2n-1 & 1 \\ (n-1)^2 & 0 \end{bmatrix},\quad\text{for}\quad n \geq2.$$
We will write
\(x\) for the number described by the current matrix \(X_n\), and use
the upper index \(X^{(i)}\) to denote the number of egestions.
$$\begin{aligned}
X_{2} &= \begin{bmatrix} 12 & 4\\4 & 1\end{bmatrix} \Rightarrow 3 < x < 4,
&&\text{not enough info, multiply by } M_3\\
X_2 M_3 = X_3 &= \begin{bmatrix} 76 & 12\\24 & 4\end{bmatrix} \Rightarrow 3\tfrac16 > x > 3,
&&\text{not enough info, multiply by } M_4\\
X_3 M_4 = X_4 &= \begin{bmatrix} 640 & 76\\204 & 24\end{bmatrix} \Rightarrow 3\tfrac16 > x > 3\tfrac{7}{51},
&&\text{egest 3, multiply by } J(3)\\
J(3)X_4 = X^{(1)}_4 &= \begin{bmatrix} 204 & 24\\28 & 4\end{bmatrix} \Rightarrow 8\tfrac12 > x > 7,
&&\text{not enough info, multiply by }M_5\\
X^{(1)}_4 M_5 = X^{(1)}_5 &= \begin{bmatrix} 2220 & 204\\316 & 28\end{bmatrix} \Rightarrow 7\tfrac{2}{79} < x < 7\tfrac27,
&&\text{egest 7, multiply by } J(7)\\
J(7) X^{(1)}_5 = X^{(2)}_5 &= \begin{bmatrix} 316 & 28\\38 & 8\end{bmatrix} \Rightarrow 39\tfrac12 > x > 3\tfrac12,
&&\text{not enough info, multiply by }M_6\\
X^{(2)}_5 M_6 = X^{(2)}_6 &= \begin{bmatrix} 4176 & 316\\ 288 & 8\end{bmatrix} \Rightarrow 14\tfrac12 < x < 39\tfrac12,
&&\text{not enough info, multiply by }M_7\\
X^{(2)}_6 M_7 = X^{(2)}_7 &= \begin{bmatrix} 65664 & 4176 \\ 4032 & 288\end{bmatrix} \Rightarrow 16\tfrac27 > x > 14\tfrac12,
&&\text{not enough info, multiply by }M_8\ldots\\
\end{aligned}$$
And we have to keep going until \(M_{11}\), where not one but two denominators can be extracted in a row:
\(15\) and \(1\), which brings us to the famous fraction \(\frac{355}{113}\).
The following table shows the first 20 stages,
with the left column of \(X_n\) given as rational
and decmial values, because it's the “latest” and hence most precise.
A few words about the error. The bracketing property means that \(\pi\), or whatever number, is between
\(\frac{p_{n-1}}{q_{n-1}}\) and \(\frac{p_n}{q_n}\), so we are off by at most the difference. In the case
of regular fractions this can further be simplified, using recurrence \((\text{R})\). Subtracting
the two fractions gives \(\frac{(-1)^{n-1}}{q_nq_{n-1}}\), where the sign means, that the successive approximants
lie on opposite sides of \(\pi\) (being greater when \(n\) is odd).
This is why \(\frac{355}{113}\) works so great: the next fraction has a huge denominator
(\(66317\)) which means \(\pi\) is at most \({(113\times 66317)^{-1}\approx 10^{-7}}\) away from either of them.
Not so nice for the generalized fractions, which is evident from how huge they are — and this is after reducing them!
It is only for regular fractions, that the recurrence produces coprime numbers, which goes hand in had with the fact
that they are the best approximations for a given denominator. No wonder then, that to obtain \(8\) regular terms,
it takes \(19\) generalized ones. Still the precision is improved with
each iteration, regardless of whether a regular term could be egested, and this is how the position of the last correct
digit is determined in the table above.
IV
There are continued fractions where the generalized form converges (much) faster than the regular one, or can be
made to converge faster by some acceleration technique. The usual example of a speedup for \(\pi\) is to compare
the one we used with the simplest expansion from the
Leibniz series
for \(\arctan\), whose convergence, despite its beauty, is terrible.
But let's leave \(\pi\) for now, and instead look at another staple irrational number: