cubbi.com: Fibonacci numbers in different programming languages

cubbi.com: Fibonacci numbers in many programming languages

serious stuff
  www.cubbi.com
personal
programming
  fibonacci numbers
    forth examples
    postscript examples
    muf examples
    joy examples
    j examples
    scheme examples
    hope examples
    ocaml examples
    haskell examples
    prolog examples
    c++ examples
    java examples
    assembly language examples
    fortran examples
    c examples
    sh examples
    awk examples
    perl examples
    tcl examples
  asmix
  hacker test
  resume
science
martial arts
fun stuff
  www.cubbi.org

Fibonacci numbers are defined as follows

F(0) = 0
F(1) = 1
F(n) = F(n-1) + F(n-2)
for negative n: F(-n) = (-1)ⁿ⁺¹F(n)

Note that examples and benchmarks made before september 2009 on this page use the older, less general, definition, with F(0) = 1.
September 2009 UPDATE:
UPDATE IN PROGRESS: Five years since I wrote this page, I've decided to rewrite all examples to be more useful, the subroutines will be as general as possible (no more 2x2 matrix multiplication), all code will be commented, each example will now take its arguments from command line (where possible)
April 2014 UPDATE:
This effort was abandoned somewhere in 2010.

ALGORITHMS AND BENCHMARKS SUMMARY

1. EXPONENTIAL COMPLEXITY
1A - naive binary recursion execution time: T(n)=t_o*φⁿ	asm java C	t_o=2.8 ns t_o=3.0 ns t_o=3.1 ns
2. LINEAR COMPLEXITY
1B - cached binary recursion / memoization	pending 2009/2010 update
2A - cached linear recursion / infinite lazy-evaluated list	pending 2009/2010 update
2B - linear recursion with accumulator	pending 2009/2010 update
2C - imperative loop with mutable variables	pending 2009/2010 update
3. LOGARITHMIC COMPLEXITY
3A - matrix multiplication	pending 2009/2010 update
3B - fast recursion	pending 2009/2010 update
3C - Binet's formula with rounding	pending 2009/2010 update

EXAMPLES

Languages	Algorithms
Languages	1A	1B	2A	2B	2C	3A	3B	3C
concatenative languages (program is a composition of functions applied to stack of arguments)
Forth (1960)	OK	OK	OK	OK	OK	OK	OK	OK
PostScript (1982)	OK	OK	OK	OK	OK	OK	OK	OK
MUF (1989)	OK	OK	OK	OLD	OLD	OLD	OLD	OLD
Joy (1999)	OK		OLD	OLD	OLD	OLD	OLD	OLD
possible todo: Factor
Vector languages (program is a composition of functions applied to a multidimentional array argument)
J (1990)	OK	OK	OK	OK	OK	OK	OK	OK
possible todo: APL, K, Q, Nial
functional languages (program is an application of functions)
Scheme (1975)	OK	OK	OK	OK	OK	OK	OK	OK
Hope (1978)	OK	OK	OK	OK	N/A	OK	OK	OK
OCaml (1989)	OK	OK	OK	OK	OK	OK	OK	OK
Haskell (1990)	OK	OK	OK	OK	N/A	OK	OK	OK
possible todo: Clojure, Pure
logic and concurrent languages (program is a collection of logic statements)
Prolog (1971)	OK	OK	OK	OK	OK	OK	OK	OK
possible todo: KL1, Erlang
object languages (program is a list of data objects and their relationships)
C++ (1986)	OK	OK	OK	OK	OK	OK	OK	OK
Java (1995)	OK	OK	OK	OK	OK	OK	OK	OK
possible todo: C#, Python
imperative languages (program is a sequence of commands to the CPU)
Assembly (1952)	OK		OLD	OLD	OLD	TBD	OLD	TBD
Fortran (1954)	OK		OLD		OLD	OLD	OLD	OLD
C (1972)	OK	OK	OLD	OK	OK	OLD	OLD	OLD
sh (1972)	OK		OLD	OLD	OLD	OLD	OLD	OLD
awk (1978)	OK		OLD	OLD	OLD	OLD	OLD	OLD
perl (1987)	OK		OLD	OLD	OLD	OLD	OLD	OLD
Tcl (1990)	OK		OLD	OLD	OLD	OLD	OLD	OLD

ALGORITHMS AND BENCHMARKS IN DETAIL

Please remember that these benchmarks do not show strength or weakness of any programming language. All they show is how efficient specific language translators are at implementing the featured algorithms. It is currently (as of 2009) in the transitional stage from OLD to NEW benchmarks using modern hardware and better written examples.

Benchmarked on Intel Pentium Core i7 2.66GHz running Linux 2.6.31 with 12GB RAM, in x86_64 mode
OLD benchmarks: Intel Pentium 4 2.20GHz running Linux 2.5.75 with 640M RAM
NEW RAW BENCHMARK DATA available if you want to see them, as well as OLD RAW BENCHMARK DATA.

ALGORITHM 1A: NAIVE BINARY RECURSION
If n equals 0, return 0 If n equals 1, return 1 Recursively calculate F(n-1) Recursively calculate F(n-2) Return the sum of the results from steps 2 and 3 Complexity (fixed precision): O(exp n) speed, O(exp n) space The number of additions grows exponentially with n, and each recursive call requires memorization of state (since neither call is tail-recursive), which makes space complexity also exponential. This algorithm becomes uselessly slow for F(n) with n>50 and all measured results easily fit in 64-bit integer data type. With the duration of the integer arithmetic operations independent of n, the time to execute these programs is almost perfectly t₀φⁿ, which is why the plots of the logarithm of execution time vs. n are parallel straight lines. This algorithm was implemented directly in all languages featured on this web page, since all of them allow recursion. Only Fortran did not allow recursion in the past, but it is supported since the 1990 standard. Curiously, Java outperforms C++ on this test, but it's one of the very few artifical tests where it does so
Language	Translator [*]	T₀	T(46), seconds
Assembly	GNU as 2.19.1 [c]	2.8 ns	11.63	log(Execution time, seconds) vs. Fibonacci number calculated
Java	Sun Java SE 1.6.0_16-b01 / HotSpot 64-Bit VM 14.2-b01 [b]	3.0 ns	12.34
C	GNU gcc 4.4.1 [c]	3.1 ns	12.57
C++	GNU g++ 4.4.1 [c]	3.1 ns	12.61
OCaml	INRIA OCaml 3.11.1 [c]	3.4 ns	13.98
Fortran	GNU gfortran 4.4.1 [i]	4.3 ns	17.62
Forth	GNU gforth 0.7.0 [i]	15.3 ns	61.91
Scheme	MIT Scheme 7.7.90 20090107 [b]	22.6 ns	98.04
Haskell	GHC 6.8.2 [c]	31.3 ns	129.1
Joy	Joy1 by Manfred von Thun, 07-May-03 [i]	199 ns	813.5
PostScript	GPL Ghostscript 8.70 [i]	284 ns	1147
awk	GNU Awk 3.1.7 [i]	417 ns	1729
MUF	FuzzBall MUCK 6.09 [b]	716 ns	2939
perl	perl 5.8.8 [i]	755 ns	3101
J	Jsoftware J64 6.02 [i]	1.08 μs	4386
Hope	Hope by Ross Paterson, 08-Dec-03 [i]	1.52 μs	6035
Prolog	SWI Prolog 5.6.64 [b]	2.17 μs	8900
Tcl	Tcl 8.5.7 [i]	3.37 μs	13700
sh	bash 4.0.33(2) [i]	101 μs	415000

[*] - [c] for Compiler, [b] for Bytecode intepreter, [i] for Interpreter
Execution time is user time as reported by time(1) except for MUF, where an internal time function was called from the program. T(46) was measured directly only when it was less than 2000 seconds, otherwise it was extrapolated from execution times for smaller inputs. In all cases I tried to pick the fastest one of the language translators that are readily available and offer unlimited precision integer arithmetics (it becomes important in further tests).

ALGORITHM 1B: CACHED BINARY RECURSION / MEMOIZATION
If n equals 0, return 0 If n equals 1, return 1 Read F(n-1) from cache or recursively calculate if not cached yet Read F(n-2) from cache or recursively calculate if not cached yet Return the sum of the results from steps 2 and 3 and save it in the cache Complexity (unlimited precision): O(n²) speed, O(exp n) space The common optimization technique applied to binary recursions is memoization: for any given set of arguments, the recursive function is calculated only once, and the result is stored in memory. Further calls to the function with the same arguments return the previously calculated result. Since no function result is calculated twice, the number of additions required to calculate F(n) is reduced to n. The number of cache lookups is instead exponential but, assuming it is implemented as a suitable random-access data structure (array or hash table), the time of each cache lookup is O(1). Space requirements for this algorithm are still exponential because of binary recursion. In the programming languages that have built-in support for memoization (such as J or Factor), algorithm 1A can be turned into 1B with just one keyword. For other languages, I manually implement a suitably generic data structure.
Language	Translator [*]	t₀	T(1,000,000), seconds
				sqrt(Execution time, seconds) vs. Fibonacci number calculated

ALGORITHM 2A: CACHED LINEAR RECURSION
When previously calculated Fibonacci numbers are stored in a data structure, there is no need for binary recursion, the entire structure can be filled with only n iterations. infinite lazy evaluated list. Construct an infinite list F with F[0] equal 0 and F[1] equal 1 Define F[i] as the sum of F[n-1] and F[n-2]. Return the nth element of the list In the languages that support lazy evaluation, either implicitly (like Hope or Haskell) or explicitly (like OCaml or Scheme), it is possible to construct the data structure containing all Fibonacci numbers and simply return the nth element of the structure. random-access container. Create a container F with the elements F[0] equal 0 and F[1] equal 1 Recursively (or iteratively) calculate the elements F[2..n] according to the formula F[i]=F[i-1]+F[i-2] Return F[n] In the languages that only support eager evaluation, the entire data structure must be filled with an explicit loop (or a tail-recursive function). If the container permits random access with O(1) complexity, the total execution complexity is O(n), assuming fixed precision. For large precision numbers, the complexity of addition grows with n as well. Complexity (fixed precision): O(n) speed, O(n) space Complexity (unlimited precision): O(n²) speed, O(n²) space.
Language	Translator [*]	t₀	T(1,000,000), seconds
				sqrt(Execution time, seconds) vs. Fibonacci number calculated

ALGORITHM 2B: LINEAR RECURSION WITH ACCUMULATOR
Define a recursive function λ(n₁,n₂,n) such as λ(n₁,n₂,n) returns 0, if n=0 and 1 if n=1 λ(n₁,n₂,n) return the result of λ(n₂,n₁+n₂,n-1), if n>1 Return the result of λ(0,1,n) Complexity (fixed precision): O(n) speed, O(1) space Complexity (unlimited precision): O(n²) speed, O(n) space These algorithms execute n additions to calculate F(n), and store the intermediate results in the recursive function's additional arguments, which serve as accumulators. The time to execute is, therefore n * time to execute one addition and one recursive call. The addition of large numbers if O(n), and recursive calls are typically constant time and never greater than O(n). The time to execute the algorithm is T(n)=t₀n². which is a straight line in the plot of sqrt(T) vs. n. This recursive functions is easily written as tail recursion, which means it does not contribute to space complexity, but space is needed to store the accumulated sum, which grows as O(n). In the languages that do not support tail recursion, such as C/C++, space grows as O(n²), and the algorithm quickly becomes useless. Such languages are more suited for 2C (iteration) anyway.
Language	Translator [*]	t₀	T(1,000,000), seconds
Haskell	GHC 6.8.2 [c]	12.4 fs	12.42	sqrt(Execution time, seconds) vs. Fibonacci number calculated
Prolog	SWI Prolog 5.7.15 [b] with GMP 4.3.1	21.3 fs	22.22
OCaml	INRIA OCaml 3.11.1 [c]	32.2 fs	32.14
Java	Sun Java SE 1.6.0_16-b01 / HotSpot 64-Bit VM 14.2-b01 [b]	34.1 fs	31.27
Scheme	MIT Scheme 7.7.90 20090107 [b]	48.8 fs	47.88
Forth	GNU gforth 0.7.0 [i]	-	-
C++	GNU g++ 4.4.1 [c] with GMP 4.3.1	-	-
J	Jsoftware J64 6.02 [i]	-	-

[*] - [c] for Compiler, [b] for Bytecode intepreter, [i] for Interpreter
Execution time is user time as reported by time(1), the test programs were modified to avoid conversion of the entire result to string, since that operation may be slow for large numbers.
J reports stack overflow when trying to calculate numbers over ~3000, and I could not find a way to increase available stack size in jconsole
C++ linked with GMP consumes memory at a very large pace, exceeding 20Gb when calculating 500,000th number
Forth (Gforth linked with GMP) also consumes memory at a very large pace, running into tens of gigabytes about the 500,000th number

ALGORITHM 2C: IMPERATIVE LOOP WITH MUTABLE VARIABLES
Assign value 0 to variable n₁ and value 1 to variable n₂ Assign the value of n₁+n₂ to variable n₂ and assign the old value of n₂ to n₁ Repeat step 2 n times. Return the value of variable n₁ Complexity (fixed precision): O(n) speed, O(1) space Complexity (unlimited precision): O(n²) speed, O(n) space These algorithms execute n additions to calculate F(n), and store the intermediate results in mutable variables. Time to execute is, therefore n * time to execute one addition and two variable read/store operations. The addition of large numbers if O(n), and store/retrieve is typically constant time and never greater than O(n). The time to execute the algorithm is T(n)=t₀n². which is a straight line in the plot of sqrt(T) vs. n.
Language	Translator [*]	t₀	T(1,000,000), seconds
Forth	GNU gforth 0.7.0 [i] with GMP 4.3.1	11.1 fs	11.13	sqrt(Execution time, seconds) vs. Fibonacci number calculated
C++	GNU g++ 4.4.1 [c] with GMP 4.3.1	23.2 fs	22.62
Java	Sun Java SE 1.6.0_16-b01 / HotSpot 64-Bit VM 14.2-b01 [b]	31.6 fs	29.52
OCaml	INRIA OCaml 3.11.1 [c]	35.9 fs	34.52
Scheme	MIT Scheme 7.7.90 20090107 [b]	50.8 fs	48.46
J	Jsoftware J64 6.02 [i]	413 fs	403.6
Prolog	SWI Prolog 5.7.15 [b] with GMP 4.3.1	73.8 ns	-
Haskell	GHC 6.8.2 [c]	-	-
Hope	Hope by Ross Paterson, 08-Dec-03 [i]	-	-

ALGORITHM 3A: MATRIX MULTIPLICATION
This is the most mathematically elegant way to calculate a Fibonacci number. Complexity (fixed precision): O(log n) speed, O(1) space Complexity (unlimited precision): O(n log² n log log n) speed, O(n) space Raising a matrix to the power of n requires only O(log n) matrix multiplications, if n is a natural number and the repeated squaring algorithm is used. Space requirements for repeated squaring include the product accumulator and the last calculated square, which makes it O(1) (in fixed-precision case) unless non-tail-recursive implementation is used. In case of unlimited precision, execution time of each 2x2 matrix multiplication is proportionate to the execution time of one integer multiplication, which has the complexity of O(n log n log log n) and space requirements to store one 2x2 matrix grow as O(n). The repeated squaring algorithm for calculation of xⁿ: start with product accumulator a=1 and last square accumulator s=x if n is odd, increase the accumulated product, a=a*s calculate the next square of s, s=s² divide n by 2, n = n div 2 repeat until n=0, at which time the final value of a is the answer In languages providing bit operations, this algorithm can examine the bits in the binary representation of n, proceeding from least-significant to most-significat bit, instead of repeating division by 2. The example in J makes good use of this approach.
Language	Translator [*]	t₀	T(1,000,000), seconds
				sqrt(Execution time, seconds) vs. Fibonacci number calculated

ALGORITHM 3B: FAST RECURSION
This algorithm can be derived either from from the fact that, from definition, F(n+m)=F(n)F(m) + F(n-1)F(m-1), or by rewriting the matrix equation using the fact that A^n+m=AⁿA^m: This algorithm is faster than than 3A because it does not repeat calculations of the same numbers (top right and bottom left corners of the matrix in 3A), but it cannot be easily implemented as a tail-recursive function, resulting in O(log n) space complexity. I implement it this way: Define a recursive function λ(n) such as λ(n) returns a pair of numbers (0,0) if n equals 0 λ(n) returns a pair of numbers (1,0) if n equals 1 recursively calculate λ(n div 2) (integer division, (n-1)/2 for odd n) to obtain a pair of numbers (k₂,k₁) for even n, return a pair of numbers (2k₁+k₂)k₂, k₁²+k₂²) for odd n, return a pair of numbers ((k₁+k₂)² , (2k₁+k₂)k₂) The end result is the first element of the result of λ(n) Complexity (fixed precision): O(log n) speed, O(log n) space Complexity (unlimited precision): O(n log² n log log n) speed, O(n log n) space
Language	Translator [*]	t₀	T(1,000,000), seconds
				sqrt(Execution time, seconds) vs. Fibonacci number calculated

ALGORITHM 3C: BINET'S FORMULA WITH ROUNDING
Eigendecomposition of the matrix equation from 1A gives a formula, published by Abraham de Moivre in 1730, Euler in 1765, and by Jacques Philippe Marie Binet in 1843 Where φ is the golden ratio, (1+sqrt(5))/2. Because (1-phiⁿ) / sqrt(5) is less than 1/2 for all positive n, this formula can be simplified to nearest integer to φⁿ/sqrt(5) or, using the floor function, Complexity (fixed precision): O(log n) speed, O(1) space Complexity (unlimited precision): O(n log² n log log n) speed, O(n) space Evaluation of this formula, just as evaluation of the formula in 3A, is thus reduced to taking an integer power of a number, only this number is irrational and must be represented as a floating-point value with sufficient precision to produce the correct fibonacci number. Not many languages support arbitrary precision floating point values, I use GNU MP library where possible. The computational and space complexity of this algorithm are the same as 1A, and are the same as the complexity of the repeated squaring algorithm with one addition: time to calculate the square root of five with sufficient precision. Where not provided by the compiler/library, I iteratively calculate inverse square root via y(n+1) = 0.5y(n)[3 - a*y(n)^2] because this recurrent expression has no division.
Language	Translator [*]	t₀	T(1,000,000), seconds
				sqrt(Execution time, seconds) vs. Fibonacci number calculated

LINKS
Wolfram's MathWorld entry
OEIS entry