Bootstrapping (compilers)
From Wikipedia, the free encyclopedia
Bootstrapping is a term used in computer science to describe the techniques involved in writing a compiler (or assembler) in the target programming language which it is intended to compile. This technique is also called self-hosting.
One may then wonder how the chicken and egg problem of creating the compiler was solved: if one needs a compiler for language X to obtain a compiler for language X (which is written in language X), how did the first compiler get written? Possible methods include:
- Implementing an interpreter or compiler for language X in language Y. Niklaus Wirth reported that he wrote the first Pascal compiler in Fortran. Language Y could also be hand coded machine code or assembly language.
- Another interpreter or compiler for X has already been written in another language Y; this is how Scheme is often bootstrapped.
- Earlier versions of the compiler were written in a subset of X for which there existed some other compiler; this is how some supersets of Java are bootstrapped.
- The compiler for X is cross compiled from another architecture where there exists a compiler for X; this is how compilers for C are usually ported to other platforms.
- Writing the compiler in X; then hand-compiling it from source (most likely in a non-optimized way) and running that on the code to get an optimized compiler. Donald Knuth used this for his WEB literate programming system.
Methods for distributing compilers in source code include providing a portable bytecode version of the compiler, so as to bootstrap the process of compiling the compiler with itself.
The first language to provide such a bootstrap was NELIAC. The first commercial language to do so was PL/I. Today, a large proportion of programming languages are bootstrapped, including BASIC, C, Pascal, Factor, Haskell, Modula-2, Oberon, OCaml, Common Lisp, Scheme and more.