Self-interpreter
A
self-interpreter is a
programming language interpreter written in the language it interprets. An example would be a
BASIC interpreter written in BASIC. For many years the construction of a self-interpreter (or self-compiler) was considered an important step in the acceptance of a language; writing an interpreter is a difficult task that requires a rich language, so if a language can interpret itself this is a good indication that it is fairly powerful.
A definition of a computer language is usally made in relation to an abstract machine, often writen in Backus-Naur form, or even graphically.
The interpreter is a program who's job is to read the language source, typically in text form, and build a running program from it. A language is considered "powerful enough" if, in theory, any program can be written using it, a feature known as being Turing complete. For this reason, any reasonably powerful language should be able to write its own interpreter.
This may sound paradoxical, how can one write a language in that language if it does't yet exist? However there is little mystery here, languages are almost always "mocked up" in a general language-development system like yacc or lex, and then later converted to a final form. In these cases the early mockups can be used to develop the source code to the language.
Once the system is bootstrapped new versions of the language can be developed in the language itself.
In the past this bootstrapping operation was considered a milestone for almost every language. Today this it is no longer considered to be quite as important. Since development systems today often support many languages, it is more important to work well and provide reasonable performance in this system than it is to satisfy the computer scientists among the user base.
A full definition of a self-interpreter includes:
- the language must be Turing-complete
- the behavior of programs when interpreted by a self-interpreter must be the same as their behavior when interpreted by any other interpreter, i.e. the two must produce identical observable output for any legitimate input, though not necessarily at the same speed
- the self-interpreter must not use language constructs which are designed to facilitate recognition and interpretation of these and other language constructs (self-interpretation), e.g. LISP's eval.
The second requirement looks like a tautology saying that a self-interpreter is an interpreter. Nevertheless this requirement rules out languages whose pragmatics are not sufficient to permit a self-interpreter.
Not every Turing-complete language can have a self-interpreter in this sense.
This property forms a distinction between languages: some languages are environmentally complete, while others are not. Thus, Turing-completeness comes from the semantics (computer language semantics) of the language and environmental completeness from its pragmatics (computer language pragmatics).
This definition was made by Daniel B. Cristofani and
Oleg Mazonka in "A Very Short Self-Interpreter".