Bytecode
From Wikipedia, the free encyclopedia
This article does not cite any references or sources. Please help improve this article by adding citations to reliable sources (ideally, using inline citations). Unsourced material may be challenged and removed. (January 2009) |
Bytecode is a term which has been used to denote various forms of instruction sets designed for efficient execution by a software interpreter as well as being suitable for further compilation into machine code. Since instructions are processed by software, they may be arbitrarily complex, but are nonetheless often akin to traditional hardware instructions; stack machines are common, for instance. Different parts may often be stored in separate files, similar to object modules, but dynamically loaded during execution.
The name bytecode stems from instruction sets which have one-byte opcodes followed by optional parameters. Intermediate representations such as bytecode may be output by programming language implementations to ease interpretation, or it may be used to reduce hardware and operating system dependence by allowing the same code to run on different platforms. Bytecode may often be either directly executed on a virtual machine (i.e. interpreter), or it may be further compiled into machine code for better performance.
Unlike human-readable source code, bytecodes are compact numeric codes, constants, and references (normally numeric addresses) which encode the result of parsing and semantic analysis of things like type, scope, and nesting depths of program objects. They therefore allow much better performance than direct interpretation of source code.
[edit] Execution
A bytecode program is normally executed by parsing the instructions one at a time. This kind of bytecode interpreter is very portable. Some systems, called dynamic translators, or "just-in-time" (JIT) compilers, translate bytecode into machine language as necessary at runtime: this makes the virtual machine unportable, but doesn't lose the portability of the bytecode itself. For example, Java and Smalltalk code is typically stored in bytecoded format, which is typically then JIT compiled to translate the bytecode to machine code before execution. This introduces a delay before a program is run, when bytecode is compiled to native machine code, but improves execution speed considerably compared to interpretation—normally by several times.
Because of its performance advantage, today many language implementations execute a program in two phases, first compiling the source code into bytecode, and then passing them to the virtual machine. Therefore, there are virtual machines for Java, Python, PHP[1], Forth, and Tcl. The current reference implementation of Perl and Ruby instead work by walking an abstract syntax tree representation derived from the source code.
[edit] Examples
- O-code of the BCPL programming language
- p-code of UCSD Pascal implementation of the Pascal programming language
- Bytecodes of many implementations of the Smalltalk programming language
- Java bytecode, which is executed by the Java virtual machine
- Emacs is a text editor with a majority of its functionality implemented by its specific dialect of Lisp. These features are compiled into bytecode. This architecture allows users to customize the editor with a high level language, which after compilation into bytecode yields reasonable performance.
- EiffelStudio for the Eiffel programming language
- Managed code such as Microsoft .NET Common Intermediate Language, executed by the .NET Common Language Runtime (CLR)
- Byte Code Engineering Library
- Scheme 48 implementation of Scheme using bytecode interpreter
- CLISP implementation of Common Lisp compiles only to bytecode
- CMUCL and Scieneer Common Lisp implementations of Common Lisp can compile either to bytecode or to native code; bytecode is much more compact
- Embeddable Common Lisp implementation of Common Lisp can compile to bytecode or C code
- Icon programming language
- Ocaml programming language optionally compiles to a compact bytecode form
- Parrot virtual machine
- LLVM, a modular bytecode compiler and virtual machine
- YARV and Rubinius for Ruby.
- Infocom used the Z-machine to make its software applications more portable.
- C to Java Virtual Machine compilers
- SWEET16
- The SPIN interpreter built into the Parallax Propeller Microcontroller
- Adobe Flash objects
- BANCStar, originally bytecode for an interface-building tool but used as a language in its own right.
- Ericsson implementation of Erlang uses BEAM bytecodes
[edit] Notes
- ^ Although PHP opcodes are generated each time the program is launched, and are always interpreted and not Just-In-Time compiled