Header file
From Wikipedia, the free encyclopedia
In computer programming, particularly in the C and C++ programming languages, a header file or include file is a file, usually in the form of source code, that a compiler automatically includes when processing another source file. Typically, programmers specify the inclusion of header files via compiler directives at the beginning (or head) of the other source file.
A header file commonly contains forward declarations of classes, subroutines, variables, and other identifiers. Programmers who wish to declare standardized identifiers in more than one source file can place such identifiers in a single header file, which other code can then include whenever the header contents are required.
The C standard library and C++ standard library traditionally declare their standard functions in header files.
Contents |
[edit] Motivation
In most modern computer programming languages, programmers can break up programs into smaller components (such as classes and subroutines) and distribute those components among many translation units (typically in the form of physical source files), which the system can compile separately. Once a subroutine needs to be used somewhere other than in the translation unit where it is defined, the concept of forward declarations or function prototypes must be introduced. For example, a function defined in this way in one source file:
int add(int a, int b) { return a + b; }
may be declared (with a function prototype) and then referred to in a second source file as follows:
int add(int, int); int triple(int x) { return add(x, add(x, x)); }
However, this simplistic approach requires that the programmer maintain the function declaration for add
in two places — in the file containing its implementation and in the file using the functionality. If the definition of the function ever changes, the programmer must remember to update all the prototypes scattered across the program, as well. This is necessary, because both C and C++ implementations need not diagnose all violations of what in C++ is called the one definition rule (ODR). Actually most of them rely on a linker to do this. The linker, however, is typically very limited in its knowledge of types used in a program. This leads to some ODR violations that can not be caught by the language implementation. As a result, it is the programmer's responsibility to keep all declarations that cross translation unit boundaries coherent. Manually searching for all declarations of the same external entity and verifying that they are compatible is a tedious task. (Note that C does not define the term one definition rule — it is C++ specific. If declarations of the same entity in many C source files are different, however, the application will not function correctly and this results in undefined behavior, regardless of the naming of the rule being violated.)
To understand an ODR violation, consider the following (correct) example:
/* File print-heading.c */ #include <stdio.h> void print_heading(void) { printf("standard heading\n"); }
/* File main.c */ void print_heading(void); int main(void) { print_heading(); return 0; }
The translation unit represented by the main.c
source file references the print_heading()
function defined in another translation unit (print-heading.c
). According to the rules of C99, programmers must declare an external function before the first use. To meet this requirement, the main.c
file declares the function in the first line. This version of the program works correctly.
Later, the programmer who maintains the print-heading.c
source file may decide to make the function more flexible and support custom headings. A possible implementation might look like this:
/* File print-heading.c */ #include <stdio.h> void print_heading(const char *heading) { printf("%s\n", heading); }
If the programmer forgets to update the declaration in main.c
, dire results can occur. The print_heading()
function expects an argument and accesses its value, however, the main()
function does not supply any value. Executing this program leads to undefined behavior: the application may print garbage, terminate unexpectedly or perform unexpectedly according to the platform that it is being executed on.
Why does this code compile and link successfully? This happens because the compiler relies on a declaration in main.c
when compiling the main.c
translation unit. And that declaration matches the form of the function. Later, when the linker combines the compiled main.c
and print-heading.c
translation units (in most implementations they are represented as main.o
or main.obj
files), it probably could detect the inconsistency — but not in C. C implementations reference functions by names at object and binary file levels, this does not include the return value or the argument list. The linker encounters a reference to print_heading()
in main.o
and finds a suitable function in print-heading.o
. At this stage, all information about function argument types is lost.
How do we therefore manage multiple declarations successfully? Header files provide the solution. A module's header file declares each function, object, and data type that forms part of the public interface of the module — for example, in this case the header file would include only the declaration of add
. Each source file that refers to add
uses the #include
directive to bring in the header file:
/* File add.h */ #ifndef ADD_H #define ADD_H int add(int, int); #endif /* ADD_H */
(Aside: note that this header file uses an include guard.)
/* File triple.c */ #include "add.h" int triple(int x) { return add(x, add(x, x)); }
This reduces the maintenance burden: when a definition changes, only a single copy of the declaration must be updated (the one in the header file). The header file may also be included in the source file that contains the corresponding definitions, giving the compiler an opportunity to check the declaration and the definition for consistency.
/* File add.c */ #include "add.h" int add(int a, int b) { return a + b; }
Typically, programmers use header files to specify only interfaces, and usually provide as a minimum a small amount of documentation explaining how to use the components declared in the file. As in this example, the implementations of subroutines remain in a separate source file, which continues to be compiled separately. (One common exception in C and C++ is inline functions, which are often included in header files because most implementations cannot properly expand inline functions without seeing their definitions at compile time.)
[edit] Alternatives
Header files do not provide a monopoly solution to the problem of accessing identifiers declared in different files. They have the disadvantage that programmers may still find it necessary to make changes in two places (a source file and a header file) whenever a definition changes. Some newer languages (such as Java) dispense with header files and instead use a naming scheme that allows the compiler to locate the source files associated with interfaces and class implementations (but in doing so, restrict file-naming freedom). In such languages, the ODR problem is typically solved by two techniques: first, the compiler puts full information about types into compiled code and this information remains accessible even at run-time when the program executes. Second, Java and other modern languages have the ability to verify the number and types of arguments at method call. This does not come for free, however; it results in space and execution-time overhead that is not acceptable for some time-critical applications.
COBOL and RPG IV have a form of include files called copybooks. Programmers "include" these into the source of the program in a similar way to header files, but they also allow replacing certain text in them with other text. The COBOL keyword for inclusion is COPY, and replacement is done with a REPLACING ... BY ... clause.