Comeau C++ 4.x
Template Instantiation

For information on exporting templates, click here
  1. Template Instantiation

    The C++ language includes the concept of templates. A template is a description of a class or function that is a model for a family of related classes or functions. For example, one can write a template for a Stack class, and then use a stack of integers, a stack of floats, and a stack of some user-defined type. In the source, these might be written Stack<int>, Stack<float>, and Stack<X>. From a single source description of the template for a stack, the compiler can create instantiations of the template for each of the types required.

    The instantiation of a class template is always done as soon as it is needed in a compilation. However, the instantiations of template functions, member functions of template classes, and static data members of template classes (hereafter referred to as template entities) are not necessarily done immediately, for several reasons:

    1. One would like to end up with only one copy of each instantiated entity across all the object files that make up a program. (This of course applies to entities with external linkage.)

      The language allows one to write a specialization of a template entity, i.e., a specific version to be used in place of a version generated from the template for a specific data type. (One could, for example, write a version of Stack<int>, or of just Stack<int>::push, that replaces the template-generated version; often, such a specialization provides a more efficient representation for a particular data type.) Since the compiler cannot know, when compiling a reference to a template entity, if a specialization for that entity will be provided in another compilation, it cannot do the instantiation automatically in any source file that references it. (The modern C++ language requires that a specialization be declared in every compilation in which it is used, but for compatibility with existing code and older compilers the Comeau front end does not require that in some modes. See the command-line option --no_distinct_template_signatures.)

    2. The language also dictates that template functions that are not referenced should not be compiled, that, in fact, such functions might contain semantic errors that would prevent them from being compiled. Therefore, a reference to a template class should not automatically instantiate all the member functions of that class.

    (It should be noted that certain template entities are always instantiated when used, e.g., inline functions.)

    From these requirements, one can see that if the compiler is responsible for doing all the instantiations automatically, it can only do so on a program-wide basis. That is, the compiler cannot make decisions about instantiation of template entities until it has seen all the source files that make up a complete program.

    The Comeau C++ front end provides an instantiation mechanism that does automatic instantiation at link time (on most platforms). For cases where the programmer wants more explicit control over instantiation, the front end also provides instantiation modes and instantiation pragmas, which can be used to exert fine-grained control over the instantiation process.

  2. Automatic Instantiation

    The goal of an automatic instantiation mode is to provide painless instantiation. The programmer should be able to compile source files to object code, then link them and run the resulting program, and never have to worry about how the necessary instantiations get done.

    In practice, this is hard for a compiler to do, and different compilers use different automatic instantiation schemes with different strengths and weaknesses:

    1. AT&T/USL/Novell's cfront (which Comeau C++ 3.0 is based on) product saves information about each file it compiles in a special directory called ptrepository. It instantiates nothing during normal compilations. At link time, it looks for entities that are referenced but not defined, and whose mangled names indicate that they are template entities. For each such entity, it consults the ptrepository information to find the file containing the source for the entity, and it does a compilation of the source to generate an object file containing object code for that entity. This object code for instantiated objects is then combined with the "normal" object code in the link step.

      The programmer using cfront must follow a particular coding convention: all templates must be declared in ".h" files, and for each such file there must be a corresponding ".C" file containing the associated source definitions. The compiler is never told about the ".C" files explicitly; one does not, for example, compile them in the normal way. The link step looks for them when and if it needs them, and does so by taking the ".h" file name and replacing its suffix.

      This scheme has the disadvantage that it does a separate compilation for each instantiated function (or, at best, one compilation for all the member functions of one class). Even though the function itself is often quite small, it must be compiled along with the declarations for the types on which the instantiation is based, and those declarations can easily run into many thousands of lines. For large systems, these compilations can take a very long time. The link step tries to be smart about recompiling instantiations only when necessary, but because it keeps no fine- grained dependency information, it is often forced to "recompile the world" for a minor change in a ".h" file. In addition, cfront has no way of ensuring that preprocessing symbols are set correctly when it does these instantiation compilations, if preprocessing symbols are set other than on the command line.

    2. Borland's C++ compiler instantiates everything referenced in a compilation, then uses a special linker to remove duplicate definitions of instantiated functions.

      The programmer using Borland's compiler must make sure that every compilation sees all the source code it needs to instantiate all the template entities referenced in that compilation. That is, one cannot refer to a template entity in a source file if a definition for that entity is not included by that source file. In practice, this means that either all the definition code is put directly in the ".h" files, or that each ".h" file includes an associated ".C" (actually, ".CPP") file.

      This scheme is straightforward, and works well for small programs. For large systems, however, it tends to produce very large object files, because each object file must contain object code (and symbolic debugging information) for each template entity it references.

    Comeau's approach is a little different. It requires that for each instantiation required, there is some (normal, top-level, explicitly-compiled) source file that contains both the definition of the template entity and of any types required for the particular instantiation. This requirement can be met in various ways:

    1. The Borland convention: each ".h" file that declares a template entity also contains either the definition of the entity or includes another file containing the definition.

    2. Implicit inclusion: when the compiler sees a template declaration in a ".h" file and discovers a need to instantiate that entity, it is given permission to go off looking for an associated definition file having the same base name and a different suffix, and it implicitly includes that file at the end of the compilation. This method allows most programs written using the cfront convention to be compiled with Comeau's approach. See the section on implicit inclusion.

    3. The ad hoc approach: the programmer makes sure that the files that define template entities also have the definitions of all the available types, and adds code or pragmas in those files to request instantiation of the entities there.

    The Comeau automatic instantiation method works as follows:

    1. The first time the source files of a program are compiled, no template entities are instantiated. However, the generated object files contain information about things that could have been instantiated in each compilation. For any source file that makes use of a template instantiation an associated ".ii" file is created if one does not already exist (e.g., the compilation of abc.C would result in the creation of abc.ii).

    2. When the object files are linked together, a program called the prelinker is run. It examines the object files, looking for references and definitions of template entities, and for the added information about entities that could be instantiated.

    3. If the prelinker finds a reference to a template entity for which there is no definition anywhere in the set of object files, it looks for a file that indicates that it could instantiate that template entity. When it finds such a file, it assigns the instantiation to it. The set of instantiations assigned to a given file is recorded in the associated ".ii" file.

    4. The prelinker then executes the compiler again to recompile each file for which the ".ii" file was changed.

    5. When the compiler compiles a file, it reads the ".ii" file for that file and obeys the instantiation requests therein. It produces a new object file containing the requested template entities (and all the other things that were already in the object file).

    6. The prelinker repeats steps 3--5 until there are no more instantiations to be adjusted.

    7. The object files are linked together.

    Once the program has been linked correctly, the ".ii" files contain a complete set of instantiation assignments. From then on, whenever source files are recompiled, the compiler will consult the ".ii" files and do the indicated instantiations as it does the normal compilations. That means that, except in cases where the set of required instantiations changes, the prelink step from then on will find that all the necessary instantiations are present in the object files and no instantiation assignment adjustments need be done. That's true even if the entire program is recompiled.

    If the programmer provides a specialization of a template entity somewhere in the program, the specialization will be seen as a definition by the prelinker. Since that definition satisfies whatever references there might be to that entity, the prelinker will see no need to request an instantiation of the entity. If the programmer adds a specialization to a program that has previously been compiled, the prelinker will notice that too and remove the assignment of the instantiation from the proper ".ii" file.

    The ".ii" files should not, in general, require any manual intervention. One exception: if a definition is changed in such a way that some instantiation no longer compiles (it gets errors), and at the same time a specialization is added in another file, and the first file is being recompiled before the specialization file and is getting errors, the ".ii" file for the file getting the errors must be deleted manually to allow the prelinker to regenerate it.

    If the prelinker changes an instantiation assignment, it will issue a message like

    C++ prelinker: A<int>::f() assigned to file test.o
    C++ prelinker: executing: /Comeau/bin/como -c test.c

    The automatic instantiation scheme can coexist with partial explicit control of instantiation by the programmer through the use of pragmas or command-line specification of the instantiation mode. See the following sections.

    The automatic instantiation mode can be configured out under a custom porting arrangement. It can be turned off by the command-line option -T. If automatic instantiation is turned off, the extra information about template entities that could be instantiated in a file is not put into the object file.

  3. Instantiation Modes

    Normally, when a file is compiled, no template entities are instantiated (except those assigned to the file by automatic instantiation). The overall instantiation mode can, however, be changed by a command line option:

    In the case where the como script is given a single file to compile and link, e.g.,

    como t.c

    the compiler knows that all instantiations will have to be done in the single source file. Therefore, it uses the -tused mode and suppresses automatic instantiation.

  4. Instantiation #pragma Directives

    Instantiation pragmas can be used to control the instantiation of specific template entities or sets of template entities. There are three instantiation pragmas:

    1. The instantiate pragma causes a specified entity to be instantiated.

    2. The do_not_instantiate pragma suppresses the instantiation of a specified entity. It is typically used to suppress the instantiation of an entity for which a specific definition will be supplied.

    3. The can_instantiate pragma indicates that a specified entity can be instantiated in the current compilation, but need not be; it is used in conjunction with automatic instantiation, to indicate potential sites for instantiation if the template entity turns out to be required.

    The argument to the instantiation pragma may be:

    A pragma in which the argument is a template class name (e.g., A<int> or class A<int>) is equivalent to repeating the pragma for each member function and static data member declared in the class. When instantiating an entire class a given member function or static data member may be excluded using the do_not_instantiate pragma. For example,

    #pragma instantiate A<int>
    #pragma do_not_instantiate A<int>::f

    The template definition of a template entity must be present in the compilation for an instantiation to occur. If an instantiation is explicitly requested by use of the instantiate pragma and no template definition is available or a specific definition is provided, an error is issued.

    template <class T> void f1(T);  // No body provided
    template <class T> void g1(T);  // No body provided
    void f1(int) {}  // Specific definition
    int main()
       int     i;
       double  d;
       return 0;
    #pragma instantiate void f1(int) // error - specific def.
    #pragma instantiate void g1(int)  // error - no body provided

    f1(double) and g1(double) will not be instantiated (because no bodies were supplied) but no errors will be produced during the compilation (if no bodies are supplied at link time, a linker error will be produced).

    A member function name (e.g., A<int>::f) can only be used as a pragma argument if it refers to a single user defined member function (i.e., not an overloaded function). Compiler-generated functions are not considered, so a name may refer to a user defined constructor even if a compiler-generated copy constructor of the same name exists. Overloaded member functions can be instantiated by providing the complete member function declaration, as in

    #pragma instantiate char* A<int>::f(int, char*)

    The argument to an instantiation pragma may not be a compiler-generated function, an inline function, or a pure virtual function.

  5. Implicit Inclusion

    When implicit inclusion is enabled, the front end is given permission to assume that if it needs a definition to instantiate a template entity declared in a ".h" file it can implicitly include the corresponding ".C" file to get the source code for the definition. For example, if a template entity ABC::f is declared in file xyz.h, and an instantiation of ABC::f is required in a compilation but no definition of ABC::f appears in the source code processed by the compilation, the compiler will look to see if a file xyz.C exists, and if so it will process it as if it were included at the end of the main source file.

    To find the template definition file for a given template entity the front end needs to know the full path name of the file in which the template was declared and whether the file was included using the system include syntax (e.g., #include <file.h>). This information is not available for preprocessed source containing #line directives. Consequently, the front end will not attempt implicit inclusion for source code containing #line directives.

    By default, the list is ".c", ".C", ".cpp", ".CPP", ".cxx", ".CXX", and ".cc". This can be modified under a custom porting arrangement.

    Implicit inclusion works well alongside automatic instantiation, but the two are independent. They can be enabled or disabled independently, and implicit inclusion is still useful when automatic instantiation is not done.

    The implicit inclusion mode can be configured out under a custom porting arrangement. It can be turned on by the command-line option -B.

    Implicit inclusions are only performed during the normal compilation of a file, (i.e., not when doing only preprocessing). A common means of investigating certain kinds of problems is to produce a preprocessed source file that can be inspected. When using implicit inclusion it is sometimes desirable for the preprocessed source file to include any implicitly included files. This may be done using the --no_preproc_only command line option. This causes the preprocessed output to be generated as part of a normal compilation. When implicit inclusion is being used, the implicitly included files will appear as part of the preprocessed output in the precise location at which they were included in the compilation.

    (c)© 1997-2013 Comeau Computing, EDG. All rights reserved.

    Comeau Computing
    91-34 120th Street
    Richmond Hill, NY 11418-3214

    Back to documentation Table of Contents
    /* the end */