Comeau C++ 4.0
Pre-Release
User-Documentation
Dialects and Modes
C Language Accepted
  1. C Dialect Accepted

    The front end accepts the ANSI C language as defined by X3.159--1989.

    The special comments recognized by the UNIX lint program - /*ARGSUSED*/, /*VARARGS*/ (with or without a count of non-varying arguments), and /*NOTREACHED*/ - are also recognized by the front end.

  2. ANSI C Extensions

    The following extensions are accepted (these are flagged if the -A or -a option is specified):

    1. A translation unit (input file) can contain no declarations.

    2. Comment text can appear at the ends of preprocessing directives.

    3. __ALIGNOF__ is similar to sizeof, but returns the alignment requirement value for a type, or 1 if there is no alignment requirement. It may be followed by a type or expression in parentheses:

      • __ALIGNOF__(type)

      • __ALIGNOF__(expression)

      The expression in the second form is not evaluated.

    4. __INTADDR__(expression) scans the enclosed expression as a constant expression, and converts it to an integer constant (it is used in the offsetof macro).

    5. Bit fields may have base types that are enums or integral types besides int and unsigned int. This matches A.6.5.8 in the ANSI Common Extensions appendix.

    6. In a custom porting arrangement, the address of a bit field may be taken if the bit field has the same size and alignment as one of the integral types. A warning is issued.

    7. The last member of a struct may have an incomplete array type. It may not be the only member of the struct (otherwise, the struct would have zero size). (Not allowed in C++.)

    8. A file-scope array may have an incomplete struct, union, or enum type as its element type. The type must be completed before the array is subscripted (if it is), and by the end of the compilation if the array is not extern. In C++, an incomplete class is also allowed.

    9. Static functions may be declared in function and block scopes. Their declarations are moved to the file scope.

    10. enum tags may be incomplete: one may define the tag name and resolve it (by specifying the brace-enclosed list) later.

    11. The values of enumeration constants may be given by expressions that evaluate to unsigned quantities that fit in the unsigned int range but not in the int range. A warning is issued for suspicious cases.
      	 /* When ints are 32 bits: */
      	 enum a {w = -2147483648};  /* No warning */
      	 enum b {x = 0x80000000};   /* No warning */
      	 enum c {y = 0x80000001};   /* No warning */
      	 enum d {z = 2147483649};   /* Warning */ 

    12. An extra comma is allowed at the end of an enum list. A remark is issued except in pcc mode.

    13. The final semicolon preceding the closing } of a struct or union specifier may be omitted. A warning is issued except in pcc mode.

    14. A label definition may be immediately followed by a right brace. (Normally, a statement must follow a label definition.) A warning is issued.

    15. An empty declaration (a semicolon with nothing before it) is allowed. A remark is issued.

    16. An initializer expression that is a single value and is used to initialize an entire static array, struct, or union need not be enclosed in braces. ANSI C requires the braces.

    17. In an initializer, a pointer constant value may be cast to an integral type if the integral type is big enough to contain it.

    18. The address of a variable with register storage class may be taken. A warning is issued.

    19. In an integral constant expression, an integer constant may be cast to a pointer type and then back to an integral type.

    20. In duplicate size and sign specifiers (e.g., short short or unsigned unsigned) the redundancy is ignored. A warning is issued.

    21. long float is accepted as a synonym for double.

    22. If long long is supported (it is by default under most ports, even if the "usual" C compiler does not support it)

      • the long long and unsigned long long types are accepted;

      • integer constants suffixed by LL are given the type long long, and those suffixed by ULL are given the type unsigned long long (any of the suffix letters may be written in lower case);

      • the specifier %lld (etc.) is recognized in printf and scanf formatting strings; and

      • the long long types are accommodated in the usual arithmetic conversions.

    23. The following is only allowed under a custom porting arrangement: pointers to incomplete arrays may be used in pointer addition, subtraction, and subscripting:
      	 int (*p)[];
      	 ...
      	 q = p[0]; 

      A warning is issued if the value added or subtracted is anything other than a constant zero. Since the type pointed to by the pointer has zero size, the value added to or subtracted from the pointer is multiplied by zero and therefore has no effect on the result. This could be arranged as a custom port.

    24. Benign redeclarations of typedef names are allowed. That is, a typedef name may be redeclared in the same scope as the same type. A warning is issued.

    25. Dollar signs can be accepted in identifiers through use of a command line option or by setting a configuration parameter (arranged as a custom port). The default is to not allow dollar signs in identifiers.

    26. Numbers are scanned according to the syntax for numbers rather than the pp-number syntax. Thus, 0x123e+1 is scanned as three tokens instead of one invalid token. (If the -A or -a option is specified, of course, the pp-number syntax is used.)

    27. Assignment and pointer difference are allowed between pointers to types that are interchangeable but not identical, for example, unsigned char * and char *. This includes pointers to same-sized integral types (e.g., typically, int * and long *). A warning is issued except in pcc mode. Assignment of a string constant to a pointer to any kind of character is allowed without a warning.

    28. Assignment of pointer types is allowed in cases where the destination type has added type qualifiers that are not at the top level (e.g., int ** to const int **). Comparisons and pointer difference of such pairs of pointer types are also allowed. A warning is issued.

    29. In operations on pointers, a pointer to void is always implicitly converted to another type if necessary, and a null pointer constant is always implicitly converted to a null pointer of the right type if necessary. In ANSI C, some operators allow such things, and others (generally, where it does not make sense) do not allow them.

    30. Pointers to different function types may be assigned or compared for equality (==) or inequality (!=) without an explicit type cast. A warning is issued. This extension is not allowed in C++ mode.

    31. A pointer to void may be implicitly converted to or from a pointer to a function type.

    32. By default (this can be changed via a custom porting arrangement) the #assert preprocessing extensions of AT&T System V release 4 are allowed. These allow definition and testing of predicate names. Such names are in a name space distinct from all other names, including macro names. A predicate name is given a definition by a preprocessing directive of the form:

      #assert name
      #assert name(token-sequence)

      which defines the predicate name. In the first form, the predicate is not given a value. In the second form, it is given the value token-sequence.

      Such a predicate can be tested in a #if expression, as follows:

      #name(token-sequence)

      which has the value 1 if a #assert of that name with that token-sequence has appeared, and 0 otherwise. A given predicate may be given more than one value at a given time.

      A predicate may be deleted by a preprocessing directive of the form:

      #unassert name
      #unassert name(token-sequence)

      The first form removes all definitions of the indicated predicate name; the second form removes just the indicated definition, leaving any others there may be.

    33. asm statements and declarations are accepted. This is disabled in strict ANSI C mode (-A or -a and -m options) since it conflicts with the ANSI C standard for something like:

      asm("xyz");

      which ANSI C interprets as a call of an implicitly-defined function asm and which by default the front end interprets as an asm statement.

    34. Under a custom porting arrangement, asm functions are accepted, and __asm is recognized as a synonym for asm. An asm function body is represented in the IL by an uninterpreted null-terminated string containing the text that appears in the source (including the text of source comments when custom configured as such). An asm function must be declared with no storage class, with a prototyped parameter list, and with no omitted parameters:
      asm void f(int,int) {
      	...
      } 

      If asm functions are recognized and the C-generating back end is used, it is required that ANSI-C be generated, not K&R C, because the asm function must be put out with a prototyped parameter list.

    35. By default (this can be changed nder a custom porting arrangement), an extension is supported to allow constructs similar to C++ anonymous unions, including the following:

      • not only anonymous unions but also anonymous structs are allowed - that is, their members are promoted to the scope of the containing struct and looked up like ordinary members;

      • they can be introduced into the containing struct by a typedef name - they needn't be declared directly, as with true anonymous unions; and

      • a tag may be declared (C mode only).

      Among the restrictions: the extension only applies to constructs within structs.

    36. Under most ports, an extension is supported to allow restrict as a type qualifier for object pointer types and function parameter arrays that decay to pointers. Its presence is recorded in the IL so that back ends can perform optimizations that would otherwise be prevented because of possible aliasing. This extension follows the NCEG proposal for incorporating restrict into C (see X3J11.1 Technical Report 2).

    37. Under a custom porting arrangement, the expression &... is accepted in the body of a function in which an ellipsis appears in the parameter list. It is needed to support some versions of macro va_start in stdarg.h.

    38. Under a custom porting arrangement, an ellipsis may appear by itself in the parameter list of a function declaration - e.g., f(...). A diagnostic is issued in strict ANSI C mode.

    39. External entities declared in other scopes are visible. A warning is issued.
      void f1(void) { extern void f(); }
      void f2() { f(); /* Using out of scope declaration */ } 

    40. Under a custom porting arrangement, but only when not compiling in strict ANSI C mode, end-of-line comments (using \\ as delimiter) are supported.

    41. In the following areas considered "undefined behavior" by the ANSI C standard, the front end does the following:

      • Adjacent wide and normal string literals are not concatenated unless wchar_t is defined as a character type (in which case wide and normal strings are the same).

      • In character and string escapes, if the character following the \ has no special meaning, the value of the escape is the character itself. Thus "\s" == "s". A warning is issued.

  3. K&R/pcc mode

    When pcc mode is specified, the front end accepts the traditional C language defined by The C Programming Language, first edition, by Kernighan and Ritchie (K&R), Prentice-Hall, 1978. In addition, it provides almost complete compatibility with the Reiser cpp and Johnson pcc widely used as part of UNIX systems; since there is no documentation of the exact behavior of those programs, complete compatibility cannot be guaranteed.

    In general, when compiling in pcc mode, the front end attempts to interpret a source program that is valid to pcc in the same way that pcc would. However, ANSI features that do not conflict with this behavior are not disabled.

    In some cases where pcc allows a highly questionable construct, the front end will accept the construct but gives a warning, where pcc would be silent (for example: 0x, a degenerate hexadecimal number, is accepted as zero).

    The known cases where the front end is not compatible with pcc are the following:

    1. Token pasting is not done outside of macro expansions (i.e., in the primary source line) when two tokens are separated only by a comment. That is, a/**/b is not considered to be ab. The pcc behavior in that case can be gotten by preprocessing to a text file and then compiling that file.

      The textual output from preprocessing is also equivalent but not identical: the blank lines and white space will not be exactly the same.

    2. pcc will consider the result of a ?: operator to be an lvalue if the first operand is constant and the second and third operands are compatible lvalues. This front end will not.

    3. pcc mis-parses the third operand of a ?: operator in a way that some programs exploit:

      i ? j : k += l

      is parsed by pcc as

      i ? j : (k += l)

      which is not correct, since the precedence of += is lower than the precedence of ?:. This front end will generate an error for that case.

    4. lint recognizes the keywords for its special comments anywhere in a comment, regardless of whether or not they are preceded by other text in the comment. The front end only recognizes the keywords when they are the first identifier following an optional initial series of blanks and/or horizontal tabs. lint also recognizes only a single digit of the VARARGS count; the front end will accumulate as many digits as appear.

    The differences in pcc mode relative to the default ANSI mode are as follows:

    1. The keywords signed, const, and volatile are disabled, to avoid problems with items declared with those names in old-style code. Those keywords were ANSI C inventions. The other non-K&R keywords (enum and void) are judged to have existed already in code and are not disabled.

    2. Declarations of the form

      typedef some-type void;

      are ignored.

    3. Assignment is allowed between pointers and integers, and between incompatible pointer types, without an explicit cast. A warning is issued.

    4. A field selection of the form p->field is allowed even if p does not point to a struct or union that contains field. p must be a pointer or an integer. Likewise, x.field is allowed even if x is not a struct or union that contains field. x must be an lvalue. For both cases, all definitions of field as a field must have the same offset within their struct or union.

    5. By default (this can be changed in a custom porting arrangement) overflows detected while folding signed integer operations on constants will cause warnings rather than errors. Usually this should be set to match the desired target machine behavior on integer operations in C.

    6. By default (this can be changed in a custom porting arrangement) integral types with the same representation (size, signedness, and alignment) will be considered identical and may be used interchangeably. For example, this means that int and long will be interchangeable if they have the same size.

    7. A warning will be issued for a & applied to an array. The type of such an operation is "address of array element" rather than "address of array".

    8. For the shift operators << and >>, the usual arithmetic conversions are done, the right operand is converted to int, and the result type is the type of the left operand. In ANSI C, the integral promotions are done on the two operands, and the result type is the type of the left operand. The effect of this difference is that, in pcc mode, a long shift count will force the shift to be done as long.

    9. When preprocessing output is generated, the line-identifying directives will have the pcc form instead of the ANSI form.

    10. String literals will not be shared. Identical string literals will cause multiple copies of the string to be allocated.

    11. sizeof may be applied to bit fields; the size is that of the underlying type (e.g., unsigned int).

    12. lvalues cast to a type of the same size remain lvalues, except when they involve a floating-point conversion.

    13. When a function parameter list begins with a typedef identifier, the parameter list is considered prototyped only if the typedef identifier is followed by something other than a comma or right parenthesis:
      typedef int t;
      int f(t) {}   /* Old-style list */
      int g(t x) {} /* Prototyped list, parameter x of type t */ 

      That is, function parameters are allowed to have the same names as typedefs. In the normal ANSI mode, of course, any parameter list that begins with a typedef identifier is considered prototyped, so the first example above would give an error.

    14. The names of functions and of external variables are always entered at the file scope.

    15. A function declared static, used, and never defined is treated as if its storage class were extern.

    16. A file-scope array that has an unspecified storage class and remains incomplete at the end of the compilation will be treated as if its storage class is extern (in ANSI mode, the number of elements is changed to 1, and the storage class remains unspecified).

    17. The empty declaration

      struct x;

      will not hide an outer-scope declaration of the same tag.

    18. In a declaration of a member of a struct or union, no diagnostic is issued for omitting the declarator list; nevertheless, such a declaration has no effect on the layout. For example,
      struct s {char a; int; char b[2];} v;
      /* sizeof(v) is 3 */ 

    19. enums are always given type int. Under a custom porting arrangement, smaller integral types will be used if possible.

    20. No warning is generated for a storage specifier appearing in other than the first position in a list of specifiers (as in int static).

    21. short, long, and unsigned are treated as "adjectives" in type specifiers, and they may be used to modify a typedef type.

    22. A "plain" char is considered to be the same as either signed char or unsigned char, depending on the installation default and command-line options. In ANSI C, "plain" char is a third type distinct from both signed char and unsigned char.

    23. Free-standing tag declarations are allowed in the parameter declaration list for a function with old-style parameters.

    24. float function parameters are promoted to double function parameters.

    25. float functions are promoted to double functions.

    26. Declaration specifiers are allowed to be completely omitted in declarations (ANSI C allows this only for function declarations). Thus

      i;

      declares i as an int variable. A warning is issued.

    27. The = preceding an initializer may be omitted. A warning is issued. See K&R first edition, Appendix A, section 17 (anachronisms). This is accepted only under a custom porting arrangement.

    28. All float operations are done as double.

    29. __STDC__ is left undefined.

    30. Extra spaces to prevent pasting of adjacent confusable tokens are not generated in textual preprocessing output.

    31. The first directory searched for include files is the directory containing the file containing the #include instead of the directory containing the primary source file.

    32. Trigraphs are not recognized.

    33. Comments are deleted entirely (instead of being replaced by one space) in preprocessing output.

    34. 0x is accepted as a hexadecimal 0, with a warning.

    35. 1E+ is accepted as a floating-point constant with an exponent of 0, with a warning.

    36. The compound assignment operators may be written as two tokens (e.g., += may be written + =).

    37. The compound assignment operators may be written in their old-fashioned reversed forms (e.g., -= may be written =-). A warning is issued. This is described in K&R first edition, Appendix A, section 17 (anachronisms). This is accepted only under a custom porting arrangement.

    38. The digits 8 and 9 are allowed in octal constants.

    39. A warning rather than an error is issued for integer constants that are larger than can be accommodated in an unsigned long. The value is truncated to an acceptable number of low-order bits.

    40. The types of large integer constants are determined according to the K&R rules (they won't be unsigned in some cases where ANSI C would define them that way). Integer constants with apparent values larger than LONG_MAX are typed as long and are also marked as "non-arithmetic", which suppresses some warnings when using them.

    41. The escape \a (alert) is not recognized in character and string constants.

    42. Macro expansion is done differently. Arguments to macros are not macro-expanded before being inserted into the expansion of the macro. Any macro invocations in the argument text are expanded when the macro expansion is rescanned. With this method, macro recursion is possible and is checked for.

    43. Token pasting inside macro expansions is done differently. End-of-token markers are not maintained, so tokens that abut after macro substitution may be parsed as a single token.

    44. Macro parameter names inside character and string constants are recognized and substituted for.

    45. Macro invocations having too many arguments are flagged with a warning rather than an error. The extra arguments are ignored.

    46. Macro invocations having too few arguments are flagged with a warning rather than an error. A null string is used as the value of the missing parameters.

    47. Extra #elses (after the first has appeared in an #if block) are ignored, with a warning.

    48. Expressions in a switch statement are cast to int; this differs from the ANSI C definition in that a long expression is (possibly) truncated.

    49. The promotion rules for integers are different: unsigned char and unsigned short are promoted to unsigned int.

    50. An identifier in a function is allowed to have the same name as a parameter of the function. A warning is issued.

  4. Extensions Accepted in SVR4 Compatibility Mode

    The following extensions are accepted in SVR4 C compatibility mode:

    1. Macro invocations having too many arguments are flagged with a warning rather than an error. The extra arguments are ignored.

    2. Macro invocations having too few arguments are flagged with a warning rather than an error. A null string is used as the value of the missing parameters.

    3. The sequence /**/ in a macro definition is treated as equivalent to the token-pasting operator ##.

    4. lvalues cast to a type of the same size remain lvalues, except when they involve a floating-point conversion.

    5. Assignment is allowed between pointers and integers, and between incompatible pointer types, without an explicit cast. A warning is issued.

    6. A field selection of the form p->field is allowed even if p does not point to a struct or union that contains field. p must be a pointer. Likewise, x.field is allowed even if x is not a struct or union that contains field. x must be an lvalue. For both cases, all definitions of field as a field must have the same offset within their struct or union.

    7. In an integral constant expression, an integer constant may be cast to a pointer type and then back to an integral type.

    8. Incompatible external object declarations are allowed if the object types share the same underlying representation.

    9. Certain incompatible function declarations are allowed. A warning is issued.
      typedef unsigned int size_t; 
      extern size_t strlen(const char *); 
      extern int strlen();  /* Warning */ 

  5. Extensions Accepted in Microsoft Mode (C and C++)

    When Microsoft mode is used, the following extensions are allowed. These extensions are accepted in both C and C++.

    1. The calling convention keywords __cdecl, __fastcall, and __stdcall are recognized (as are alternative versions _cdecl, _fastcall, and _stdcall). Syntactically, these are treated like pointer declarators and are recognized where pointer declarators are allowed.
      int __cdecl f();
      char * __cdecl g();
      void (__cdecl *fp)(); 

      Calling conventions are allowed on object declarations, but are ignored.

    2. The __inline keyword is recognized (as is the alternate form _inline).

    3. The __declspec storage class is recognized. The dllimport, dllexport, naked, and thread modifiers are recognized as extended declaration modifiers.
      __declspec(dllimport) int f();
      declspec(dllexport) void g(){}
      __declspec(naked) void h(){}
      __declspec(thread) int i; 
      

    4. The __unaligned keyword is recognized and accepted as a type qualifier.

    5. The __based keyword is recognized and information about based pointer declarations is passed on to the back end. Only the form __based(variable) is accepted; the other (16-bit mode) forms are not.

    6. The Structured Exception Handling statements are recognized:

      __try { statements; } __finally { statements; }
      __try {
      statements; } __except( expression ) { statements; }
      __leave;

    7. #pragma pack is recognized, including the enhanced syntax that permits the user to manage a stack of packing alignment values by means of special keywords push and pop. In addition, packing may be controlled from the command line (see --pack_alignment=n, which corresponds to /Zpn). (These features are enabled in most ports.)

    8. Two-stage bit-field allocation is done under a custom porting arrangement.

    9. Extended support of anonymous unions, including "anonymous structs" and (in C++) "anonymous classes", is provided (This is supported by default (this can be changed under a custom porting arrangement)).

    10. End-of-line comments (using // as delimiter) are supported in C mode.

    11. A nonconstant may appear in the initializer list of an aggregate variable with automatic storage class, even in C mode.

    12. If the final field of a struct is of incomplete array type, it can be initialized by one or more values in an aggregate initializer list:
      struct S { 
      	int i, j; 
      	int a[]; 
      } x = { 0,0,0,0 };
      /* Initializes x.a[0] and x.a[1] to 0. */ 

      (Also, such fields may be declared with [0] instead of [].) This is allowed in C++ mode only if the class is an aggregate.

    13. An enumeration type may be declared and then used before its definition is provided:

      enum E x;
      E f(E e) { return e; }
      enum E { a,b,c }; 

    14. Lvalue casts are permitted. In C or C++ modes, casting an lvalue of integral type to the same integral type yields an lvalue. In C mode, lvalue casts involving integral types of unequal size are also allowed.

    15. asm statements may be of the form:

      asm assembly-instruction
      asm {
      assembly-instruction-list }

      __asm
      and _asm are recognized as synonyms for asm.

    16. In C mode a warning is issued instead of an error when a function is redeclared with an incompatible type.

    17. Type specifiers __int16,__int32, and __int64 are accepted, as long as there are corresponding integer kinds in the target environment, and __int8 is treated as equivalent to char if target chars are defined to have exactly 8 bits.

    18. When the 16-bit-mode extensions are enabled, near and far are supported.
      int far *p;       // p is a "far" pointer
      int far x;        // x is allocated in the far data segment
      int far f(float); // f is called using a far call 

      _near, __near, _far, and __far are also accepted.

    19. The sequence "//" formed during a macro expansion is treated as the beginning of a comment.
      #define COMMENT /##/
      COMMENT This is ignored 

    20. Functions returning values of class type in C++ mode are considered to return lvalues.

    21. In C mode struct/union tags declared in a prototype scope are placed in the surrounding file or block scope instead (i.e., the C++ rule is used):
      int f(struct S *);
      int i = f((struct S *)0);  /* Okay, same S. */ 

    22. In C mode implicit conversions are allowed between pointers to different types, with a warning.
      int f(int *);
      int i = f((float *)0);  /* Okay, with a warning. */ 

    23. The second and third operands of the "?" operator can be a scalar and a void operand. A warning is issued.
      void f();
      void g() {
      	int i = 0;
      	i ? i : f(); // Okay, with a warning.
      } 

    24. When disambiguation requires deciding whether something is a parameter declaration or an argument expression the declaration
      int xyz(int()); 

      declares a function named xyz, that takes a parameter of type "function taking no arguments and returning an int." In Microsoft mode this is interpreted as a declaration of an object that is initialized with the value int() (which evaluates to zero).

    25. Extra #else directives (i.e., after the first) encountered in a skipped section of an #if are ignored, with a warning.

    26. A storage class (static or extern) is accepted on a friend function declaration.

    27. In C mode the enumerator list may be empty in an enum declaration (i.e., the C++ rule is used).

    28. In C mode an ellipsis is permitted (but ignored) in an old-style parameter list declaration, e.g,,
      void f(x, ...) int x; { }  /* Okay, with a warning. */

    29. In C mode _alloca is predeclared to accept an argument of type size_t and return a void * pointer.

    30. In a functional-notation type conversion, a multi-token type such as "unsigned int(x)" is accepted.

(c)© 1997-2013 Comeau Computing, EDG. All rights reserved.

Comeau Computing
91-34 120th Street
Richmond Hill, NY 11418-3214

Back to documentation Table of Contents
http://www.comeaucomputing.com
/* the end */