bisonc++input - Organization of bisonc++’s grammar
  file(s)
Bisonc++ derives from bison++(1), originally derived
    from bison(1). Like these programs bisonc++ generates a parser
    for an LALR(1) grammar. Bisonc++ generates C++ code: an
    expandable C++ class.
Refer to bisonc++(1) for a general overview. This manual
    page covers the structure and organization of bisonc++’s
    grammar file(s).
Bisonc++’s grammar file has the following generic
    outline:
    directives (see the next section)
    %%
    grammar rules
        
Grammar rules have the following generic form:
    nonterminal:
        production-rules
    ;
        
Production rules consist of zero or more sequences of terminal
    tokens, nonterminal tokens and/or action blocks. When multiple production
    rules are used they must be separated from each other by vertical bars.
    Action blocks are C++ compound statements.
This manual page contains the following sections:
  - o
- DESCRIPTION: this section;
- o
- DIRECTIVES: bisonc++’s grammar-specification
      directives;
- o
- POLYMORPHIC SEMANTIC VALUES: how to use polymorphic semantic values
      in parsers generated by bisonc++;
- o
- DOLLAR NOTATIONS: available $-shorthand notations with single,
      union, and polymorphic semantic value types.
- o
- RESTRICTIONS ON TOKEN NAMES: name restrictions for user-defined
      symbols;
- o
- OBSOLETE SYMBOLS: symbols available to bison(1), but not to
      bisonc++;
- o
- USING SYMBOLIC TOKENS IN CLASSES OTHER THAN THE PARSER CLASS; how
      to refer to tokens defined in the grammar;
- o
- EXAMPLE: an example of using bisonc++;
- o
- SEE ALSO: references to other programs and documentation;
- o
- AUTHOR: at the end of this man-page.
    
  
Starting with version 6.02.00 bisonc++ reserved identifiers
    no longer end in two underscore characters, but in one. This modification
    was necessary because according to the C++ standard identifiers
    having two or more consecutive underscore characters are reserved by the
    language. In practice this could require some minor modifications of
    existing source files using bisonc++’s facilities, most likely
    limited to changing Tokens__ into Tokens_ and changing
    Meta__ into Meta_.
The complete list of affected names is:
  - Enums:
DebugMode_, ErrorRecovery_, Return_, Tag_, Tokens_
  - Enums values:
PARSE_ABORT_, PARSE_ACCEPT_, UNEXPECTED_TOKEN_,
  sizeofTag_
  - Type / namespace
    designators:
Meta_, PI_, STYPE_
  - Member functions:
clearin_, errorRecovery_, errorVerbose_, executeAction_,
  lex_, lookup_, nextCycle_, nextToken_, popToken_, pop_, print_, pushToken_,
  push_, recovery_, redoToken_, reduce_, savedToken_, shift_, stackSize_,
  startRecovery_, state_, token_, top_, vs_,
  - Protected data
    members:
d_acceptedTokens_, d_actionCases_, d_debug_, d_nErrors_,
  d_requiredTokens_, d_val_, idOfTag_, s_nErrors_
Quite a few directives can be specified in the initial section of
    the grammar specification file. If command-line options for directives are
    available, then their specifications take precedence over the corresponding
    directives in the grammar file. Once class header or implementation header
    files exist directives affecting those files are ignored.
Directives accepting a `filename’ do not accept path names,
    i.e., they cannot contain directory separators (/); directives
    accepting a ’pathname’ may contain directory separators. A
    ’pathname’ using blank characters should be surrounded by
    double quotes.
Some directives may generate errors. This happens when their
    specifications conflict with the contents of files bisonc++ cannot
    modify (e.g., a parser class header file exists, but doesn’t define a
    namespace, but in a later run the a %namespace directive was
    provided).
To resolve such errors the offending directive could be omitted,
    the existing file could be removed, or the existing file could be
    hand-edited according to the directive’s specification.
  - o
- %baseclass-header filename
- Filename defines the name of the file to contain the
      parser’s base class. This class defines, e.g., the parser’s
      symbolic tokens. Defaults to the name of the parser class plus the suffix
      base.h. This directive is overruled by the
      --baseclass-header (-b) command-line option.
- It is an error if this directive is used and an already existing parser
      class header file does not contain #include
      "filename".
- o
- %baseclass-preinclude pathname
- Pathname defines the path to the file preincluded by the
      parser’s base-class header. See the description of the
      --baseclass-preinclude option for details about this directive. By
      default, bisonc++ surrounds header by double quotes.
      However, when header itself is surrounded by pointed brackets
      #include <header> is included.
- o
- %class-header filename
- Filename defines the name of the file to contain the parser class.
      Defaults to the name of the parser class plus the suffix .h This
      directive is overruled by the --class-header (-c)
      command-line option.
- It is an error if this directive is used and an already existing
      implementation header file does not contain #include
      "filename".
- o
- %class-name parser-class-name
- Declares the name of the parser class. It defines the name of the
      C++ class that is generated. If no %class-name is specified
      the default class name Parser is used.
- It is an error if this directive is used and an already existing
      parser-class header file does not define class
      `className’ and/or if an already existing implementation
      header file does not define members of the class
      `className’.
- o
- %debug
- Add debugging code to the generated parse and its support
      functions, which can show (on the standard output stream) the steps
      performed by the parsing function while it parses input streams. When this
      directive is specified then the parsing steps are shown by default. The
      setDebug members can be used to suppress outputting these parsing
      steps. #ifdef DEBUG macros are not used. Existing debugging code
      can be removed by rerunning bisonc++ without specifying the
      debug option or directive.
- o
- %default-actions(d)(off|quiet|warn|std)
- By default, bisonc++ adds a $$ = $1 action block to rules
      not having final action blocks, but not to empty production rules. This
      default behavior can also explicitly be configured using the
      default-actions std option or directive.
- Bisonc++ also supports alternate ways of handling rules not having
      final action blocks. When off is specified, bisonc++ does
      not add $$ = $1 action blocks; when polymorphic semantic values are
      used, then specifying
- - warn adds specialized action blocks, using the semantic types of
      the first elements of the production rules, while issuing a warning;
- - quiet adds these action blocks without issuing warnings.
- When either warn or quiet are specified the types of $$ and
      $1 must match. When bisonc++ detects a type mismatches it issues
      errors.
- o
- %error-verbose
- This directive can be specified to dump the parser’s state stack to
      the standard output stream when the parser encounters a syntactic error.
      The stack dump shows on separate lines a stack index followed by the state
      stored at the indicated stack element. The first stack element is the
      stack’s top element.
- o
- %expect number
- This directive specifies the exact number of shift/reduce and
      reduce/reduce conflicts for which no warnings are to be generated. Details
      of the conflicts are reported in the verbose output file (e.g.,
      grammar.output). If the number of actually encountered conflicts
      deviates from `number’, then this directive is ignored.
- o
- %filenames filename
- Filename is a generic filename that is used for all header files
      generated by bisonc++. Options defining specific filenames are also
      available (which then, in turn, overrule the name specified by this
      directive). This directive is overruled by the --filenames
      (-f) command-line option.
- o
- %flex
- When provided, the scanner member returning the matched text is called as
      d_scanner.YYText(), and the scanner member returning the next
      lexical token is called as d_scanner.yylex(). This directive is
      only interpreted if the %scanner directive is also provided.
- o
- %implementation-header filename
- Filename defines the name of the file to contain the implementation
      header. It defaults to the name of the generated parser class plus the
      suffix .ih.
- The implementation header should contain all directives and declarations
      that are only used by the parser’s member functions. It is
      the only header file that is included by the source file containing
      parse’s implementation. User defined implementation of other
      class members may use the same convention, thus concentrating all
      directives and declarations that are required for the compilation of other
      source files belonging to the parser class in one header file.
- o
- %include pathname
- This directive is used to switch to pathname while processing a
      grammar specification. Unless pathname defines an absolute
      file-path, pathname is searched relative to the location of
      bisonc++’s main grammar specification file (i.e., the
      grammar file that was specified as bisonc++’s command-line
      option). This directive can be used to split long grammar specification
      files in shorter, meaningful units. After processing pathname
      processing continues beyond the %include pathname directive.
- o
- %left terminal ...
- Defines the names of symbolic terminal tokens that must be treated as
      left-associative. I.e., in case of a shift/reduce conflict, a reduction is
      preferred over a shift. Sequences of %left, %nonassoc,
      %right and %token directives may be used to define the
      precedence of operators. In expressions, the first used directive defines
      the tokens having the lowest precedence, the last used defines the tokens
      having the highest priority. See also %token below.
- o
- %locationstruct struct-definition
- Defines the organization of the location-struct data type LTYPE_.
      This struct should be specified analogously to the way the parser’s
      stacktype is defined using %union (see below). The location struct
      is named LTYPE_. By default (if neither locationstruct nor
      LTYPE_ is specified) the standard location struct (see the next
      directive) is used:
- o
- %lsp-needed
- This directive results in bisonc++ generating a parser using the
      standard location stack. This stack’s default type is:
    
 struct LTYPE_
 {
 int timestamp;
 int first_line;
 int first_column;
 int last_line;
 int last_column;
 char *text;
 };
 
 Bisonc++ does not provide the elements of the LTYPE_
      struct with values. Action blocks of production rules may refer to the
      location stack element associated with a production element using @
      variables, like @1.timestamp, @3.text, @5. The rule’s
      location struct itself may be referred to as either d_loc_ or
      @@.
- o
- %ltype typename
- Specifies a user-defined token location type. If %ltype is used,
      typename should be the name of an alternate (predefined) type
      (e.g., size_t). It should not be used if a %locationstruct
      specification is defined (see below). Within the parser class, this type
      is available as the type `LTYPE_’. All text on the line
      following %ltype is used for the typename specification. It
      should therefore not contain comment or any other characters that are not
      part of the actual type definition.
- o
- %namespace namespace
- Define all of the code generated by bisonc++ in the namespace
      namespace. By default no namespace is defined. If this directive is
      used the implementation header is provided with a commented out using
      namespace declaration for the specified namespace. In addition, the
      parser and parser base class header files also use the specified namespace
      to define their include guard directives.
- It is an error if this directive is used and an already existing
      parser-class header file and/or implementation header file does not define
      namespace identifier.
- o
- %negative-dollar-indices
- Do not generate warnings when zero- or negative dollar-indices are used in
      the grammar’s action blocks. Zero or negative dollar-indices are
      commonly used to implement inherited attributes, and should normally be
      avoided. When used, they can be specified like $-1, or like
      $<type>-1, where type is empty; an STYPE_ tag;
      or a field-name. However, note that in combination with the
      %polymorphic directive (see below) only the $-i format can
      be used.
- o
- %no-lines
- By default #line preprocessor directives are inserted just before
      action statements in the file containing the parser’s parse
      function. These directives are suppressed by the %no-lines
      directive.
- o
- %nonassoc terminal ...
- Defines the names of symbolic terminal tokens that should be treated as
      non-associative. I.e., in case of a shift/reduce conflict, a reduction is
      preferred over a shift. Sequences of %left, %nonassoc, %right and
      %token directives may be used to define the precedence of
      operators. In expressions, the first used directive defines the tokens
      having the lowest precedence, the last used defines the tokens having the
      highest priority. See also %token below.
- o
- %parsefun-source filename
- Filename defines the name of the file to contain the parser member
      function parse. Defaults to parse.cc. This directive is
      overruled by the --parse-source (-p) command-line
    option.
- o
- %polymorphic polymorphic-specification(s)
- Bison’s traditional way of handling multiple semantic values is to
      use a %union specification (see below). Although %union is
      supported by bisonc++, a polymorphic semantic value class is
      preferred due to its improved type safety.
- The %polymorphic directive defines a polymorphic semantic value
      class and can be used instead of a %union specification. Refer to
      section POLYMORPHIC SEMANTIC VALUES below or to
      bisonc++’s user manual for a detailed description of the
      specification, characteristics, and use of polymorphic semantic
    values.
- o
- %prec token
- Defines the precedence of a production rule. By default, production rules
      have priorities that are equal to the priorities of their first terminal
      tokens, or they receive the maximum possible priority if they don’t
      contain terminal tokens. To change a production rule’s default
      priority the %prec directive is used, which assigns the
      directive’s token’s priority to the production rule’s
      priority. A well known application of %prec is:
    
 expression:
 ’-’ expression %prec UMINUS
 {
 ...
 }
 
 Here, the default priority and precedence of the `-’ token as
      the subtraction operator is overruled by the precedence and priority of
      the UMINUS token, which is commonly defined as
 %right UMINUS
 
 (see below) following, e.g., the ’*’ and
      ’/’ operators.
- Refer to bisonc++’s user manual for a more elaborate
      coverage of the %prec directive.
- o
- %print-tokens
- The print directive provides an implementation of the Parser
      class’s print_ function displaying the current token value
      and the text matched by the lexical scanner as received by the generated
      parse function.
- o
- %prompt
- When adding debugging code (using the debug option or directive)
      the debug information is displayed continuously while the parser processes
      its input. When using the prompt directive the generated parser
      displays a prompt (a question mark) at each step of the parsing process.
      Caveat: when using this option the parser’s input cannot be
      provided at the parser’s standard input stream.
- o
- %required-tokens number
- Following a syntactic error, require at least number successfully
      processed tokens before another syntactic error can be reported. By
      default number is zero.
- o
- %right terminal ...
- Defines the names of symbolic terminal tokens that should be treated as
      right-associative. I.e., in case of a shift/reduce conflict, a shift is
      preferred over a reduction. Sequences of %left, %nonassoc, %right
      and %token directives may be used to define the precedence of
      operators. In expressions, the first used directive defines the tokens
      having the lowest precedence, the last used defines the tokens having the
      highest priority. See also %token below.
- o
- %scanner pathname
- Use pathname as the path name to the file pre-included in the
      parser’s class header. See the description of the --scanner
      option for details about this directive. Similar to the convention adopted
      for this argument, pathname by default is surrounded by double
      quotes. However, when the argument is surrounded by pointed brackets
      #include <pathname> is included. This directive results in
      the definition of a composed Scanner d_scanner data member
      into the generated parser, and in the definition of a int lex()
      member, returning d_scanner.lex().
- By specifying the %flex directive the function
      d_scanner.yylex() is called. Any other function to call can be
      specified using the --scanner-token-function option (or
      %scanner-token-function directive).
- It is an error if this directive is used and an already existing parser
      class header file does not include `pathname’.
- o
- %scanner-class-name scannerClassName
- Defines the name of the scanner class, declared by the pathname
      header file that is specified at the scanner option or directive.
      By default the class name Scanner is used.
- It is an error if this directive is used and either the scanner
      directive was not provided, or the parser class interface in an already
      existing parser class header file does not declare a scanner class
      d_scanner object.
- o
- %scanner-matched-text-function function-call
- The scanner function returning the text that was matched by the lexical
      scanner after its token function (see below) has returned. A complete
      function call expression should be provided (including a scanner object,
      if used). Example:
    
 %scanner-matched-text-function myScanner.matchedText()
 
 By specifying the %flex directive the function
      d_scanner.YYText() is called.
- If the function call contains white space scanner-token-function
      should be surrounded by double quotes.
- o
- %scanner-token-function function-call
- The scanner function returning the next token, called from the generated
      parser’s lex function. A complete function call expression
      should be provided (including a scanner object, if used). Example:
    
 %scanner-token-function d_scanner.lex()
 
 If the function call contains white space scanner-token-function
      should be surrounded by double quotes.
- It is an error if this directive is used and the scanner token function is
      not called from the code in an already existing implementation
    header.
- o
- %stack-expansion size Defines the number of elements to be
      added to the generated parser’s semantic value stack when it must
      be enlarged. By default 10 elements are added to the stack. This
      option/directive is interpreted only once, and only if size at
      least equals the default stack expansion size of 10.
- o
- %start nonterminal
- The nonterminal nonterminal should be used as the grammar’s
      start-symbol. If omitted, the first grammatical rule is used as the
      grammar’s starting rule. All syntactically correct sentences must
      be derivable from this starting rule.
- o
- %stype typename
- The type of the semantic value of nonterminal tokens. By default it is
      int. %stype, %union, and %polymorphic are mutually
      exclusive directives.
- Within the parser class, the semantic value type is available as the type
      `STYPE_’. All text on the line following %stype is
      used for the typename specification. It should therefore not
      contain comment or any other characters that are not part of the actual
      type definition.
- o
- %tag-mismatches on|off
- This directive is only interpreted when polymorphic semantic values are
      used. When on is specified (which is used by default) the
      parse member of the generated parser dynamically checks that the
      tag that is used when calling a semantic value’s get member
      matches the actual tag of the semantic value.
- If a mismatch is observed, then the parsing function aborts after
      displaying a fatal error message. If this happens, and if the
      option/directive debug was specified when bisonc++ created
      the parser’s parsing function, then the program can be rerun,
      specifying parser.setDebug(Parser::ACTIONCASES) before calling the
      parsing function. As a result the case-entry numbers of the switch,
      defined in the parser’s executeAction member, are inserted
      into the standard output stream. The action case number reported just
      before the program displays the fatal error message tells you in which of
      the grammar’s action block the error was encountered.
- o
- %target-directory pathname
- Pathname defines the directory where generated files should be
      written. By default this is the directory where bisonc++ is called.
      This directive is overruled by the --target-directory command-line
      option.
- o
- %thread-safe
- Only used with polymorphic semantic values, and then only required when
      the parser is used in multiple threads: it ensures that each
      thread’s polymorphic code only accesses its own parser’s
      error counting variable.
- o
- %token terminal ...
- Defines the names of symbolic terminal tokens. Sequences of %left,
      %nonassoc, %right and %token directives may be used to define
      the precedence of operators. In expressions, the first used directive
      defines the tokens having the lowest precedence, the last used defines the
      tokens having the highest priority. See also %token below.
- NOTE: Symbolic tokens are defined as enum-values in the
      parser’s base class. The names of symbolic tokens may not be equal
      to the names of the members and types defined by bisonc++ itself
      (see the next sections). This requirement is not enforced by
      bisonc++, but compilation errors may result if this requirement is
      violated.
- o
- %token-class classname
- Classname defines the name of the Tokens class that is
      defined when the %token-path directive or option (see below) is
      specified. If token-path isn’t specified then this directive
      is ignored. By default the class name Tokens is used.
- o
- %token-namespace namespace
- If token-path is specified (see below) then namespace
      defines the namespace of the Tokens class. By default no namespace
      is used.
- o
- %token-path pathname
- Pathname defines the path name of the file to contain the struct
      Tokens defining the enumeration Tokens_ containing the symbolic
      tokens of the generated grammar. If this option is specified the
      ParserBase class is derived from it, thus making the tokens
      available to the generated parser class. The name of the struct
      Tokens can be altered using the token-class directive or
      option. By default (if token_path is not specified) the tokens are
      defined as the enum Tokens_ in the ParserBase class. If
      pathname doesn’t exist it is created by bisonc++. If
      the file pathname already exists it is rewritten at each new run of
      bisonc++.
- o
- %type <type> nonterminal ...
- In combination with %polymorphic or %union: associate the
      semantic value of a nonterminal symbol with a polymorphic semantic value
      tag or union field defined by these directives.
- o
- %union union-definition
- Acts identically to the identically named bison and bison++
      declaration. Bisonc++ generates a union, named STYPE_, as
      its semantic type.
- o
- %weak-tags
- This directive is ignored unless the %polymorphic directive was
      specified. It results in the declaration of enum Tag_ rather
      than enum class Tag_. When in doubt, don’t use this
      directive.
    
  
Like bison(1), bisonc++ by default uses int
    semantic values, and also supports the %stype and %union
    directives for using single-type or traditional C-type unions as
    semantic values. These types of semantic values are covered in
    bisonc++’s manual.
In addition, the %polymorphic directive can be specified to
    generate a parser using `polymorphic’ semantic values. In this case
    semantic values are specified as pairs, consisting of tags (which are
    C++ identifiers), and C++ (pointer or value) type names. Tags
    and type names are separated by colons. Multiple tag and type name
    combinations are separated by semicolons, and an optional semicolon ends the
    final tag/type pair.
Here is an example, defining three semantic values: an int,
    a std::string and a std::vector<double>:
    %polymorphic INT: int; STRING: std::string; 
                 VECT: std::vector<double>
        
The identifier to the left of the colon is called the tag-identifier (or
  simply tag), and the type name to the right of the colon is called the
  type-name. Starting with bisonc++ version 4.12.00 the types no
  longer have to provide default constructors.
When polymorphic type-names refer to types that have not yet been
    declared by the parser’s base class header, then these types must be
    (directly or indirectly) declared in a header file whose location is
    specified using the %baseclass-preinclude directive.
%type directives are used to associate (non-)terminals with
    semantic value types. E.g., after:
    %polymorphic INT: int; TEXT: std::string
    %type <INT> expr
        
the expr nonterminal returns int semantic values. In a rule like:
    expr:
        expr ’+’ expr
        {
            // Action block: C++ statements here.
        }
        
symbols $$, $1, and $3 represent int values, and can be
  used that way in the C++ action block.
Definitions and declarations
The %polymorphic directive adds the following definitions
    and declarations to the generated base class header and parser source file
    (if the %namespace directive was used then all declared/defined
    elements are placed inside the namespace that is specified by the
    %namespace directive):
  - o
- All semantic value type identifiers are collected in a strongly typed
      `Tag_’ enumeration. E.g.,
    
 enum class Tag_
 {
 INT,
 STRING,
 VECT
 };
 
 
- o
- An anonymous enum defining the symbolic constant sizeofTag_
      equal to the number of tags in the Tag_ enumeration.
- o
- The namespace Meta_ contains almost all of the code implementing
      polymorphic values.
    
  
The namespace Meta_ contains, among other classes the class
    SType. The parser’s semantic value type STYPE_ is equal
    to Meta_::SType.
STYPE_ equals Meta_::SType
Meta_::SType provides the standard user interface for using
    polymorphic semantic data types. It declares the following public
  interface:
  - o
- Constructors: Default, copy and move constructors. No data can be
      retrieved from SType objects that were constructed by
      SType’s default constructors, but they can accept values of
      defined polymorphic types, which may then be retrieved from those
    objects.
- o
- Operators: The standard overloaded assignment operators (copy and move
      assignment operators) are available.
- In addition the members
    
 SType &operator=(Type const &value)
and
 SType &operator=(Type &&tmp)
 
 are defined for each of the polymorphic semantic value types. Up to version
      6.03.00 these members were defined as member templates, but sometimes
      awkward compilation errors were encountered as with member templates
      Type must exactly match one of the defined polymorphic semantic
      types since Type is used to determine the appropriate
      Meta_::Tag_ value. As a consequence, if, e.g., a polymorphic type
      %polymorphic INT: int is defined then an assignment like $$
      = true fails, since the inferred type is bool and no
      matching polymorphic type is available. Now that the assignment operators
      are defined as plain member functions this problem isn’t
      encountered anymore because standard type conversions may then be applied
      by the compiler. Note that ambiguities may still be encountered. If, e.g.,
      polymorphic types are defined for int and char and an
      expression like $$ = 30U is used the compiler cannot tell whether
      $$ refers to the int or to the char semantic value. A
      standard (static) cast, or explicitly calling the assign member
      (see the next item) solves these kind of ambiguities.
- When operator=(Type const &value) is used, the left-hand side
      SType object receives a copy of value; when
      operator=(Type &&tmp) is used, tmp is
      move-assigned to the left-hand side SType object;
- o
- void assign<tag>(Args &&...args) The tag
      template argument must be a Tag_ value. This member function
      constructs a semantic value of the type matching tag from the
      arguments that are passed to this member (zero arguments are OK if the
      type associated with tag supports default construction). The
      constructed value (not a copy of this value) is then stored in the
      STYPE_ object for which assign has been called.
- As a Meta_::Tag_ value must be specified when using assign
      the compiler can use the explicit tag to convert assign’s
      arguments to an SType object of the type matching the specified
      tag.
- The member assign can be used to store a specific polymorphic
      semantic value in an STYPE_ object. It differs from the set of
      operator=(Type) members in that assign accepts multiple
      arguments to construct the requested SType value from, whereas the
      operator= members only accept single arguments of defined
      polymorphic types.
- To initialize an STYPE_ object with a default STYPE_ value,
      direct assignment can be used (e.g., d_lval_ = STYPE_{}). To assign
      a semantic value to a production rule using assign the _$$
      notation must be used, as $$ is interpreted as the polymorphic
      value type that is associated with the production rule:
    
_$$.assign<Tag_::CHAR>(30U);
 
 
- o
- DataType &get<tag>(), and DataType const
      &get<tag>() const These members return references to the
      object’s semantic values. The tag must be a Tag_
      value: its specification tells the compiler which semantic value type it
      must use.
- When the option/directive tag-mismatches on was specified then
      get, when called from the generated parse function, performs
      a run-time check to confirm that the specified tag corresponds to
      object’s actual Tag_ value. If a mismatch is observed, then
      the parsing function aborts with a fatal error message. When shorthand
      notations (like $$ and $1) are used in production
      rules’ action blocks, then bisonc++ can determine the
      correct tag, preventing the run-time check from failing.
- But once a fatal error is encountered, it can be difficult to
      determine which action block generated the error. If this happens, then
      consider regenerating the parser specifying the --debug option,
      calling
    parser.setDebug(Parser::ACTIONCASES) before calling the parser’s parse function.
- Following this the case-entry numbers of the switch which is
      defined in the parser’s executeAction member are inserted
      into the standard output stream just before the matching statements are
      executed. The action case number that’s reported just before the
      program reports the fatal error tells you in which of the grammar’s
      action block the error was encountered.
- o
- Tag_ tag() const The tag matching the semantic value’s
      polymorphic type is returned. The returned value is a valid Tag_
      value when the SType object’s valid member returns
      true;
- By default, or after assigning a plain (default) STYPE_ object to
      an STYPE_ object (e.g., using a statement like $$ =
      STYPE_{}), valid returns false, and the tag
      member returns Meta_::sizeofTag_.
- o
- bool valid() const
- The value true is returned if the object contains a semantic value.
      Otherwise false is returned. Note that default STYPE_ values
      can be assigned to STYPE_ objects, but they do not represent valid
      semantic values. See also the previous description of the tag
      member.
    
  
Inside action blocks dollar-notations can be used to retrieve and
    assign values from/to the elements of production rules. Type directives are
    used to associates dollar-notations with semantic types.
When %stype is specified (and with the default int
    semantic value type) the following dollar-notations are available:
  - o
- $$ = 
- A value is assigned to the rule’s nonterminal’s semantic
      value. The right-hand side (rhs) of the assignment expression must be an
      expression of a type that can be assigned to the STYPE_ type.
- o
- $$(expr)
- Same as the previous dollar-notation: expr’s value is
      assigned to the rule’s nonterminal’s semantic value.
- o
- _$$
- This refers to the semantic value of the rule’s nonterminal.
- o
- $$
- Same as the previous item: this refers to the semantic value of the
      rule’s nonterminal.
- o
- $$.
- If STYPE_ is a class-type then this dollar-notation is shorthand
      for the member selector operator, applied to the rule’s
      nonterminal’s semantic value.
- o
- $$->
- If STYPE_ is a class-type then this dollar-notation is shorthand
      for the pointer to member operator, applied to the rule’s
      nonterminal’s semantic value.
- o
- _$1
- This refers to the current production rule’s first
      component’s semantic value.
- o
- $1
- Same as the previous dollar-notation: this refers to the current
      production rule’s first component’s semantic value.
- o
- $1.
- If STYPE_ is a class-type then this dollar-notation is shorthand
      for the member selector operator, applied to the current production
      rule’s first component’s semantic value.
- o
- $1->
- If STYPE_ is a class-type then this dollar-notation is shorthand
      for the pointer to member operator, applied to the current production
      rule’s first component’s semantic value.
- o
- _$-1
- This refers to the semantic value of a component in a production rule,
      listed immediately before the current rule’s nonterminal ($-2
      refers to a component used two elements before the current nonterminal,
      etc.).
- o
- $-1
- Same as the previous item: this refers to the semantic value of a
      component in a production rule, listed immediately before the current
      rule’s nonterminal.
- o
- $-1.
- If STYPE_ is a class-type then this dollar-notation is shorthand
      for the member selector operator, applied to the semantic value of some
      production rule element, 1 element before the current rule’s
      nonterminal.
- o
- $-1->
- If STYPE_ is a class-type then this dollar-notation is shorthand
      for the pointer to member operator, applied to the semantic value of some
      production rule element, 1 element before the current rule’s
      nonterminal.
    
  
When %union is specified these dollar-notations are
    available:
  - o
- $$ = 
- A value is assigned to the rule’s nonterminal’s semantic
      value. If the rule’s nonterminal was associated with one of the
      union’s field types, then the matching union field receives the
      value of the assignment expression’s right-hand side. If no
      association was defined then the variable representing the
      nonterminal’s semantic value is a plain union (i.e., STYPE_)
      variable.
- o
- $$(expr)
- Expr’s value is assigned to the rule’s
      nonterminal’s plain union (i.e., STYPE_) type. Any
      association that may have been defined between the nonterminal and a union
      field is ignored.
- o
- _$$
- This refers to the rule’s nonterminal’s plain union (i.e.,
      STYPE_) type. Any association that may have been defined between
      the nonterminal and a union field is ignored.
- o
- $$
- This refers to the rule’s nonterminal’s semantic value. If
      it was associated with one of the union’s types, then $$
      refers to the associated union field. If no association was defined then
      $$ represents a plain union (i.e., STYPE_) type of
    variable.
- o
- $$.
- If the rule’s nonterminal’s semantic value was associated
      with one of the union’s types, then $$. is shorthand for the
      member selector operator, applied to the associated union field type. If
      no association was defined then $$. is shorthand for the field
      selector operator, applied to the nonterminal’s semantic
      value’s plain union (i.e., STYPE_) type.
- o
- $$->
- If the rule’s nonterminal’s semantic value was associated
      with one of the union’s types, then $$-> is shorthand for
      the pointer to member operator, applied to the associated union field
      type. If no association was defined then an error message is issued, as
      the pointer to member operator is not defined for plain union types.
- o
- _$1
- This refers to the current production rule’s first
      component’s plain union (STYPE_) value.
- o
- $1
- This shorthand refers to the semantic value of the production
      rule’s first element. If it was associated with one of the
      union’s types, then $1 refers to the associated union field.
      If no association was defined then $1 represents a plain union
      (i.e., STYPE_) type of variable.
- o
- $1.
- If the production rule’s first component’s semantic value
      was associated with one of the union’s types, then $1. is
      shorthand for the member selector operator, applied to the associated
      union field type. If no association was defined then $1. is
      shorthand for the field selector operator, applied to the first
      component’s semantic value’s plain union (i.e.,
      STYPE_) type.
- o
- $1->
- If the production rule’s first component’s semantic value
      was associated with one of the union’s types, then $1->
      is shorthand for the pointer to member operator, applied to the associated
      union field type. If no association was defined then an error message is
      issued, as the pointer to member operator is not defined for plain union
      types.
- o
- _$-1
- This refers to the plain union (STYPE_) value of a component in a
      production rule, listed immediately before the current rule’s
      nonterminal ($-2 refers to a component used two elements before the
      current nonterminal, etc.).
- o
- $-1
- Same: this refers to the plain union (STYPE_) value of a component
      in a production rule, listed immediately before the current rule’s
      nonterminal ($-2 refers to a component used two elements before the
      current nonterminal, etc.).
- o
- $-1.
- This is shorthand for the field selector operator applied to to the plain
      union (STYPE_) value of some production rule element, 1 element
      before the current rule’s nonterminal.
- o
- $-1->
- This shorthand refers to tho pointer to member operator applied to the
      plain union (STYPE_) value of some production rule element, 1
      element before the current rule’s nonterminal. Its use results in
      an error message, as the pointer to member operator is not defined for
      plain union types.
- o
- $<field>-1
- This refers to the field union field of a component in a production
      rule, listed immediately before the current rule’s nonterminal.
      Note that the validity of the specified field for that particular
      component cannot be verified by bisonc++.
- o
- $<field>-1.
- This refers to the member selector operator of the field union
      field of a component in a production rule, listed immediately before the
      current rule’s nonterminal. Note that the validity of the specified
      field for that particular component cannot be verified by
    bisonc++.
- o
- $<field>-1-> This refers to the pointer to member operator
      of the field union field of a component in a production rule,
      listed immediately before the current rule’s nonterminal. Note that
      the validity of the specified field for that particular component cannot
      be verified by bisonc++.
    
  
When %polymorphic is specified these dollar-notations can
    be used:
  - o
- $$ = 
- A semantic value is assigned to the rule’s nonterminal’s
      semantic value. The right-hand side (rhs) of the assignment expression
      must be an expression of the type that is associated with $$. This
      assignment operation assumes that the type of the rhs-expression equals
      $$’s semantic value type. If the types don’t match the
      compiler issues a compilation error when compiling parse.cc.
      Casting the rhs to the correct value type is possible, but in that case
      the function call operator (see the next item) is preferred, as it does
      not require casting. If no semantic value type was associated with $$ then
      the assignment $$ = STYPE_{} can be used.
- o
- $$(expr)
- A value is assigned to the rule’s nonterminal’s semantic
      value. Expr must be of a type that can be statically cast to
      $$’s semantic value type. The required static_cast is
      generated by bisonc++ and doesn’t have to be specified for
      expr.
- o
- _$$
- This refers to the rule’s nonterminal’s semantic value,
      disregarding any polymorphic type that might have been associated with the
      rule’s nonterminal.
- o
- $$
- If no polymorphic type was associated with the rule’s nonterminal
      then this is shorthand for a reference to the rule’s plain
      STYPE_ value. If a polymorphic value type was associated with the
      rule’s nonterminal then this shorthand represents a reference to a
      value of that particular type.
- o
- $$.
- If no polymorphic type was associated with the rule’s nonterminal
      then this is shorthand for the member selector operator, applied to a
      reference to the rule’s nonterminal’s STYPE_ value.
      If a polymorphic value type was associated with the rule’s
      nonterminal then this shorthand represents the member selector operator,
      applied to a reference of that particular type.
- o
- $$->
- If no polymorphic type was associated with the rule’s nonterminal
      then this is shorthand for the pointer to member operator, applied to a
      reference to the rule’s nonterminal’s STYPE_ value.
      If a polymorphic value type was associated with the rule’s
      nonterminal then this shorthand represents the pointer to member operator,
      applied to a reference of that particular type.
- o
- _$1
- This refers to the current production rule’s first
      component’s generic STYPE_ value.
- o
- $1
- This shorthand refers to the semantic value of the production
      rule’s first element. If it was associated with a polymorphic type,
      then $1 refers to a value of that particular type. If no
      association was defined then $1 represents a generic STYPE_
      value.
- o
- $1.
- If the production rule’s first component’s semantic value
      was associated with a polymorphic type, then $1. is shorthand for
      the member selector operator, applied to the value of the associated
      polymorphic type. If no association was defined then $1. is
      shorthand for the member selector operator, applied to the first
      component’s generic STYPE_ value.
- o
- $1->
- If the production rule’s first component’s semantic value
      was associated with a polymorphic type, then $1-> is shorthand
      for the pointer to member operator, applied to the value of the associated
      polymorphic type. If no association was defined then $1. is
      shorthand for the pointer to member operator, applied to the first
      component’s generic STYPE_ value.
- o
- _$-1
- This refers to the generic (STYPE_) value of a component in a
      production rule, listed immediately before the current rule’s
      nonterminal ($-2 refers to a component used two elements before the
      current nonterminal, etc.).
- o
- $-1
- Same: this refers to the generic (STYPE_) value of a component in a
      production rule, listed immediately before the current rule’s
      nonterminal ($-2 refers to a component used two elements before the
      current nonterminal, etc.).
- o
- $-1.
- This is shorthand for the member selector operator applied to to the
      generic STYPE_ value of some production rule element, 1 element
      before the current rule’s nonterminal.
- o
- $-1->
- This is shorthand for the pointer to member operator applied to to the
      generic STYPE_ value of some production rule element, 1 element
      before the current rule’s nonterminal.
- o
- $<tag>-1
- This shorthand represents a reference to the semantic value of the
      polymorphic type associated with tag of some production rule
      element, 1 element before the current rule’s nonterminal.
- If, when using the generated parser’s class parse function,
      the polymorphic type of that element turns out not to match the type that
      is associated with tag then a run-time fatal error results.
- If that happens, and the debug option/directive had been specified
      when bisonc++ was run, then rerun the program after specifying
      parser.setDebug(Parser::ACTIONCASES) to locate the parse
      function’s action block where the fatal error was encountered.
- o
- $<tag>-1.
- This shorthand represents the member selector operator, applied to the
      semantic value of the polymorphic type associated with tag of some
      production rule element, 1 element before the current rule’s
      nonterminal.
- If, when using the generated parser’s class parse function,
      the polymorphic type of that element turns out not to match the type that
      is associated with tag then a run-time fatal error results. The
      procedure suggested at the previous ($<tag>-1) item for
      solving such errors can be applied here as well.
- o
- $<tag>-1->
- This shorthand represents the pointer to member selector operator, applied
      to the semantic value of the polymorphic type associated with tag
      of some production rule element, 1 element before the current
      rule’s nonterminal.
- If, when using the generated parser’s class parse function,
      the polymorphic type of that element turns out not to match the type that
      is associated with tag then a run-time fatal error results. The
      procedure suggested at the previous ($<tag>-1) item for
      solving such errors can be applied here as well.
    
  
To avoid collisions with names defined by the parser’s
    (base) class, the following identifiers should not be used as token
  names:
  - o
- Identifiers ending in an underscore;
- o
- Any of the following identifiers: ABORT, ACCEPT, ERROR, clearin,
      debug, or setDebug.
    
  
All DECLARATIONS and DEFINE symbols not listed above
    but defined in bison++ are obsolete with bisonc++. In
    particular, there is no %header{ ... %} section anymore. Also,
    all DEFINE symbols related to member functions are now obsolete.
    There is no need for these symbols anymore as they can simply be declared in
    the class header file and defined elsewhere.
The tokens defined in the grammar files processed by
    bisonc++ must usually also be available to the lexical scanner,
    returning those tokens when certain regular expressions are matched. E.g., a
    NUMBER token may be used in the grammar and the lexical scanner may
    be expected to return that token when the input matches the [0-9]+
    regular expression. To avoid circular dependencies among classes the tokens
    can be written to a separate file using the token-path directive or
    option. The location and name of this file is specified by the
    token-path specification, and is generated from scratch at every run
    of bisonc++. By default the grammar’s symbolic tokens are made
    available in the class Tokens, and classes may refer to its tokens
    using the Tokens class scope (e.g., Tokens::NUMBER).
Before bisonc++ version 6.04.00 tokens were made available
    by including the file parserbase.h, using a simple #define
    suggesting that the tokens were in fact defined by the parser class itself.
    Using this scheme lexical scanner specifications returned, e.g.,
    Parser::NUMBER when [0-9]+ was matched. Unless the
    token-path directive or option is used this approach is still
    available, but its use is deprecated.
Using a fairly traditional example, we construct a simple
    calculator below. The basic operators as well as parentheses can be used to
    specify expressions, and each expression should be terminated by a newline.
    The program terminates when a q is entered. Empty lines result in a
    mere prompt.
First an associated grammar is constructed. When a syntactic error
    is encountered all tokens are skipped until then next newline and a simple
    message is printed using the default error function. It is assumed
    that no semantic errors occur (in particular, no divisions by zero). The
    grammar is decorated with actions performed when the corresponding
    grammatical production rule is recognized. The grammar itself is rather
    standard and straightforward, but note the first part of the specification
    file, containing various other directives, among which the %scanner
    directive, resulting in a composed d_scanner object as well as an
    implementation of the member function int lex, and the
    %token-path directive, defining the class Tokens in he file
    ../scanner/tokens.h. In this example, the Scanner class is
    generated by flexc++(1). The details of constructing a class using
    flexc++ is beyond the scope of this man-page, but
    flexc++’s specification file is shown below.
Here is bisonc++’s input file:
%filenames parser
%scanner    ../scanner/scanner.h
%token-path ../tokens/tokens.h
                                // lowest precedence
%token  NUMBER                  // integral numbers
        EOLN                    // newline
%left   ’+’ ’-’ 
%left   ’*’ ’/’ 
%right  UNARY
                                // highest precedence 
%%
expressions:
    expressions  evaluate
|
    prompt
;
evaluate:
    alternative prompt
;
prompt:
    {
        prompt();
    }
;
alternative:
    expression EOLN
    {
        cout << $1 << endl;
    }
|
    ’q’ done
|
    EOLN
|
    error EOLN
;
done:
    {
        cout << "Done.\n";
        ACCEPT();
    }
;
expression:
    expression ’+’ expression
    {
        $$ = $1 + $3;
    }
|
    expression ’-’ expression
    {
        $$ = $1 - $3;
    }
|
    expression ’*’ expression
    {
        $$ = $1 * $3;
    }
|
    expression ’/’ expression
    {
        $$ = $1 / $3;
    }
|
    ’-’ expression      %prec UNARY
    {
        $$ = -$2;
    }
|
    ’+’ expression      %prec UNARY
    {
        $$ = $2;
    }
|
    ’(’ expression ’)’
    {
        $$ = $2;
    }
|
    NUMBER
    {
        $$ = stoul(d_scanner.matched());
    }
;
Bisonc++ processes this file, generating the following
    files:
  - o
- The parser’s base class, which should not be modified by the
      programmer:
- 
    // Generated by Bisonc++ V6.04.02 on Thu, 18 Mar 2021 13:30:50 +0100
// hdr/includes
#ifndef ParserBase_h_included
#define ParserBase_h_included
#include <exception>
#include <vector>
#include <iostream>
// $insert preincludes
#include "../tokens/tokens.h"
// hdr/baseclass
namespace // anonymous
{
 struct PI_;
}
// $insert parserbase
class ParserBase: public Tokens
{
 public:
 enum DebugMode_
 {
 OFF           = 0,
 ON            = 1 << 0,
 ACTIONCASES   = 1 << 1
 };
// $insert tokens
// $insert STYPE
typedef int STYPE_;
 private:
 // state  semval
 typedef std::pair<size_t, STYPE_> StatePair;
 // token   semval
 typedef std::pair<int,    STYPE_> TokenPair;
 int d_stackIdx = -1;
 std::vector<StatePair> d_stateStack;
 StatePair  *d_vsp = 0;       // points to the topmost value stack
 size_t      d_state = 0;
 TokenPair   d_next;
 int         d_token;
 bool        d_terminalToken = false;
 bool        d_recovery = false;
 protected:
 enum Return_
 {
 PARSE_ACCEPT_ = 0,   // values used as parse()’s return values
 PARSE_ABORT_  = 1
 };
 enum ErrorRecovery_
 {
 UNEXPECTED_TOKEN_,
 };
 bool        d_actionCases_ = false;    // set by options/directives
 bool        d_debug_ = true;
 size_t      d_requiredTokens_;
 size_t      d_nErrors_;                // initialized by clearin()
 size_t      d_acceptedTokens_;
 STYPE_     d_val_;
 ParserBase();
 void ABORT() const;
 void ACCEPT() const;
 void ERROR() const;
 STYPE_ &vs_(int idx);             // value stack element idx
 int  lookup_() const;
 int  savedToken_() const;
 int  token_() const;
 size_t stackSize_() const;
 size_t state_() const;
 size_t top_() const;
 void clearin_();
 void errorVerbose_();
 void lex_(int token);
 void popToken_();
 void pop_(size_t count = 1);
 void pushToken_(int token);
 void push_(size_t nextState);
 void redoToken_();
 bool recovery_() const;
 void reduce_(int rule);
 void shift_(int state);
 void startRecovery_();
 public:
 void setDebug(bool mode);
 void setDebug(DebugMode_ mode);
}; 
// hdr/abort
inline void ParserBase::ABORT() const
{
 throw PARSE_ABORT_;
}
// hdr/accept
inline void ParserBase::ACCEPT() const
{
 throw PARSE_ACCEPT_;
}
// hdr/error
inline void ParserBase::ERROR() const
{
 throw UNEXPECTED_TOKEN_;
}
// hdr/savedtoken
inline int ParserBase::savedToken_() const
{
 return d_next.first;
}
// hdr/opbitand
inline ParserBase::DebugMode_ operator&(ParserBase::DebugMode_ lhs,
 ParserBase::DebugMode_ rhs)
{
 return static_cast<ParserBase::DebugMode_>(
 static_cast<int>(lhs) & rhs);
}
// hdr/opbitor
inline ParserBase::DebugMode_ operator|(ParserBase::DebugMode_ lhs,
 ParserBase::DebugMode_ rhs)
{
 return static_cast<ParserBase::DebugMode_>(static_cast<int>(lhs) | rhs);
};
// hdr/recovery
inline bool ParserBase::recovery_() const
{
 return d_recovery;
}
// hdr/stacksize
inline size_t ParserBase::stackSize_() const
{
 return d_stackIdx + 1;
}
// hdr/state
inline size_t ParserBase::state_() const
{
 return d_state;
}
// hdr/token
inline int ParserBase::token_() const
{
 return d_token;
}
// hdr/vs
inline ParserBase::STYPE_ &ParserBase::vs_(int idx) 
{
 return (d_vsp + idx)->second;
}
#endif
- o
- The parser class parser.h itself. In the grammar specification
      various member functions are used (e.g., done) and prompt.
      These functions are so small that they can very well be implemented
      inline. Note that done calls ACCEPT to terminate further
      parsing. ACCEPT and related members (e.g., ABORT) can be
      called from any member called by parse. As a consequence, action
      blocks could contain mere function calls, rather than several statements,
      thus minimizing the need to rerun bisonc++ when an action is
      modified.
- Once bisonc++ has created parser.h additionally required
      members can be added to it (bisonc++ itself won’t modify
      parser.h anymore once it is created), resulting in the following
      final version:
- 
    // Generated by Bisonc++ V5.00.00 on Sun, 03 Apr 2016 17:49:17 +0200
#ifndef Parser_h_included
#define Parser_h_included
// $insert baseclass
#include "parserbase.h"
// $insert scanner.h
#include "../scanner/scanner.h"
#undef Parser
class Parser: public ParserBase
{
 // $insert scannerobject
 Scanner d_scanner;
 
 public:
 int parse();
 private:
 void error();                   // called on (syntax) errors
 int lex();                      // returns the next token from the
 // lexical scanner.
 void print();                   // use, e.g., d_token, d_loc
 void prompt();
 void done();
 // support functions for parse():
 void executeAction_(int ruleNr);
 void errorRecovery_();
 void nextCycle_();
 void nextToken_();
 void print_();
 void exceptionHandler(std::exception const &exc);
};
inline void Parser::prompt()
{
 std::cout << "? " << std::flush;
}
inline void Parser::done()
{
 std::cout << "Done\n";
 ACCEPT();
}
#endif
- o
- The file ../tokens/tokens.h is generated because of the
      %token-path directive. To avoid circular dependencies the tokens
      are made available in a separate file, allowing classes used by the parser
      to use the grammar’s tokens as well. Here is the file specifying
      the grammar’s tokens:
- 
    #ifndef INCLUDED_TOKENS_
#define INCLUDED_TOKENS_
struct Tokens
{
 // Symbolic tokens:
 enum Tokens_
 {
 NUMBER = 257,
 EOLN,
 UNARY,
 };
};
#endif
For the program no additional members had to be defined in the
    class Parser. The member function parse is defined by
    bisonc++ in the source file parse.cc, and it includes
    parser.ih.
As cerr is used in the grammar’s actions, a using
    namespace std or comparable directive is required. It is specified in
    parser.ih. Here is the implementation header declaring the standard
    namespace:
// Generated by Bisonc++ V5.00.00 on Sun, 03 Apr 2016 17:51:26 +0200
    // Include this file in the sources of the class Parser.
// $insert class.h
#include "parser.h"
inline void Parser::error()
{
    std::cerr << "Syntax error\n";
}
// $insert lex
inline int Parser::lex()
{
    return d_scanner.lex();
}
inline void Parser::print()         
{
    print_();           // displays tokens if --print was specified
}
inline void Parser::exceptionHandler(std::exception const &exc)         
{
    throw;              // re-implement to handle exceptions thrown by actions
}
    // Add here includes that are only required for the compilation 
    // of Parser’s sources.
    // UN-comment the next using-declaration if you want to use
    // int Parser’s sources symbols from the namespace std without
    // specifying std::
using namespace std;
In the current context the member function parse’s
    implementation is not very relevant (it should not be modified by the
    programmer anyway). It is not shown here, but is available as
    calculator/parser/parse.cc in the distribution’s demos/
    directory after building the calculator using the there provided
    build script.
The lexical scanner is generated by flexc++(1) from the
    following specification file, using the command flexc++ lexer:
// see also regression/calculator/scanner
%interactive
%filenames scanner
%%
[ \t]+                          // skip white space
\n                              return Tokens::EOLN;
[0-9]+                          return Tokens::NUMBER;
.                               return matched()[0];
%%
Finally, here is the program’s main function:
#include "parser/parser.h"
int main()
{
    Parser calculator;
    return calculator.parse();
}
bison(1), bison++(1), bisonc++(1),
    bisonc++api(3), bison.info (using texinfo), flexc++(1),
    https://fbb-git.gitlab.io/bisoncpp/
Lakos, J. (2001) Large Scale C++ Software Design, Addison
    Wesley.
  
  Aho, A.V., Sethi, R., Ullman, J.D. (1986) Compilers, Addison
  Wesley.
Frank B. Brokken (f.b.brokken@rug.nl).