World Library  
Flag as Inappropriate
Email this Article

GNU bison

Article Id: WHEBN0000053189
Reproduction Date:

Title: GNU bison  
Author: World Heritage Encyclopedia
Language: English
Subject: Berkeley Yacc, Yacc, GNU Compiler Collection, Outline of software engineering, Simple LR parser
Collection: Compiling Tools, Cross-Platform Software, Gnu Project Software, Parser Generators
Publisher: World Heritage Encyclopedia

GNU bison

GNU Bison
Developer(s) Robert Corbett, The GNU Project
Stable release 3.0.2 (December 5, 2013 (2013-12-05)[1])
Operating system Cross-platform
Type Parser generator
License GPL (free software)

GNU bison, commonly known as Bison, is a parser generator that is part of the GNU Project. Bison reads a specification of a context-free language, warns about any parsing ambiguities, and generates a parser (either in C, C++, or Java) which reads sequences of tokens and decides whether the sequence conforms to the syntax specified by the grammar. Bison by default generates LALR parsers but can also create GLR parsers.[2]

In POSIX mode, Bison is compatible with yacc, but also has several improvements over this earlier program. flex, an automatic lexical analyser, is often used with Bison, to tokenise input data and provide Bison with tokens.

Bison was originally written by Robert Corbett in 1988. Later, in 1990, Robert Corbett wrote another parser generator named Berkeley Yacc. Bison was made Yacc-compatible by Richard Stallman.

Bison is free software and is available under the GNU General Public License, with an exception (discussed below) allowing its generated code to be used without triggering the copyleft requirements of the licence.


  • A complete reentrant parser example 1
  • Reentrancy 2
  • Using Bison from other languages 3
  • Licence and distribution of generated code 4
    • A GPL-compatible licence is not required 4.1
    • Distribution of packages using Bison 4.2
  • Where is it used? 5
  • See also 6
  • References 7
  • Further reading 8
  • External links 9

A complete reentrant parser example

The following example shows how to use Bison and flex to write a simple calculator program (only addition and multiplication) and a program for creating an abstract syntax tree. The next two files provide definition and implementation of the syntax tree functions.

 * Expression.h
 * Definition of the structure used to build the syntax tree.
#ifndef __EXPRESSION_H__
#define __EXPRESSION_H__

 * @brief The operation type
typedef enum tagEOperationType
} EOperationType;

 * @brief The expression structure
typedef struct tagSExpression
    EOperationType type;///< type of operation

    int value;///< valid only when type is eVALUE
    struct tagSExpression *left; ///< left side of the tree
    struct tagSExpression *right;///< right side of the tree
} SExpression;

 * @brief It creates an identifier
 * @param value The number value
 * @return The expression or NULL in case of no memory
SExpression *createNumber(int value);

 * @brief It creates an operation
 * @param type The operation type
 * @param left The left operand
 * @param right The right operand
 * @return The expression or NULL in case of no memory
SExpression *createOperation(EOperationType type, SExpression *left, SExpression *right);

 * @brief Deletes a expression
 * @param b The expression
void deleteExpression(SExpression *b);

#endif // __EXPRESSION_H__
 * Expression.c
 * Implementation of functions used to build the syntax tree.

#include "Expression.h"


 * @brief Allocates space for expression
 * @return The expression or NULL if not enough memory
static SExpression *allocateExpression()
    SExpression *b = (SExpression *)malloc(sizeof(SExpression));

    if (b == NULL)
        return NULL;

    b->type = eVALUE;
    b->value = 0;

    b->left = NULL;
    b->right = NULL;

    return b;

SExpression *createNumber(int value)
    SExpression *b = allocateExpression();

    if (b == NULL)
        return NULL;

    b->type = eVALUE;
    b->value = value;

    return b;

SExpression *createOperation(EOperationType type, SExpression *left, SExpression *right)
    SExpression *b = allocateExpression();

    if (b == NULL)
        return NULL;

    b->type = type;
    b->left = left;
    b->right = right;

    return b;

void deleteExpression(SExpression *b)
    if (b == NULL)



The tokens needed by the Bison parser will be generated using flex.

 * Lexer.l file
 * To generate the lexical analyzer run: "flex Lexer.l"
#include "Expression.h"
#include "Parser.h"


%option outfile="Lexer.c" header-file="Lexer.h"
%option warn nodefault
%option reentrant noyywrap never-interactive nounistd
%option bison-bridge
LPAREN      "("
RPAREN      ")"
PLUS        "+"
NUMBER      [0-9]+
WS          [ \r\n\t]*
{WS}            { /* Skip blanks. */ }
{NUMBER}        { sscanf(yytext, "%d", &yylval->value); return TOKEN_NUMBER; }
{MULTIPLY}      { return TOKEN_MULTIPLY; }
{PLUS}          { return TOKEN_PLUS; }
{LPAREN}        { return TOKEN_LPAREN; }
{RPAREN}        { return TOKEN_RPAREN; }
.               {  }
int yyerror(const char *msg) {
    fprintf(stderr,"Error:%s\n",msg); return 0;

Since the tokens are provided by flex we must provide the means to communicate between the parser and the lexer.[3] The data type used for communication, YYSTYPE, is set using Bison's %union declaration.

Since in this sample we use the reentrant version of both flex and yacc we are forced to provide parameters for the yylex function, when called from yyparse.[3] This is done through Bison's %lex-param and %parse-param declarations.[4]

 * Parser.y file
 * To generate the parser run: "bison Parser.y"
#include "Expression.h"
#include "Parser.h"
#include "Lexer.h"

int yyerror(SExpression **expression, yyscan_t scanner, const char *msg) {
    // Add error handling routine as needed

%code requires {

typedef void* yyscan_t;


%output  "Parser.c"
%defines "Parser.h"
%define api.pure
%lex-param   { yyscan_t scanner }
%parse-param { SExpression **expression }
%parse-param { yyscan_t scanner }

%union {
    int value;
    SExpression *expression;
%left '+' TOKEN_PLUS

%type  expr
    : expr { *expression = $1; }
    : expr[L] TOKEN_PLUS expr[R] { $$ = createOperation( ePLUS, $L, $R ); }
    | expr[L] TOKEN_MULTIPLY expr[R] { $$ = createOperation( eMULTIPLY, $L, $R ); }
    | TOKEN_LPAREN expr[E] TOKEN_RPAREN { $$ = $E; }
    | TOKEN_NUMBER { $$ = createNumber($1); }

The code needed to obtain the syntax tree using the parser generated by Bison and the scanner generated by flex is the following.

 * main.c file

#include "Expression.h"
#include "Parser.h"
#include "Lexer.h"
int yyparse(SExpression **expression, yyscan_t scanner);
SExpression *getAST(const char *expr)
    SExpression *expression;
    yyscan_t scanner;
    YY_BUFFER_STATE state;
    if (yylex_init(&scanner)) {
        // couldn't initialize
        return NULL;
    state = yy_scan_string(expr, scanner);
    if (yyparse(&expression, scanner)) {
        // error parsing
        return NULL;
    yy_delete_buffer(state, scanner);
    return expression;
int evaluate(SExpression *e)
    switch (e->type) {
        case eVALUE:
            return e->value;
        case eMULTIPLY:
            return evaluate(e->left) * evaluate(e->right);
        case ePLUS:
            return evaluate(e->left) + evaluate(e->right);
            // shouldn't be here
            return 0;
int main(void)
    SExpression *e = NULL;
    char test[]=" 4 + 2*10 + 3*( 5 + 1 )";
    int result = 0;
    e = getAST(test);
    result = evaluate(e);
    printf("Result of '%s' is %d\n", test, result);
    return 0;

A simple makefile to build the project is the following.

# Makefile

FILES   = Lexer.c Parser.c Expression.c main.c
CC      = g++
CFLAGS  = -g -ansi

test:           $(FILES)
                $(CC) $(CFLAGS) $(FILES) -o test

Lexer.c:        Lexer.l 
                flex Lexer.l

Parser.c:       Parser.y Lexer.c
                bison Parser.y

                rm -f *.o *~ Lexer.c Lexer.h Parser.c Parser.h test


Reentrancy is a feature which has been added to Bison and does not exist in Yacc.

Normally, Bison generates a parser which is not reentrant. In order to achieve reentrancy the declaration %define api.pure must be used. More details on Bison reentrancy can be found in the Bison manual.[5]

Using Bison from other languages

Bison can only generate code for C, C++ and Java.[6] For using the Bison generated parser from other languages a language binding tool such as SWIG can be used.

Licence and distribution of generated code

Because Bison generates source code that in turn gets added to the source code of other software projects, it raises some simple but interesting copyright questions.

A GPL-compatible licence is not required

The code generated by Bison includes significant amounts of code from the Bison project itself. The Bison package is distributed under the terms of the GNU General Public License (GPL) but an exception has been added so that the GPL does not apply to output.[7][8]

Earlier releases of Bison stipulated that parts of its output were also licensed under the GPL, due to the inclusion of the yyparse() function from the original source code in the output.

Distribution of packages using Bison

Free software projects that use Bison may have a choice of whether to distribute the source code which their project feeds into Bison, or the resulting C code outputted by Bison. Both are sufficient for a recipient to be able to compile the project's source code. However, distributing only the input carries the minor inconvenience that the recipients must have a compatible copy of Bison installed so that they can generate the necessary C code when compiling the project. And distributing only the outputted C code creates the problem of making it very difficult for the recipients to modify the parser since this code was written neither by a human nor for humans - its purpose is to be fed directly into a C compiler.

These problems can be avoided by distributing both the input files and the generated code. Most people will compile using the generated code, no different from any other software package, but anyone who wants to modify the parser component can modify the input files first and re-generate the generated files before compiling. Projects distributing both usually don't have the generated files in their revision control systems. The files are only generated when making a release.

Some licences, such as the GPL, require that the source code be in "the preferred form of the work for making modifications to it". GPL'd projects using Bison must thus distribute the files which are the input for Bison. Of course, they can also include the generated files.

Where is it used?

Because Bison was written as a replacement for Yacc, and is largely compatible, the code from a lot of projects using Bison could equally be fed into Yacc, so it's difficult to say that a project's source code "uses" Bison. In many cases, the "use" of Bison could be trivially replaced by the equivalent use of Yacc.

Bison does have features not found in Yacc, so some projects can be truly said to "use" Bison, since Yacc wouldn't suffice.

The following list is of projects which are known to "use" Bison in the looser sense, that they use free software development tools and distribute code which is intended to be fed into Bison or a Bison-compatible package.

  • The Ruby programming language (YARV)
  • The PHP programming language (Zend Parser)
  • GCC started out using Bison, but switched to a hand-written recursive-descent parser for C++ in 2004 (version 3.4),[9] and for C and Objective-C in 2006 (version 4.1)[10]
  • The Go programming language (GC)
  • Bash shell uses a yacc grammar for parsing the command input.
  • LilyPond
  • PostgreSQL[11]

See also

  • Berkeley Yacc (byacc) - another free software Yacc replacement sharing the same author as GNU Bison


  1. ^ Demaille, Akim (2013-12-05). "bison-3.0.2 released". info-gnu. Retrieved 2013-12-24.
  2. ^  
  3. ^ a b GNU Bison Manual: C Scanners with Bison Parsers
  4. ^ GNU Bison Manual: Calling Conventions for Pure Parsers
  5. ^ GNU Bison Manual: A Pure (Reentrant) Parser
  6. ^ GNU Bison Manual: Bison Declaration Summary
  7. ^ GNU Bison Manual: Conditions for Using Bison
  8. ^ A source code file, parse-gram.c, which includes the exception
  9. ^ GCC 3.4 Release Series Changes, New Features, and Fixes
  10. ^ GCC 4.1 Release Series Changes, New Features, and Fixes
  11. ^

Further reading


External links

  • Website in the GNU Project
    • Manual
  • Project home at Savannah
  • Entry in the Free Software Directory
  • Internals of C parsers generated by GNU Bison
  • How to download and install Bison (GNU Parser Generator) on Linux
  • Win32 binaries by GnuWin32 (version 2.4.1)
This article was sourced from Creative Commons Attribution-ShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for and content contributors is made possible from the U.S. Congress, E-Government Act of 2002.
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a non-profit organization.

Copyright © World Library Foundation. All rights reserved. eBooks from Project Gutenberg are sponsored by the World Library Foundation,
a 501c(4) Member's Support Non-Profit Organization, and is NOT affiliated with any governmental agency or department.