ocamllex tutorial

This is a tutorial on how to use ocamllex which is distributed with Ocaml language.

About this document

Lots of part of this document are borrowed from the flex manual.

All license term in this document is NOT related with ocamlyacc; it is ONLY for this document.

Please mail all comments and suggestions to me.

The companion tutorial for ocamlyacc is available at ocamlyacc tutorial.

The old tutorial and source of the examples used in this document can be found here.


    ocamllex is a tool for generating scanners: programs which recognized lexical patterns in text.

    Some simple examples

    First some simple examples to get the flavor of how one uses ocamllex. The following ocamllex input specifies a scanner which whenever it encounters the string "current_directory" will replace it with the current directory: ...

    Format of the input file

    The ocamllex input file consists of four sections; header, definitions, rules and trailer section: ...


    The patterns in the input are written using regular expressions in the style of lex, with a more Caml-like syntax.

    How the input is matched

    When the generated scanner is run, it analyzes its input looking for strings which match any of its patterns. If it finds more than one match, it takes the one matching the most text (the "longest match" principle). If it finds two or more matches of the same length, the rule listed first in the ocamllex input file is chosen (the "first match" principle).


    Each pattern in a rule has a corresponding action, which can be any arbitrary Ocaml expression.

    The generted scanner

    The output of ocamllex is the file lex.ml when it is invoked as `ocamllex lex.mll`. The generated file contains the scanning functions, a number of tables used by it for matching tokens, and a number of auxiliary routines.

    Start conditions

    ocamllex provides a mechanism for conditionally activating rules. When you want do activate the other rule, just call the other entrypoint function.

    Interfacing with ocamlyacc

    One of the main uses of ocamllex is as a companion to the ocamlyacc parser-generator. ocamlyacc parsers call one of the scanning functions to find the next input token.


    ocamllex has the following options: ...

    Usage tips

    The number of status transitions generated by ocamllex are limited to at most 32767. If you use too many transitions, for example, too many keywords, ocamllex generates the following error message ...


    This chapter includes examples in complete form. Some are revised from the code fragments of the previous chapters.


comments powered by Disqus