The patterns in the input are written using regular expressions in the style of lex, with a more Caml-like syntax. These are:
'c': match the character ‘c’. The character constant is the same syntax as Objective Caml character._: (underscore) match any character.eof: match an end-of-file ."foo": the literal string “foo”. The syntax is the same syntax as Objective Caml string constants.['x' 'y' 'z']: character set; in this case, the pattern matches either an ‘x’, a ‘y’, or a ‘z’ .['a' 'b' 'j'-'o' 'Z']: character set with a range in it; ranges of characters ‘c1’ - ‘c2’ (all characters between c1 and c2, inclusive);
in this case, the pattern matches an ‘a’, a ‘b’, any letter from ‘j’ through ‘o’, or a ‘Z’.[^ 'A'-'Z']:
a negated character set, i.e., any character but those in the class. In this case, any character EXCEPT an uppercase letter.[^ 'A'-'Z' '\n']: any character EXCEPT an uppercase letter or a newliner*: zero or more r’s, where r is any regular expressionr+: one or more r’s, where r is any regular expressionr?: zero or one r’s, where r is any regular expression (that is, “an optional r”)ident: the expansion of the “ident”
defined by an earlier let ident =  regexp definition.(r): match an r; parentheses are used to override precedence (see below)rs: the regular expression r followed by the regular expression s; called “concatenation”r|s: either an r or an sr#s: match the difference of the two specified character sets.r as ident: bind the string matched by r to identifier identThe regular expressions listed above are grouped according to precedence, from highest precedence at the top to lowest at the bottom; ‘*’ and ‘+’ have highest precedence, followed by ‘?’, ‘concatenation’, ‘|’, and then ‘as’. For example,
"foo" | "bar"*is the same as
("foo")|("bar"*)since the ‘*’ operator has higher precedence than than alternation (‘|’). This pattern therefore matches either the string “foo” or zero-or-more of the string “bar”.
To match zero-or-more “foo”’s-or-“bar”’s:
("foo"|"bar")*A negated character set such as the example “[^ ‘A’-‘Z’]” above will match a newline unless “\n” (or an equivalent escape sequence) is one of the characters explicitly present in the negated character set (e.g., “[^ ‘A’-‘Z’ ‘\n’]“). This is unlike how many other regular expression tools treat negated character set, but unfortunately the inconsistency is historically entrenched. Matching newlines means that a pattern like [^“]* can match the entire input unless there’s another quote in the input.