|
|
|
|
!
For
a 15 day free trial click here!
To order now click here! |
|
|
|
This article describes the expression list stack, and gives a couple useful examples which exploit this feature.It is oftentimes very useful to do some context dependent processing in the lexical analyzer to isolate or distinguish certain tokens. The expression list stack can be used to do just this task. Legacy lex tools use what are called start states to do something similar, but this method is not nearly as intuitive or as powerful as using the expression list stack. Basically, you can group tokens into an expression list, and push, pop, or goto (which is a pop followed by a push) lists as tokens are recognized. The best way to illustrate this is with an example. We will use C++ like comments as our first example. C++ Comments C++ accepts 2 types of comments. Anything enclosed in a /*...*/ or any-thing between a // and the end of the line. The typical parser doesnt care about comments, and it would be nice to have the lexer weed the comments out. This is easy to handle with expression lists. %expression Main /\* %ignore, %push MultiLineComment; // %ignore, %push SingleLineComment; %expression MultiLineComment .+ %ignore; \n %ignore; \*/ %ignore, %pop; %expression SingleLineComment .+ %ignore; \n %ignore, %pop; To add nested /*...*/ comments all we need to do is add this to the Mul-tiLine expression list: /\* %ignore, %push MultiLineComment; You can see that this is a very elegant way to handle this problem, particularly when nested comments are required. Embedded SQL Another useful example is parsing embedded SQL. This is a perfect application for expression lists, and illustrates their benefits quite well. The preprocessor wants to ignore everything until an EXEC SQL statement is reached, then wants to interpret the data as SQL statements. When a semicolon is reached, the preprocessor should start ignoring to-kens until the next EXEC SQL is reached. With the expression list stack, this is trivial. %expresssion Main [^ \t\n]+ %ignore; [ \t\n]+ %ignore; EXEC[ \t]+SQL %ignore, %push SqlList; %expression SqlList // // SQL tokens go here // ; %ignore, %pop; The first token recognizes anything but white space, the second recognizes white space, and the last is the EXEC SQL delimiter. The SqlList describes the SQL lexical analyzer. An added benefit of this solution is that you can develop your SQL rule file assuming that you are only parsing SQL statements (not embedded SQL), and just add the initial Main expression list in the above example to parse the embedded version. |
|