โŒ About FreshRSS

Normal view

There are new articles available, click to refresh the page.
Before yesterdayNews from the Ada programming language world

Reentrant scanner and parser with Aflex and Ayacc

14 May 2023 at 19:02
[Aflex and Ayacc](Ada/Aflex-Ayacc-code.jpg)
    1. What's new in Aflex 1.6

- Support the flex options `%option output`, `%option nooutput`, `%option yywrap`, `%option noinput`,

 `%option noyywrap`, `%option unput`, `%option nounput`, `%option bufsize=NNN` to better control the
 generated `_IO` package.
- Aflex(https://github.com/Ada-France/aflex) templates provide more control for tuning the code generation and
 they are embedded with [Advanced Resource Embedder](https://gitlab.com/stcarrez/resource-embedder)
- Support to define Ada code block in the scanner that is inserted in the generated scanner - New option -P to generate a private Ada package for DFA and IO - New directive `%option reentrant` and `%yyvar` to generate a recursive scanner - New directive `%yydecl` to allow passing parameters to `YYLex`
 or change the default function name

Example of `%option` directives to tell Aflex(https://github.com/Ada-France/aflex) to avoid generating several function or procedures and customize the buffer size.

```Ada %option nounput %option noinput %option nooutput %option noyywrap %option bufsize=1024 ```

The tool supports some code block injection at various places in the generated scanner. The code block has the following syntax where `<block-name>` is the name of the code block:

```Ada %<block-name> {

 -- Put Ada code here

} ```

The `%yytype` code block can contain type declaration, function and procedure declarations. It is injected within the `YYLex` function in the declaration part. The `%yyinit` code block can contain statements that are executed at beginning of the `YYLex` function. The `%yyaction` code block can contain statements that are executed before running any action. The `%yywrap` code block can contain statements which are executed when the end of current file is reached to start parsing a next input.

    1. What's new in Ayacc 1.4

- Support the Bison `%define variable value` option to configure the parser generator - Support the Bison `%code name { ... }` directive to insert code verbatim into the output parser - Recognize some Bison variables `api.pure`, `api.private`, `parse.error`, `parse.stacksize`,

 `parse.name`, `parse.params`, `parse.yyclearin`, `parse.yyerrok`, `parse.error`
- New option `-S skeleton` to allow using an external skeleton file for the parser generator - Ayacc(https://github.com/Ada-France/ayacc) templates provide more control for tuning the code generation and
 they are embedded with [Advanced Resource Embedder](https://gitlab.com/stcarrez/resource-embedder)
- New option `-P` to generate a private Ada package for the tokens package - Improvement to allow passing parameters to `YYParse` for the grammar rules - New `%lex` directive to control the call of `YYLex` function - Fix #6: ayacc gets stuck creating an infinitely large file after encountering a comment in an action

The generator supports two code block injections, the first one `decl` is injected in the `YYParse` procedure declaration and the `init` is injected as first statements to be executed only once when the procedure is called. The syntax is borrowed from the Bison parser:

```Ada %code decl {

  -- Put Ada declarations

} %code init {

  -- Put Ada statements

} ```

Some other Bison like improvements have been introduced to control the generation of the parser code.

``` %define parse.error true %define parse.stacksize 256 %define parse.yyclearin false %define parse.yyerrok false %define parse.name MyParser ```

    1. How to use

The easiest way to use Ayacc(https://github.com/Ada-France/ayacc) and Aflex(https://github.com/Ada-France/aflex) is to use Alire(https://github.com/alire-project/alire), get the sources, build them and install them. You can do this as follows:

``` alr get aflex cd aflex_1.6.0_b3c21d99 alr build alr install alr get ayacc cd ayacc_1.4.0_c06f997f alr build alr install ```

  • UPDATE*: the `alr install` command is available only with Alire(https://github.com/alire-project/alire) 2.0.

Using these tools is done in two steps:

1. a first step to call `aflex` or `ayacc` command with the scanner file or grammar file, 2. a second step to call `gnatchop` to split the generated file in separate Ada files

For example, with a `calc_lex.l` scanner file, you would use:

``` aflex calc_lex.l gnatchop -w calc_lex.ada ```

And with a `calc.y` grammar file:

``` ayacc calc.y gnatchop -w calc.ada ```

To know more about how to write a scanner file or grammar file, have a look at Aflex 1.5 and Ayacc 1.3.0(https://blog.vacs.fr/vacs/blogs/post.html?post=2021/12/18/Aflex-1.5-and-Ayacc-1.3.0) which explains more into details some of these aspects.

    1. Highlight on reentrancy

By default Aflex(https://github.com/Ada-France/aflex) and Ayacc(https://github.com/Ada-France/ayacc) generate a scanner and a parser which use global variables declared in a generated Ada package. These global variables contain some state about the scanner such as the current file being scanned. The Ayacc(https://github.com/Ada-France/ayacc) parser generates on its side two global variables `YYLVal` and `YYVal`.

Using global variables creates some strong restrictions when using the generated scanner and parser: we can scan and parse only one file at a time. It cannot be used in a multi-thread environment unless the scan and parse is protected from concurrent access. We cannot use easily some grammars that need to recurse and parse another file such as an included file.

      1. Reentrant scanner

The reentrant scanner is created by using the `-R` option or the `%option reentrant` directive. The scanner will then need a specific declaration with a context parameter that will hold the scanner state and variables. The context parameter has its type generated in the `Lexer_IO` package. The `%yydecl` directive in the scanner file must be used to declare the `YYLex` function with its parameters. By default the name of the context variable is `Context` but you can decide to customize and change it to another name by using the `%yyvar` directive.

``` %option reentrant %yyvar Context %yydecl function YYLex (Context : in out Lexer_IO.Context_Type) return Token ```

When the `reentrant` option is activated, Aflex(https://github.com/Ada-France/aflex) will generate a first `Context_Type` limited type in the `Lexer_DFA` package and another one in the `Lexer_IO` package. The generator can probably be improved in the future to provide a single package with a single type declaration. The `Lexer_DFA` package contains the internal data structures for the scanner to maintain its state and the `Lexer_IO` package holds the input file as well as the `YYLVal` and `YYVal` values.

      1. Reentrant parser

On its side, Ayacc(https://github.com/Ada-France/ayacc) uses the `YYLVal` and `YYVal` variables. By default, it generates them in the `_tokens` package that contains the list of parser symbols. It must not generate them and it must now use the scanner `Context_Type` to hold them as well as the scanner internal state. The setup requires several steps:

1. The reentrant parser is activated by using the `%define api.pure`

  directive similar to the [bison %define](https://www.gnu.org/software/bison/manual/html_node/_0025define-Summary.html).

2. The `%lex` directive must be used to define how the `YYLex` function must be called since it now has some

  context parameter.

3. The scanner context variable must be declared somewhere, either as parameter to the `YYParse`

  procedure or as a local variable to `YYParse`.  This is done using the new `%code decl` directive
  and allows to customize the local declaration part of the `YYParse` generated procedure.

4. We must give visibility of the `YYLVal` and `YYVal` values defined in the scanner context variable.

  Again, we can do this within the `%code decl` directive.

A simple reentrant parser could be defined by using:

```Ada %define api.pure true %lex YYLex (Scanner) %code decl {

     Scanner : Lexer_IO.Context_Type;
     YYLVal  : YYSType renames Scanner.YYLVal;
     YYVal   : YYSType renames Scanner.YYVal;

} ```

However, this simple form is not really useful as you may need to open the file and setup the scanner to read from it. It is probably better to pass the scanner context as parameter to the `YYParse` procedure. For this, we can use the `%define parse.params` directive to control the procedure parameters. The reentrant parser is declared as follows:

```Ada %lex YYLex (Scanner) %define api.pure true %define parse.params "Scanner : in out Lexer_IO.Context_Type" %code decl {

     YYLVal : YYSType renames Scanner.YYLVal;
     YYVal  : YYSType renames Scanner.YYVal;

} ```

To use the reentrant parser and scanner, we only need to declare the scanner context, open the file by using the `Lexer_IO.Open_Input` procedure and call the `YYParse` procedure as follows:

```Ada

 Scanner : Lexer_IO.Context_Type;
 ...
   Lexer_IO.Open_Input (Scanner, "file-to-scan");
   YYParse (Scanner);

```

      1. Grammar examples:

To have a more complete example of a reentrant parser, you may have a look at the following files:

โŒ
โŒ