Semantic code

Instead of terminating a rule with a semicolon, you can append a block of D code:

A()
{
    "asdf"
    {
        // D code
    }
}

Within these code blocks, you can use the member variables of the SyntaxTree structure, as well as the names of the non-terminals used in the rule. These non-terminal names are delegates that point to the corresponding code block of the rule that was reduced to the respective instance of the non-terminal. The use is very straight forward, as you will see in a moment.

Let's expand the arithmetic example from above to actually calculate the value of an expression. We will do so by using a single synthetic attribute value for all non-terminals. That means that we declare an output parameter in each non-terminal that will be computed bottom-up when traversing the tree.

The simplest case therefore is a number. We know the value of those immediately:

Atom(out int value)
{
    regexp("-?[0-9]+")
    {
        value = atoi(_ST_regexMatch);
    }
}

Here we use the _ST_match member to access the string that has been matched by the regular expression.

To use atoi we would have to import std.string first. In this case, the generated D source file already imports it, so we don't have to do it manually. But we'll see how to add our own imports and global declarations in a later section.

Now that Atom can give use values, we can use them in non-leaf nodes. As mentioned above, we can call non-terminal symbols just as if they were normal functions. Therefore, the semantic code is straight forward:

MulExpr(out int value)
{
    Atom "*" MulExpr
    {
        MulExpr(value);
        int tmp;
        Atom(tmp);
        value *= tmp;
    }
}

First we call MulExpr to compute the value of the right side of the multiplication and save the value the the output parameter. Then we call Atom with a temporary variable and multiply both values to produce the result. Note that we don't care about traversing the syntax tree. APaGeD takes care of this for us.

The code for the Expr non-terminal looks similar:

Expr(out int value)
{
    MulExpr "+" Expr
    {
        int val;
        Expr(val);
        MulExpr(value);
        value += val;
    }

    MulExpr "-" Expr
    {
        int val;
        Expr(val);
        MulExpr(value);
        value -= val;
    }
}

See the arithmetic.apd example for the full version.

If a non-terminal symbol appears more than once in a rule, you need to specify aliases for all instances but one to make the semantic calls unique. This is done by appending =<alias> to the non-terminal:

MulExpr "*" MulExpr=MulExpr2
{
    MulExpr(value); // calls the first instance
    int tmp;
    MulExpr2(tmp);  // calls the second instance
    value *= tmp;
}

Now it is obvious how we're going to use inherited attributes, that means values that are computed top-down. We'll simply add parameters to our non-terminals without the out modifier. Those will be available in semantic code blocks just like normal function parameters. Of course you can also use inout parameters or any other valid D parameter declaration.

You can also call non-terminals multiple times within the same semantic code block, effectively performing multi-pass semantic analysis. APaGeD does this for example to implement forward references for non-terminals.

See the binfloat.apd example and src/parser.apd for details.

The declaration of a non-terminal may be followed by attributes. At the moment, the only attribute available is no_ast, which prevents AST nodes to be generated for that non-terminal and all of it's children.

A() no_ast
{
    "asdf";
}

Note that semantic code will not be executed, if a non-terminal has no AST node. Not generating AST nodes improves parsing speed and simplifies the AST.