So far, left recursion detector assumed that left recursion occurs only
when the recursive rule is at the very left-hand side of rule's
expression:
start = start
This didn't catch cases like this:
start = "a"? start
In general, if a rule reference can be reached without consuming any
input, it can lead to left recursion. This commit fixes the detector to
consider that.
Fixes#190.
Parsers can now be generated with support for tracing using the --trace
CLI option or a boolean |trace| option to |PEG.buildParser|. This makes
them trace their progress, which can be useful for debugging. Parsers
generated with tracing support are called "tracing parsers".
When a tracing parser executes, by default it traces the rules it enters
and exits by writing messages to the console. For example, a parser
built from this grammar:
start = a / b
a = "a"
b = "b"
will write this to the console when parsing input "b":
1:1 rule.enter start
1:1 rule.enter a
1:1 rule.fail a
1:1 rule.enter b
1:2 rule.match b
1:2 rule.match start
You can customize tracing by passing a custom *tracer* to parser's
|parse| method using the |tracer| option:
parser.parse(input, { trace: tracer });
This will replace the built-in default tracer (which writes to the
console) by the tracer you supplied.
The tracer must be an object with a |trace| method. This method is
called each time a tracing event happens. It takes one argument which is
an object describing the tracing event.
Currently, three events are supported:
* rule.enter -- triggered when a rule is entered
* rule.match -- triggered when a rule matches successfully
* rule.fail -- triggered when a rule fails to match
These events are triggered in nested pairs -- for each rule.enter event
there is a matching rule.match or rule.fail event.
The event object passed as an argument to |trace| contains these
properties:
* type -- event type
* rule -- name of the rule the event is related to
* offset -- parse position at the time of the event
* line -- line at the time of the event
* column -- column at the time of the event
* result -- rule's match result (only for rule.match event)
The whole tracing API is somewhat experimental (which is why it isn't
documented properly yet) and I expect it will evolve over time as
experience is gained.
The default tracer is also somewhat bare-bones. I hope that PEG.js user
community will develop more sophisticated tracers over time and I'll be
able to integrate their best ideas into the default tracer.
Action and predicate code can now see variables defined in expressions
"above" them.
Based on a pull request by Bryon Vandiver (@asterick):
https://github.com/pegjs/pegjs/pull/180Fixes#316.
The generated parser API specs are mostly extracted from
generated-parser.spec.js, which got renamed to
generated-parser-behavior.spec.js to better reflect its purpose.
Unit specs are unit tests of internal stuff. API specs are tests of the
user-visible APIs and behavior.
I think it makes sense to make this distinction because then the public
API line is more clearly visible e.g. when using the specs as
documentation.
The TEXT instruction now replaces position at the top of the stack with
the input from that position until the current position. This is simpler
and cleaner semantics than the previous one, where TEXT also popped an
additional value from the stack and kept the position there.
Implement the following bytecode instructions:
* PUSH_UNDEFINED
* PUSH_NULL
* PUSH_FAILED
* PUSH_EMPTY_ARRAY
These instructions push simple JavaSccript values to the stack directly,
without going through constants. This makes the bytecode slightly
shorter and the bytecode generator somewhat simpler.
Also note that PUSH_EMPTY_ARRAY allows us to avoid a hack which protects
the [] constant from modification.
The action/predicate code didn't have access to the parser object. This
was mostly a side effect actions/predicates being implemented as nested
functions, in which |this| is a reference to the global object (an ugly
JavaScript quirk). The initializer, being implemented differently, had
access to the parser object via |this|, but this was not documented.
Because having access to the parser object can be useful, this commits
introduces a new |parser| variable which holds a reference to it, is
visible in action/predicate/initializer code, and is properly
documented.
See also:
https://groups.google.com/forum/#!topic/pegjs/Na7YWnz6Bmg
This is mostly done for consistency with the JavaScript example grammar,
from which the |Identifier| rule is taken from. See the previous commit
for details.
Instead of matching segments between blocks character by character,
match them as a whole. Also align the style with other similar rules
(e.g. the comment ones).
Before this commit, line continuations in character classes contributed
an empty string to the list of characters and character ranges matched
by a class. While this didn't lead to a buggy behavior with the current
code generator, the AST was wrong and the problem could have caused bugs
later.
This commit fixes the problem.