When called inside an action, the |text| function returns the text
matched by action's expression. It can be also called inside an
initializer or a predicate where it returns an empty string.
The |text| function will be useful mainly in cases where one needs a
structured representation of the input and simultaneously the raw text.
Until now, the only way to get the raw text in these cases was to
painfully build it from the structured representation.
Fixes GH-131.
Implement a new syntax to extract matched strings from expressions. For
example, instead of:
identifier = first:[a-zA-Z_] rest:[a-zA-Z0-9_]* { return first + rest.join(""); }
you can now just write:
identifier = $([a-zA-Z_] [a-zA-Z0-9_]*)
This is useful mostly for "lexical" rules at the bottom of many
grammars.
Note that structured match results are still built for the expressions
prefixed by "$", they are just ignored. I plan to optimize this later
(sometime after the code generator rewrite).
Getting rid of the |trackLineAndColumn| simplifies the code generator
(by unifying two paths in the code).
The |line| and |column| functions currently always compute all the
position info from scratch, which is horribly ineffective. This will be
improved in later commit(s).
This will allow to compute position data lazily and get rid of the
|trackLineAndColumn| option without affecting performance of generated
parsers that don't use position data.
Before this commit, |PEG.buildParser| always returned a parser object.
The only way to get its source code was to call the |toSource| method on
it. While this method worked for parsers produced by |PEG.buildParser|
directly, it didn't work for parsers instantiated by executing their
source code. In other words, it was unreliable.
This commit remvoes the |toSource| method on generated parsers and
introduces a new |output| option to |PEG.buildParser|. It allows callers
to specify whether they want to get back the parser object
(|options.output === "parser"|) or its source code (|options.output ===
"source"|). This is much better and more reliable API.
Before this commit, generated parser were able to start parsing from any
rule. This was nice, but it made rule code inlining impossible.
Since this commit, the list of allowed start rules has to be specified
explicitly using the |allowedStartRules| option of the |PEG.buildParser|
method (or the --allowed-start-rule option on the command-line). These
rules will be excluded from inlining when it's implemented.
This commit replaces the |startRule| parameter of the |parse| method in
generated parsers with more generic |options| -- an options object. This
options object can be used to pass custom options to the parser because
it is visible as the |options| variable inside parser code.
The start rule can now be specified as the |startRule| option. This
means you have to replace all calls like:
parser.parse("input", "myStartRule");
with
parser.parse("input", { startRule: "myStartRule" });
Closes GH-37.
Or, swapped Ruby dependency for a Node dependency.
The build script was also modified to always regenerate the parser (in
case of the "parser" task) or rebuild the library (in case of the
"build" task) even if the source files were not modified. Not doing this
led to problems when the generating code changed but the files didn't
(which happened often during development).
The source code is now in the src directory. The library needs to be
built using "rake", which creates the lib/peg.js file by combining the
source files.
This and also speeds up the benchmark suite execution by 7.83 % on V8.
Detailed results (benchmark suite totals):
---------------------------------
Test # Before After
---------------------------------
1 26.17 kB/s 28.16 kB/s
2 26.05 kB/s 28.16 kB/s
3 25.99 kB/s 28.10 kB/s
4 26.13 kB/s 28.11 kB/s
5 26.14 kB/s 28.07 kB/s
---------------------------------
Average 26.10 kB/s 28.14 kB/s
---------------------------------
Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/533.2 (KHTML, like Gecko) Chrome/5.0.342.7 Safari/533.2