Before this commit, variables for saving match results and parse
positions in generated parsers were not used efficiently. Each rule
basically used its own variable(s) for storing the data, with names
generated sequentially during code emitting. There was no reuse of
variables and a lot of unnecessary assignments between them.
It is easy to see that both match results and parse positions can
actually be stored on a stack that grows as the parser walks deeper in
the grammar tree and shrinks as it returns. Moreover, if one creates a
new stack for each rule the parser enters, its maximum depth can be
computed statically from the grammar. This allows us to implement the
stack not as an array, but as a set of numbered variables in each
function that handles parsing of a grammar rule, avoiding potentially
slow array accesses.
This commit implements the idea from the previous paragraph, using
separate stack for match results and for parse positions. As a result,
defined variables are reused and unnecessary copying avoided.
Speed implications
------------------
This change speeds up the benchmark suite execution by 2.14%.
Detailed results (benchmark suite totals as reported by "jake benchmark"
on Node.js 0.4.8):
-----------------------------------
Test # Before After
-----------------------------------
1 129.01 kB/s 131.98 kB/s
2 129.39 kB/s 130.13 kB/s
3 128.63 kB/s 132.57 kB/s
4 127.53 kB/s 129.82 kB/s
5 127.98 kB/s 131.80 kB/s
-----------------------------------
Average 128.51 kB/s 131.26 kB/s
-----------------------------------
Size implications
-----------------
This change makes a sample of generated parsers 8.60% smaller:
Before:
$ wc -c src/parser.js examples/*.js
110867 src/parser.js
13886 examples/arithmetics.js
450125 examples/css.js
632390 examples/javascript.js
61365 examples/json.js
1268633 total
After:
$ wc -c src/parser.js examples/*.js
99597 src/parser.js
13077 examples/arithmetics.js
399893 examples/css.js
592044 examples/javascript.js
54797 examples/json.js
1159408 total
The change does not change the benchmark suite execution speed
statistically significantly.
Detailed results (benchmark suite totals as reported by "jake benchmark"
on Node.js 0.4.8):
-----------------------------------
Test # Before After
-----------------------------------
1 128.20 kB/s 128.03 kB/s
2 130.36 kB/s 128.73 kB/s
3 126.53 kB/s 129.72 kB/s
4 127.46 kB/s 127.48 kB/s
5 127.63 kB/s 128.53 kB/s
-----------------------------------
Average 128.04 kB/s 125.50 kB/s
-----------------------------------
Closes GH-25.
Calling the parsing function could have been done without the ugly table
using |eval|, but this seemed to degrade performance significantly (by
about 3 %). This is probably because engines optimize badly in presence
of |eval|.
The method used in this patch does not change the benchmark suite
execution speed statistically significantly on V8.
Detailed results (benchmark suite totals):
---------------------------------
Test # Before After
---------------------------------
1 38.24 kB/s 38.28 kB/s
2 38.35 kB/s 38.15 kB/s
3 38.43 kB/s 38.40 kB/s
4 38.53 kB/s 38.20 kB/s
5 38.25 kB/s 38.39 kB/s
---------------------------------
Average 38.36 kB/s 38.39 kB/s
---------------------------------
Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.151 Safari/534.1
Originally I wanted to be very explicit with accesses to global object,
but since all this file is about extending it, the |global.| qualifier
seems more like noise.
This patch prevents portability problems. In particular, it fixes a
problem where "SyntaxError: Invalid range in character class." error
appeared when using command-line version on Widnows (see GH-13).
There are now three vendor directories. The goal is to have test- and
benchmark-specific stuff is its own directories and not in the main one.
vendor
test/vendor
benchmark/vendor
The source code is now in the src directory. The library needs to be
built using "rake", which creates the lib/peg.js file by combining the
source files.
1. |PEG.Compiler| -> |PEG.compiler|
2. |PEG.grammarParser| -> |PEG.parser|
This brings us closer to the desired structure of the PEG object, which
is:
+-PEG
|- parser
+- compiler
|- checks
|- passes
+- emitter
These are the only things (together with the |PEG.buildParser| function
and exceptions) that I want to be publicly accessible -- as extension
points and also for easy testing of PEG.js's components.
Before this change, the start rule was the one named "start" and there
was an option to override that. This is now impossible.
The goal of this change is to contain all information for the parser
generation in the grammar itself.
In the future, some override directive for the start rule (like Bison's
"%start") may be added to the grammar.
Little change in the source grammar now does not change variables in all
the generated code. This is helpful especially when one has the
generated grammar stored in a VCS (this is true e.g. for our
metagrammar).
Labeled expressions lead to more maintainable code and also will allow
certain optimizations (we can ignore results of expressions not passed
to the actions).
This does not speed up the benchmark suite execution statistically
significantly on V8.
Detailed results (benchmark suite totals):
---------------------------------
Test # Before After
---------------------------------
1 28.43 kB/s 28.46 kB/s
2 28.38 kB/s 28.56 kB/s
3 28.22 kB/s 28.58 kB/s
4 28.76 kB/s 28.55 kB/s
5 28.57 kB/s 28.48 kB/s
---------------------------------
Average 28.47 kB/s 28.53 kB/s
---------------------------------
Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/533.4 (KHTML, like Gecko) Chrome/5.0.375.55 Safari/533.4
I'll introduce labelled expressions shortly and I want to use ":" as a
label-expression separator. This change avoids conflict between the two
meanings of ":". (What would e.g. "foo: 'bar'" mean? Rule "foo"
matching string "bar", or string "bar" labelled "foo"?)
The mistakes weren't caught because the first one introduces a syntax
error, causing the whole test suite not to load. Unfortunately, QUnit
didn't complain so I missed this.
The real commit these changes belong to is
33a1a7c1e9.
The action now computes the number of passed parameters during the code
generation and the parameters are declared directly as $1, $2, etc. in the
generated function.
This does not speed up the benchmark suite execution statistically significantly
on V8.
Detailed results (benchmark suite totals):
---------------------------------
Test # Before After
---------------------------------
1 28.68 kB/s 29.08 kB/s
2 28.77 kB/s 28.72 kB/s
3 28.89 kB/s 28.78 kB/s
4 28.84 kB/s 28.57 kB/s
5 28.86 kB/s 28.84 kB/s
---------------------------------
Average 28.81 kB/s 28.80 kB/s
---------------------------------
Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/533.2 (KHTML, like Gecko) Chrome/5.0.342.9 Safari/533.2
This does not speed up the benchmark suite execution statistically significantly
on V8.
Detailed results (benchmark suite totals):
---------------------------------
Test # Before After
---------------------------------
1 28.84 kB/s 28.75 kB/s
2 28.76 kB/s 28.69 kB/s
3 28.72 kB/s 28.69 kB/s
4 28.84 kB/s 28.93 kB/s
5 28.82 kB/s 28.70 kB/s
---------------------------------
Average 28.80 kB/s 28.75 kB/s
---------------------------------
Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/533.2 (KHTML, like Gecko)
Chrome/5.0.342.9 Safari/533.2
This does not speed up the benchmark suite execution statistically significantly
on V8.
Detailed results (benchmark suite totals):
---------------------------------
Test # Before After
---------------------------------
1 28.72 kB/s 28.84 kB/s
2 28.84 kB/s 28.76 kB/s
3 28.83 kB/s 28.72 kB/s
4 28.81 kB/s 28.84 kB/s
5 28.76 kB/s 28.82 kB/s
---------------------------------
Average 28.79 kB/s 28.80 kB/s
---------------------------------
Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/533.2 (KHTML, like Gecko) Chrome/5.0.342.9 Safari/533.2