746 Commits (4b154e177f5881b746dd3d035274f396678d5779)
 

Author SHA1 Message Date
David Majda 4b154e177f Update character categories in grammars to Unicode 8.0.0 9 years ago
David Majda 703a352985 Change few testcase descriptions
Reaction to changes in 130cbcfaa3.
9 years ago
David Majda d7d7e87874 Make infinite loop and left recursion detectors work with named rules
Add missing |named| case to the visitor in lib/compiler/asts.js, which
makes the infinite loop and left recursion detectors work correctly with
named rules.

The missing case caused |make parser| to fail with:

  140:34: Infinite loop detected.
  make: *** [parser] Error 1
9 years ago
David Majda 130cbcfaa3 Rename asts.matchesEmpty to alwaysAdvancesOnSuccess and negate it
This makes it more clear that the function isn't about the input the
expression *matched* but about the input it *consumed* when it matched.

Based on a comment by @Mingun:

  https://github.com/pegjs/pegjs/pull/307#issuecomment-89512575
9 years ago
David Majda 317059760a Fix incorrect pass name in a spec description 9 years ago
David Majda 373f48c10f Fix small error in two testcases
Pointed out by @Mingun:

  6ce97457bf (commitcomment-10548605)
9 years ago
David Majda 6f2c75f7d8 Label specs: Improve structure and descriptions 9 years ago
David Majda 8487c9a0ff Label specs: Add missing test case 9 years ago
David Majda f4385da177 Label specs: Unify formatting with other code 9 years ago
David Majda ddff5feea3 Label specs: Simplify and regularize block-scoped label specs
There is no need to test 3 labels from the outside scope, 1 is enough.
9 years ago
David Majda 122d7b0737 README.md: Mention there is no backtracking for *, +, and ?
Based on a pull request by Jak Wings (@jakwings):

  https://github.com/pegjs/pegjs/pull/333
9 years ago
David Majda 5e6b5da4e9 Merge pull request #347 from mbaumgartl/errorstack
Add stack trace in engines based on V8
9 years ago
Marco Baumgartl 940a66fb38 Add stack trace in engines based on V8. Fixes #331 10 years ago
David Majda cb640cd0b2 Merge pull request #345 from arlolra/split4
Even more split outs from #339
10 years ago
Arlo Breault 12c169e7b5 Convert PEG.js code to strict mode
* Issues #323
10 years ago
Arlo Breault 1a32ae7bd0 Make PEG global explicit in helpers 10 years ago
Arlo Breault f4d2357609 Make jshint aware of node globals 10 years ago
Arlo Breault 7a94f97b46 Convert benchmark files to modules 10 years ago
David Majda 9815e49477 Merge pull request #343 from arlolra/split2
More split outs from #339
10 years ago
Arlo Breault 45e39c3ac8 Make generated parsers use strict mode
* Issue #324

 * JSHint complains about two possible strict violations. But are valid
   uses of `this`, so we suppress the warnings.
10 years ago
Arlo Breault 7285ccfd4e Remove block around initialize code
* In strict mode code, functions can only be declared at top level or
   immediately within another function.  This means functions defined in
   the initializer would throw.
10 years ago
Arlo Breault b079a056a2 Suppress linting newcap
* When the "use strict"; directive is set, constructors called without
   `new` will set the execution context to undefined. JSHint tries to be
   clever and forces on newcap. Suppress this behaviour, especially
   because newcap has gone the way of the dodo.
10 years ago
David Majda e0be643b7c Merge pull request #341 from arlolra/split
Split up the work from #339
10 years ago
Arlo Breault c71f723b3f Run `make hint` before testing 10 years ago
Arlo Breault 16756d9010 Run CI on newer version of node / iojs 10 years ago
Arlo Breault 7695e5e3c5 Fix complaints from `make hint` 10 years ago
David Majda f2200e48af Optimize location info computation
Before this commit, position details (line and column) weren't computed
efficiently from the current parse position. There was a cache but it
held only one item and it was rarely hit in practice. This resulted in
frequent rescanning of the whole input when the |location| function was
used in various places in a grammar.

This commit extends the cache to remember position details for any
position they were ever computed for. In case of a cache miss, the cache
is searched for a value corresponding to the nearest lower position,
which is then used to compute position info for the desired position
(which is then cached). The whole input never needs to be rescanned.

No items are ever evicted from the cache. I think this is fine as the
max number of entries is the length of the input. If this becomes a
problem I can introduce some eviction logic later.

The performance impact of this change is significant. As the benchmark
suite doesn't contain any grammar with |location| calls I just used a
little ad-hoc benchmark script which measured time to parse the grammar
of PEG.js itself (which contains |location| calls):

  var fs     = require("fs"),
      parser = require("./lib/parser");

  var grammar = fs.readFileSync("./src/parser.pegjs", "utf-8"),
      startTime, endTime;

  startTime = (new Date()).getTime();
  parser.parse(grammar);
  endTime = (new Date()).getTime();

  console.log(endTime - startTime);

The measured time went from ~293 ms to ~54 ms on my machine.

Fixes #337.
10 years ago
David Majda 29bb921994 Rename |peg$cache| to |peg$resultsCache|
This change will make the results cache clearly distinguishable from the
position details cache (which I'll add in a minute).
10 years ago
David Majda eaca5f0acf Add location information to |GrammarError|
This means all errors thrown by |PEG.buildParser| now have associated
location information.
10 years ago
David Majda 89146915ce Add location information to AST nodes
This will allow to add location information to |GrammarError| exceptions
thrown in various passes.
10 years ago
David Majda d1fe86683b Improve location info in tracing events
Replace |line|, |column|, and |offset| properties of tracing events with
the |location| property. It contains an object similar to the one
returned by the |location| function available in action code:

  {
    start: { offset: 23, line: 5, column: 6 },
    end:   { offset: 25, line: 5, column: 8 }
  }

For the |rule.match| event, |start| refers to the position at the
beginning of the matched input and |end| refers to the position after
the end of the matched input.

For |rule.enter| and |rule.fail| events, both |start| and |end| refer to
the current position at the time the rule was entered.
10 years ago
David Majda 065f4e1b75 Improve location info in syntax errors
Replace |line|, |column|, and |offset| properties of |SyntaxError| with
the |location| property. It contains an object similar to the one
returned by the |location| function available in action code:

  {
    start: { offset: 23, line: 5, column: 6 },
    end:   { offset: 25, line: 5, column: 8 }
  }

For syntax errors produced in the middle of the input, |start| refers to
the first unparsed character and |end| refers to the character behind it
(meaning the span is 1 character). This corresponds to the portion of
the input in the |found| property.

For syntax errors produced the end of the input, both |start| and |end|
refer to a character past the end of the input (meaning the span is 0
characters).

For syntax errors produced by calling |expected| or |error| functions in
action code the location info is the same as the |location| function
would return.
10 years ago
David Majda b1ad2a1f61 Rename |reportedPos| to |savedPos|
Preform the following renames:

  * |reportedPos| -> |savedPos| (abstract machine variable)
  * |peg$reportedPos| -> |peg$savedPos| (variable in generated code)
  * |REPORT_SAVED_POS| -> |LOAD_SAVED_POS| (instruction)
  * |REPORT_CURR_POS| -> |UPDATE_SAVED_POS| (instruction)

The idea is that the name |reportedPos| is no longer accurate after the
|location| change (seea the previous commit) because now both
|reportedPos| and |currPos| are reported to user code. Renaming to
|savedPos| resolves this inaccuracy.

There is probably some better name for the concept than quite generic
|savedPos|, but it doesn't come to me.
10 years ago
David Majda 4f7145e360 Improve location info available in action code
Replace |line|, |column|, and |offset| functions with the |location|
function. It returns an object like this:

  {
    start: { offset: 23, line: 5, column: 6 },
    end:   { offset: 25, line: 5, column: 8 }
  }

In actions, |start| refers to the position at the beginning of action's
expression and |end| refers to the position after the end of action's
expression. This allows one to easily add location info e.g. to AST
nodes created in actions.

In predicates, both |start| and |end| refer to the current position.

Fixes #246.
10 years ago
David Majda e75f21dc8f Don't indent empty lines when creating browser version
This prevents having lines with nothing but 4 spaces in the output.
10 years ago
David Majda 889563a0ae Add missing ";" 10 years ago
David Majda 3473c6cb64 Remove extra whitespace 10 years ago
David Majda fb320c4c59 Fix small errors in Jasmine matcher messages 10 years ago
David Majda d7fc0b5c3b Implement infinite loop detection
Fixes #26.
10 years ago
David Majda 95ce20ed92 Extract the |matchesEmpty| visitor from the |reportLeftRecursion| pass
Beside the recursion detector, the visitor will also be used by infinite
loop detector.

Note the newly created |asts.matchesEmpty| function re-creates the
visitor each time it is called, which makes it slower than necessary.
This could have been worked around in various ways but I chose to defer
that optimization because real-world performance impact is small.
10 years ago
David Majda 03a391e874 s/appliedRules/visitedRules/
The rules are not really *applied* by the |reportLeftRecursion| pass,
they are just *visited*.
10 years ago
David Majda 25ed2b7ee2 Improve comment describing the |reportLeftRecursion| pass 10 years ago
David Majda 6ce97457bf Fix left recursion detection
So far, left recursion detector assumed that left recursion occurs only
when the recursive rule is at the very left-hand side of rule's
expression:

  start = start

This didn't catch cases like this:

  start = "a"? start

In general, if a rule reference can be reached without consuming any
input, it can lead to left recursion. This commit fixes the detector to
consider that.

Fixes #190.
10 years ago
David Majda da57118a43 Implement basic support for tracing
Parsers can now be generated with support for tracing using the --trace
CLI option or a boolean |trace| option to |PEG.buildParser|. This makes
them trace their progress, which can be useful for debugging. Parsers
generated with tracing support are called "tracing parsers".

When a tracing parser executes, by default it traces the rules it enters
and exits by writing messages to the console. For example, a parser
built from this grammar:

  start = a / b
  a = "a"
  b = "b"

will write this to the console when parsing input "b":

  1:1 rule.enter start
  1:1 rule.enter   a
  1:1 rule.fail    a
  1:1 rule.enter   b
  1:2 rule.match   b
  1:2 rule.match start

You can customize tracing by passing a custom *tracer* to parser's
|parse| method using the |tracer| option:

  parser.parse(input, { trace: tracer });

This will replace the built-in default tracer (which writes to the
console) by the tracer you supplied.

The tracer must be an object with a |trace| method. This method is
called each time a tracing event happens. It takes one argument which is
an object describing the tracing event.

Currently, three events are supported:

  * rule.enter -- triggered when a rule is entered
  * rule.match -- triggered when a rule matches successfully
  * rule.fail  -- triggered when a rule fails to match

These events are triggered in nested pairs -- for each rule.enter event
there is a matching rule.match or rule.fail event.

The event object passed as an argument to |trace| contains these
properties:

  * type   -- event type
  * rule   -- name of the rule the event is related to
  * offset -- parse position at the time of the event
  * line   -- line at the time of the event
  * column -- column at the time of the event
  * result -- rule's match result (only for rule.match event)

The whole tracing API is somewhat experimental (which is why it isn't
documented properly yet) and I expect it will evolve over time as
experience is gained.

The default tracer is also somewhat bare-bones. I hope that PEG.js user
community will develop more sophisticated tracers over time and I'll be
able to integrate their best ideas into the default tracer.
10 years ago
David Majda 675561f085 Rename and generalize |generateCache{Header,Footer}|
Rename |generateCache{Header,Footer}| to |generateRule{Header,Footer}|
and change their responsibility to generate overall header/footer of a
rule function (when optimizing for speed) or the |peg$parseRule|
function (when optimizing for speed). This creates a natural place where
to generate tracing code (coming soon).
10 years ago
David Majda fb5f6c6ee9 Make labels behave like block-scoped variables
Action and predicate code can now see variables defined in expressions
"above" them.

Based on a pull request by Bryon Vandiver (@asterick):

  https://github.com/pegjs/pegjs/pull/180

Fixes #316.
10 years ago
David Majda 73795a65cc Behavior specs cleanup: Add group specs
While groups don't create separate nodes on the AST level, they exist
as concept on the user level, so they should be specified.
10 years ago
David Majda e306b58443 Behavior specs cleanup: Improve error reporting specs 10 years ago
David Majda e9d038547d Behavior specs cleanup: Improve semantic predicate specs
Note that use of |text| inside semantic predicate code is no longer
tested and officially supported.
10 years ago
David Majda 3d9600b81b Behavior specs cleanup: Improve action specs 10 years ago