pegjs/spec
David Majda 2f2152204a Refine error handling further
Before this commit, the |expected| and |error| functions didn't halt the
parsing immediately, but triggered a regular match failure. After they
were called, the parser could backtrack, try another branches, and only
if no other branch succeeded, it triggered an exception with information
possibly based on parameters passed to the |expected| or |error|
function (this depended on positions where failures in other branches
have occurred).

While nice in theory, this solution didn't work well in practice. There
were at least two problems:

  1. Action expression could have easily triggered a match failure later
     in the input than the action itself. This resulted in the
     action-triggered failure to be shadowed by the expression-triggered
     one.

     Consider the following example:

       integer = digits:[0-9]+ {
         var result = parseInt(digits.join(""), 10);

         if (result % 2 === 0) {
           error("The number must be an odd integer.");
           return;
         }

         return result;
       }

     Given input "2", the |[0-9]+| expression would record a match
     failure at position 1 (an unsuccessful attempt to parse yet another
     digit after "2"). However, a failure triggered by the |error| call
     would occur at position 0.

     This problem could have been solved by silencing match failures in
     action expressions, but that would lead to severe performance
     problems (yes, I tried and measured). Other possible solutions are
     hacks which I didn't want to introduce into PEG.js.

  2. Triggering a match failure in action code could have lead to
     unexpected backtracking.

     Consider the following example:

       class = "[" (charRange / char)* "]"

       charRange = begin:char "-" end:char {
         if (begin.data.charCodeAt(0) > end.data.charCodeAt(0)) {
           error("Invalid character range: " + begin + "-" + end + ".");
         }

         // ...
       }

       char = [a-zA-Z0-9_\-]

     Given input "[b-a]", the |charRange| rule would fail, but the
     parser would try the |char| rule and succeed repeatedly, resulting
     in "b-a" being parsed as a sequence of three |char|'s, which it is
     not.

     This problem could have been solved by using negative predicates,
     but that would complicate the grammar and still wouldn't get rid of
     unintuitive behavior.

Given these problems I decided to change the semantics of the |expected|
and |error| functions. They don't interact with regular match failure
mechanism anymore, but they cause and immediate parse failure by
throwing an exception. I think this is more intuitive behavior with less
harmful side effects.

The disadvantage of the new approach is that one can't backtrack from an
action-triggered error. I don't see this as a big deal as I think this
will be rarely needed and one can always use a semantic predicate as a
workaround.

Speed impact
------------
Before:     993.84 kB/s
After:      998.05 kB/s
Difference: 0.42%

Size impact
-----------
Before:     1019968 b
After:      975434 b
Difference: -4.37%

(Measured by /tools/impact with Node.js v0.6.18 on x86_64 GNU/Linux.)
2013-12-06 21:43:27 +01:00
..
compiler/passes Refine error handling further 2013-12-06 21:43:27 +01:00
vendor/jasmine Upgrade jasmine and jasmine-node 2013-08-22 09:07:19 +02:00
generated-parser.spec.js Refine error handling further 2013-12-06 21:43:27 +01:00
helpers.js Fix too eager proxy rules removal 2013-01-06 10:17:10 +01:00
index.html Code generator rewrite 2013-01-01 16:38:09 +01:00
parser.spec.js Remove the |startRule| property from the AST 2013-01-06 10:21:48 +01:00
README Git repo npmization: Make the repo a npm package 2012-11-10 14:21:14 +01:00

PEG.js Spec Suite
=================

This is the PEG.js spec suite. It ensures PEG.js works correctly. All specs
should always pass on all supported platforms.

Running in a browser
--------------------

  1. Make sure you have Node.js and all the development dependencies specified
     in package.json installed.

  2. Run the following command in the PEG.js root directory (one level up from
     this one):

       make browser

  3. Start a web server and make it serve the PEG.js root directory.

  4. Point your browser to an URL corresponding to the index.html file.

  5. Watch the specs pass (or fail).

If you have Python installed, you can fulfill steps 3 and 4 by running the
following command in the PEG.js root directory

  python -m SimpleHTTPServer

and loading http://localhost:8000/spec/index.html in your browser.

Running from a command-line
---------------------------

  1. Make sure you have Node.js and all the development dependencies specified
     in package.json installed.

  2. Run the following command in the PEG.js root directory (one level up from
     this one):

       make spec

  3. Watch the specs pass (or fail).