63 Commits (8bd4d390a9f43b730cd4403b4c767847e1e8cd4e)

Author SHA1 Message Date
Futago-za Ryuu 35f3c5267a Merge pull request #490 from chearon/css-num-fix
CSS example: parse decimal form of nums correctly
7 years ago
David Berneda 962db9d090 Update arithmetics.pegjs
Allow spacing before digits like: "     2 * (3 + 4)"
8 years ago
Caleb Hearon 770ca6e723 CSS example: parse decimal form of nums correctly
Before, 0.02 could get parsed as 0 and 0.02 when looking for nums+
8 years ago
David Majda ff3cc7930e CSS Example: Move null filtering from extractList to buildList
This makes extractList identical to the same function in other grammars
and makes it so that nulls are dealt with in only one function (until
now, they were dealt with both in extractList and buildList).

The refactoring should be safe as extractList isn't by itself used in
contexts where it can be passed a list containing nulls.
8 years ago
David Majda 647d488147 JSON example: Fix link to RFC 4234. 8 years ago
David Majda 2baeace235 JSON example: Expand some one-line rules to multiple lines
Blocks of one-line rules with aligned "=" signs should be used only in
cases where the rules are symmetric and we want to emphasize that.

Follow-up to ff7193776e.
8 years ago
David Majda 400a3cfa3c Avoid aligning object keys
The only exception left are objects representing a mapping with simple
keys and values -- essentially tables written as object literals.

See #443.
8 years ago
David Majda 6294bb5b13 Use only "//" comments
See #443.
8 years ago
David Majda 5ad1bc2add CSS example: Switch from first/rest to head/tail
Follow-up to e510ecc3d0.
8 years ago
David Majda 9c04c94c85 Escape vertical tab as "\v", not "\x0B"
See #441.
8 years ago
David Majda aa1a2e74cf Replace suitable for loops with Array methods (in /examples)
See #441.
8 years ago
Ali Tavakoli d914c7b150 JavaScript example: Use LogicalExpression nodes for "&&" and "||"
The buildLogicalExpression function was defined, but not used;
specifically, the Logical(AND|OR)Expression(NoIn)? rules were
constructing BinaryExpression nodes, but are now LogicalExpression
nodes as per the ESTree spec (es5.md).
8 years ago
David Majda bf08c6cbc3 JavaScript example: Declare built AST as compatible with ESTree
ESTree is now the canonical JavaScript AST format. Mozilla SpiderMonkey
Parser API, which we delared compatibility with before, is obsolete.
8 years ago
David Majda 567655e72f JavaScript example: Add "type" to ObjectExpression properties
Nodes representing ObjectExpression properties were missing the "type"
property (set to "Property") so let's add it.
8 years ago
Ali Tavakoli 5f9bc6ed4d JavaScript example: Add "kind" to VariableDeclaration nodes
The JavaScript example grammar's VariableDeclaration nodes were missing
the "kind" member (which is always set to "var", according to the
ESTree spec).
8 years ago
David Majda f07ab7f32e examples/json.pegjs: Fix the "unescaped" rule
The "unescaped" rule was created by mechanically translating original
RFC 7159 rule:

  unescaped = %x20-21 / %x23-5B / %x5D-10FFFF

into:

  unescaped = [\x20-\x21\x23-\x5B\x5D-\u10FFFF]

However, this mechanical translation was incorrect as PEG.js grammars
don't have 6-digit Unicode escape sequences. Sequence "\u10FFFF" was
interpreted as "\u10FF" followed by two "F" characters.

This commit rewrites the "unescaped" rule into a form which, while not
being a mechanical translation of the original rule, matches the same
characters in the whole Unicode range. It also macthes textual
description of string representation in RFC 7159:

  All Unicode characters may be placed within the quotation marks,
  except for the characters that must be escaped: quotation mark,
  reverse solidus, and the control characters (U+0000 through U+001F).

Fixes #417.
9 years ago
David Majda e510ecc3d0 Switch from first/rest to head/tail in example grammars
In the past year I worked on various grammars where first/rest or
head/tail were used as labels for parts of lists. I found I associate
head/tail with a list immediately, while in case of first/rest I have to
"parse" grammar rules for a while before understanding their structure.

Moreover, I tend to assume that rest is a list of the same thigs as
first, but I don't have such assumption in case of head/tail. This
assumption was in conflict with the grammar structure.

I'm not sure how much these observations are applicable to others, but I
decided to act on them and switch from first/rest to head/tail.
9 years ago
David Majda 10d7a6aded Simplify the arithmetics example grammar
The arithmetics example grammar is the first thing everyone sees in the
online editor at the PEG.js website, but it begins with a complicated
|combine| function in the initializer. Without understanding it it is
impossible to understand code in the actions. This may be a barrier to
learning how PEG.js works.

This commit removes the |combine| function and gets rid of the whole
initializer, removing the learning obstacle and streamlining action
code. The only cost is a slight code duplication.
9 years ago
David Majda 9e8cb04c81 examples/arithmetics.pegjs: Remove trailing whitespace 9 years ago
David Majda cca41e6618 Quote |class| object literal key
It broke IE 8.
9 years ago
David Majda 4b154e177f Update character categories in grammars to Unicode 8.0.0 9 years ago
chunpu e6efe09ac3 Update example arithmetics.pegjs, make it work 11 years ago
David Majda c13cc88262 JavaScript example: Remove reserved word detection
Reserved word detection as it was implemented in the JavaScript example
grammar had two big downsides:

  1. It required changes in ordering of choices in some rules in order
     not to trigger the detection prematurely. One of the changes was
     already implemented (in the |Statement| rule, see the diff), but
     apparently more were needed (the grammar didn't parse inputs like
     |true| or |function f() {}|). And I'm not 100% sure that would be
     the end of it (maybe deeper structural changes would be needed).

  2. It made error messages confusing. Consider the following example:

       var a = @;

     Instead of reporting:

       Expected ... but "@" found.

     the generated parser reported:

       Reserved word "var" can't be used as an identifier.

     This was because the parser parsed the statement first as
     |VariableStatement| and when this failed, it tried to parse it as
     |ExpressionStatement|, triggering the reserved word detection.

Because of these, I decided to remove reserved word detection from the
JavaScript example grammar.
11 years ago
David Majda b271d66442 JavaScript example: Fix automatic semicolon insertion
Fix parsing of inputs like this:

  foo() // comment
  bar()
11 years ago
David Majda c70c8551b4 JavaScript example: Fix parsing of statements
Fixes a problem where statements starting with a reserved word produced
errors like this:

  Reserved word "return" can't be used as an identifier.

The problem was in a wrong ordering of choices in the |Statement| rule
together with aggressive reserved word detection in the |Identifier|
rule.
11 years ago
David Majda 2005345976 Complete rewrite of the CSS example grammar
This is a complete rewrite of the CSS example grammar. It is now based
on CSS 2.1 *including the errata* and the generated parser builds a
nicer syntax tree. There is also a number of cleanups, formatting
changes, naming changes, and bug fixes.

Beside this, the rewrite reflects how I write grammars today (as opposed
to few years ago) and what style I would recommend to others.
11 years ago
David Majda 18f92c5647 Complete rewrite of the JavaScript example grammar
This is a complete rewrite of the JavaScript example grammar. It is now
based on ECMA-262, 5.1 Edition and the generated parser builds a syntax
tree compatible with Mozilla SpiderMonkey Parser API. There is also a
number of cleanups, formatting changes, naming changes, and bug fixes.

Beside this, the rewrite reflects how I write grammars today (as opposed
to few years ago) and what style I would recommend to others.
11 years ago
David Majda fba70833dd Complete rewrite of the JSON example grammar
This is a complete rewrite of the JSON example grammar. It is now based
on RFC 7159 instead of an informal description at the JSON website.

Beside this, the rewrite reflects how I write grammars today (as opposed
to few years ago) and what style I would recommend to others.
11 years ago
David Majda f5443d2bf1 Complete rewrite of the arithmetics example grammar
This is a complete rewrite of the arithmetics example grammar. It now
allows whitespace between tokens, supports "-" and "/" operators, and
gets the operator associativity right. Also, rule names now match the usual
conventions (term, factor,...).

Beside this, the rewrite reflects how I write grammars today (as opposed
to few years ago) and what style I would recommend to others.
11 years ago
David Majda da9a32a5f3 Example grammars: Improve |parseInt| invocations
Instead of |parseInt("0x" + digits)| do |parseInt(digits, 16)|, which is
a bit cleaner.
11 years ago
David Majda 68c6452d8a CSS example grammar: Simplify |integer| and |float| rules
It's not necessary to parse |parts| in the |integer| and |float| rule
into integer/float value. Everywhere these rules are used the result is
converted back into string anyway.
11 years ago
David Majda 86769a6c5c Error handling: Make |?| return |null| on unsuccessful match
Before this commit, the |?| operator returned an empty string upon
unsuccessful match. This commit changes the returned value to |null|. It
also updates the PEG.js grammar and the example grammars, which used the
value returned by |?| quite often.

Returning |null| is possible because it no longer indicates a match
failure.

I expect that this change will simplify many real-world grammars, as an
empty string is almost never desirable as a return value (except some
lexer-level rules) and it is often translated into |null| or some other
value in action code.

Implements part of #198.
11 years ago
David Majda 57e806383c Error handling: Use a special value (not |null|) to indicate failure
Using a special value to indicate match failure instead of |null| allows
actions to return |null| as a regular value. This simplifies e.g. the
JSON parser.

Note the special value is internal and intentionally undocumented. This
means that there is currently no official way how to trigger a match
failure from an action. This is a temporary state which will be fixed
soon.

The negative performance impact (see below) is probably caused by
changing lot of comparisons against |null| (which likely check the value
against a fixed constant representing |null| in the interpreter) to
comparisons against the special value (which likely check the value
against another value in the interpreter).

Implements part of #198.

Speed impact
------------
Before:     1146.82 kB/s
After:      1031.25 kB/s
Difference: -10.08%

Size impact
-----------
Before:     950817 b
After:      973269 b
Difference: 2.36%

(Measured by /tools/impact with Node.js v0.6.18 on x86_64 GNU/Linux.)
11 years ago
David Majda f3d392bd1c JavaScript example: Fix handling of elided elements in array literals
JavaScript allows one to skip (elide) elements in array literals. It
also allows a trailing comma, which doesn't imply an element elision.

For example, an array literal:

  [,,,]

contains three elided elements (one before each comma) and a trailing
comma.

Example JavaScript parser handled elided elements incorrectly and just
threw them away. This commit fixes this behvior and inserts |null| in
the AST for each elided element. This is in line with how SpiderMonkey's
JavaScript parser (the |Reflect.parse| API), Esprima and Acorn behave.

Based on a patch by @fpirsch:

  https://github.com/dmajda/pegjs/pull/177
11 years ago
David Majda d71bca46a1 Javascript example: Improve array literal rules
Makes the |ArrayLiteral| and |ElementList| rules more in line with the
ECMAScript grammar.

Based on a patch by @fpirsch:

  https://github.com/dmajda/pegjs/pull/177
11 years ago
David Majda fe18c6ffd3 Fix |null| handling in the JSON parser
We couldn't return |null| in the |value| rule of the JSON example
parser because that would mean parse failure. So until now, we just
returned |"null"| (a string).

This was obviously stupid, so this commit changes the |value| rule to
return a special object instead that is converted to |null| later.

Based on patches by Patrick Logan (GH-91) and Jakub Vrána (GH-191).
11 years ago
fpirsch d7e853b87c Fix automatic semi-colon insertion
Fix automatic semi-colon insertion in var statements without
initialisers.
var i
i = 1;
is valid and not accepted by the parser

but
var i = 2
i = 3;
is valid and accepted by the parser, as it should be.

With this fix, both are accepted.
12 years ago
David Majda c54483bb17 Text nodes: Use text nodes in examples/javascript.pegjs 12 years ago
David Majda faaf9b6be1 Text nodes: Use text nodes in examples/css.pegjs 12 years ago
David Majda d0dfe46550 Text nodes: Use text nodes in examples/json.pegjs 12 years ago
David Majda 9ec6b6aa57 Text nodes: Use text nodes in examples/arithmetics.pegjs 12 years ago
fpirsch fa05142292 Update examples/javascript.pegjs
Changed "arguments" to "args" in several places to avoid shadowing "arguments", which is not allowed by Google Clusure Compiler.
12 years ago
David Majda 70e4166bb2 Fix typo in a comment in JavaScript example grammar 13 years ago
David Majda 8ae3eea7c4 Fix typo in JavaScript example grammar
Fixes GH-62.
13 years ago
Jason Davies d386d3a351 Fix typo in comment. 13 years ago
David Majda 5cf66d824c Fix typo in JavaScript example grammar 13 years ago
David Majda 5f810f803b Make example grammars compatible with Rhino 15 years ago
David Majda ee8c121676 Use labeled expressions and variables instead of $1, $2, etc.
Labeled expressions lead to more maintainable code and also will allow
certain optimizations (we can ignore results of expressions not passed
to the actions).

This does not speed up the benchmark suite execution statistically
significantly on V8.

Detailed results (benchmark suite totals):

---------------------------------
 Test #     Before       After
---------------------------------
      1   28.43 kB/s   28.46 kB/s
      2   28.38 kB/s   28.56 kB/s
      3   28.22 kB/s   28.58 kB/s
      4   28.76 kB/s   28.55 kB/s
      5   28.57 kB/s   28.48 kB/s
---------------------------------
Average   28.47 kB/s   28.53 kB/s
---------------------------------

Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/533.4 (KHTML, like Gecko) Chrome/5.0.375.55 Safari/533.4
15 years ago
David Majda 409ddf2ae8 Formatted all grammars more consistently and transparently
This is purely cosmetical change, no functionality was affected
(hopefully).
15 years ago
David Majda 698564a3c2 Replace ":" after a rule name with "="
I'll introduce labelled expressions shortly and I want to use ":" as a
label-expression separator. This change avoids conflict between the two
meanings of ":". (What would e.g. "foo: 'bar'" mean?  Rule "foo"
matching string "bar", or string "bar" labelled "foo"?)
15 years ago