79 Commits (15cb8ff968c3c172ad4161416a5e47d8a092b3a6)

Author SHA1 Message Date
Futago-za Ryuu 616749377b Fix IE11 Support (#583)
- Revert ES6 changes to arithmetics.pegjs
- Use Array#forEach instead of for..of
- Don't use native Array#find & Array#findIndex
- Added util/arrays.js (find & findIndex)
- Use Function instead of eval
6 years ago
Futago-za Ryuu e0e9fbcd30 Unicode 11 6 years ago
Futago-za Ryuu e64118f3b7 Update src/parser.pegjs
- use value plucking
- remove helpers not needed now
- types in OPS_* are now returned by *Operator
- RESERVED_WORDS is now a `Object<Identifier,true>`
- use ES2015+ JavaScript
- cleanup source code
6 years ago
Futago-za Ryuu 460f0cc5bc Implement value plucking
Resolves #235, #427, #545
6 years ago
Futago-za Ryuu 26969475f7 IdentifierName > Identifier 6 years ago
Futago-za Ryuu b81396a904 Regenerate parser 7 years ago
Futago-za Ryuu fe6f09238a Relay parser opts from peg.generate (#553) 7 years ago
Futago-za Ryuu 5476eca59f Moved AST and visitor classes 7 years ago
Futago-za Ryuu b0a5db1ab9 Expose ast classes used by parser 7 years ago
Mingun 0dab14d652 Add ability to extract comments from the grammar (#511)
All comments stored in the `comments` property of the `grammar` node.
Comments extracted only if the `extractComments` options set to `true` when you generate parser.
This property is object with mapping start offset of comment to comment object, that looks like:

```js
{
  text: 'text in the comment, just after // or /* and before */',
  multiline: true|false,// true for /**/ comments, false for // comments
  location: location()
}
```
7 years ago
Mingun 9b90fa1d81 Move all codegeneration from `generateBytecode` pass to `generateJs` pass (#459)
* Split 'consts' collection by content types into:

  - literals: for literal expressions, like `"a"`
  - classes: for character class expressions, like `[a]`
  - expectations: for constants describing expected values when parse failed
  - functions: for constants with function code

* Move any JavaScript code generation from 'generateBytecode' to 'generateJs'.
* Rename opcode 'MATCH_REGEXP' to 'MATCH_CLASS' (name reflects purpose, not implementation).
* Replace 'PUSH' opcode with 'PUSH_EMPTY_STRING' opcode because it is only used with empty strings
7 years ago
Futago-za Ryuu 27ec5ed9b1 Ensure we are nearly always in fast mode on V8
See: https://stackoverflow.com/a/24989927/1518408
7 years ago
Futago-za Ryuu 93cc6c5b26 Parser calls AST node creator now
Before this commit, the PEG.js parser always created the AST using a plain JavaSctript object, but allthough simple and effective for the job, this method sacrificies performance slightly.

From now on the parser shall call a Node creator. This should help with performance, as well as in the future move some AST helpers into the new AST functions.
7 years ago
Mingun 4cc9185a78 Improve error when reserved word used as label (#552)
Before this commit error looks like (for input `start = break:'a'`)

> Expected "!", "$", "&", "(", "*", "+", ".", "/", "/*", "//", ";", "?", character class, code block, comment, end of line, identifier, literal, or whitespace but ":" found.

After this error looks like

> Label can't be a reserved word "break".
7 years ago
felix cb3c5f4473 Improve error message for unbalanced brace. (#534)
Currently, an open brace without a corresponding brace will emit this confusing error message:

> Expected "!", "$", "&", "(", "*", "+", ".", "/", "/*", "//", ";", "?", character class, code block, comment, end of line, identifier, literal, or whitespace but "{" found.

This change adds an error case to the grammar to make it clear what the problem is.
7 years ago
Mingun c98fee1629 Add location information to group AST node 7 years ago
David Majda 400a3cfa3c Avoid aligning object keys
The only exception left are objects representing a mapping with simple
keys and values -- essentially tables written as object literals.

See #443.
8 years ago
David Majda 12112310f2 Use only double quotes for strings
See #443
8 years ago
David Majda 6294bb5b13 Use only "//" comments
See #443.
8 years ago
David Majda 6fa8ad63f9 Replace some functions with arrow functions
Because arrow functions work rather differently than normal functions (a
bad design mistake if you ask me), I decided to be conservative with the
conversion.

I converted:

  * event handlers
  * callbacks
  * arguments to Array.prototype.map & co.
  * small standalone lambda functions

I didn't convert:

  * functions assigned to object literal properties (the new shorthand
    syntax would be better here)
  * functions passed to "describe", "it", etc. in specs (because Jasmine
    relies on dynamic "this")

See #442.
8 years ago
David Majda bdf91b5941 Replace "var" with "let" & "const"
This is purely a mechanical change, not taking advantage of block scope
of "let" and "const". Minimizing variable scope will come in the next
commit.

In general, "var" is converted into "let" and "const" is used only for
immutable variables of permanent character (generally spelled in
ALL_CAPS). Using it for any immutable variable regardless on its
permanence would feel confusing.

Any code which is not transpiled and needs to run in ES6 environment
(examples, code in grammars embedded in specs, ...) is kept unchanged.
This is also true for code generated by PEG.js.

See #442.
8 years ago
David Majda 9c04c94c85 Escape vertical tab as "\v", not "\x0B"
See #441.
8 years ago
David Majda 3e8bcbea73 Replace suitable for loops with Array methods (in /src)
See #441.
8 years ago
David Majda e7d03825e0 AST: Remove the "rawText" property from "class" nodes
It isn't used anymore.
9 years ago
David Majda 2fd77b96fc Revert "Use literal raw text in error messages"
I no longer think that using raw literal texts in error messages is the
right thing to do. The main reason is that it couples error messages
with details of the grammar such as use of single or double quotes in
literals. A better solution is coming in the next commit.

This reverts commit 69a0f769fc.
9 years ago
David Majda 0c39f1cf86 Fix labels leaking to outer scope
Labels in expressions like "(a:"a")" or "(a:"a" b:"b" c:"c")" were
visible to the outside despite being wrapped in parens. This commit
makes them invisible, as they should be.

Note this required introduction of a new "group" AST node, whose purpose
is purely to provide label scope isolation. This was necessary because
"label" and "sequence" nodes don't (and can't!) provide this isolation
themselves.

Part of a fix of #396.
9 years ago
David Majda a4a66a2e5b Switch from first/rest to head/tail in the PEG.js grammar
In the past year I worked on various grammars where first/rest or
head/tail were used as labels for parts of lists. I found I associate
head/tail with a list immediately, while in case of first/rest I have to
"parse" grammar rules for a while before understanding their structure.

Moreover, I tend to assume that rest is a list of the same thigs as
first, but I don't have such assumption in case of head/tail. This
assumption was in conflict with the grammar structure.

I'm not sure how much these observations are applicable to others, but I
decided to act on them and switch from first/rest to head/tail.
9 years ago
David Majda 69a0f769fc Use literal raw text in error messages
Fixes #127.
9 years ago
David Majda 4b154e177f Update character categories in grammars to Unicode 8.0.0 9 years ago
David Majda 89146915ce Add location information to AST nodes
This will allow to add location information to |GrammarError| exceptions
thrown in various passes.
10 years ago
David Majda 7e3b4ec4f8 PEG.js grammar: Remove reserved word detection
This is mostly done for consistency with the JavaScript example grammar,
from which the |Identifier| rule is taken from. See the previous commit
for details.
11 years ago
David Majda e78ffbba9c PEG.js grammar: Improve the |Code| rule a bit
Instead of matching segments between blocks character by character,
match them as a whole. Also align the style with other similar rules
(e.g. the comment ones).
11 years ago
David Majda 64eb5faf54 PEG.js grammar: Fix line continuation handling in character classes
Before this commit, line continuations in character classes contributed
an empty string to the list of characters and character ranges matched
by a class. While this didn't lead to a buggy behavior with the current
code generator, the AST was wrong and the problem could have caused bugs
later.

This commit fixes the problem.
11 years ago
David Majda 0678bd8a0c PEG.js grammar: Add missing semicolon
Fixes the following JSHint error:

  lib/parser.js: line 108, col 54, Missing semicolon.
11 years ago
David Majda cf294ef236 PEG.js grammar: Add limitations
The limitations are inherited from the JavaScript example grammar.
11 years ago
David Majda 0459ab6b37 PEG.js grammar: Formatting & comments 11 years ago
David Majda 6f2510e49e PEG.js grammar: Make rules with operators more generic 11 years ago
David Majda 45c29a886f PEG.js grammar: Extract the |SemanticPredicateExpression| rule
Semantic predicates are kind of |PrimaryExpression|, not kind of
|PrefixedExpression|. Therefore I extracted a rule for them and
referenced it from the |PrimaryExpression|.
11 years ago
David Majda da18f6a729 PEG.js grammar: Extract the |RuleReferenceExpression| rule
This makes the |Primary| rule a bit more tidy.
11 years ago
David Majda 8e6f98e45c PEG.js grammar: Extract the |ActionExpression| rule
Having it separated from the |SequenceExpression| rule is cleaner and
more logical.
11 years ago
David Majda 5c6f4dd38b PEG.js grammar: Append |Expression| to expression rule names
Makes the rule names a bit longer but also clearer.
11 years ago
David Majda 27c2d26203 PEG.js grammar: More JavaScript-like initializer/rule separation
Initializer and rules are now separated in a similar way as JavaScript
statements -- either by a semicolon or a line terminator, possibly with
whitespace and comments mixed in.

One consequence is that the grammars like this are now illegal:

  foo = "a" bar = "b"

A semicolon needs to be inserted between the rules:

  foo = "a";bar = "b"

I consider this a good change as the now-illegal syntax was somewhat
confusing.
11 years ago
David Majda 4ce7593f5f PEG.js grammar: Extract the |AnyMatcher| rule
This makes the |Primary| rule a bit more tidy. Also, matching the |.|
character really belongs to the lexical part of the grammar, next to
literals and character classes.
11 years ago
David Majda c0df01b092 PEG.js grammar: Improve code block handling
* Rename the |Action| rule to |CodeBlock| (it better describes what
    the rule matches).

  * Implement the rule in a simpler way and move it after more basic
    lexical elements.
11 years ago
David Majda 13f72bb19d PEG.js grammar: More JavaScript-like rules for identifiers
This change has two side effects:

  * Label names can no longer be JavaScript reserved words.

  * |$| is allowed again in label names. However, because of the
    preference rules, names starting with it will be usually parsed as a
    text operator followed by another identifier (denoting a rule
    reference or label name).
11 years ago
David Majda 0d6b91cb20 PEG.js grammar: More JavaScript-like rules for strings/literals/classes 11 years ago
David Majda bcb5271649 PEG.js grammar: More JavaScript-like rules for skipped elements 11 years ago
David Majda b463808b3f PEG.js grammar: Replace several smaller comments by a big initial one 11 years ago
David Majda a5a0609505 PEG.js grammar: Inline trivial character rules 11 years ago
David Majda ae89f5e469 PEG.js grammar: Change whitespace handling
Before this commit, whitespace was handled at the lexical level by
making tokens consume any whitespace coming after them. This was
accomplished by appending |__| to every token rule.

This commit changes whitespace handling to be more explicit. Tokens no
longer consume whitespace coming after them and syntactic rules have to
cope with it. While this slightly complicates the syntactic grammar, I
think it's a cleaner way. Moreover, it is what JavaScript example
grammar does.

One small side-effect of thich change is that the grammar is now
stand-alone (it doesn't require utils.js anymore).
11 years ago