Commit graph

209 commits

Author SHA1 Message Date
Futago-za Ryuu 616749377b Fix IE11 Support (#583)
- Revert ES6 changes to arithmetics.pegjs
- Use Array#forEach instead of for..of
- Don't use native Array#find & Array#findIndex
- Added util/arrays.js (find & findIndex)
- Use Function instead of eval
2018-09-25 17:59:38 +01:00
Futago-za Ryuu e0e9fbcd30 Unicode 11 2018-09-18 07:09:11 +01:00
Futago-za Ryuu e64118f3b7 Update src/parser.pegjs
- use value plucking
- remove helpers not needed now
- types in OPS_* are now returned by *Operator
- RESERVED_WORDS is now a `Object<Identifier,true>`
- use ES2015+ JavaScript
- cleanup source code
2018-09-18 06:57:55 +01:00
Futago-za Ryuu 460f0cc5bc Implement value plucking
Resolves #235, #427, #545
2018-09-17 11:32:34 +01:00
Futago-za Ryuu 26969475f7 IdentifierName > Identifier 2018-09-15 06:41:57 +01:00
Futago-za Ryuu 8b43c8419f Use input/output from the config file 2018-09-14 03:43:09 +01:00
Futago-za Ryuu b81396a904 Regenerate parser 2018-04-03 03:17:30 +01:00
Futago-za Ryuu c2c823196f Re-generate parser 2018-03-23 04:12:19 +00:00
Futago-za Ryuu fe6f09238a Relay parser opts from peg.generate (#553) 2018-01-28 23:42:16 +00:00
Futago-za Ryuu 5476eca59f Moved AST and visitor classes 2018-01-28 02:00:28 +00:00
Futago-za Ryuu b0a5db1ab9 Expose ast classes used by parser 2018-01-26 08:24:51 +00:00
Mingun 0dab14d652 Add ability to extract comments from the grammar (#511)
All comments stored in the `comments` property of the `grammar` node.
Comments extracted only if the `extractComments` options set to `true` when you generate parser.
This property is object with mapping start offset of comment to comment object, that looks like:

```js
{
  text: 'text in the comment, just after // or /* and before */',
  multiline: true|false,// true for /**/ comments, false for // comments
  location: location()
}
```
2018-01-24 18:10:45 +00:00
Futago-za Ryuu b6bc0d905e Use .js files with -c option on CLI
This commit adds support for '.js' files to be passed to the '-c', '--config'  or '--extra-options-file' options on the CLI, allowing the developer to do some extra work before the parser is generated (if they wish), or dynamically set options based on the enviroment.
2018-01-19 08:55:04 +00:00
Mingun 9b90fa1d81 Move all codegeneration from generateBytecode pass to generateJs pass (#459)
* Split 'consts' collection by content types into:

  - literals: for literal expressions, like `"a"`
  - classes: for character class expressions, like `[a]`
  - expectations: for constants describing expected values when parse failed
  - functions: for constants with function code

* Move any JavaScript code generation from 'generateBytecode' to 'generateJs'.
* Rename opcode 'MATCH_REGEXP' to 'MATCH_CLASS' (name reflects purpose, not implementation).
* Replace 'PUSH' opcode with 'PUSH_EMPTY_STRING' opcode because it is only used with empty strings
2018-01-17 16:57:49 +00:00
Futago-za Ryuu 27ec5ed9b1 Ensure we are nearly always in fast mode on V8
See: https://stackoverflow.com/a/24989927/1518408
2018-01-16 02:57:12 +00:00
Futago-za Ryuu 93cc6c5b26 Parser calls AST node creator now
Before this commit, the PEG.js parser always created the AST using a plain JavaSctript object, but allthough simple and effective for the job, this method sacrificies performance slightly.

From now on the parser shall call a Node creator. This should help with performance, as well as in the future move some AST helpers into the new AST functions.
2018-01-16 02:46:41 +00:00
Mingun 4cc9185a78 Improve error when reserved word used as label (#552)
Before this commit error looks like (for input `start = break:'a'`)

> Expected "!", "$", "&", "(", "*", "+", ".", "/", "/*", "//", ";", "?", character class, code block, comment, end of line, identifier, literal, or whitespace but ":" found.

After this error looks like

> Label can't be a reserved word "break".
2018-01-07 14:51:06 +00:00
Futago-za Ryuu db70215c4a Added 'header' option (#491) 2017-12-18 00:54:47 +00:00
felix cb3c5f4473 Improve error message for unbalanced brace. (#534)
Currently, an open brace without a corresponding brace will emit this confusing error message:

> Expected "!", "$", "&", "(", "*", "+", ".", "/", "/*", "//", ";", "?", character class, code block, comment, end of line, identifier, literal, or whitespace but "{" found.

This change adds an error case to the grammar to make it clear what the problem is.
2017-09-15 21:36:19 +01:00
Mingun c98fee1629 Add location information to group AST node 2017-06-24 23:34:50 +05:00
David Majda 400a3cfa3c Avoid aligning object keys
The only exception left are objects representing a mapping with simple
keys and values -- essentially tables written as object literals.

See #443.
2016-09-22 07:55:30 +02:00
David Majda 12112310f2 Use only double quotes for strings
See #443
2016-09-21 15:06:56 +02:00
David Majda 6294bb5b13 Use only "//" comments
See #443.
2016-09-20 15:07:39 +02:00
David Majda 6fa8ad63f9 Replace some functions with arrow functions
Because arrow functions work rather differently than normal functions (a
bad design mistake if you ask me), I decided to be conservative with the
conversion.

I converted:

  * event handlers
  * callbacks
  * arguments to Array.prototype.map & co.
  * small standalone lambda functions

I didn't convert:

  * functions assigned to object literal properties (the new shorthand
    syntax would be better here)
  * functions passed to "describe", "it", etc. in specs (because Jasmine
    relies on dynamic "this")

See #442.
2016-09-12 16:07:43 +02:00
David Majda bdf91b5941 Replace "var" with "let" & "const"
This is purely a mechanical change, not taking advantage of block scope
of "let" and "const". Minimizing variable scope will come in the next
commit.

In general, "var" is converted into "let" and "const" is used only for
immutable variables of permanent character (generally spelled in
ALL_CAPS). Using it for any immutable variable regardless on its
permanence would feel confusing.

Any code which is not transpiled and needs to run in ES6 environment
(examples, code in grammars embedded in specs, ...) is kept unchanged.
This is also true for code generated by PEG.js.

See #442.
2016-09-09 10:44:00 +02:00
David Majda 9c04c94c85 Escape vertical tab as "\v", not "\x0B"
See #441.
2016-09-01 15:03:47 +02:00
David Majda 3e8bcbea73 Replace suitable for loops with Array methods (in /src)
See #441.
2016-09-01 12:55:26 +02:00
David Majda e7d03825e0 AST: Remove the "rawText" property from "class" nodes
It isn't used anymore.
2016-06-18 04:52:49 +02:00
David Majda 2fd77b96fc Revert "Use literal raw text in error messages"
I no longer think that using raw literal texts in error messages is the
right thing to do. The main reason is that it couples error messages
with details of the grammar such as use of single or double quotes in
literals. A better solution is coming in the next commit.

This reverts commit 69a0f769fc.
2016-05-09 15:07:44 +02:00
David Majda 0c39f1cf86 Fix labels leaking to outer scope
Labels in expressions like "(a:"a")" or "(a:"a" b:"b" c:"c")" were
visible to the outside despite being wrapped in parens. This commit
makes them invisible, as they should be.

Note this required introduction of a new "group" AST node, whose purpose
is purely to provide label scope isolation. This was necessary because
"label" and "sequence" nodes don't (and can't!) provide this isolation
themselves.

Part of a fix of #396.
2016-03-11 16:42:03 +01:00
David Majda a4a66a2e5b Switch from first/rest to head/tail in the PEG.js grammar
In the past year I worked on various grammars where first/rest or
head/tail were used as labels for parts of lists. I found I associate
head/tail with a list immediately, while in case of first/rest I have to
"parse" grammar rules for a while before understanding their structure.

Moreover, I tend to assume that rest is a list of the same thigs as
first, but I don't have such assumption in case of head/tail. This
assumption was in conflict with the grammar structure.

I'm not sure how much these observations are applicable to others, but I
decided to act on them and switch from first/rest to head/tail.
2015-10-09 17:23:36 +02:00
David Majda 69a0f769fc Use literal raw text in error messages
Fixes #127.
2015-09-18 10:56:05 -07:00
David Majda 4b154e177f Update character categories in grammars to Unicode 8.0.0 2015-08-06 16:42:26 +02:00
David Majda 89146915ce Add location information to AST nodes
This will allow to add location information to |GrammarError| exceptions
thrown in various passes.
2015-04-06 17:34:37 +02:00
David Majda 7e3b4ec4f8 PEG.js grammar: Remove reserved word detection
This is mostly done for consistency with the JavaScript example grammar,
from which the |Identifier| rule is taken from. See the previous commit
for details.
2014-04-13 20:32:39 +02:00
David Majda e78ffbba9c PEG.js grammar: Improve the |Code| rule a bit
Instead of matching segments between blocks character by character,
match them as a whole. Also align the style with other similar rules
(e.g. the comment ones).
2014-04-06 15:02:51 +02:00
David Majda 64eb5faf54 PEG.js grammar: Fix line continuation handling in character classes
Before this commit, line continuations in character classes contributed
an empty string to the list of characters and character ranges matched
by a class. While this didn't lead to a buggy behavior with the current
code generator, the AST was wrong and the problem could have caused bugs
later.

This commit fixes the problem.
2014-04-06 14:53:57 +02:00
David Majda 0678bd8a0c PEG.js grammar: Add missing semicolon
Fixes the following JSHint error:

  lib/parser.js: line 108, col 54, Missing semicolon.
2014-04-06 14:40:01 +02:00
David Majda cf294ef236 PEG.js grammar: Add limitations
The limitations are inherited from the JavaScript example grammar.
2014-04-04 11:25:21 +02:00
David Majda 0459ab6b37 PEG.js grammar: Formatting & comments 2014-04-04 11:25:21 +02:00
David Majda 6f2510e49e PEG.js grammar: Make rules with operators more generic 2014-04-04 11:25:21 +02:00
David Majda 45c29a886f PEG.js grammar: Extract the |SemanticPredicateExpression| rule
Semantic predicates are kind of |PrimaryExpression|, not kind of
|PrefixedExpression|. Therefore I extracted a rule for them and
referenced it from the |PrimaryExpression|.
2014-04-04 11:25:21 +02:00
David Majda da18f6a729 PEG.js grammar: Extract the |RuleReferenceExpression| rule
This makes the |Primary| rule a bit more tidy.
2014-04-04 11:25:21 +02:00
David Majda 8e6f98e45c PEG.js grammar: Extract the |ActionExpression| rule
Having it separated from the |SequenceExpression| rule is cleaner and
more logical.
2014-04-04 11:25:21 +02:00
David Majda 5c6f4dd38b PEG.js grammar: Append |Expression| to expression rule names
Makes the rule names a bit longer but also clearer.
2014-04-04 11:25:21 +02:00
David Majda 27c2d26203 PEG.js grammar: More JavaScript-like initializer/rule separation
Initializer and rules are now separated in a similar way as JavaScript
statements -- either by a semicolon or a line terminator, possibly with
whitespace and comments mixed in.

One consequence is that the grammars like this are now illegal:

  foo = "a" bar = "b"

A semicolon needs to be inserted between the rules:

  foo = "a";bar = "b"

I consider this a good change as the now-illegal syntax was somewhat
confusing.
2014-04-04 11:25:21 +02:00
David Majda 4ce7593f5f PEG.js grammar: Extract the |AnyMatcher| rule
This makes the |Primary| rule a bit more tidy. Also, matching the |.|
character really belongs to the lexical part of the grammar, next to
literals and character classes.
2014-04-04 11:25:21 +02:00
David Majda c0df01b092 PEG.js grammar: Improve code block handling
* Rename the |Action| rule to |CodeBlock| (it better describes what
    the rule matches).

  * Implement the rule in a simpler way and move it after more basic
    lexical elements.
2014-04-04 11:25:21 +02:00
David Majda 13f72bb19d PEG.js grammar: More JavaScript-like rules for identifiers
This change has two side effects:

  * Label names can no longer be JavaScript reserved words.

  * |$| is allowed again in label names. However, because of the
    preference rules, names starting with it will be usually parsed as a
    text operator followed by another identifier (denoting a rule
    reference or label name).
2014-04-04 11:25:21 +02:00
David Majda 0d6b91cb20 PEG.js grammar: More JavaScript-like rules for strings/literals/classes 2014-04-04 11:25:20 +02:00