Commit graph

860 commits

Author SHA1 Message Date
David Majda f457c41dd4 Declare the |j| variable before use in lib/utils/arrays.js
Until now it was inadvertently global.
2014-05-17 08:00:21 +02:00
David Majda 24394e3f91 Fix comment alignment in lib/compiler/passes/generate-javascript.js 2014-05-17 07:54:16 +02:00
David Majda 2b06476c69 Regenerate lib/parser.js after bytecode changes 2014-05-16 13:34:53 +02:00
David Majda dad1207c46 Improve semantics of the TEXT bytecode instruction
The TEXT instruction now replaces position at the top of the stack with
the input from that position until the current position. This is simpler
and cleaner semantics than the previous one, where TEXT also popped an
additional value from the stack and kept the position there.
2014-05-16 13:30:03 +02:00
David Majda a815a8b902 Implement additional PUSH_* bytecode instructions
Implement the following bytecode instructions:

  * PUSH_UNDEFINED
  * PUSH_NULL
  * PUSH_FAILED
  * PUSH_EMPTY_ARRAY

These instructions push simple JavaSccript values to the stack directly,
without going through constants. This makes the bytecode slightly
shorter and the bytecode generator somewhat simpler.

Also note that PUSH_EMPTY_ARRAY allows us to avoid a hack which protects
the [] constant from modification.
2014-05-16 13:28:29 +02:00
David Majda c6f0818d49 Use sentence case consistently in {spec,benchmark}/README.md headers 2014-05-10 16:40:39 +02:00
David Majda 4d456402be Small cleanup of benchmark/index.js
Update coding style to match the rest of PEG.js.
2014-05-10 16:22:41 +02:00
David Majda 811a5c0f01 Small cleanup of benchmark/runner.js
Update coding style to match the rest of PEG.js.
2014-05-10 16:20:12 +02:00
David Majda b901a5c37a Rewrite benchmark/README.md
More clarity, better grammar (hopefully).
2014-05-09 15:09:54 +02:00
David Majda f102814998 Rewrite spec/README.md
More clarity, better grammar (hopefully).
2014-05-09 15:09:53 +02:00
David Majda fc1d54d049 Convert benchmark/README to Markdown
It will look nicer on GitHub.
2014-05-09 14:41:00 +02:00
David Majda 24e1644c58 Convert spec/README to Markdown
It will look nicer on GitHub.
2014-05-09 14:37:05 +02:00
David Majda 85c8f386c1 Formatting 2014-05-09 13:40:50 +02:00
David Majda f03ba4bf4f generate-javascript.js: s/generateJavaScript/generateJavascript/
This makes the variable name in sync with pass name in lib/compiler.js.
2014-05-08 20:28:32 +02:00
David Majda 57f7fae684 Fix a bug in |stringEscape|
The |stringEscape| function both in lib/compiler/javascript.js and in
generated parsers didn't escape characters in the U+0100..U+107F and
U+1000..U+107F ranges.
2014-05-08 20:28:32 +02:00
David Majda 88e5f136e1 Utility functions cleanup: Cleanup lib/compiler/javascript.js 2014-05-08 20:28:32 +02:00
David Majda c1e1502d43 Utility functions cleanup: Cleanup lib/compiler/visitor.js 2014-05-08 20:28:32 +02:00
David Majda bfaad70899 Utility functions cleanup: Cleanup lib/compiler/asts.js 2014-05-08 20:28:32 +02:00
David Majda 05f97f444d Utility functions cleanup: Cleanup lib/utils/classes.js 2014-05-08 20:28:32 +02:00
David Majda 1582304f16 Utility functions cleanup: Cleanup lib/utils/objects.js 2014-05-08 20:28:32 +02:00
David Majda 50b2054fbf Utility functions cleanup: Cleanup lib/utils/arrays.js 2014-05-08 20:28:31 +02:00
David Majda 5adad3ae12 Utility functions cleanup: Split lib/utils.js
Split lib/utils.js into multiple files. Some of the functions were
generic, these were moved into files in lib/utils. Other funtions were
specific for the compiler, these were moved to files in lib/compiler.

This commit only moves functions around -- there is no renaming and
cleanup performed. Both will come later.
2014-05-08 20:28:31 +02:00
David Majda ff8e877fce Change module exporting style
Modules now generally store the exported object in a named variable or
function first and only assign |module.exports| at the very end. This is
a difference when compared to style used until now, where most modules
started with a |module.exports| assignment.

I think the explicit name helps readability and understandability.
2014-05-04 14:11:44 +02:00
David Majda 11aab6374f s/head/first/ & s/tail/rest/ in a testcase
Makes the testcase in sync with example grammars.
2014-04-27 13:44:25 +02:00
David Majda d9354c4632 Standardize on 3 spaces before // comments 2014-04-27 13:42:05 +02:00
David Majda f3a83788aa Inline functions extracted just because of JSHint
Rather than extracting functions just because JSHint complained about
defining functions inside a loop, let's inline then and silence the
warning.
2014-04-27 13:31:49 +02:00
David Majda 46ac1bf171 Wrap initializer code in generated parsers into |{...}|
Initializer code is usually indented and this indentation is carried
over to generated code. This resulted in a piece of indented code in the
middle of the parser.

This commit wraps initializer code in |{...}|, which makes indentation
in generated parsers look a bit more natural.
2014-04-27 13:17:59 +02:00
David Majda 5fd41d444b Merge pull request #252 from chunpu/patch-1
Update example arithmetics.pegjs, make it work
2014-04-21 10:38:26 +02:00
chunpu e6efe09ac3 Update example arithmetics.pegjs, make it work 2014-04-21 16:23:42 +08:00
David Majda 5a02bca34d Clarify initializer documentation
Make it clear that there is only one initializer in the whole grammar.
The previous formulation could have been understood to mean that there
can be an initializer for every rule in the grammar.

Fixes #82.
2014-04-20 13:40:24 +02:00
David Majda 39084496ca Expose the parser object in action/predicate code
The action/predicate code didn't have access to the parser object. This
was mostly a side effect actions/predicates being implemented as nested
functions, in which |this| is a reference to the global object (an ugly
JavaScript quirk). The initializer, being implemented differently, had
access to the parser object via |this|, but this was not documented.

Because having access to the parser object can be useful, this commits
introduces a new |parser| variable which holds a reference to it, is
visible in action/predicate/initializer code, and is properly
documented.

See also:

  https://groups.google.com/forum/#!topic/pegjs/Na7YWnz6Bmg
2014-04-19 21:00:40 +02:00
David Majda c7521fb868 Mark |parse| and |SyntaxError| as internal identifiers
The |parse| function and the |SyntaxError| exception were meant as
internal, so let's mark them as such.
2014-04-19 20:57:03 +02:00
David Majda 7e3b4ec4f8 PEG.js grammar: Remove reserved word detection
This is mostly done for consistency with the JavaScript example grammar,
from which the |Identifier| rule is taken from. See the previous commit
for details.
2014-04-13 20:32:39 +02:00
David Majda c13cc88262 JavaScript example: Remove reserved word detection
Reserved word detection as it was implemented in the JavaScript example
grammar had two big downsides:

  1. It required changes in ordering of choices in some rules in order
     not to trigger the detection prematurely. One of the changes was
     already implemented (in the |Statement| rule, see the diff), but
     apparently more were needed (the grammar didn't parse inputs like
     |true| or |function f() {}|). And I'm not 100% sure that would be
     the end of it (maybe deeper structural changes would be needed).

  2. It made error messages confusing. Consider the following example:

       var a = @;

     Instead of reporting:

       Expected ... but "@" found.

     the generated parser reported:

       Reserved word "var" can't be used as an identifier.

     This was because the parser parsed the statement first as
     |VariableStatement| and when this failed, it tried to parse it as
     |ExpressionStatement|, triggering the reserved word detection.

Because of these, I decided to remove reserved word detection from the
JavaScript example grammar.
2014-04-13 20:29:00 +02:00
David Majda b271d66442 JavaScript example: Fix automatic semicolon insertion
Fix parsing of inputs like this:

  foo() // comment
  bar()
2014-04-06 15:27:52 +02:00
David Majda c70c8551b4 JavaScript example: Fix parsing of statements
Fixes a problem where statements starting with a reserved word produced
errors like this:

  Reserved word "return" can't be used as an identifier.

The problem was in a wrong ordering of choices in the |Statement| rule
together with aggressive reserved word detection in the |Identifier|
rule.
2014-04-06 15:20:01 +02:00
David Majda e78ffbba9c PEG.js grammar: Improve the |Code| rule a bit
Instead of matching segments between blocks character by character,
match them as a whole. Also align the style with other similar rules
(e.g. the comment ones).
2014-04-06 15:02:51 +02:00
David Majda 64eb5faf54 PEG.js grammar: Fix line continuation handling in character classes
Before this commit, line continuations in character classes contributed
an empty string to the list of characters and character ranges matched
by a class. While this didn't lead to a buggy behavior with the current
code generator, the AST was wrong and the problem could have caused bugs
later.

This commit fixes the problem.
2014-04-06 14:53:57 +02:00
David Majda 0678bd8a0c PEG.js grammar: Add missing semicolon
Fixes the following JSHint error:

  lib/parser.js: line 108, col 54, Missing semicolon.
2014-04-06 14:40:01 +02:00
David Majda 421b8d6e51 Clean up parser specs 2014-04-04 20:40:09 +02:00
David Majda cf294ef236 PEG.js grammar: Add limitations
The limitations are inherited from the JavaScript example grammar.
2014-04-04 11:25:21 +02:00
David Majda 0459ab6b37 PEG.js grammar: Formatting & comments 2014-04-04 11:25:21 +02:00
David Majda 6f2510e49e PEG.js grammar: Make rules with operators more generic 2014-04-04 11:25:21 +02:00
David Majda 45c29a886f PEG.js grammar: Extract the |SemanticPredicateExpression| rule
Semantic predicates are kind of |PrimaryExpression|, not kind of
|PrefixedExpression|. Therefore I extracted a rule for them and
referenced it from the |PrimaryExpression|.
2014-04-04 11:25:21 +02:00
David Majda da18f6a729 PEG.js grammar: Extract the |RuleReferenceExpression| rule
This makes the |Primary| rule a bit more tidy.
2014-04-04 11:25:21 +02:00
David Majda 8e6f98e45c PEG.js grammar: Extract the |ActionExpression| rule
Having it separated from the |SequenceExpression| rule is cleaner and
more logical.
2014-04-04 11:25:21 +02:00
David Majda 5c6f4dd38b PEG.js grammar: Append |Expression| to expression rule names
Makes the rule names a bit longer but also clearer.
2014-04-04 11:25:21 +02:00
David Majda 27c2d26203 PEG.js grammar: More JavaScript-like initializer/rule separation
Initializer and rules are now separated in a similar way as JavaScript
statements -- either by a semicolon or a line terminator, possibly with
whitespace and comments mixed in.

One consequence is that the grammars like this are now illegal:

  foo = "a" bar = "b"

A semicolon needs to be inserted between the rules:

  foo = "a";bar = "b"

I consider this a good change as the now-illegal syntax was somewhat
confusing.
2014-04-04 11:25:21 +02:00
David Majda 4ce7593f5f PEG.js grammar: Extract the |AnyMatcher| rule
This makes the |Primary| rule a bit more tidy. Also, matching the |.|
character really belongs to the lexical part of the grammar, next to
literals and character classes.
2014-04-04 11:25:21 +02:00
David Majda c0df01b092 PEG.js grammar: Improve code block handling
* Rename the |Action| rule to |CodeBlock| (it better describes what
    the rule matches).

  * Implement the rule in a simpler way and move it after more basic
    lexical elements.
2014-04-04 11:25:21 +02:00