Labels in expressions like "(a:"a")" or "(a:"a" b:"b" c:"c")" were
visible to the outside despite being wrapped in parens. This commit
makes them invisible, as they should be.
Note this required introduction of a new "group" AST node, whose purpose
is purely to provide label scope isolation. This was necessary because
"label" and "sequence" nodes don't (and can't!) provide this isolation
themselves.
Part of a fix of #396.
Semantic predicate and action specs which verified label scope didn't
exercise labels in subexpressions. This commit adds cases exercising
them, including few commented-out cases which reveal #396 (these will be
uncommented when the bug gets fixed).
Note that added specs exercise all relevant expression types. This is
needed because code that makes subexpression labels invisible to the
outside is separate for each expression type so one generic test
wouldn't generate enough coverage.
Part of a fix of #396.
So far, semantic predicate and action specs which verified scope of
labels from containing or outer sequences exercised only cases where
label variables were defined. This commit adds also some negative cases.
The idea comes from @Mingun.
Semantic predicate and action specs which verified scope of labels from
outer sequences exercised all relevant expression types. This is not
needed as the behavior is common for all expression types and no extra
code needs to be written to make it work for each of them.
This commit changes the specs to verify scope of labels from outer
sequences using only one expression type.
Semantic predicate specs which verified scope of labels from containing
sequences used 3 elements where 1 is enough.
This commit removes the redundant elements.
Semantic predicate and action specs which verified label scope used
repetitive "it" blocks. Rewrite them to use just one "it" block and a
list of testcases. This makes them more concise.
Semantic predicate specs which verified label scope also checked parser
results. This is not necessary because for the purpose of these specs it
is enough to verify that label variables have correct values, which is
done in predicate code already.
This commit removes parser result checks from these specs.
The "unescaped" rule was created by mechanically translating original
RFC 7159 rule:
unescaped = %x20-21 / %x23-5B / %x5D-10FFFF
into:
unescaped = [\x20-\x21\x23-\x5B\x5D-\u10FFFF]
However, this mechanical translation was incorrect as PEG.js grammars
don't have 6-digit Unicode escape sequences. Sequence "\u10FFFF" was
interpreted as "\u10FF" followed by two "F" characters.
This commit rewrites the "unescaped" rule into a form which, while not
being a mechanical translation of the original rule, matches the same
characters in the whole Unicode range. It also macthes textual
description of string representation in RFC 7159:
All Unicode characters may be placed within the quotation marks,
except for the characters that must be escaped: quotation mark,
reverse solidus, and the control characters (U+0000 through U+001F).
Fixes#417.
Running bin/pegjs with one argument which was an extension-less file
name caused the file to be overwritten. This was because internal
extension rewriting logic didn't handle this case corectly.
This commit changes the logic from regexp-based to path.extname-based,
fixing the problem. The new code generates file names like this:
Input file name Output file name
------------------------------------
grammar.ext grammar.js
grammar.ext1.ext2 grammar.ext1.js
grammar. grammar.js
grammar grammar.js
Fixes#405.
Before this commit, generated parsers considered the following character
sequences as newlines:
Sequence Description
------------------------------
"\n" Unix
"\r" Old Mac
"\r\n" Windows
"\u2028" line separator
"\u2029" paragraph separator
This commit limits the sequences only to "\n" and "\r\n". The reason is
that nobody uses Unicode newlines or "\r" in practice.
A positive side effect of the change is that newline-handling code
became simpler (and likely faster).
Instead of setting ESLint environment to "node" globally, set it on
per-directory basis using separate .eslintrc.json files:
Directory Environment
-----------------------
bin node
lib commonjs
spec jasmine
It was impossible to use this approach for the "benchmark" directory
which contains a mix of files used in various environments. For
benchmark/run, the environment is set inline. For the other files, as
well as spec/helpers.js, the globals are declared manually (it is
impossible to express how these files are used just by a list of
environments).
Fixes#408.
Fix the following errors:
31:9 error "parser" is defined but never used no-unused-vars
406:14 error "expected" is defined but never used no-unused-vars
1304:15 error "s1" is defined but never used no-unused-vars
1386:15 error "s1" is defined but never used no-unused-vars
1442:15 error "s1" is defined but never used no-unused-vars
Fix the following errors:
59:35 error "options" is defined but never used no-unused-vars
91:11 error "plugin" is defined but never used no-unused-vars
102:35 error "options" is defined but never used no-unused-vars
128:35 error "options" is defined but never used no-unused-vars
Note that ESLint revealed a real problem where the test supposedly
verifying receiving options by a plugin didn't actually verify anything.
Implement the swap and change various directives in the source code. The
"make hint" target becomes "make lint".
The change leads to quite some errors being reported by ESLint. These
will be fixed in subsequent commits.
Note the configuration enables just the recommended rules. Later I plan
to enable more rules to enforce the coding standard. The configuration
also sets the environment to "node", which is far from ideal as the
codebase contains a mix of CommonJS, Node.js and browser code. I hope to
clean this up at some point.
It turns out that OS X doesn't support long options for uname and it
doesn't support -o/--operating-system at all. Let's tweak uname's
options into something POSIX-compatible which still gives reasonable
results.
The new "uname -mrs" call results in the following:
OS uname -mrs
-----------------------------------------------
OS X Mavericks Darwin 15.2.0 x86_64
Ubuntu 14.04 Linux 3.13.0-32-generic x86_64
Without this, shell's printf is unreliable. For example, on OSX with
cs_CZ.UTF-8 locale it complained about number formatting:
tools/impact: line 51: printf: .0300: invalid number
Do not test under io.js (which is no longer a thing), test under Node.js
4.0.x and 5.0.x.
Zero minor versions are intentional, I wan't to make sure that PEG.js
doesn't depend on any features added later in the 4.x and 5.x cycle.
The expectation deduplication algorithm called |Array.prototype.splice|
to eliminate each individual duplication, which was slow. This caused
problems with grammar/input combinations that generated a lot of
expecations (see #377 for an example).
This commit replaces the algorithm with much faster one, eliminating the
problem.
In the past year I worked on various grammars where first/rest or
head/tail were used as labels for parts of lists. I found I associate
head/tail with a list immediately, while in case of first/rest I have to
"parse" grammar rules for a while before understanding their structure.
Moreover, I tend to assume that rest is a list of the same thigs as
first, but I don't have such assumption in case of head/tail. This
assumption was in conflict with the grammar structure.
I'm not sure how much these observations are applicable to others, but I
decided to act on them and switch from first/rest to head/tail.
In the past year I worked on various grammars where first/rest or
head/tail were used as labels for parts of lists. I found I associate
head/tail with a list immediately, while in case of first/rest I have to
"parse" grammar rules for a while before understanding their structure.
Moreover, I tend to assume that rest is a list of the same thigs as
first, but I don't have such assumption in case of head/tail. This
assumption was in conflict with the grammar structure.
I'm not sure how much these observations are applicable to others, but I
decided to act on them and switch from first/rest to head/tail.
The arithmetics example grammar is the first thing everyone sees in the
online editor at the PEG.js website, but it begins with a complicated
|combine| function in the initializer. Without understanding it it is
impossible to understand code in the actions. This may be a barrier to
learning how PEG.js works.
This commit removes the |combine| function and gets rid of the whole
initializer, removing the learning obstacle and streamlining action
code. The only cost is a slight code duplication.