pegjs

Commit Graph

Author	SHA1	Message	Date
David Majda	de1704f007	Replace \|util.{puts,error}\| by \|console.{log,error}\| The \|util.puts\| and \|util.error\| functions are deprecated in Node.js 0.12.x. Based on a pull request by Jan Stránský (@burningtree): https://github.com/pegjs/pegjs/pull/334	9 years ago
Arlo Breault	12c169e7b5	Convert PEG.js code to strict mode * Issues #323	10 years ago
David Majda	065f4e1b75	Improve location info in syntax errors Replace \|line\|, \|column\|, and \|offset\| properties of \|SyntaxError\| with the \|location\| property. It contains an object similar to the one returned by the \|location\| function available in action code: { start: { offset: 23, line: 5, column: 6 }, end: { offset: 25, line: 5, column: 8 } } For syntax errors produced in the middle of the input, \|start\| refers to the first unparsed character and \|end\| refers to the character behind it (meaning the span is 1 character). This corresponds to the portion of the input in the \|found\| property. For syntax errors produced the end of the input, both \|start\| and \|end\| refer to a character past the end of the input (meaning the span is 0 characters). For syntax errors produced by calling \|expected\| or \|error\| functions in action code the location info is the same as the \|location\| function would return.	10 years ago
David Majda	da57118a43	Implement basic support for tracing Parsers can now be generated with support for tracing using the --trace CLI option or a boolean \|trace\| option to \|PEG.buildParser\|. This makes them trace their progress, which can be useful for debugging. Parsers generated with tracing support are called "tracing parsers". When a tracing parser executes, by default it traces the rules it enters and exits by writing messages to the console. For example, a parser built from this grammar: start = a / b a = "a" b = "b" will write this to the console when parsing input "b": 1:1 rule.enter start 1:1 rule.enter a 1:1 rule.fail a 1:1 rule.enter b 1:2 rule.match b 1:2 rule.match start You can customize tracing by passing a custom tracer to parser's \|parse\| method using the \|tracer\| option: parser.parse(input, { trace: tracer }); This will replace the built-in default tracer (which writes to the console) by the tracer you supplied. The tracer must be an object with a \|trace\| method. This method is called each time a tracing event happens. It takes one argument which is an object describing the tracing event. Currently, three events are supported: * rule.enter -- triggered when a rule is entered * rule.match -- triggered when a rule matches successfully * rule.fail -- triggered when a rule fails to match These events are triggered in nested pairs -- for each rule.enter event there is a matching rule.match or rule.fail event. The event object passed as an argument to \|trace\| contains these properties: * type -- event type * rule -- name of the rule the event is related to * offset -- parse position at the time of the event * line -- line at the time of the event * column -- column at the time of the event * result -- rule's match result (only for rule.match event) The whole tracing API is somewhat experimental (which is why it isn't documented properly yet) and I expect it will evolve over time as experience is gained. The default tracer is also somewhat bare-bones. I hope that PEG.js user community will develop more sophisticated tracers over time and I'll be able to integrate their best ideas into the default tracer.	10 years ago
David Majda	95fd64ec15	.jshintrc: Add the "forin" option & fix fallout Also added few missing \|hasOwnProperty\| calls that JSHint didn't detect because it only looks whether there is an \|if\| statement wrapping the loop body.	11 years ago
David Majda	f22d7aabb5	Fix JSHint errors in bin/pegjs Fixes the following JSHint errors: bin/pegjs: line 66, col 14, 'extraOptions' used out of scope. bin/pegjs: line 70, col 19, 'extraOptions' used out of scope. bin/pegjs: line 71, col 20, 'extraOptions' used out of scope. bin/pegjs: line 80, col 10, Wrap the /regexp/ literal in parens to disambiguate the slash operator. bin/pegjs: line 128, col 43, Missing semicolon. bin/pegjs: line 128, col 45, Don't make functions within a loop. bin/pegjs: line 150, col 13, Redefinition of 'module'. bin/pegjs: line 217, col 34, Expected '===' and instead saw '=='. bin/pegjs: line 243, col 44, 'source' used out of scope. bin/pegjs: line 243, col 61, 'source' used out of scope.	11 years ago
David Majda	851681d663	Implement the --extra-options and --extra-options-file options These are mainly useful to pass additional options to plugins.	12 years ago
David Majda	d013016717	bin/pegjs: Fix help wrapping All help text should be wrapped at column 80.	12 years ago
David Majda	2dc39bb779	bin/pegjs: Output just the parser source if --export-var is empty This will make embedding generated parsers into other files easier. Based on a patch by Glen Huang: https://github.com/dmajda/pegjs/pull/143	12 years ago
David Majda	e1af175af8	Plugin API: Implement the --plugin option Implements part of GH-106.	12 years ago
David Majda	fe1ca481ab	Code generator rewrite This is a complete rewrite of the PEG.js code generator. Its goals are: 1. Allow optimizing the generated parser code for code size as well as for parsing speed. 2. Prepare ground for future optimizations and big features (like incremental parsing). 2. Replace the old template-based code-generation system with something more lightweight and flexible. 4. General code cleanup (structure, style, variable names, ...). New Architecture ---------------- The new code generator consists of two steps: * Bytecode generator -- produces bytecode for an abstract virtual machine * JavaScript generator -- produces JavaScript code based on the bytecode The abstract virtual machine is stack-based. Originally I wanted to make it register-based, but it turned out that all the code related to it would be more complex and the bytecode itself would be longer (because of explicit register specifications in instructions). The only downsides of the stack-based approach seem to be few small inefficiencies (see e.g. the \|NIP\| instruction), which seem to be insignificant. The new generator allows optimizing for parsing speed or code size (you can choose using the \|optimize\| option of the \|PEG.buildParser\| method or the --optimize/-o option on the command-line). When optimizing for size, the JavaScript generator emits the bytecode together with its constant table and a generic bytecode interpreter. Because the interpreter is small and the bytecode and constant table grow only slowly with size of the grammar, the resulting parser is also small. When optimizing for speed, the JavaScript generator just compiles the bytecode into JavaScript. The generated code is relatively efficient, so the resulting parser is fast. Internal Identifiers -------------------- As a small bonus, all internal identifiers visible to user code in the initializer, actions and predicates are prefixed by \|peg$\|. This lowers the chance that identifiers in user code will conflict with the ones from PEG.js. It also makes using any internals in user code ugly, which is a good thing. This solves GH-92. Performance ----------- The new code generator improved parsing speed and parser code size significantly. The generated parsers are now: * 39% faster when optimizing for speed * 69% smaller when optimizing for size (without minification) * 31% smaller when optimizing for size (with minification) (Parsing speed was measured using the \|benchmark/run\| script. Code size was measured by generating parsers for examples in the \|examples\| directory and adding up the file sizes. Minification was done by \|uglify --ascii\| in version 1.3.4.) Final Note ---------- This is just a beginning! The new code generator lays a foundation upon which many optimizations and improvements can (and will) be made. Stay tuned :-)	12 years ago
David Majda	3333cdd18d	Position tracking: Kill the \|trackLineAndColumn\| option Getting rid of the \|trackLineAndColumn\| simplifies the code generator (by unifying two paths in the code). The \|line\| and \|column\| functions currently always compute all the position info from scratch, which is horribly ineffective. This will be improved in later commit(s).	12 years ago
David Majda	05a6bad989	Kill the \|toSource\| method, introduce the \|output\| option Before this commit, \|PEG.buildParser\| always returned a parser object. The only way to get its source code was to call the \|toSource\| method on it. While this method worked for parsers produced by \|PEG.buildParser\| directly, it didn't work for parsers instantiated by executing their source code. In other words, it was unreliable. This commit remvoes the \|toSource\| method on generated parsers and introduces a new \|output\| option to \|PEG.buildParser\|. It allows callers to specify whether they want to get back the parser object (\|options.output === "parser"\|) or its source code (\|options.output === "source"\|). This is much better and more reliable API.	12 years ago
David Majda	208cc33930	Allowed start rules must be specified explicitly Before this commit, generated parser were able to start parsing from any rule. This was nice, but it made rule code inlining impossible. Since this commit, the list of allowed start rules has to be specified explicitly using the \|allowedStartRules\| option of the \|PEG.buildParser\| method (or the --allowed-start-rule option on the command-line). These rules will be excluded from inlining when it's implemented.	12 years ago
David Majda	8f71c07cec	Implement the "--cache" command-line option	13 years ago
David Majda	58cc5b739d	Implement "--track-line-and-column" command-line option	13 years ago
David Majda	a0898388fb	/bin/pegjs: Avoid calling \|process.openStdin\| While \|process.openStdin\| is not officially deprecated, it's no longer documented and just using \|process.stdin\| and resuming it seems to be the official way.	13 years ago
David Majda	de256105eb	/bin/pegjs: Don't close standard output Avoids "Error: process.stdout cannot be closed" error when invoked without file arguments.	13 years ago
David Majda	fb5028eb90	Use \|util\| module instead of \|sys\| \|sys\| emits a warning in Node.js 0.6.x.	13 years ago
David Majda	c90e7f369b	Fix regexp for detecting command-line options in /bin/pegjs Closes GH-51.	13 years ago
David Majda	dcf904c392	bin/pegjs: Default parser variable name is "module.exports" The previous default name was "exports.parser". This meant that to use the generated parser in Node.js, you had to use code like this: var parser = require("./my-cool-parser").parser; parser.parse(...); Now you can shorten it a bit: var parser = require("./my-cool-parser"); parser.parse(...); The shorter version makes sense since no other objects except the parser are exported from the module.	14 years ago
David Majda	d5caaa7877	Nicer messages in command-line mode on read/write errors	14 years ago
David Majda	957b96c1b5	Add check for missing parameter of the -e/--export-var option.	14 years ago
David Majda	d0c074e2f8	Small style fixes	14 years ago
David Majda	814ce7d9db	Switch command-line mode backend from Rhino to Node	14 years ago
David Majda	4d68812b65	Fix usage description	14 years ago
David Majda	977d1d20c7	Fix wrong version reported by "bin/pegjs --version" DRY: Now the version is stored only in the VERSION file.	14 years ago
David Majda	a12a24fca1	Make parsers generated by /bin/pegjs CommonJS modules by default	14 years ago
David Majda	e59f3ba338	Split the source code into several files, introduce build system The source code is now in the src directory. The library needs to be built using "rake", which creates the lib/peg.js file by combining the source files.	14 years ago
David Majda	917cf1cf2a	Start rule of the grammar is now implicitly its first rule Before this change, the start rule was the one named "start" and there was an option to override that. This is now impossible. The goal of this change is to contain all information for the parser generation in the grammar itself. In the future, some override directive for the start rule (like Bison's "%start") may be added to the grammar.	15 years ago
David Majda	81eced29b2	Whitespace fixes	15 years ago
David Majda	08635b658b	Make bin/pegjs work when called via a symlink Similar issue exists on Windows too (they have symlinks since Vista), but I could not find how to dereference symlinks from batch files, so I did not fix it. I guess this does not matter much given how little the symlinks are used in the Windows world. Closes #1.	15 years ago
David Majda	e63f64a3d5	Make the generated parsers standalone (no runtime is required). This and also speeds up the benchmark suite execution by 7.83 % on V8. Detailed results (benchmark suite totals): --------------------------------- Test # Before After --------------------------------- 1 26.17 kB/s 28.16 kB/s 2 26.05 kB/s 28.16 kB/s 3 25.99 kB/s 28.10 kB/s 4 26.13 kB/s 28.11 kB/s 5 26.14 kB/s 28.07 kB/s --------------------------------- Average 26.10 kB/s 28.14 kB/s --------------------------------- Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/533.2 (KHTML, like Gecko) Chrome/5.0.342.7 Safari/533.2	15 years ago
David Majda	d3104742d9	Fixed --start vs. --start-rule inconsistency between help and actual option processing code.	15 years ago
David Majda	a43d1b33e3	Bootstrapped the grammar parser, yay! I should have done this long ago.	15 years ago
David Majda	0a5788b50e	Fixed typo in help: "parserVar" -> "parser_var".	15 years ago
David Majda	c3dd696a3e	Initial commit.	15 years ago

37 Commits (69a0f769fc1e3cd751affce198a8248cda2859c2)