You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

332 lines
8.7 KiB

12 years ago
11 years ago
11 years ago
11 years ago
11 years ago
Implement basic support for tracing Parsers can now be generated with support for tracing using the --trace CLI option or a boolean |trace| option to |PEG.buildParser|. This makes them trace their progress, which can be useful for debugging. Parsers generated with tracing support are called "tracing parsers". When a tracing parser executes, by default it traces the rules it enters and exits by writing messages to the console. For example, a parser built from this grammar: start = a / b a = "a" b = "b" will write this to the console when parsing input "b": 1:1 rule.enter start 1:1 rule.enter a 1:1 rule.fail a 1:1 rule.enter b 1:2 rule.match b 1:2 rule.match start You can customize tracing by passing a custom *tracer* to parser's |parse| method using the |tracer| option: parser.parse(input, { trace: tracer }); This will replace the built-in default tracer (which writes to the console) by the tracer you supplied. The tracer must be an object with a |trace| method. This method is called each time a tracing event happens. It takes one argument which is an object describing the tracing event. Currently, three events are supported: * rule.enter -- triggered when a rule is entered * rule.match -- triggered when a rule matches successfully * rule.fail -- triggered when a rule fails to match These events are triggered in nested pairs -- for each rule.enter event there is a matching rule.match or rule.fail event. The event object passed as an argument to |trace| contains these properties: * type -- event type * rule -- name of the rule the event is related to * offset -- parse position at the time of the event * line -- line at the time of the event * column -- column at the time of the event * result -- rule's match result (only for rule.match event) The whole tracing API is somewhat experimental (which is why it isn't documented properly yet) and I expect it will evolve over time as experience is gained. The default tracer is also somewhat bare-bones. I hope that PEG.js user community will develop more sophisticated tracers over time and I'll be able to integrate their best ideas into the default tracer.
7 years ago
Code generator rewrite This is a complete rewrite of the PEG.js code generator. Its goals are: 1. Allow optimizing the generated parser code for code size as well as for parsing speed. 2. Prepare ground for future optimizations and big features (like incremental parsing). 2. Replace the old template-based code-generation system with something more lightweight and flexible. 4. General code cleanup (structure, style, variable names, ...). New Architecture ---------------- The new code generator consists of two steps: * Bytecode generator -- produces bytecode for an abstract virtual machine * JavaScript generator -- produces JavaScript code based on the bytecode The abstract virtual machine is stack-based. Originally I wanted to make it register-based, but it turned out that all the code related to it would be more complex and the bytecode itself would be longer (because of explicit register specifications in instructions). The only downsides of the stack-based approach seem to be few small inefficiencies (see e.g. the |NIP| instruction), which seem to be insignificant. The new generator allows optimizing for parsing speed or code size (you can choose using the |optimize| option of the |PEG.buildParser| method or the --optimize/-o option on the command-line). When optimizing for size, the JavaScript generator emits the bytecode together with its constant table and a generic bytecode interpreter. Because the interpreter is small and the bytecode and constant table grow only slowly with size of the grammar, the resulting parser is also small. When optimizing for speed, the JavaScript generator just compiles the bytecode into JavaScript. The generated code is relatively efficient, so the resulting parser is fast. Internal Identifiers -------------------- As a small bonus, all internal identifiers visible to user code in the initializer, actions and predicates are prefixed by |peg$|. This lowers the chance that identifiers in user code will conflict with the ones from PEG.js. It also makes using any internals in user code ugly, which is a good thing. This solves GH-92. Performance ----------- The new code generator improved parsing speed and parser code size significantly. The generated parsers are now: * 39% faster when optimizing for speed * 69% smaller when optimizing for size (without minification) * 31% smaller when optimizing for size (with minification) (Parsing speed was measured using the |benchmark/run| script. Code size was measured by generating parsers for examples in the |examples| directory and adding up the file sizes. Minification was done by |uglify --ascii| in version 1.3.4.) Final Note ---------- This is just a beginning! The new code generator lays a foundation upon which many optimizations and improvements can (and will) be made. Stay tuned :-)
9 years ago
Code generator rewrite This is a complete rewrite of the PEG.js code generator. Its goals are: 1. Allow optimizing the generated parser code for code size as well as for parsing speed. 2. Prepare ground for future optimizations and big features (like incremental parsing). 2. Replace the old template-based code-generation system with something more lightweight and flexible. 4. General code cleanup (structure, style, variable names, ...). New Architecture ---------------- The new code generator consists of two steps: * Bytecode generator -- produces bytecode for an abstract virtual machine * JavaScript generator -- produces JavaScript code based on the bytecode The abstract virtual machine is stack-based. Originally I wanted to make it register-based, but it turned out that all the code related to it would be more complex and the bytecode itself would be longer (because of explicit register specifications in instructions). The only downsides of the stack-based approach seem to be few small inefficiencies (see e.g. the |NIP| instruction), which seem to be insignificant. The new generator allows optimizing for parsing speed or code size (you can choose using the |optimize| option of the |PEG.buildParser| method or the --optimize/-o option on the command-line). When optimizing for size, the JavaScript generator emits the bytecode together with its constant table and a generic bytecode interpreter. Because the interpreter is small and the bytecode and constant table grow only slowly with size of the grammar, the resulting parser is also small. When optimizing for speed, the JavaScript generator just compiles the bytecode into JavaScript. The generated code is relatively efficient, so the resulting parser is fast. Internal Identifiers -------------------- As a small bonus, all internal identifiers visible to user code in the initializer, actions and predicates are prefixed by |peg$|. This lowers the chance that identifiers in user code will conflict with the ones from PEG.js. It also makes using any internals in user code ugly, which is a good thing. This solves GH-92. Performance ----------- The new code generator improved parsing speed and parser code size significantly. The generated parsers are now: * 39% faster when optimizing for speed * 69% smaller when optimizing for size (without minification) * 31% smaller when optimizing for size (with minification) (Parsing speed was measured using the |benchmark/run| script. Code size was measured by generating parsers for examples in the |examples| directory and adding up the file sizes. Minification was done by |uglify --ascii| in version 1.3.4.) Final Note ---------- This is just a beginning! The new code generator lays a foundation upon which many optimizations and improvements can (and will) be made. Stay tuned :-)
9 years ago
Fix ESLint errors in bin/pegjs Fix the following errors: 12:3 error Unexpected console statement no-console 16:3 error Unexpected console statement no-console 17:3 error Unexpected console statement no-console 18:3 error Unexpected console statement no-console 19:3 error Unexpected console statement no-console 20:3 error Unexpected console statement no-console 21:3 error Unexpected console statement no-console 22:3 error Unexpected console statement no-console 23:3 error Unexpected console statement no-console 24:3 error Unexpected console statement no-console 25:3 error Unexpected console statement no-console 26:3 error Unexpected console statement no-console 27:3 error Unexpected console statement no-console 28:3 error Unexpected console statement no-console 29:3 error Unexpected console statement no-console 30:3 error Unexpected console statement no-console 31:3 error Unexpected console statement no-console 32:3 error Unexpected console statement no-console 33:3 error Unexpected console statement no-console 34:3 error Unexpected console statement no-console 35:3 error Unexpected console statement no-console 36:3 error Unexpected console statement no-console 37:3 error Unexpected console statement no-console 38:3 error Unexpected console statement no-console 39:3 error Unexpected console statement no-console 40:3 error Unexpected console statement no-console 41:3 error Unexpected console statement no-console 42:3 error Unexpected console statement no-console 43:3 error Unexpected console statement no-console 44:3 error Unexpected console statement no-console 56:3 error Unexpected console statement no-console 232:9 error "inputStream" is already defined no-redeclare 240:9 error "outputStream" is already defined no-redeclare
6 years ago
  1. #!/usr/bin/env node
  2. "use strict";
  3. let fs = require("fs");
  4. let path = require("path");
  5. let peg = require("../lib/peg");
  6. /* Helpers */
  7. function printVersion() {
  8. console.log("PEG.js " + peg.VERSION);
  9. }
  10. function printHelp() {
  11. console.log("Usage: pegjs [options] [--] [<input_file>]");
  12. console.log("");
  13. console.log("Options:");
  14. console.log(" --allowed-start-rules <rules> comma-separated list of rules the generated");
  15. console.log(" parser will be allowed to start parsing");
  16. console.log(" from (default: the first rule in the");
  17. console.log(" grammar)");
  18. console.log(" --cache make generated parser cache results");
  19. console.log(" -d, --dependency <dependency> use specified dependency (can be specified");
  20. console.log(" multiple times)");
  21. console.log(" -e, --export-var <variable> name of a global variable into which the");
  22. console.log(" parser object is assigned to when no module");
  23. console.log(" loader is detected");
  24. console.log(" --extra-options <options> additional options (in JSON format) to pass");
  25. console.log(" to peg.generate");
  26. console.log(" --extra-options-file <file> file with additional options (in JSON");
  27. console.log(" format) to pass to peg.generate");
  28. console.log(" --format <format> format of the generated parser: amd,");
  29. console.log(" commonjs, globals, umd (default: commonjs)");
  30. console.log(" -h, --help print help and exit");
  31. console.log(" -O, --optimize <goal> select optimization for speed or size");
  32. console.log(" (default: speed)");
  33. console.log(" -o, --output <file> output file");
  34. console.log(" --plugin <plugin> use a specified plugin (can be specified");
  35. console.log(" multiple times)");
  36. console.log(" --trace enable tracing in generated parser");
  37. console.log(" -v, --version print version information and exit");
  38. }
  39. function exitSuccess() {
  40. process.exit(0);
  41. }
  42. function exitFailure() {
  43. process.exit(1);
  44. }
  45. function abort(message) {
  46. console.error(message);
  47. exitFailure();
  48. }
  49. function addExtraOptions(options, json) {
  50. let extraOptions;
  51. try {
  52. extraOptions = JSON.parse(json);
  53. } catch (e) {
  54. if (!(e instanceof SyntaxError)) { throw e; }
  55. abort("Error parsing JSON: " + e.message);
  56. }
  57. if (typeof extraOptions !== "object") {
  58. abort("The JSON with extra options has to represent an object.");
  59. }
  60. for (let key in extraOptions) {
  61. if (extraOptions.hasOwnProperty(key)) {
  62. options[key] = extraOptions[key];
  63. }
  64. }
  65. }
  66. /*
  67. * Extracted into a function just to silence JSHint complaining about creating
  68. * functions in a loop.
  69. */
  70. function trim(s) {
  71. return s.trim();
  72. }
  73. /* Arguments */
  74. let args = process.argv.slice(2); // Trim "node" and the script path.
  75. function isOption(arg) {
  76. return (/^-.+/).test(arg);
  77. }
  78. function nextArg() {
  79. args.shift();
  80. }
  81. /* Files */
  82. function readStream(inputStream, callback) {
  83. let input = "";
  84. inputStream.on("data", (data) => { input += data; });
  85. inputStream.on("end", () => { callback(input); });
  86. }
  87. /* Main */
  88. let inputFile = null;
  89. let outputFile = null;
  90. let options = {
  91. cache: false,
  92. dependencies: {},
  93. exportVar: null,
  94. format: "commonjs",
  95. optimize: "speed",
  96. output: "source",
  97. plugins: [],
  98. trace: false
  99. };
  100. while (args.length > 0 && isOption(args[0])) {
  101. let json, id, mod;
  102. switch (args[0]) {
  103. case "--allowed-start-rules":
  104. nextArg();
  105. if (args.length === 0) {
  106. abort("Missing parameter of the -e/--allowed-start-rules option.");
  107. }
  108. options.allowedStartRules = args[0]
  109. .split(",")
  110. .map(trim);
  111. break;
  112. case "--cache":
  113. options.cache = true;
  114. break;
  115. case "-d":
  116. case "--dependency":
  117. nextArg();
  118. if (args.length === 0) {
  119. abort("Missing parameter of the -d/--dependency option.");
  120. }
  121. if (args[0].indexOf(":") !== -1) {
  122. let parts = args[0].split(":");
  123. options.dependencies[parts[0]] = parts[1];
  124. } else {
  125. options.dependencies[args[0]] = args[0];
  126. }
  127. break;
  128. case "-e":
  129. case "--export-var":
  130. nextArg();
  131. if (args.length === 0) {
  132. abort("Missing parameter of the -e/--export-var option.");
  133. }
  134. options.exportVar = args[0];
  135. break;
  136. case "--extra-options":
  137. nextArg();
  138. if (args.length === 0) {
  139. abort("Missing parameter of the --extra-options option.");
  140. }
  141. addExtraOptions(options, args[0]);
  142. break;
  143. case "--extra-options-file":
  144. nextArg();
  145. if (args.length === 0) {
  146. abort("Missing parameter of the --extra-options-file option.");
  147. }
  148. try {
  149. json = fs.readFileSync(args[0]);
  150. } catch(e) {
  151. abort("Can't read from file \"" + args[0] + "\".");
  152. }
  153. addExtraOptions(options, json);
  154. break;
  155. case "--format":
  156. nextArg();
  157. if (args.length === 0) {
  158. abort("Missing parameter of the --format option.");
  159. }
  160. if (args[0] !== "amd" && args[0] !== "commonjs" && args[0] !== "globals" && args[0] !== "umd") {
  161. abort("Module format must be one of \"amd\", \"commonjs\", \"globals\", and \"umd\".");
  162. }
  163. options.format = args[0];
  164. break;
  165. case "-h":
  166. case "--help":
  167. printHelp();
  168. exitSuccess();
  169. break;
  170. case "-O":
  171. case "--optimize":
  172. nextArg();
  173. if (args.length === 0) {
  174. abort("Missing parameter of the -O/--optimize option.");
  175. }
  176. if (args[0] !== "speed" && args[0] !== "size") {
  177. abort("Optimization goal must be either \"speed\" or \"size\".");
  178. }
  179. options.optimize = args[0];
  180. break;
  181. case "-o":
  182. case "--output":
  183. nextArg();
  184. if (args.length === 0) {
  185. abort("Missing parameter of the -o/--output option.");
  186. }
  187. outputFile = args[0];
  188. break;
  189. case "--plugin":
  190. nextArg();
  191. if (args.length === 0) {
  192. abort("Missing parameter of the --plugin option.");
  193. }
  194. id = /^(\.\/|\.\.\/)/.test(args[0]) ? path.resolve(args[0]) : args[0];
  195. mod;
  196. try {
  197. mod = require(id);
  198. } catch (e) {
  199. if (e.code !== "MODULE_NOT_FOUND") { throw e; }
  200. abort("Can't load module \"" + id + "\".");
  201. }
  202. options.plugins.push(mod);
  203. break;
  204. case "--trace":
  205. options.trace = true;
  206. break;
  207. case "-v":
  208. case "--version":
  209. printVersion();
  210. exitSuccess();
  211. break;
  212. case "--":
  213. nextArg();
  214. break;
  215. default:
  216. abort("Unknown option: " + args[0] + ".");
  217. }
  218. nextArg();
  219. }
  220. if (Object.keys(options.dependencies).length > 0) {
  221. if (options.format !== "amd" && options.format !== "commonjs" && options.format !== "umd") {
  222. abort("Can't use the -d/--dependency option with the \"" + options.format + "\" module format.");
  223. }
  224. }
  225. if (options.exportVar !== null) {
  226. if (options.format !== "globals" && options.format !== "umd") {
  227. abort("Can't use the -e/--export-var option with the \"" + options.format + "\" module format.");
  228. }
  229. }
  230. let inputStream;
  231. let outputStream;
  232. switch (args.length) {
  233. case 0:
  234. inputFile = "-";
  235. break;
  236. case 1:
  237. inputFile = args[0];
  238. break;
  239. default:
  240. abort("Too many arguments.");
  241. }
  242. if (outputFile === null) {
  243. if (inputFile === "-") {
  244. outputFile = "-";
  245. } else {
  246. outputFile = inputFile.substr(0, inputFile.length - path.extname(inputFile).length) + ".js";
  247. }
  248. }
  249. if (inputFile === "-") {
  250. process.stdin.resume();
  251. inputStream = process.stdin;
  252. inputStream.on("error", () => {
  253. abort("Can't read from file \"" + inputFile + "\".");
  254. });
  255. } else {
  256. inputStream = fs.createReadStream(inputFile);
  257. }
  258. if (outputFile === "-") {
  259. outputStream = process.stdout;
  260. } else {
  261. outputStream = fs.createWriteStream(outputFile);
  262. outputStream.on("error", () => {
  263. abort("Can't write to file \"" + outputFile + "\".");
  264. });
  265. }
  266. readStream(inputStream, (input) => {
  267. let source;
  268. try {
  269. source = peg.generate(input, options);
  270. } catch (e) {
  271. if (e.location !== undefined) {
  272. abort(e.location.start.line + ":" + e.location.start.column + ": " + e.message);
  273. } else {
  274. abort(e.message);
  275. }
  276. }
  277. outputStream.write(source);
  278. if (outputStream !== process.stdout) {
  279. outputStream.end();
  280. }
  281. });