From fb348fbaa1fe59435d451dd28390d7ea51b0e604 Mon Sep 17 00:00:00 2001 From: Bartek Kryza Date: Mon, 19 Jun 2023 23:23:50 +0200 Subject: [PATCH] Updated Doxygen docs --- .gitignore | 2 + CONTRIBUTING.md | 16 +- Doxyfile | 12 +- Makefile | 6 +- docs/README.md | 13 +- docs/architecture.md | 2 + docs/comment_decorators.md | 16 +- docs/common_options.md | 33 +- docs/diagram_templates.md | 2 +- docs/doxygen/github.min.css | 7 + docs/doxygen/header.html | 59 +- docs/doxygen/highlight.js | 2575 +++++++++++++++++++++++++++++ docs/doxygen/layout-clang-uml.xml | 245 +++ docs/generator_types.md | 9 + docs/img/github-mark.svg | 1 + docs/include_diagrams.md | 4 +- docs/jinja_templates.md | 8 +- docs/license.md | 15 + docs/quick_start.md | 12 +- docs/sequence_diagrams.md | 3 + docs/troubleshooting.md | 33 +- 21 files changed, 2981 insertions(+), 92 deletions(-) create mode 100644 docs/architecture.md create mode 100644 docs/doxygen/github.min.css create mode 100644 docs/doxygen/highlight.js create mode 100644 docs/doxygen/layout-clang-uml.xml create mode 100644 docs/img/github-mark.svg create mode 100644 docs/license.md diff --git a/.gitignore b/.gitignore index 276a9b9b..74c9721d 100644 --- a/.gitignore +++ b/.gitignore @@ -25,6 +25,8 @@ docs/diagrams docs/doxygen/html docs/doxygen/xml docs/doxygen/latex +docs/contributing.md +docs/changelog.md coverage*.info diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 427a280d..bf78bf08 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,8 +1,10 @@ -# Contributing to `clang-uml` +# Contributing to clang-uml Thanks for taking interest in `clang-uml`! -> Please, make sure you're ok with [Code of conduct](./CODE_OF_CONDUCT.md) and [LICENSE](./LICENSE.md) +> Please, make sure you're ok with +> [Code of conduct](./CODE_OF_CONDUCT.md) +> and [LICENSE](./LICENSE.md) ## If you found a bug @@ -18,12 +20,12 @@ Thanks for taking interest in `clang-uml`! the C++ code which triggers the issue, and in `tests/t00050/test_case.h` write the test checks. The test case must be also added manually to `tests/test_cases.cc`: ```cpp - ... + // ... #include "t00047/test_case.h" #include "t00048/test_case.h" #include "t00049/test_case.h" - #include "t00050/test_case.h" - ... + #include "t00050/test_case.h" // <<< + // ... ``` Finally, create an issue with a link to your branch with the new test case. @@ -82,8 +84,8 @@ Thanks for taking interest in `clang-uml`! the feature to ensure we're on the same page as to its purpose and possible implementation * Next, implement the feature, please try to adapt to the overall code style: * 80-character line width - * snakes over camels - * use `make format` before submitting PR to ensure consistent formatting + * snakes not camels + * use `make format` before submitting PR to ensure consistent formatting (requires Docker) * use `make tidy` to check if your code doesn't introduce any `clang-tidy` warnings * Add test case (or multiple test cases), which cover the new feature * Finally, create a pull request! diff --git a/Doxyfile b/Doxyfile index fe6bc7c8..e513d548 100644 --- a/Doxyfile +++ b/Doxyfile @@ -44,7 +44,7 @@ PROJECT_NUMBER = 0.3.7 # for a project that appears at the top of each page and should give viewer a # quick idea about the purpose of the project. Keep the description short. -PROJECT_BRIEF = C++ UML diagram generator based on Clang +PROJECT_BRIEF = C++ to UML diagram generator based on Clang # With the PROJECT_LOGO tag one can specify a logo or an icon that is included # in the documentation. The maximum height of the logo should not exceed 55 @@ -781,7 +781,7 @@ FILE_VERSION_FILTER = # DoxygenLayout.xml, doxygen will parse it automatically even if the LAYOUT_FILE # tag is left empty. -LAYOUT_FILE = +LAYOUT_FILE = docs/doxygen/layout-clang-uml.xml # The CITE_BIB_FILES tag can be used to specify one or more bib files containing # the reference definitions. This must be a list of .bib files. The .bib @@ -1013,7 +1013,9 @@ EXAMPLE_RECURSIVE = NO # that contain images that are to be included in the documentation (see the # \image command). -IMAGE_PATH = docs/diagrams +IMAGE_PATH = docs/img +IMAGE_PATH += docs/diagrams +IMAGE_PATH += docs/test_cases # The INPUT_FILTER tag can be used to specify a program that doxygen should # invoke to filter for each input file. Doxygen will invoke the filter program @@ -1305,9 +1307,9 @@ HTML_EXTRA_FILES = thirdparty/doxygen-awesome-css/doxygen-awesome-interact HTML_EXTRA_FILES += thirdparty/doxygen-awesome-css/doxygen-awesome-paragraph-link.js HTML_EXTRA_FILES += thirdparty/doxygen-awesome-css/doxygen-awesome-fragment-copy-button.js HTML_EXTRA_FILES += thirdparty/doxygen-awesome-css/doxygen-awesome-tabs.js -HTML_EXTRA_FILES += CONTRIBUTING.md HTML_EXTRA_FILES += docs/doxygen/highlight.min.js -HTML_EXTRA_FILES += docs/doxygen/solarized-light.min.css +HTML_EXTRA_FILES += docs/doxygen/github.min.css +HTML_EXTRA_FILES += docs/img/github-mark.svg # The HTML_COLORSTYLE_HUE tag controls the color of the HTML output. Doxygen # will adjust the colors in the style sheet and background images according to diff --git a/Makefile b/Makefile index e7888c1d..0864a5cd 100644 --- a/Makefile +++ b/Makefile @@ -139,8 +139,10 @@ docs: make -C docs toc .PHONY: doxygen -doxygen: - doxygen +doxygen: docs + cp CONTRIBUTING.md docs/contributing.md + cp CHANGELOG.md docs/changelog.md + ../doxygen/_build/bin/doxygen .PHONY: fedora/% fedora/%: diff --git a/docs/README.md b/docs/README.md index 58675ace..81f0e489 100644 --- a/docs/README.md +++ b/docs/README.md @@ -9,15 +9,20 @@ legacy code. The configuration file or files for `clang-uml` define the types and contents of each generated diagram. The diagrams can be generated in [PlantUML](https://plantuml.com) and JSON formats. +Example sequence diagram generated using `clang-uml` from [this code](https://github.com/bkryza/clang-uml/blob/master/tests/t20029/t20029.cc): +![Sample sequence diagram](test_cases/t20029_sequence.svg) + `clang-uml` currently supports C++ up to version 17 with partial support for C++ 20. To see what `clang-uml` can do, checkout the diagrams generated for unit test cases [here](./test_cases.md) or examples in [clang-uml-examples](https://github.com/bkryza/clang-uml-examples) repository. +These pages provide both user and developer documentation. + * [Quick start](./quick_start.md) * [Installation](./installation.md) -* Generating diagrams +* **Generating diagrams** * [Common options](./common_options.md) * [Generator types](./generator_types.md) * [Class diagrams](./class_diagrams.md) @@ -33,6 +38,8 @@ test cases [here](./test_cases.md) or examples in * [Doxygen integration](./doxygen_integration.md) * [Test cases documentation](./test_cases.md) * [Troubleshooting](./troubleshooting.md) -* Development +* [Changelog](./changelog.md) +* [License](./license.md) +* **Development** * [Architecture](./architecture.md) - * [Contributing](../CONTRIBUTING.md) + * [Contributing](./contributing.md) diff --git a/docs/architecture.md b/docs/architecture.md new file mode 100644 index 00000000..a2dc5cd6 --- /dev/null +++ b/docs/architecture.md @@ -0,0 +1,2 @@ +# Architecture + diff --git a/docs/comment_decorators.md b/docs/comment_decorators.md index 783c984e..059ff692 100644 --- a/docs/comment_decorators.md +++ b/docs/comment_decorators.md @@ -2,10 +2,10 @@ -* [`note`](#note) -* [`skip` and `skiprelationship`](#skip-and-skiprelationship) -* [`composition`, `association` and `aggregation`](#composition-association-and-aggregation) -* [`style`](#style) +* ['note'](#note) +* ['skip' and 'skiprelationship'](#skip-and-skiprelationship) +* ['composition', 'association' and 'aggregation'](#composition-association-and-aggregation) +* ['style'](#style) @@ -27,7 +27,7 @@ The optional `:` suffix will apply this decorator only to a specif Currently, the following decorators are supported. -## `note` +## 'note' This decorator allows to specify directly in the code comments that should be included in the generated diagrams. @@ -91,7 +91,7 @@ generates the following class diagram: ![note](./test_cases/t00028_class.svg) -## `skip` and `skiprelationship` +## 'skip' and 'skiprelationship' This decorator allows to skip the specific classes or methods from the diagrams, for instance the following code: ```cpp @@ -145,7 +145,7 @@ generates the following diagram: ![skip](./test_cases/t00029_class.svg) -## `composition`, `association` and `aggregation` +## 'composition', 'association' and 'aggregation' These decorators allow to specify explicitly the type of relationship within a class diagram that should be generated for a given class member. For instance the following code: @@ -190,7 +190,7 @@ generates the following diagram: ![skip](./test_cases/t00030_class.svg) -## `style` +## 'style' This decorator allows to specify in the code specific styles for diagram elements, for instance: ```cpp diff --git a/docs/common_options.md b/docs/common_options.md index e1e8f0a2..fe54e8d3 100644 --- a/docs/common_options.md +++ b/docs/common_options.md @@ -7,9 +7,9 @@ * [PlantUML custom directives](#plantuml-custom-directives) * [Adding debug information in the generated diagrams](#adding-debug-information-in-the-generated-diagrams) * [Resolving include path and compiler flags issues](#resolving-include-path-and-compiler-flags-issues) - * [Use `--query-driver` command line option](#use---query-driver-command-line-option) + * [Use '--query-driver' command line option](#use---query-driver-command-line-option) * [Manually add and remove compile flags from the compilation database](#manually-add-and-remove-compile-flags-from-the-compilation-database) - * [Using `CMAKE_CXX_IMPLICIT_INCLUDE_DIRECTORIES`](#using-cmake_cxx_implicit_include_directories) + * [Using 'CMAKE_CXX_IMPLICIT_INCLUDE_DIRECTORIES'](#using-cmake_cxx_implicit_include_directories) @@ -18,14 +18,17 @@ By default, `clang-uml` will look for file `.clang-uml` in the projects director from it. The file must be specified in YAML and it's overall structure is as follows: ```yaml - +# common options for all diagrams +... diagrams: - : - type: [class|sequence|package|include] - - : - type: [class|sequence|package|include] - + first_diagram_name>: + type: class|sequence|package|include + # diagram specific options + ... + second_diagram_name: + type: class|sequence|package|include + # diagram specific options + ... ... ``` @@ -116,7 +119,7 @@ These errors can be overcome, by ensuring that the Clang parser has the correct include paths to analyse your code base on the given platform. `clang-uml` provides several mechanisms to resolve this issue: -### Use `--query-driver` command line option +### Use '--query-driver' command line option > This option is not available on Windows. @@ -133,7 +136,7 @@ system, when generating diagrams for an embedded project and providing `arm-none-eabi-gcc` as driver: ```bash -$ clang-uml --query-driver arm-none-eabi-gcc +clang-uml --query-driver arm-none-eabi-gcc ``` the following options are appended to each command line after `argv[0]` of the @@ -148,7 +151,7 @@ already as `argv[0]` in your `compile_commands.json`, you can simply invoke `clang-uml` as: ```bash -$ clang-uml --query-driver . +clang-uml --query-driver . ``` however please make sure that the `compile_commands.json` contain a command, @@ -173,11 +176,11 @@ remove_compile_flags: These options can be also passed on the command line, for instance: ```bash -$ clang-uml --add-compile-flag -I/opt/my_toolchain/include \ - --remove-compile-flag -I/usr/include ... +clang-uml --add-compile-flag -I/opt/my_toolchain/include \ + --remove-compile-flag -I/usr/include ... ``` -### Using `CMAKE_CXX_IMPLICIT_INCLUDE_DIRECTORIES` +### Using 'CMAKE_CXX_IMPLICIT_INCLUDE_DIRECTORIES' Yet another option, for CMake based projects, is to use the following CMake option: ```cmake diff --git a/docs/diagram_templates.md b/docs/diagram_templates.md index 35a55d10..ebb4ac72 100644 --- a/docs/diagram_templates.md +++ b/docs/diagram_templates.md @@ -88,7 +88,7 @@ clang-uml --show-template parents_hierarchy_tmpl users configuration file defines another template with a name which already exists as built-in template it will override the predefined templates. -Currently the following templates are built-in: +Currently, the following templates are built-in: * `parents_hierarchy_tmpl` - generate inheritance hierarchy diagram including parents of a specified class * `subclass_hierarchy_tmpl` - generate inheritance hierarchy diagram including diff --git a/docs/doxygen/github.min.css b/docs/doxygen/github.min.css new file mode 100644 index 00000000..188aaaab --- /dev/null +++ b/docs/doxygen/github.min.css @@ -0,0 +1,7 @@ +/*! + Theme: Github + Author: Defman21 + License: ~ MIT (or more permissive) [via base16-schemes-source] + Maintainer: @highlightjs/core-team + Version: 2021.09.0 +*/pre code.hljs{display:block;overflow-x:auto;padding:1em}code.hljs{padding:3px 5px}.hljs{color:#333;background:#fff}.hljs ::selection,.hljs::selection{background-color:#c8c8fa;color:#333}.hljs-comment{color:#969896}.hljs-tag{color:#e8e8e8}.hljs-operator,.hljs-punctuation,.hljs-subst{color:#333}.hljs-operator{opacity:.7}.hljs-bullet,.hljs-deletion,.hljs-name,.hljs-selector-tag,.hljs-template-variable,.hljs-variable{color:#ed6a43}.hljs-attr,.hljs-link,.hljs-literal,.hljs-number,.hljs-symbol,.hljs-variable.constant_{color:#0086b3}.hljs-class .hljs-title,.hljs-title,.hljs-title.class_{color:#795da3}.hljs-strong{font-weight:700;color:#795da3}.hljs-addition,.hljs-built_in,.hljs-code,.hljs-doctag,.hljs-keyword.hljs-atrule,.hljs-quote,.hljs-regexp,.hljs-string,.hljs-title.class_.inherited__{color:#183691}.hljs-attribute,.hljs-function .hljs-title,.hljs-section,.hljs-title.function_,.ruby .hljs-property{color:#795da3}.diff .hljs-meta,.hljs-keyword,.hljs-template-tag,.hljs-type{color:#a71d5d}.hljs-emphasis{color:#a71d5d;font-style:italic}.hljs-meta,.hljs-meta .hljs-keyword,.hljs-meta .hljs-string{color:#333}.hljs-meta .hljs-keyword,.hljs-meta-keyword{font-weight:700} \ No newline at end of file diff --git a/docs/doxygen/header.html b/docs/doxygen/header.html index 79cd32a6..a33472da 100644 --- a/docs/doxygen/header.html +++ b/docs/doxygen/header.html @@ -9,7 +9,7 @@ - + @@ -22,31 +22,35 @@ - - - + + + $treeview $search $mathjax - + @@ -59,35 +63,44 @@ $extrastylesheet - -
- +
- + - + + + + - + diff --git a/docs/doxygen/highlight.js b/docs/doxygen/highlight.js new file mode 100644 index 00000000..dd0992d4 --- /dev/null +++ b/docs/doxygen/highlight.js @@ -0,0 +1,2575 @@ +/*! + Highlight.js v11.7.0 (git: 82688fad18) + (c) 2006-2022 undefined and other contributors + License: BSD-3-Clause + */ +var hljs = (function () { + 'use strict'; + + var deepFreezeEs6 = {exports: {}}; + + function deepFreeze(obj) { + if (obj instanceof Map) { + obj.clear = obj.delete = obj.set = function () { + throw new Error('map is read-only'); + }; + } else if (obj instanceof Set) { + obj.add = obj.clear = obj.delete = function () { + throw new Error('set is read-only'); + }; + } + + // Freeze self + Object.freeze(obj); + + Object.getOwnPropertyNames(obj).forEach(function (name) { + var prop = obj[name]; + + // Freeze prop if it is an object + if (typeof prop == 'object' && !Object.isFrozen(prop)) { + deepFreeze(prop); + } + }); + + return obj; + } + + deepFreezeEs6.exports = deepFreeze; + deepFreezeEs6.exports.default = deepFreeze; + + /** @typedef {import('highlight.js').CallbackResponse} CallbackResponse */ + /** @typedef {import('highlight.js').CompiledMode} CompiledMode */ + /** @implements CallbackResponse */ + + class Response { + /** + * @param {CompiledMode} mode + */ + constructor(mode) { + // eslint-disable-next-line no-undefined + if (mode.data === undefined) mode.data = {}; + + this.data = mode.data; + this.isMatchIgnored = false; + } + + ignoreMatch() { + this.isMatchIgnored = true; + } + } + + /** + * @param {string} value + * @returns {string} + */ + function escapeHTML(value) { + return value + .replace(/&/g, '&') + .replace(//g, '>') + .replace(/"/g, '"') + .replace(/'/g, '''); + } + + /** + * performs a shallow merge of multiple objects into one + * + * @template T + * @param {T} original + * @param {Record[]} objects + * @returns {T} a single new object + */ + function inherit$1(original, ...objects) { + /** @type Record */ + const result = Object.create(null); + + for (const key in original) { + result[key] = original[key]; + } + objects.forEach(function(obj) { + for (const key in obj) { + result[key] = obj[key]; + } + }); + return /** @type {T} */ (result); + } + + /** + * @typedef {object} Renderer + * @property {(text: string) => void} addText + * @property {(node: Node) => void} openNode + * @property {(node: Node) => void} closeNode + * @property {() => string} value + */ + + /** @typedef {{scope?: string, language?: string, sublanguage?: boolean}} Node */ + /** @typedef {{walk: (r: Renderer) => void}} Tree */ + /** */ + + const SPAN_CLOSE = ''; + + /** + * Determines if a node needs to be wrapped in + * + * @param {Node} node */ + const emitsWrappingTags = (node) => { + // rarely we can have a sublanguage where language is undefined + // TODO: track down why + return !!node.scope || (node.sublanguage && node.language); + }; + + /** + * + * @param {string} name + * @param {{prefix:string}} options + */ + const scopeToCSSClass = (name, { prefix }) => { + if (name.includes(".")) { + const pieces = name.split("."); + return [ + `${prefix}${pieces.shift()}`, + ...(pieces.map((x, i) => `${x}${"_".repeat(i + 1)}`)) + ].join(" "); + } + return `${prefix}${name}`; + }; + + /** @type {Renderer} */ + class HTMLRenderer { + /** + * Creates a new HTMLRenderer + * + * @param {Tree} parseTree - the parse tree (must support `walk` API) + * @param {{classPrefix: string}} options + */ + constructor(parseTree, options) { + this.buffer = ""; + this.classPrefix = options.classPrefix; + parseTree.walk(this); + } + + /** + * Adds texts to the output stream + * + * @param {string} text */ + addText(text) { + this.buffer += escapeHTML(text); + } + + /** + * Adds a node open to the output stream (if needed) + * + * @param {Node} node */ + openNode(node) { + if (!emitsWrappingTags(node)) return; + + let className = ""; + if (node.sublanguage) { + className = `language-${node.language}`; + } else { + className = scopeToCSSClass(node.scope, { prefix: this.classPrefix }); + } + this.span(className); + } + + /** + * Adds a node close to the output stream (if needed) + * + * @param {Node} node */ + closeNode(node) { + if (!emitsWrappingTags(node)) return; + + this.buffer += SPAN_CLOSE; + } + + /** + * returns the accumulated buffer + */ + value() { + return this.buffer; + } + + // helpers + + /** + * Builds a span element + * + * @param {string} className */ + span(className) { + this.buffer += ``; + } + } + + /** @typedef {{scope?: string, language?: string, sublanguage?: boolean, children: Node[]} | string} Node */ + /** @typedef {{scope?: string, language?: string, sublanguage?: boolean, children: Node[]} } DataNode */ + /** @typedef {import('highlight.js').Emitter} Emitter */ + /** */ + + /** @returns {DataNode} */ + const newNode = (opts = {}) => { + /** @type DataNode */ + const result = { children: [] }; + Object.assign(result, opts); + return result; + }; + + class TokenTree { + constructor() { + /** @type DataNode */ + this.rootNode = newNode(); + this.stack = [this.rootNode]; + } + + get top() { + return this.stack[this.stack.length - 1]; + } + + get root() { return this.rootNode; } + + /** @param {Node} node */ + add(node) { + this.top.children.push(node); + } + + /** @param {string} scope */ + openNode(scope) { + /** @type Node */ + const node = newNode({ scope }); + this.add(node); + this.stack.push(node); + } + + closeNode() { + if (this.stack.length > 1) { + return this.stack.pop(); + } + // eslint-disable-next-line no-undefined + return undefined; + } + + closeAllNodes() { + while (this.closeNode()); + } + + toJSON() { + return JSON.stringify(this.rootNode, null, 4); + } + + /** + * @typedef { import("./html_renderer").Renderer } Renderer + * @param {Renderer} builder + */ + walk(builder) { + // this does not + return this.constructor._walk(builder, this.rootNode); + // this works + // return TokenTree._walk(builder, this.rootNode); + } + + /** + * @param {Renderer} builder + * @param {Node} node + */ + static _walk(builder, node) { + if (typeof node === "string") { + builder.addText(node); + } else if (node.children) { + builder.openNode(node); + node.children.forEach((child) => this._walk(builder, child)); + builder.closeNode(node); + } + return builder; + } + + /** + * @param {Node} node + */ + static _collapse(node) { + if (typeof node === "string") return; + if (!node.children) return; + + if (node.children.every(el => typeof el === "string")) { + // node.text = node.children.join(""); + // delete node.children; + node.children = [node.children.join("")]; + } else { + node.children.forEach((child) => { + TokenTree._collapse(child); + }); + } + } + } + + /** + Currently this is all private API, but this is the minimal API necessary + that an Emitter must implement to fully support the parser. + + Minimal interface: + + - addKeyword(text, scope) + - addText(text) + - addSublanguage(emitter, subLanguageName) + - finalize() + - openNode(scope) + - closeNode() + - closeAllNodes() + - toHTML() + + */ + + /** + * @implements {Emitter} + */ + class TokenTreeEmitter extends TokenTree { + /** + * @param {*} options + */ + constructor(options) { + super(); + this.options = options; + } + + /** + * @param {string} text + * @param {string} scope + */ + addKeyword(text, scope) { + if (text === "") { return; } + + this.openNode(scope); + this.addText(text); + this.closeNode(); + } + + /** + * @param {string} text + */ + addText(text) { + if (text === "") { return; } + + this.add(text); + } + + /** + * @param {Emitter & {root: DataNode}} emitter + * @param {string} name + */ + addSublanguage(emitter, name) { + /** @type DataNode */ + const node = emitter.root; + node.sublanguage = true; + node.language = name; + this.add(node); + } + + toHTML() { + const renderer = new HTMLRenderer(this, this.options); + return renderer.value(); + } + + finalize() { + return true; + } + } + + /** + * @param {string} value + * @returns {RegExp} + * */ + + /** + * @param {RegExp | string } re + * @returns {string} + */ + function source(re) { + if (!re) return null; + if (typeof re === "string") return re; + + return re.source; + } + + /** + * @param {RegExp | string } re + * @returns {string} + */ + function lookahead(re) { + return concat('(?=', re, ')'); + } + + /** + * @param {RegExp | string } re + * @returns {string} + */ + function anyNumberOfTimes(re) { + return concat('(?:', re, ')*'); + } + + /** + * @param {RegExp | string } re + * @returns {string} + */ + function optional(re) { + return concat('(?:', re, ')?'); + } + + /** + * @param {...(RegExp | string) } args + * @returns {string} + */ + function concat(...args) { + const joined = args.map((x) => source(x)).join(""); + return joined; + } + + /** + * @param { Array } args + * @returns {object} + */ + function stripOptionsFromArgs(args) { + const opts = args[args.length - 1]; + + if (typeof opts === 'object' && opts.constructor === Object) { + args.splice(args.length - 1, 1); + return opts; + } else { + return {}; + } + } + + /** @typedef { {capture?: boolean} } RegexEitherOptions */ + + /** + * Any of the passed expresssions may match + * + * Creates a huge this | this | that | that match + * @param {(RegExp | string)[] | [...(RegExp | string)[], RegexEitherOptions]} args + * @returns {string} + */ + function either(...args) { + /** @type { object & {capture?: boolean} } */ + const opts = stripOptionsFromArgs(args); + const joined = '(' + + (opts.capture ? "" : "?:") + + args.map((x) => source(x)).join("|") + ")"; + return joined; + } + + /** + * @param {RegExp | string} re + * @returns {number} + */ + function countMatchGroups(re) { + return (new RegExp(re.toString() + '|')).exec('').length - 1; + } + + /** + * Does lexeme start with a regular expression match at the beginning + * @param {RegExp} re + * @param {string} lexeme + */ + function startsWith(re, lexeme) { + const match = re && re.exec(lexeme); + return match && match.index === 0; + } + + // BACKREF_RE matches an open parenthesis or backreference. To avoid + // an incorrect parse, it additionally matches the following: + // - [...] elements, where the meaning of parentheses and escapes change + // - other escape sequences, so we do not misparse escape sequences as + // interesting elements + // - non-matching or lookahead parentheses, which do not capture. These + // follow the '(' with a '?'. + const BACKREF_RE = /\[(?:[^\\\]]|\\.)*\]|\(\??|\\([1-9][0-9]*)|\\./; + + // **INTERNAL** Not intended for outside usage + // join logically computes regexps.join(separator), but fixes the + // backreferences so they continue to match. + // it also places each individual regular expression into it's own + // match group, keeping track of the sequencing of those match groups + // is currently an exercise for the caller. :-) + /** + * @param {(string | RegExp)[]} regexps + * @param {{joinWith: string}} opts + * @returns {string} + */ + function _rewriteBackreferences(regexps, { joinWith }) { + let numCaptures = 0; + + return regexps.map((regex) => { + numCaptures += 1; + const offset = numCaptures; + let re = source(regex); + let out = ''; + + while (re.length > 0) { + const match = BACKREF_RE.exec(re); + if (!match) { + out += re; + break; + } + out += re.substring(0, match.index); + re = re.substring(match.index + match[0].length); + if (match[0][0] === '\\' && match[1]) { + // Adjust the backreference. + out += '\\' + String(Number(match[1]) + offset); + } else { + out += match[0]; + if (match[0] === '(') { + numCaptures++; + } + } + } + return out; + }).map(re => `(${re})`).join(joinWith); + } + + /** @typedef {import('highlight.js').Mode} Mode */ + /** @typedef {import('highlight.js').ModeCallback} ModeCallback */ + + // Common regexps + const MATCH_NOTHING_RE = /\b\B/; + const IDENT_RE = '[a-zA-Z]\\w*'; + const UNDERSCORE_IDENT_RE = '[a-zA-Z_]\\w*'; + const NUMBER_RE = '\\b\\d+(\\.\\d+)?'; + const C_NUMBER_RE = '(-?)(\\b0[xX][a-fA-F0-9]+|(\\b\\d+(\\.\\d*)?|\\.\\d+)([eE][-+]?\\d+)?)'; // 0x..., 0..., decimal, float + const BINARY_NUMBER_RE = '\\b(0b[01]+)'; // 0b... + const RE_STARTERS_RE = '!|!=|!==|%|%=|&|&&|&=|\\*|\\*=|\\+|\\+=|,|-|-=|/=|/|:|;|<<|<<=|<=|<|===|==|=|>>>=|>>=|>=|>>>|>>|>|\\?|\\[|\\{|\\(|\\^|\\^=|\\||\\|=|\\|\\||~'; + + /** + * @param { Partial & {binary?: string | RegExp} } opts + */ + const SHEBANG = (opts = {}) => { + const beginShebang = /^#![ ]*\//; + if (opts.binary) { + opts.begin = concat( + beginShebang, + /.*\b/, + opts.binary, + /\b.*/); + } + return inherit$1({ + scope: 'meta', + begin: beginShebang, + end: /$/, + relevance: 0, + /** @type {ModeCallback} */ + "on:begin": (m, resp) => { + if (m.index !== 0) resp.ignoreMatch(); + } + }, opts); + }; + + // Common modes + const BACKSLASH_ESCAPE = { + begin: '\\\\[\\s\\S]', relevance: 0 + }; + const APOS_STRING_MODE = { + scope: 'string', + begin: '\'', + end: '\'', + illegal: '\\n', + contains: [BACKSLASH_ESCAPE] + }; + const QUOTE_STRING_MODE = { + scope: 'string', + begin: '"', + end: '"', + illegal: '\\n', + contains: [BACKSLASH_ESCAPE] + }; + const PHRASAL_WORDS_MODE = { + begin: /\b(a|an|the|are|I'm|isn't|don't|doesn't|won't|but|just|should|pretty|simply|enough|gonna|going|wtf|so|such|will|you|your|they|like|more)\b/ + }; + /** + * Creates a comment mode + * + * @param {string | RegExp} begin + * @param {string | RegExp} end + * @param {Mode | {}} [modeOptions] + * @returns {Partial} + */ + const COMMENT = function(begin, end, modeOptions = {}) { + const mode = inherit$1( + { + scope: 'comment', + begin, + end, + contains: [] + }, + modeOptions + ); + mode.contains.push({ + scope: 'doctag', + // hack to avoid the space from being included. the space is necessary to + // match here to prevent the plain text rule below from gobbling up doctags + begin: '[ ]*(?=(TODO|FIXME|NOTE|BUG|OPTIMIZE|HACK|XXX):)', + end: /(TODO|FIXME|NOTE|BUG|OPTIMIZE|HACK|XXX):/, + excludeBegin: true, + relevance: 0 + }); + const ENGLISH_WORD = either( + // list of common 1 and 2 letter words in English + "I", + "a", + "is", + "so", + "us", + "to", + "at", + "if", + "in", + "it", + "on", + // note: this is not an exhaustive list of contractions, just popular ones + /[A-Za-z]+['](d|ve|re|ll|t|s|n)/, // contractions - can't we'd they're let's, etc + /[A-Za-z]+[-][a-z]+/, // `no-way`, etc. + /[A-Za-z][a-z]{2,}/ // allow capitalized words at beginning of sentences + ); + // looking like plain text, more likely to be a comment + mode.contains.push( + { + // TODO: how to include ", (, ) without breaking grammars that use these for + // comment delimiters? + // begin: /[ ]+([()"]?([A-Za-z'-]{3,}|is|a|I|so|us|[tT][oO]|at|if|in|it|on)[.]?[()":]?([.][ ]|[ ]|\))){3}/ + // --- + + // this tries to find sequences of 3 english words in a row (without any + // "programming" type syntax) this gives us a strong signal that we've + // TRULY found a comment - vs perhaps scanning with the wrong language. + // It's possible to find something that LOOKS like the start of the + // comment - but then if there is no readable text - good chance it is a + // false match and not a comment. + // + // for a visual example please see: + // https://github.com/highlightjs/highlight.js/issues/2827 + + begin: concat( + /[ ]+/, // necessary to prevent us gobbling up doctags like /* @author Bob Mcgill */ + '(', + ENGLISH_WORD, + /[.]?[:]?([.][ ]|[ ])/, + '){3}') // look for 3 words in a row + } + ); + return mode; + }; + const C_LINE_COMMENT_MODE = COMMENT('//', '$'); + const C_BLOCK_COMMENT_MODE = COMMENT('/\\*', '\\*/'); + const HASH_COMMENT_MODE = COMMENT('#', '$'); + const NUMBER_MODE = { + scope: 'number', + begin: NUMBER_RE, + relevance: 0 + }; + const C_NUMBER_MODE = { + scope: 'number', + begin: C_NUMBER_RE, + relevance: 0 + }; + const BINARY_NUMBER_MODE = { + scope: 'number', + begin: BINARY_NUMBER_RE, + relevance: 0 + }; + const REGEXP_MODE = { + // this outer rule makes sure we actually have a WHOLE regex and not simply + // an expression such as: + // + // 3 / something + // + // (which will then blow up when regex's `illegal` sees the newline) + begin: /(?=\/[^/\n]*\/)/, + contains: [{ + scope: 'regexp', + begin: /\//, + end: /\/[gimuy]*/, + illegal: /\n/, + contains: [ + BACKSLASH_ESCAPE, + { + begin: /\[/, + end: /\]/, + relevance: 0, + contains: [BACKSLASH_ESCAPE] + } + ] + }] + }; + const TITLE_MODE = { + scope: 'title', + begin: IDENT_RE, + relevance: 0 + }; + const UNDERSCORE_TITLE_MODE = { + scope: 'title', + begin: UNDERSCORE_IDENT_RE, + relevance: 0 + }; + const METHOD_GUARD = { + // excludes method names from keyword processing + begin: '\\.\\s*' + UNDERSCORE_IDENT_RE, + relevance: 0 + }; + + /** + * Adds end same as begin mechanics to a mode + * + * Your mode must include at least a single () match group as that first match + * group is what is used for comparison + * @param {Partial} mode + */ + const END_SAME_AS_BEGIN = function(mode) { + return Object.assign(mode, + { + /** @type {ModeCallback} */ + 'on:begin': (m, resp) => { resp.data._beginMatch = m[1]; }, + /** @type {ModeCallback} */ + 'on:end': (m, resp) => { if (resp.data._beginMatch !== m[1]) resp.ignoreMatch(); } + }); + }; + + var MODES = /*#__PURE__*/Object.freeze({ + __proto__: null, + MATCH_NOTHING_RE: MATCH_NOTHING_RE, + IDENT_RE: IDENT_RE, + UNDERSCORE_IDENT_RE: UNDERSCORE_IDENT_RE, + NUMBER_RE: NUMBER_RE, + C_NUMBER_RE: C_NUMBER_RE, + BINARY_NUMBER_RE: BINARY_NUMBER_RE, + RE_STARTERS_RE: RE_STARTERS_RE, + SHEBANG: SHEBANG, + BACKSLASH_ESCAPE: BACKSLASH_ESCAPE, + APOS_STRING_MODE: APOS_STRING_MODE, + QUOTE_STRING_MODE: QUOTE_STRING_MODE, + PHRASAL_WORDS_MODE: PHRASAL_WORDS_MODE, + COMMENT: COMMENT, + C_LINE_COMMENT_MODE: C_LINE_COMMENT_MODE, + C_BLOCK_COMMENT_MODE: C_BLOCK_COMMENT_MODE, + HASH_COMMENT_MODE: HASH_COMMENT_MODE, + NUMBER_MODE: NUMBER_MODE, + C_NUMBER_MODE: C_NUMBER_MODE, + BINARY_NUMBER_MODE: BINARY_NUMBER_MODE, + REGEXP_MODE: REGEXP_MODE, + TITLE_MODE: TITLE_MODE, + UNDERSCORE_TITLE_MODE: UNDERSCORE_TITLE_MODE, + METHOD_GUARD: METHOD_GUARD, + END_SAME_AS_BEGIN: END_SAME_AS_BEGIN + }); + + /** + @typedef {import('highlight.js').CallbackResponse} CallbackResponse + @typedef {import('highlight.js').CompilerExt} CompilerExt + */ + + // Grammar extensions / plugins + // See: https://github.com/highlightjs/highlight.js/issues/2833 + + // Grammar extensions allow "syntactic sugar" to be added to the grammar modes + // without requiring any underlying changes to the compiler internals. + + // `compileMatch` being the perfect small example of now allowing a grammar + // author to write `match` when they desire to match a single expression rather + // than being forced to use `begin`. The extension then just moves `match` into + // `begin` when it runs. Ie, no features have been added, but we've just made + // the experience of writing (and reading grammars) a little bit nicer. + + // ------ + + // TODO: We need negative look-behind support to do this properly + /** + * Skip a match if it has a preceding dot + * + * This is used for `beginKeywords` to prevent matching expressions such as + * `bob.keyword.do()`. The mode compiler automatically wires this up as a + * special _internal_ 'on:begin' callback for modes with `beginKeywords` + * @param {RegExpMatchArray} match + * @param {CallbackResponse} response + */ + function skipIfHasPrecedingDot(match, response) { + const before = match.input[match.index - 1]; + if (before === ".") { + response.ignoreMatch(); + } + } + + /** + * + * @type {CompilerExt} + */ + function scopeClassName(mode, _parent) { + // eslint-disable-next-line no-undefined + if (mode.className !== undefined) { + mode.scope = mode.className; + delete mode.className; + } + } + + /** + * `beginKeywords` syntactic sugar + * @type {CompilerExt} + */ + function beginKeywords(mode, parent) { + if (!parent) return; + if (!mode.beginKeywords) return; + + // for languages with keywords that include non-word characters checking for + // a word boundary is not sufficient, so instead we check for a word boundary + // or whitespace - this does no harm in any case since our keyword engine + // doesn't allow spaces in keywords anyways and we still check for the boundary + // first + mode.begin = '\\b(' + mode.beginKeywords.split(' ').join('|') + ')(?!\\.)(?=\\b|\\s)'; + mode.__beforeBegin = skipIfHasPrecedingDot; + mode.keywords = mode.keywords || mode.beginKeywords; + delete mode.beginKeywords; + + // prevents double relevance, the keywords themselves provide + // relevance, the mode doesn't need to double it + // eslint-disable-next-line no-undefined + if (mode.relevance === undefined) mode.relevance = 0; + } + + /** + * Allow `illegal` to contain an array of illegal values + * @type {CompilerExt} + */ + function compileIllegal(mode, _parent) { + if (!Array.isArray(mode.illegal)) return; + + mode.illegal = either(...mode.illegal); + } + + /** + * `match` to match a single expression for readability + * @type {CompilerExt} + */ + function compileMatch(mode, _parent) { + if (!mode.match) return; + if (mode.begin || mode.end) throw new Error("begin & end are not supported with match"); + + mode.begin = mode.match; + delete mode.match; + } + + /** + * provides the default 1 relevance to all modes + * @type {CompilerExt} + */ + function compileRelevance(mode, _parent) { + // eslint-disable-next-line no-undefined + if (mode.relevance === undefined) mode.relevance = 1; + } + + // allow beforeMatch to act as a "qualifier" for the match + // the full match begin must be [beforeMatch][begin] + const beforeMatchExt = (mode, parent) => { + if (!mode.beforeMatch) return; + // starts conflicts with endsParent which we need to make sure the child + // rule is not matched multiple times + if (mode.starts) throw new Error("beforeMatch cannot be used with starts"); + + const originalMode = Object.assign({}, mode); + Object.keys(mode).forEach((key) => { delete mode[key]; }); + + mode.keywords = originalMode.keywords; + mode.begin = concat(originalMode.beforeMatch, lookahead(originalMode.begin)); + mode.starts = { + relevance: 0, + contains: [ + Object.assign(originalMode, { endsParent: true }) + ] + }; + mode.relevance = 0; + + delete originalMode.beforeMatch; + }; + + // keywords that should have no default relevance value + const COMMON_KEYWORDS = [ + 'of', + 'and', + 'for', + 'in', + 'not', + 'or', + 'if', + 'then', + 'parent', // common variable name + 'list', // common variable name + 'value' // common variable name + ]; + + const DEFAULT_KEYWORD_SCOPE = "keyword"; + + /** + * Given raw keywords from a language definition, compile them. + * + * @param {string | Record | Array} rawKeywords + * @param {boolean} caseInsensitive + */ + function compileKeywords(rawKeywords, caseInsensitive, scopeName = DEFAULT_KEYWORD_SCOPE) { + /** @type {import("highlight.js/private").KeywordDict} */ + const compiledKeywords = Object.create(null); + + // input can be a string of keywords, an array of keywords, or a object with + // named keys representing scopeName (which can then point to a string or array) + if (typeof rawKeywords === 'string') { + compileList(scopeName, rawKeywords.split(" ")); + } else if (Array.isArray(rawKeywords)) { + compileList(scopeName, rawKeywords); + } else { + Object.keys(rawKeywords).forEach(function(scopeName) { + // collapse all our objects back into the parent object + Object.assign( + compiledKeywords, + compileKeywords(rawKeywords[scopeName], caseInsensitive, scopeName) + ); + }); + } + return compiledKeywords; + + // --- + + /** + * Compiles an individual list of keywords + * + * Ex: "for if when while|5" + * + * @param {string} scopeName + * @param {Array} keywordList + */ + function compileList(scopeName, keywordList) { + if (caseInsensitive) { + keywordList = keywordList.map(x => x.toLowerCase()); + } + keywordList.forEach(function(keyword) { + const pair = keyword.split('|'); + compiledKeywords[pair[0]] = [scopeName, scoreForKeyword(pair[0], pair[1])]; + }); + } + } + + /** + * Returns the proper score for a given keyword + * + * Also takes into account comment keywords, which will be scored 0 UNLESS + * another score has been manually assigned. + * @param {string} keyword + * @param {string} [providedScore] + */ + function scoreForKeyword(keyword, providedScore) { + // manual scores always win over common keywords + // so you can force a score of 1 if you really insist + if (providedScore) { + return Number(providedScore); + } + + return commonKeyword(keyword) ? 0 : 1; + } + + /** + * Determines if a given keyword is common or not + * + * @param {string} keyword */ + function commonKeyword(keyword) { + return COMMON_KEYWORDS.includes(keyword.toLowerCase()); + } + + /* + + For the reasoning behind this please see: + https://github.com/highlightjs/highlight.js/issues/2880#issuecomment-747275419 + + */ + + /** + * @type {Record} + */ + const seenDeprecations = {}; + + /** + * @param {string} message + */ + const error = (message) => { + console.error(message); + }; + + /** + * @param {string} message + * @param {any} args + */ + const warn = (message, ...args) => { + console.log(`WARN: ${message}`, ...args); + }; + + /** + * @param {string} version + * @param {string} message + */ + const deprecated = (version, message) => { + if (seenDeprecations[`${version}/${message}`]) return; + + console.log(`Deprecated as of ${version}. ${message}`); + seenDeprecations[`${version}/${message}`] = true; + }; + + /* eslint-disable no-throw-literal */ + + /** + @typedef {import('highlight.js').CompiledMode} CompiledMode + */ + + const MultiClassError = new Error(); + + /** + * Renumbers labeled scope names to account for additional inner match + * groups that otherwise would break everything. + * + * Lets say we 3 match scopes: + * + * { 1 => ..., 2 => ..., 3 => ... } + * + * So what we need is a clean match like this: + * + * (a)(b)(c) => [ "a", "b", "c" ] + * + * But this falls apart with inner match groups: + * + * (a)(((b)))(c) => ["a", "b", "b", "b", "c" ] + * + * Our scopes are now "out of alignment" and we're repeating `b` 3 times. + * What needs to happen is the numbers are remapped: + * + * { 1 => ..., 2 => ..., 5 => ... } + * + * We also need to know that the ONLY groups that should be output + * are 1, 2, and 5. This function handles this behavior. + * + * @param {CompiledMode} mode + * @param {Array} regexes + * @param {{key: "beginScope"|"endScope"}} opts + */ + function remapScopeNames(mode, regexes, { key }) { + let offset = 0; + const scopeNames = mode[key]; + /** @type Record */ + const emit = {}; + /** @type Record */ + const positions = {}; + + for (let i = 1; i <= regexes.length; i++) { + positions[i + offset] = scopeNames[i]; + emit[i + offset] = true; + offset += countMatchGroups(regexes[i - 1]); + } + // we use _emit to keep track of which match groups are "top-level" to avoid double + // output from inside match groups + mode[key] = positions; + mode[key]._emit = emit; + mode[key]._multi = true; + } + + /** + * @param {CompiledMode} mode + */ + function beginMultiClass(mode) { + if (!Array.isArray(mode.begin)) return; + + if (mode.skip || mode.excludeBegin || mode.returnBegin) { + error("skip, excludeBegin, returnBegin not compatible with beginScope: {}"); + throw MultiClassError; + } + + if (typeof mode.beginScope !== "object" || mode.beginScope === null) { + error("beginScope must be object"); + throw MultiClassError; + } + + remapScopeNames(mode, mode.begin, { key: "beginScope" }); + mode.begin = _rewriteBackreferences(mode.begin, { joinWith: "" }); + } + + /** + * @param {CompiledMode} mode + */ + function endMultiClass(mode) { + if (!Array.isArray(mode.end)) return; + + if (mode.skip || mode.excludeEnd || mode.returnEnd) { + error("skip, excludeEnd, returnEnd not compatible with endScope: {}"); + throw MultiClassError; + } + + if (typeof mode.endScope !== "object" || mode.endScope === null) { + error("endScope must be object"); + throw MultiClassError; + } + + remapScopeNames(mode, mode.end, { key: "endScope" }); + mode.end = _rewriteBackreferences(mode.end, { joinWith: "" }); + } + + /** + * this exists only to allow `scope: {}` to be used beside `match:` + * Otherwise `beginScope` would necessary and that would look weird + + { + match: [ /def/, /\w+/ ] + scope: { 1: "keyword" , 2: "title" } + } + + * @param {CompiledMode} mode + */ + function scopeSugar(mode) { + if (mode.scope && typeof mode.scope === "object" && mode.scope !== null) { + mode.beginScope = mode.scope; + delete mode.scope; + } + } + + /** + * @param {CompiledMode} mode + */ + function MultiClass(mode) { + scopeSugar(mode); + + if (typeof mode.beginScope === "string") { + mode.beginScope = { _wrap: mode.beginScope }; + } + if (typeof mode.endScope === "string") { + mode.endScope = { _wrap: mode.endScope }; + } + + beginMultiClass(mode); + endMultiClass(mode); + } + + /** + @typedef {import('highlight.js').Mode} Mode + @typedef {import('highlight.js').CompiledMode} CompiledMode + @typedef {import('highlight.js').Language} Language + @typedef {import('highlight.js').HLJSPlugin} HLJSPlugin + @typedef {import('highlight.js').CompiledLanguage} CompiledLanguage + */ + + // compilation + + /** + * Compiles a language definition result + * + * Given the raw result of a language definition (Language), compiles this so + * that it is ready for highlighting code. + * @param {Language} language + * @returns {CompiledLanguage} + */ + function compileLanguage(language) { + /** + * Builds a regex with the case sensitivity of the current language + * + * @param {RegExp | string} value + * @param {boolean} [global] + */ + function langRe(value, global) { + return new RegExp( + source(value), + 'm' + + (language.case_insensitive ? 'i' : '') + + (language.unicodeRegex ? 'u' : '') + + (global ? 'g' : '') + ); + } + + /** + Stores multiple regular expressions and allows you to quickly search for + them all in a string simultaneously - returning the first match. It does + this by creating a huge (a|b|c) regex - each individual item wrapped with () + and joined by `|` - using match groups to track position. When a match is + found checking which position in the array has content allows us to figure + out which of the original regexes / match groups triggered the match. + + The match object itself (the result of `Regex.exec`) is returned but also + enhanced by merging in any meta-data that was registered with the regex. + This is how we keep track of which mode matched, and what type of rule + (`illegal`, `begin`, end, etc). + */ + class MultiRegex { + constructor() { + this.matchIndexes = {}; + // @ts-ignore + this.regexes = []; + this.matchAt = 1; + this.position = 0; + } + + // @ts-ignore + addRule(re, opts) { + opts.position = this.position++; + // @ts-ignore + this.matchIndexes[this.matchAt] = opts; + this.regexes.push([opts, re]); + this.matchAt += countMatchGroups(re) + 1; + } + + compile() { + if (this.regexes.length === 0) { + // avoids the need to check length every time exec is called + // @ts-ignore + this.exec = () => null; + } + const terminators = this.regexes.map(el => el[1]); + this.matcherRe = langRe(_rewriteBackreferences(terminators, { joinWith: '|' }), true); + this.lastIndex = 0; + } + + /** @param {string} s */ + exec(s) { + this.matcherRe.lastIndex = this.lastIndex; + const match = this.matcherRe.exec(s); + if (!match) { return null; } + + // eslint-disable-next-line no-undefined + const i = match.findIndex((el, i) => i > 0 && el !== undefined); + // @ts-ignore + const matchData = this.matchIndexes[i]; + // trim off any earlier non-relevant match groups (ie, the other regex + // match groups that make up the multi-matcher) + match.splice(0, i); + + return Object.assign(match, matchData); + } + } + + /* + Created to solve the key deficiently with MultiRegex - there is no way to + test for multiple matches at a single location. Why would we need to do + that? In the future a more dynamic engine will allow certain matches to be + ignored. An example: if we matched say the 3rd regex in a large group but + decided to ignore it - we'd need to started testing again at the 4th + regex... but MultiRegex itself gives us no real way to do that. + + So what this class creates MultiRegexs on the fly for whatever search + position they are needed. + + NOTE: These additional MultiRegex objects are created dynamically. For most + grammars most of the time we will never actually need anything more than the + first MultiRegex - so this shouldn't have too much overhead. + + Say this is our search group, and we match regex3, but wish to ignore it. + + regex1 | regex2 | regex3 | regex4 | regex5 ' ie, startAt = 0 + + What we need is a new MultiRegex that only includes the remaining + possibilities: + + regex4 | regex5 ' ie, startAt = 3 + + This class wraps all that complexity up in a simple API... `startAt` decides + where in the array of expressions to start doing the matching. It + auto-increments, so if a match is found at position 2, then startAt will be + set to 3. If the end is reached startAt will return to 0. + + MOST of the time the parser will be setting startAt manually to 0. + */ + class ResumableMultiRegex { + constructor() { + // @ts-ignore + this.rules = []; + // @ts-ignore + this.multiRegexes = []; + this.count = 0; + + this.lastIndex = 0; + this.regexIndex = 0; + } + + // @ts-ignore + getMatcher(index) { + if (this.multiRegexes[index]) return this.multiRegexes[index]; + + const matcher = new MultiRegex(); + this.rules.slice(index).forEach(([re, opts]) => matcher.addRule(re, opts)); + matcher.compile(); + this.multiRegexes[index] = matcher; + return matcher; + } + + resumingScanAtSamePosition() { + return this.regexIndex !== 0; + } + + considerAll() { + this.regexIndex = 0; + } + + // @ts-ignore + addRule(re, opts) { + this.rules.push([re, opts]); + if (opts.type === "begin") this.count++; + } + + /** @param {string} s */ + exec(s) { + const m = this.getMatcher(this.regexIndex); + m.lastIndex = this.lastIndex; + let result = m.exec(s); + + // The following is because we have no easy way to say "resume scanning at the + // existing position but also skip the current rule ONLY". What happens is + // all prior rules are also skipped which can result in matching the wrong + // thing. Example of matching "booger": + + // our matcher is [string, "booger", number] + // + // ....booger.... + + // if "booger" is ignored then we'd really need a regex to scan from the + // SAME position for only: [string, number] but ignoring "booger" (if it + // was the first match), a simple resume would scan ahead who knows how + // far looking only for "number", ignoring potential string matches (or + // future "booger" matches that might be valid.) + + // So what we do: We execute two matchers, one resuming at the same + // position, but the second full matcher starting at the position after: + + // /--- resume first regex match here (for [number]) + // |/---- full match here for [string, "booger", number] + // vv + // ....booger.... + + // Which ever results in a match first is then used. So this 3-4 step + // process essentially allows us to say "match at this position, excluding + // a prior rule that was ignored". + // + // 1. Match "booger" first, ignore. Also proves that [string] does non match. + // 2. Resume matching for [number] + // 3. Match at index + 1 for [string, "booger", number] + // 4. If #2 and #3 result in matches, which came first? + if (this.resumingScanAtSamePosition()) { + if (result && result.index === this.lastIndex) ; else { // use the second matcher result + const m2 = this.getMatcher(0); + m2.lastIndex = this.lastIndex + 1; + result = m2.exec(s); + } + } + + if (result) { + this.regexIndex += result.position + 1; + if (this.regexIndex === this.count) { + // wrap-around to considering all matches again + this.considerAll(); + } + } + + return result; + } + } + + /** + * Given a mode, builds a huge ResumableMultiRegex that can be used to walk + * the content and find matches. + * + * @param {CompiledMode} mode + * @returns {ResumableMultiRegex} + */ + function buildModeRegex(mode) { + const mm = new ResumableMultiRegex(); + + mode.contains.forEach(term => mm.addRule(term.begin, { rule: term, type: "begin" })); + + if (mode.terminatorEnd) { + mm.addRule(mode.terminatorEnd, { type: "end" }); + } + if (mode.illegal) { + mm.addRule(mode.illegal, { type: "illegal" }); + } + + return mm; + } + + /** skip vs abort vs ignore + * + * @skip - The mode is still entered and exited normally (and contains rules apply), + * but all content is held and added to the parent buffer rather than being + * output when the mode ends. Mostly used with `sublanguage` to build up + * a single large buffer than can be parsed by sublanguage. + * + * - The mode begin ands ends normally. + * - Content matched is added to the parent mode buffer. + * - The parser cursor is moved forward normally. + * + * @abort - A hack placeholder until we have ignore. Aborts the mode (as if it + * never matched) but DOES NOT continue to match subsequent `contains` + * modes. Abort is bad/suboptimal because it can result in modes + * farther down not getting applied because an earlier rule eats the + * content but then aborts. + * + * - The mode does not begin. + * - Content matched by `begin` is added to the mode buffer. + * - The parser cursor is moved forward accordingly. + * + * @ignore - Ignores the mode (as if it never matched) and continues to match any + * subsequent `contains` modes. Ignore isn't technically possible with + * the current parser implementation. + * + * - The mode does not begin. + * - Content matched by `begin` is ignored. + * - The parser cursor is not moved forward. + */ + + /** + * Compiles an individual mode + * + * This can raise an error if the mode contains certain detectable known logic + * issues. + * @param {Mode} mode + * @param {CompiledMode | null} [parent] + * @returns {CompiledMode | never} + */ + function compileMode(mode, parent) { + const cmode = /** @type CompiledMode */ (mode); + if (mode.isCompiled) return cmode; + + [ + scopeClassName, + // do this early so compiler extensions generally don't have to worry about + // the distinction between match/begin + compileMatch, + MultiClass, + beforeMatchExt + ].forEach(ext => ext(mode, parent)); + + language.compilerExtensions.forEach(ext => ext(mode, parent)); + + // __beforeBegin is considered private API, internal use only + mode.__beforeBegin = null; + + [ + beginKeywords, + // do this later so compiler extensions that come earlier have access to the + // raw array if they wanted to perhaps manipulate it, etc. + compileIllegal, + // default to 1 relevance if not specified + compileRelevance + ].forEach(ext => ext(mode, parent)); + + mode.isCompiled = true; + + let keywordPattern = null; + if (typeof mode.keywords === "object" && mode.keywords.$pattern) { + // we need a copy because keywords might be compiled multiple times + // so we can't go deleting $pattern from the original on the first + // pass + mode.keywords = Object.assign({}, mode.keywords); + keywordPattern = mode.keywords.$pattern; + delete mode.keywords.$pattern; + } + keywordPattern = keywordPattern || /\w+/; + + if (mode.keywords) { + mode.keywords = compileKeywords(mode.keywords, language.case_insensitive); + } + + cmode.keywordPatternRe = langRe(keywordPattern, true); + + if (parent) { + if (!mode.begin) mode.begin = /\B|\b/; + cmode.beginRe = langRe(cmode.begin); + if (!mode.end && !mode.endsWithParent) mode.end = /\B|\b/; + if (mode.end) cmode.endRe = langRe(cmode.end); + cmode.terminatorEnd = source(cmode.end) || ''; + if (mode.endsWithParent && parent.terminatorEnd) { + cmode.terminatorEnd += (mode.end ? '|' : '') + parent.terminatorEnd; + } + } + if (mode.illegal) cmode.illegalRe = langRe(/** @type {RegExp | string} */ (mode.illegal)); + if (!mode.contains) mode.contains = []; + + mode.contains = [].concat(...mode.contains.map(function(c) { + return expandOrCloneMode(c === 'self' ? mode : c); + })); + mode.contains.forEach(function(c) { compileMode(/** @type Mode */ (c), cmode); }); + + if (mode.starts) { + compileMode(mode.starts, parent); + } + + cmode.matcher = buildModeRegex(cmode); + return cmode; + } + + if (!language.compilerExtensions) language.compilerExtensions = []; + + // self is not valid at the top-level + if (language.contains && language.contains.includes('self')) { + throw new Error("ERR: contains `self` is not supported at the top-level of a language. See documentation."); + } + + // we need a null object, which inherit will guarantee + language.classNameAliases = inherit$1(language.classNameAliases || {}); + + return compileMode(/** @type Mode */ (language)); + } + + /** + * Determines if a mode has a dependency on it's parent or not + * + * If a mode does have a parent dependency then often we need to clone it if + * it's used in multiple places so that each copy points to the correct parent, + * where-as modes without a parent can often safely be re-used at the bottom of + * a mode chain. + * + * @param {Mode | null} mode + * @returns {boolean} - is there a dependency on the parent? + * */ + function dependencyOnParent(mode) { + if (!mode) return false; + + return mode.endsWithParent || dependencyOnParent(mode.starts); + } + + /** + * Expands a mode or clones it if necessary + * + * This is necessary for modes with parental dependenceis (see notes on + * `dependencyOnParent`) and for nodes that have `variants` - which must then be + * exploded into their own individual modes at compile time. + * + * @param {Mode} mode + * @returns {Mode | Mode[]} + * */ + function expandOrCloneMode(mode) { + if (mode.variants && !mode.cachedVariants) { + mode.cachedVariants = mode.variants.map(function(variant) { + return inherit$1(mode, { variants: null }, variant); + }); + } + + // EXPAND + // if we have variants then essentially "replace" the mode with the variants + // this happens in compileMode, where this function is called from + if (mode.cachedVariants) { + return mode.cachedVariants; + } + + // CLONE + // if we have dependencies on parents then we need a unique + // instance of ourselves, so we can be reused with many + // different parents without issue + if (dependencyOnParent(mode)) { + return inherit$1(mode, { starts: mode.starts ? inherit$1(mode.starts) : null }); + } + + if (Object.isFrozen(mode)) { + return inherit$1(mode); + } + + // no special dependency issues, just return ourselves + return mode; + } + + var version = "11.7.0"; + + class HTMLInjectionError extends Error { + constructor(reason, html) { + super(reason); + this.name = "HTMLInjectionError"; + this.html = html; + } + } + + /* + Syntax highlighting with language autodetection. + https://highlightjs.org/ + */ + + /** + @typedef {import('highlight.js').Mode} Mode + @typedef {import('highlight.js').CompiledMode} CompiledMode + @typedef {import('highlight.js').CompiledScope} CompiledScope + @typedef {import('highlight.js').Language} Language + @typedef {import('highlight.js').HLJSApi} HLJSApi + @typedef {import('highlight.js').HLJSPlugin} HLJSPlugin + @typedef {import('highlight.js').PluginEvent} PluginEvent + @typedef {import('highlight.js').HLJSOptions} HLJSOptions + @typedef {import('highlight.js').LanguageFn} LanguageFn + @typedef {import('highlight.js').HighlightedHTMLElement} HighlightedHTMLElement + @typedef {import('highlight.js').BeforeHighlightContext} BeforeHighlightContext + @typedef {import('highlight.js/private').MatchType} MatchType + @typedef {import('highlight.js/private').KeywordData} KeywordData + @typedef {import('highlight.js/private').EnhancedMatch} EnhancedMatch + @typedef {import('highlight.js/private').AnnotatedError} AnnotatedError + @typedef {import('highlight.js').AutoHighlightResult} AutoHighlightResult + @typedef {import('highlight.js').HighlightOptions} HighlightOptions + @typedef {import('highlight.js').HighlightResult} HighlightResult + */ + + + const escape = escapeHTML; + const inherit = inherit$1; + const NO_MATCH = Symbol("nomatch"); + const MAX_KEYWORD_HITS = 7; + + /** + * @param {any} hljs - object that is extended (legacy) + * @returns {HLJSApi} + */ + const HLJS = function(hljs) { + // Global internal variables used within the highlight.js library. + /** @type {Record} */ + const languages = Object.create(null); + /** @type {Record} */ + const aliases = Object.create(null); + /** @type {HLJSPlugin[]} */ + const plugins = []; + + // safe/production mode - swallows more errors, tries to keep running + // even if a single syntax or parse hits a fatal error + let SAFE_MODE = true; + const LANGUAGE_NOT_FOUND = "Could not find the language '{}', did you forget to load/include a language module?"; + /** @type {Language} */ + const PLAINTEXT_LANGUAGE = { disableAutodetect: true, name: 'Plain text', contains: [] }; + + // Global options used when within external APIs. This is modified when + // calling the `hljs.configure` function. + /** @type HLJSOptions */ + let options = { + ignoreUnescapedHTML: false, + throwUnescapedHTML: false, + noHighlightRe: /^(no-?highlight)$/i, + languageDetectRe: /\blang(?:uage)?-([\w-]+)\b/i, + classPrefix: 'hljs-', + cssSelector: 'pre code', + languages: null, + // beta configuration options, subject to change, welcome to discuss + // https://github.com/highlightjs/highlight.js/issues/1086 + __emitter: TokenTreeEmitter + }; + + /* Utility functions */ + + /** + * Tests a language name to see if highlighting should be skipped + * @param {string} languageName + */ + function shouldNotHighlight(languageName) { + return options.noHighlightRe.test(languageName); + } + + /** + * @param {HighlightedHTMLElement} block - the HTML element to determine language for + */ + function blockLanguage(block) { + let classes = block.className + ' '; + + classes += block.parentNode ? block.parentNode.className : ''; + + // language-* takes precedence over non-prefixed class names. + const match = options.languageDetectRe.exec(classes); + if (match) { + const language = getLanguage(match[1]); + if (!language) { + warn(LANGUAGE_NOT_FOUND.replace("{}", match[1])); + warn("Falling back to no-highlight mode for this block.", block); + } + return language ? match[1] : 'no-highlight'; + } + + return classes + .split(/\s+/) + .find((_class) => shouldNotHighlight(_class) || getLanguage(_class)); + } + + /** + * Core highlighting function. + * + * OLD API + * highlight(lang, code, ignoreIllegals, continuation) + * + * NEW API + * highlight(code, {lang, ignoreIllegals}) + * + * @param {string} codeOrLanguageName - the language to use for highlighting + * @param {string | HighlightOptions} optionsOrCode - the code to highlight + * @param {boolean} [ignoreIllegals] - whether to ignore illegal matches, default is to bail + * + * @returns {HighlightResult} Result - an object that represents the result + * @property {string} language - the language name + * @property {number} relevance - the relevance score + * @property {string} value - the highlighted HTML code + * @property {string} code - the original raw code + * @property {CompiledMode} top - top of the current mode stack + * @property {boolean} illegal - indicates whether any illegal matches were found + */ + function highlight(codeOrLanguageName, optionsOrCode, ignoreIllegals) { + let code = ""; + let languageName = ""; + if (typeof optionsOrCode === "object") { + code = codeOrLanguageName; + ignoreIllegals = optionsOrCode.ignoreIllegals; + languageName = optionsOrCode.language; + } else { + // old API + deprecated("10.7.0", "highlight(lang, code, ...args) has been deprecated."); + deprecated("10.7.0", "Please use highlight(code, options) instead.\nhttps://github.com/highlightjs/highlight.js/issues/2277"); + languageName = codeOrLanguageName; + code = optionsOrCode; + } + + // https://github.com/highlightjs/highlight.js/issues/3149 + // eslint-disable-next-line no-undefined + if (ignoreIllegals === undefined) { ignoreIllegals = true; } + + /** @type {BeforeHighlightContext} */ + const context = { + code, + language: languageName + }; + // the plugin can change the desired language or the code to be highlighted + // just be changing the object it was passed + fire("before:highlight", context); + + // a before plugin can usurp the result completely by providing it's own + // in which case we don't even need to call highlight + const result = context.result + ? context.result + : _highlight(context.language, context.code, ignoreIllegals); + + result.code = context.code; + // the plugin can change anything in result to suite it + fire("after:highlight", result); + + return result; + } + + /** + * private highlight that's used internally and does not fire callbacks + * + * @param {string} languageName - the language to use for highlighting + * @param {string} codeToHighlight - the code to highlight + * @param {boolean?} [ignoreIllegals] - whether to ignore illegal matches, default is to bail + * @param {CompiledMode?} [continuation] - current continuation mode, if any + * @returns {HighlightResult} - result of the highlight operation + */ + function _highlight(languageName, codeToHighlight, ignoreIllegals, continuation) { + const keywordHits = Object.create(null); + + /** + * Return keyword data if a match is a keyword + * @param {CompiledMode} mode - current mode + * @param {string} matchText - the textual match + * @returns {KeywordData | false} + */ + function keywordData(mode, matchText) { + return mode.keywords[matchText]; + } + + function processKeywords() { + if (!top.keywords) { + emitter.addText(modeBuffer); + return; + } + + let lastIndex = 0; + top.keywordPatternRe.lastIndex = 0; + let match = top.keywordPatternRe.exec(modeBuffer); + let buf = ""; + + while (match) { + buf += modeBuffer.substring(lastIndex, match.index); + const word = language.case_insensitive ? match[0].toLowerCase() : match[0]; + const data = keywordData(top, word); + if (data) { + const [kind, keywordRelevance] = data; + emitter.addText(buf); + buf = ""; + + keywordHits[word] = (keywordHits[word] || 0) + 1; + if (keywordHits[word] <= MAX_KEYWORD_HITS) relevance += keywordRelevance; + if (kind.startsWith("_")) { + // _ implied for relevance only, do not highlight + // by applying a class name + buf += match[0]; + } else { + const cssClass = language.classNameAliases[kind] || kind; + emitter.addKeyword(match[0], cssClass); + } + } else { + buf += match[0]; + } + lastIndex = top.keywordPatternRe.lastIndex; + match = top.keywordPatternRe.exec(modeBuffer); + } + buf += modeBuffer.substring(lastIndex); + emitter.addText(buf); + } + + function processSubLanguage() { + if (modeBuffer === "") return; + /** @type HighlightResult */ + let result = null; + + if (typeof top.subLanguage === 'string') { + if (!languages[top.subLanguage]) { + emitter.addText(modeBuffer); + return; + } + result = _highlight(top.subLanguage, modeBuffer, true, continuations[top.subLanguage]); + continuations[top.subLanguage] = /** @type {CompiledMode} */ (result._top); + } else { + result = highlightAuto(modeBuffer, top.subLanguage.length ? top.subLanguage : null); + } + + // Counting embedded language score towards the host language may be disabled + // with zeroing the containing mode relevance. Use case in point is Markdown that + // allows XML everywhere and makes every XML snippet to have a much larger Markdown + // score. + if (top.relevance > 0) { + relevance += result.relevance; + } + emitter.addSublanguage(result._emitter, result.language); + } + + function processBuffer() { + if (top.subLanguage != null) { + processSubLanguage(); + } else { + processKeywords(); + } + modeBuffer = ''; + } + + /** + * @param {CompiledScope} scope + * @param {RegExpMatchArray} match + */ + function emitMultiClass(scope, match) { + let i = 1; + const max = match.length - 1; + while (i <= max) { + if (!scope._emit[i]) { i++; continue; } + const klass = language.classNameAliases[scope[i]] || scope[i]; + const text = match[i]; + if (klass) { + emitter.addKeyword(text, klass); + } else { + modeBuffer = text; + processKeywords(); + modeBuffer = ""; + } + i++; + } + } + + /** + * @param {CompiledMode} mode - new mode to start + * @param {RegExpMatchArray} match + */ + function startNewMode(mode, match) { + if (mode.scope && typeof mode.scope === "string") { + emitter.openNode(language.classNameAliases[mode.scope] || mode.scope); + } + if (mode.beginScope) { + // beginScope just wraps the begin match itself in a scope + if (mode.beginScope._wrap) { + emitter.addKeyword(modeBuffer, language.classNameAliases[mode.beginScope._wrap] || mode.beginScope._wrap); + modeBuffer = ""; + } else if (mode.beginScope._multi) { + // at this point modeBuffer should just be the match + emitMultiClass(mode.beginScope, match); + modeBuffer = ""; + } + } + + top = Object.create(mode, { parent: { value: top } }); + return top; + } + + /** + * @param {CompiledMode } mode - the mode to potentially end + * @param {RegExpMatchArray} match - the latest match + * @param {string} matchPlusRemainder - match plus remainder of content + * @returns {CompiledMode | void} - the next mode, or if void continue on in current mode + */ + function endOfMode(mode, match, matchPlusRemainder) { + let matched = startsWith(mode.endRe, matchPlusRemainder); + + if (matched) { + if (mode["on:end"]) { + const resp = new Response(mode); + mode["on:end"](match, resp); + if (resp.isMatchIgnored) matched = false; + } + + if (matched) { + while (mode.endsParent && mode.parent) { + mode = mode.parent; + } + return mode; + } + } + // even if on:end fires an `ignore` it's still possible + // that we might trigger the end node because of a parent mode + if (mode.endsWithParent) { + return endOfMode(mode.parent, match, matchPlusRemainder); + } + } + + /** + * Handle matching but then ignoring a sequence of text + * + * @param {string} lexeme - string containing full match text + */ + function doIgnore(lexeme) { + if (top.matcher.regexIndex === 0) { + // no more regexes to potentially match here, so we move the cursor forward one + // space + modeBuffer += lexeme[0]; + return 1; + } else { + // no need to move the cursor, we still have additional regexes to try and + // match at this very spot + resumeScanAtSamePosition = true; + return 0; + } + } + + /** + * Handle the start of a new potential mode match + * + * @param {EnhancedMatch} match - the current match + * @returns {number} how far to advance the parse cursor + */ + function doBeginMatch(match) { + const lexeme = match[0]; + const newMode = match.rule; + + const resp = new Response(newMode); + // first internal before callbacks, then the public ones + const beforeCallbacks = [newMode.__beforeBegin, newMode["on:begin"]]; + for (const cb of beforeCallbacks) { + if (!cb) continue; + cb(match, resp); + if (resp.isMatchIgnored) return doIgnore(lexeme); + } + + if (newMode.skip) { + modeBuffer += lexeme; + } else { + if (newMode.excludeBegin) { + modeBuffer += lexeme; + } + processBuffer(); + if (!newMode.returnBegin && !newMode.excludeBegin) { + modeBuffer = lexeme; + } + } + startNewMode(newMode, match); + return newMode.returnBegin ? 0 : lexeme.length; + } + + /** + * Handle the potential end of mode + * + * @param {RegExpMatchArray} match - the current match + */ + function doEndMatch(match) { + const lexeme = match[0]; + const matchPlusRemainder = codeToHighlight.substring(match.index); + + const endMode = endOfMode(top, match, matchPlusRemainder); + if (!endMode) { return NO_MATCH; } + + const origin = top; + if (top.endScope && top.endScope._wrap) { + processBuffer(); + emitter.addKeyword(lexeme, top.endScope._wrap); + } else if (top.endScope && top.endScope._multi) { + processBuffer(); + emitMultiClass(top.endScope, match); + } else if (origin.skip) { + modeBuffer += lexeme; + } else { + if (!(origin.returnEnd || origin.excludeEnd)) { + modeBuffer += lexeme; + } + processBuffer(); + if (origin.excludeEnd) { + modeBuffer = lexeme; + } + } + do { + if (top.scope) { + emitter.closeNode(); + } + if (!top.skip && !top.subLanguage) { + relevance += top.relevance; + } + top = top.parent; + } while (top !== endMode.parent); + if (endMode.starts) { + startNewMode(endMode.starts, match); + } + return origin.returnEnd ? 0 : lexeme.length; + } + + function processContinuations() { + const list = []; + for (let current = top; current !== language; current = current.parent) { + if (current.scope) { + list.unshift(current.scope); + } + } + list.forEach(item => emitter.openNode(item)); + } + + /** @type {{type?: MatchType, index?: number, rule?: Mode}}} */ + let lastMatch = {}; + + /** + * Process an individual match + * + * @param {string} textBeforeMatch - text preceding the match (since the last match) + * @param {EnhancedMatch} [match] - the match itself + */ + function processLexeme(textBeforeMatch, match) { + const lexeme = match && match[0]; + + // add non-matched text to the current mode buffer + modeBuffer += textBeforeMatch; + + if (lexeme == null) { + processBuffer(); + return 0; + } + + // we've found a 0 width match and we're stuck, so we need to advance + // this happens when we have badly behaved rules that have optional matchers to the degree that + // sometimes they can end up matching nothing at all + // Ref: https://github.com/highlightjs/highlight.js/issues/2140 + if (lastMatch.type === "begin" && match.type === "end" && lastMatch.index === match.index && lexeme === "") { + // spit the "skipped" character that our regex choked on back into the output sequence + modeBuffer += codeToHighlight.slice(match.index, match.index + 1); + if (!SAFE_MODE) { + /** @type {AnnotatedError} */ + const err = new Error(`0 width match regex (${languageName})`); + err.languageName = languageName; + err.badRule = lastMatch.rule; + throw err; + } + return 1; + } + lastMatch = match; + + if (match.type === "begin") { + return doBeginMatch(match); + } else if (match.type === "illegal" && !ignoreIllegals) { + // illegal match, we do not continue processing + /** @type {AnnotatedError} */ + const err = new Error('Illegal lexeme "' + lexeme + '" for mode "' + (top.scope || '') + '"'); + err.mode = top; + throw err; + } else if (match.type === "end") { + const processed = doEndMatch(match); + if (processed !== NO_MATCH) { + return processed; + } + } + + // edge case for when illegal matches $ (end of line) which is technically + // a 0 width match but not a begin/end match so it's not caught by the + // first handler (when ignoreIllegals is true) + if (match.type === "illegal" && lexeme === "") { + // advance so we aren't stuck in an infinite loop + return 1; + } + + // infinite loops are BAD, this is a last ditch catch all. if we have a + // decent number of iterations yet our index (cursor position in our + // parsing) still 3x behind our index then something is very wrong + // so we bail + if (iterations > 100000 && iterations > match.index * 3) { + const err = new Error('potential infinite loop, way more iterations than matches'); + throw err; + } + + /* + Why might be find ourselves here? An potential end match that was + triggered but could not be completed. IE, `doEndMatch` returned NO_MATCH. + (this could be because a callback requests the match be ignored, etc) + + This causes no real harm other than stopping a few times too many. + */ + + modeBuffer += lexeme; + return lexeme.length; + } + + const language = getLanguage(languageName); + if (!language) { + error(LANGUAGE_NOT_FOUND.replace("{}", languageName)); + throw new Error('Unknown language: "' + languageName + '"'); + } + + const md = compileLanguage(language); + let result = ''; + /** @type {CompiledMode} */ + let top = continuation || md; + /** @type Record */ + const continuations = {}; // keep continuations for sub-languages + const emitter = new options.__emitter(options); + processContinuations(); + let modeBuffer = ''; + let relevance = 0; + let index = 0; + let iterations = 0; + let resumeScanAtSamePosition = false; + + try { + top.matcher.considerAll(); + + for (;;) { + iterations++; + if (resumeScanAtSamePosition) { + // only regexes not matched previously will now be + // considered for a potential match + resumeScanAtSamePosition = false; + } else { + top.matcher.considerAll(); + } + top.matcher.lastIndex = index; + + const match = top.matcher.exec(codeToHighlight); + // console.log("match", match[0], match.rule && match.rule.begin) + + if (!match) break; + + const beforeMatch = codeToHighlight.substring(index, match.index); + const processedCount = processLexeme(beforeMatch, match); + index = match.index + processedCount; + } + processLexeme(codeToHighlight.substring(index)); + emitter.closeAllNodes(); + emitter.finalize(); + result = emitter.toHTML(); + + return { + language: languageName, + value: result, + relevance: relevance, + illegal: false, + _emitter: emitter, + _top: top + }; + } catch (err) { + if (err.message && err.message.includes('Illegal')) { + return { + language: languageName, + value: escape(codeToHighlight), + illegal: true, + relevance: 0, + _illegalBy: { + message: err.message, + index: index, + context: codeToHighlight.slice(index - 100, index + 100), + mode: err.mode, + resultSoFar: result + }, + _emitter: emitter + }; + } else if (SAFE_MODE) { + return { + language: languageName, + value: escape(codeToHighlight), + illegal: false, + relevance: 0, + errorRaised: err, + _emitter: emitter, + _top: top + }; + } else { + throw err; + } + } + } + + /** + * returns a valid highlight result, without actually doing any actual work, + * auto highlight starts with this and it's possible for small snippets that + * auto-detection may not find a better match + * @param {string} code + * @returns {HighlightResult} + */ + function justTextHighlightResult(code) { + const result = { + value: escape(code), + illegal: false, + relevance: 0, + _top: PLAINTEXT_LANGUAGE, + _emitter: new options.__emitter(options) + }; + result._emitter.addText(code); + return result; + } + + /** + Highlighting with language detection. Accepts a string with the code to + highlight. Returns an object with the following properties: + + - language (detected language) + - relevance (int) + - value (an HTML string with highlighting markup) + - secondBest (object with the same structure for second-best heuristically + detected language, may be absent) + + @param {string} code + @param {Array} [languageSubset] + @returns {AutoHighlightResult} + */ + function highlightAuto(code, languageSubset) { + languageSubset = languageSubset || options.languages || Object.keys(languages); + const plaintext = justTextHighlightResult(code); + + const results = languageSubset.filter(getLanguage).filter(autoDetection).map(name => + _highlight(name, code, false) + ); + results.unshift(plaintext); // plaintext is always an option + + const sorted = results.sort((a, b) => { + // sort base on relevance + if (a.relevance !== b.relevance) return b.relevance - a.relevance; + + // always award the tie to the base language + // ie if C++ and Arduino are tied, it's more likely to be C++ + if (a.language && b.language) { + if (getLanguage(a.language).supersetOf === b.language) { + return 1; + } else if (getLanguage(b.language).supersetOf === a.language) { + return -1; + } + } + + // otherwise say they are equal, which has the effect of sorting on + // relevance while preserving the original ordering - which is how ties + // have historically been settled, ie the language that comes first always + // wins in the case of a tie + return 0; + }); + + const [best, secondBest] = sorted; + + /** @type {AutoHighlightResult} */ + const result = best; + result.secondBest = secondBest; + + return result; + } + + /** + * Builds new class name for block given the language name + * + * @param {HTMLElement} element + * @param {string} [currentLang] + * @param {string} [resultLang] + */ + function updateClassName(element, currentLang, resultLang) { + const language = (currentLang && aliases[currentLang]) || resultLang; + + element.classList.add("hljs"); + element.classList.add(`language-${language}`); + } + + /** + * Applies highlighting to a DOM node containing code. + * + * @param {HighlightedHTMLElement} element - the HTML element to highlight + */ + function highlightElement(element) { + /** @type HTMLElement */ + let node = null; + const language = blockLanguage(element); + + if (shouldNotHighlight(language)) return; + + fire("before:highlightElement", + { el: element, language: language }); + + // we should be all text, no child nodes (unescaped HTML) - this is possibly + // an HTML injection attack - it's likely too late if this is already in + // production (the code has likely already done its damage by the time + // we're seeing it)... but we yell loudly about this so that hopefully it's + // more likely to be caught in development before making it to production + if (element.children.length > 0) { + if (!options.ignoreUnescapedHTML) { + console.warn("One of your code blocks includes unescaped HTML. This is a potentially serious security risk."); + console.warn("https://github.com/highlightjs/highlight.js/wiki/security"); + console.warn("The element with unescaped HTML:"); + console.warn(element); + } + if (options.throwUnescapedHTML) { + const err = new HTMLInjectionError( + "One of your code blocks includes unescaped HTML.", + element.innerHTML + ); + throw err; + } + } + + node = element; + const text = node.textContent; + const result = language ? highlight(text, { language, ignoreIllegals: true }) : highlightAuto(text); + + element.innerHTML = result.value; + updateClassName(element, language, result.language); + element.result = { + language: result.language, + // TODO: remove with version 11.0 + re: result.relevance, + relevance: result.relevance + }; + if (result.secondBest) { + element.secondBest = { + language: result.secondBest.language, + relevance: result.secondBest.relevance + }; + } + + fire("after:highlightElement", { el: element, result, text }); + } + + /** + * Updates highlight.js global options with the passed options + * + * @param {Partial} userOptions + */ + function configure(userOptions) { + options = inherit(options, userOptions); + } + + // TODO: remove v12, deprecated + const initHighlighting = () => { + highlightAll(); + deprecated("10.6.0", "initHighlighting() deprecated. Use highlightAll() now."); + }; + + // TODO: remove v12, deprecated + function initHighlightingOnLoad() { + highlightAll(); + deprecated("10.6.0", "initHighlightingOnLoad() deprecated. Use highlightAll() now."); + } + + let wantsHighlight = false; + + /** + * auto-highlights all pre>code elements on the page + */ + function highlightAll() { + // if we are called too early in the loading process + if (document.readyState === "loading") { + wantsHighlight = true; + return; + } + + const blocks = document.querySelectorAll(options.cssSelector); + blocks.forEach(highlightElement); + } + + function boot() { + // if a highlight was requested before DOM was loaded, do now + if (wantsHighlight) highlightAll(); + } + + // make sure we are in the browser environment + if (typeof window !== 'undefined' && window.addEventListener) { + window.addEventListener('DOMContentLoaded', boot, false); + } + + /** + * Register a language grammar module + * + * @param {string} languageName + * @param {LanguageFn} languageDefinition + */ + function registerLanguage(languageName, languageDefinition) { + let lang = null; + try { + lang = languageDefinition(hljs); + } catch (error$1) { + error("Language definition for '{}' could not be registered.".replace("{}", languageName)); + // hard or soft error + if (!SAFE_MODE) { throw error$1; } else { error(error$1); } + // languages that have serious errors are replaced with essentially a + // "plaintext" stand-in so that the code blocks will still get normal + // css classes applied to them - and one bad language won't break the + // entire highlighter + lang = PLAINTEXT_LANGUAGE; + } + // give it a temporary name if it doesn't have one in the meta-data + if (!lang.name) lang.name = languageName; + languages[languageName] = lang; + lang.rawDefinition = languageDefinition.bind(null, hljs); + + if (lang.aliases) { + registerAliases(lang.aliases, { languageName }); + } + } + + /** + * Remove a language grammar module + * + * @param {string} languageName + */ + function unregisterLanguage(languageName) { + delete languages[languageName]; + for (const alias of Object.keys(aliases)) { + if (aliases[alias] === languageName) { + delete aliases[alias]; + } + } + } + + /** + * @returns {string[]} List of language internal names + */ + function listLanguages() { + return Object.keys(languages); + } + + /** + * @param {string} name - name of the language to retrieve + * @returns {Language | undefined} + */ + function getLanguage(name) { + name = (name || '').toLowerCase(); + return languages[name] || languages[aliases[name]]; + } + + /** + * + * @param {string|string[]} aliasList - single alias or list of aliases + * @param {{languageName: string}} opts + */ + function registerAliases(aliasList, { languageName }) { + if (typeof aliasList === 'string') { + aliasList = [aliasList]; + } + aliasList.forEach(alias => { aliases[alias.toLowerCase()] = languageName; }); + } + + /** + * Determines if a given language has auto-detection enabled + * @param {string} name - name of the language + */ + function autoDetection(name) { + const lang = getLanguage(name); + return lang && !lang.disableAutodetect; + } + + /** + * Upgrades the old highlightBlock plugins to the new + * highlightElement API + * @param {HLJSPlugin} plugin + */ + function upgradePluginAPI(plugin) { + // TODO: remove with v12 + if (plugin["before:highlightBlock"] && !plugin["before:highlightElement"]) { + plugin["before:highlightElement"] = (data) => { + plugin["before:highlightBlock"]( + Object.assign({ block: data.el }, data) + ); + }; + } + if (plugin["after:highlightBlock"] && !plugin["after:highlightElement"]) { + plugin["after:highlightElement"] = (data) => { + plugin["after:highlightBlock"]( + Object.assign({ block: data.el }, data) + ); + }; + } + } + + /** + * @param {HLJSPlugin} plugin + */ + function addPlugin(plugin) { + upgradePluginAPI(plugin); + plugins.push(plugin); + } + + /** + * + * @param {PluginEvent} event + * @param {any} args + */ + function fire(event, args) { + const cb = event; + plugins.forEach(function(plugin) { + if (plugin[cb]) { + plugin[cb](args); + } + }); + } + + /** + * DEPRECATED + * @param {HighlightedHTMLElement} el + */ + function deprecateHighlightBlock(el) { + deprecated("10.7.0", "highlightBlock will be removed entirely in v12.0"); + deprecated("10.7.0", "Please use highlightElement now."); + + return highlightElement(el); + } + + /* Interface definition */ + Object.assign(hljs, { + highlight, + highlightAuto, + highlightAll, + highlightElement, + // TODO: Remove with v12 API + highlightBlock: deprecateHighlightBlock, + configure, + initHighlighting, + initHighlightingOnLoad, + registerLanguage, + unregisterLanguage, + listLanguages, + getLanguage, + registerAliases, + autoDetection, + inherit, + addPlugin + }); + + hljs.debugMode = function() { SAFE_MODE = false; }; + hljs.safeMode = function() { SAFE_MODE = true; }; + hljs.versionString = version; + + hljs.regex = { + concat: concat, + lookahead: lookahead, + either: either, + optional: optional, + anyNumberOfTimes: anyNumberOfTimes + }; + + for (const key in MODES) { + // @ts-ignore + if (typeof MODES[key] === "object") { + // @ts-ignore + deepFreezeEs6.exports(MODES[key]); + } + } + + // merge all the modes/regexes into our main object + Object.assign(hljs, MODES); + + return hljs; + }; + + // export an "instance" of the highlighter + var highlight = HLJS({}); + + return highlight; + +})(); +if (typeof exports === 'object' && typeof module !== 'undefined') { module.exports = hljs; } diff --git a/docs/doxygen/layout-clang-uml.xml b/docs/doxygen/layout-clang-uml.xml new file mode 100644 index 00000000..f8d7951a --- /dev/null +++ b/docs/doxygen/layout-clang-uml.xml @@ -0,0 +1,245 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/docs/generator_types.md b/docs/generator_types.md index ad6b0786..fa909016 100644 --- a/docs/generator_types.md +++ b/docs/generator_types.md @@ -9,6 +9,15 @@ Currently, there are 2 types of diagram generators: `plantuml` and `json`. +To specify, which generators should be used on the command line use option `-g`. +For instance to generate both types of diagrams run `clang-uml` as follows: + +```bash +clang-uml -g plantuml -g json +``` + +By default only `plantuml` diagram are generated. + ## PlantUML Generates UML diagrams in textual PlantUML format, which can then diff --git a/docs/img/github-mark.svg b/docs/img/github-mark.svg new file mode 100644 index 00000000..37fa923d --- /dev/null +++ b/docs/img/github-mark.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/docs/include_diagrams.md b/docs/include_diagrams.md index b8b6c087..b05f0fbf 100644 --- a/docs/include_diagrams.md +++ b/docs/include_diagrams.md @@ -2,7 +2,7 @@ -* [Tracking system headers directly included by projects files](#tracking-system-headers-directly-included-by-projects-files) +* [Tracking system headers directly included by project sources](#tracking-system-headers-directly-included-by-project-sources) @@ -46,7 +46,7 @@ The following table presents the PlantUML arrows representing relationships in t | Include (local) | ![association](img/puml_association.png) | | Include (system) | ![dependency](img/puml_dependency.png) | -## Tracking system headers directly included by projects files +## Tracking system headers directly included by project sources In case you would like to include the information about what system headers your projects file include simply add the following option to the diagram: diff --git a/docs/jinja_templates.md b/docs/jinja_templates.md index 9644e253..0e5a4629 100644 --- a/docs/jinja_templates.md +++ b/docs/jinja_templates.md @@ -3,8 +3,8 @@ * [Accessing comment content](#accessing-comment-content) - * [`plain` comment parser](#plain-comment-parser) - * [`clang` comment parser](#clang-comment-parser) + * ['plain' comment parser](#plain-comment-parser) + * ['clang' comment parser](#clang-comment-parser) @@ -102,12 +102,12 @@ Currently there are 2 available comment parsers: They can be selected using `comment_parser` config option. -#### `plain` comment parser +#### 'plain' comment parser This parser provides only 2 options to the Jinja context: * `comment.raw` - raw comment text, including comment markers such as `///` or `/**` * `comment.formatted` - formatted entire comment -#### `clang` comment parser +#### 'clang' comment parser This parser uses Clang comment parsing API to extract commands from the command: * `comment.raw` - raw comment text, including comment markers such as `///` or `/**` * `comment.formatted` - formatted entire comment diff --git a/docs/license.md b/docs/license.md new file mode 100644 index 00000000..14073f32 --- /dev/null +++ b/docs/license.md @@ -0,0 +1,15 @@ +## LICENSE + + Copyright 2021-present Bartek Kryza + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. diff --git a/docs/quick_start.md b/docs/quick_start.md index 6dbfc98a..6e70a9f0 100644 --- a/docs/quick_start.md +++ b/docs/quick_start.md @@ -10,7 +10,7 @@ To add an initial class diagram to your project, follow these steps: 1. Enter your projects top level directory and run: ```bash - $ clang-uml --init + clang-uml --init ``` 2. Edit the generated `.clang-uml` file and set the following: ```yaml @@ -39,21 +39,21 @@ To add an initial class diagram to your project, follow these steps: ``` 3. Run `clang-uml` in the projects top directory: ```bash - $ clang-uml + clang-uml # or to see generation progress for each diagram run - $ clang-uml --progress + clang-uml --progress ``` 4. Generate SVG images from the PlantUML diagrams: ```bash - $ plantuml -tsvg puml/*.puml + plantuml -tsvg puml/*.puml ``` 5. Add another diagram: ```bash - $ clang-uml --add-sequence-diagram another_diagram + clang-uml --add-sequence-diagram another_diagram ``` 6. Now list the diagrams defined in the config: ```bash - $ clang-uml -l + clang-uml -l The following diagrams are defined in the config file: - another_diagram [sequence] - some_class_diagram [class] diff --git a/docs/sequence_diagrams.md b/docs/sequence_diagrams.md index b7fbd723..f1c802df 100644 --- a/docs/sequence_diagrams.md +++ b/docs/sequence_diagrams.md @@ -89,9 +89,11 @@ Then you just need to copy and paste the signature exactly and rerun `clang-uml` By default, `clang-uml` will generate a new participant for each call to a free function (not method), which can lead to a very large number of participants in the diagram. If it's an issue, an option can be provided in the diagram definition: + ```yaml combine_free_functions_into_file_participants: true ``` + which will aggregate free functions per source file where they were declared thus minimizing the diagram size. An example of such diagram is presented below: @@ -109,6 +111,7 @@ following rules: Another issue is the naming of lambda participants. Currently, each lambda is rendered in the diagram as a separate class whose name is composed of the lambda location in the code (the only unique way of identifying lambdas I was able to find). For example the following code: + ```cpp #include #include diff --git a/docs/troubleshooting.md b/docs/troubleshooting.md index 7e66712e..211e8d46 100644 --- a/docs/troubleshooting.md +++ b/docs/troubleshooting.md @@ -3,16 +3,16 @@ * [General issues](#general-issues) - * [`clang-uml` crashed when generating diagram](#clang-uml-crashed-when-generating-diagram) + * [clang-uml crashes when generating diagram](#clang-uml-crashes-when-generating-diagram) * [Diagram generation is very slow](#diagram-generation-is-very-slow) * [Diagram generated with PlantUML is cropped](#diagram-generated-with-plantuml-is-cropped) - * [`clang` produces several warnings during diagram generation](#clang-produces-several-warnings-during-diagram-generation) + * [Clang produces several warnings during diagram generation](#clang-produces-several-warnings-during-diagram-generation) * [Cannot generate diagrams from header-only projects](#cannot-generate-diagrams-from-header-only-projects) * [YAML anchors and aliases are not fully supported](#yaml-anchors-and-aliases-are-not-fully-supported) * [Class diagrams](#class-diagrams) * ["fatal error: 'stddef.h' file not found"](#fatal-error-stddefh-file-not-found) * [How can I generate class diagram of my entire project](#how-can-i-generate-class-diagram-of-my-entire-project) - * [Cannot generate classes for `std` namespace](#cannot-generate-classes-for-std-namespace) + * [Cannot generate classes for 'std' namespace](#cannot-generate-classes-for-std-namespace) * [Sequence diagrams](#sequence-diagrams) * [Generated diagram is empty](#generated-diagram-is-empty) * [Generated diagram contains several empty control blocks or calls which should not be there](#generated-diagram-contains-several-empty-control-blocks-or-calls-which-should-not-be-there) @@ -21,14 +21,14 @@ ## General issues -### `clang-uml` crashed when generating diagram +### clang-uml crashes when generating diagram If `clang-uml` crashes with a segmentation fault, it is possible to trace the exact stack trace of the fault using the following steps: First, build `clang-uml` from source in debug mode, e.g.: ```bash -$ make debug +make debug ``` Then run `clang-uml`, preferably with `-vvv` for verbose log output. If your @@ -36,7 +36,7 @@ Then run `clang-uml`, preferably with `-vvv` for verbose log output. If your a single diagram to make it easier to trace the root cause of the crash, e.g.: ```bash -$ debug/src/clang-uml -vvv -n my_diagram +debug/src/clang-uml -vvv -n my_diagram ``` After `clang-uml` crashes again, detailed backtrace (generated using @@ -96,11 +96,11 @@ format and then convert to PNG, e.g.: ```bash -$ plantuml -tsvg mydiagram.puml -$ convert +antialias mydiagram.svg mydiagram.png +plantuml -tsvg mydiagram.puml +convert +antialias mydiagram.svg mydiagram.png ``` -### `clang` produces several warnings during diagram generation +### Clang produces several warnings during diagram generation During the generation of the diagram `clang` may report a lot of warnings, which do not occur during the compilation with other compiler (e.g. GCC). This can be @@ -123,8 +123,8 @@ add_compile_flags: Alternatively, the same can be passed through the `clang-uml` command line, e.g. ```bash -$ clang-uml --add-compile-flag -Wno-implicit-const-int-float-conversion \ - --add-compile-flag -Wno-shadow ... +clang-uml --add-compile-flag -Wno-implicit-const-int-float-conversion \ + --add-compile-flag -Wno-shadow ... ``` Please note that if your `compile_commands.json` already contains - for instance @@ -219,7 +219,7 @@ will be added to each compile command in the database. This is especially useful on macOS as well as for embedded toolchains, example usage: ```bash -$ clang-uml --query-driver arm-none-eabi-g++ +clang-uml --query-driver arm-none-eabi-g++ ``` Another option is to make sure that the Clang is installed on the system (even @@ -249,8 +249,8 @@ remove_compile_flags: These options can be also passed on the command line, for instance: ```bash -$ clang-uml --add-compile-flag -I/opt/my_toolchain/include \ - --remove-compile-flag -I/usr/include ... +clang-uml --add-compile-flag -I/opt/my_toolchain/include \ + --remove-compile-flag -I/usr/include ... ``` ### How can I generate class diagram of my entire project @@ -273,10 +273,10 @@ be readable. However, this option can be useful for cases when we want to get a complete JSON model of the codebase using the JSON generator: ```bash -$ clang-uml -g json -n all_classes --progress +clang-uml -g json -n all_classes --progress ``` -### Cannot generate classes for `std` namespace +### Cannot generate classes for 'std' namespace Currently, system headers are skipped automatically by `clang-uml`, due to too many errors they produce when generating diagrams, especially when trying to process `GCC`'s or `MSVC`'s system headers by `Clang` - not yet sure why @@ -302,6 +302,7 @@ exclude: Hopefully this will be eventually resolved. ## Sequence diagrams + ### Generated diagram is empty In order to generate sequence diagram the `start_from` configuration option must
-
$projectname -  $projectnumber -
-
$projectbrief
+
+ $projectnumber +
+
$projectbrief
-
$projectbrief
-
+
+ + GitHub + +
+
$searchbox$searchbox