api #

Declarations
#

syntax styling utilities and components for TypeScript, Svelte, and Markdown

39 declarations

add_grammar_clike
#

add_grammar_css
#

add_grammar_js
#

add_grammar_json
#

add_grammar_markdown
#

grammar_markdown.ts view source

(syntax_styler: SyntaxStyler): void

Markdown grammar extending markup. Supports: headings, fenced code blocks (3/4/5 backticks with nesting), lists, blockquotes, bold, italic, strikethrough, inline code, and links.

syntax_styler

returns

void

add_grammar_markup
#

add_grammar_svelte
#

add_grammar_ts
#

AddSyntaxGrammar
#

class_keywords
#

Code
#

Code.svelte view source

content

The source code to syntax highlight.

type string

lang?

Language identifier (e.g., 'ts', 'css', 'html', 'json', 'svelte', 'md').

Purpose: - When grammar is not provided, used to look up the grammar via syntax_styler.get_lang(lang) - Used for metadata: sets the data-lang attribute and determines language_supported

Special values: - null - Explicitly disables syntax highlighting (content rendered as plain text) - undefined - Falls back to default ('svelte')

Relationship with grammar: - If both lang and grammar are provided, grammar takes precedence for tokenization - However, lang is still used for the data-lang attribute and language detection

type string | null
optional

grammar?

Optional custom grammar object for syntax tokenization.

When to use: - To provide a custom language definition not registered in syntax_styler.langs - To use a modified/extended version of an existing grammar - For one-off grammar variations without registering globally

Behavior: - When provided, this grammar is used for tokenization instead of looking up via lang - Enables highlighting even if lang is not in the registry (useful for custom languages) - The lang parameter is still used for metadata (data-lang attribute) - When undefined, the grammar is automatically looked up via syntax_styler.get_lang(lang)

type SyntaxGrammar | undefined
optional

inline?

Whether to render as inline code or block code. Controls display via CSS classes.

type boolean
optional

wrap?

Whether to wrap long lines in block code. Sets white-space: pre-wrap instead of white-space: pre.

Behavior: - Wraps at whitespace (spaces, newlines) - Long tokens without spaces (URLs, hashes) will still scroll horizontally - Default false provides traditional code block behavior

Only affects block code (ignored for inline mode).

type boolean
optional

syntax_styler?

Custom SyntaxStyler instance to use for highlighting. Allows using a different styler with custom grammars or configuration.

optional

children?

Optional snippet to customize how the highlighted markup is rendered. Receives the generated HTML string as a parameter.

type Snippet<[markup: string]>
optional

CodeHighlight
#

CodeHighlight.svelte view source

content

The source code to syntax highlight.

type string

lang?

Language identifier (e.g., 'ts', 'css', 'html', 'json', 'svelte', 'md').

Purpose: - When grammar is not provided, used to look up the grammar via syntax_styler.get_lang(lang) - Used for metadata: sets the data-lang attribute and determines language_supported

Special values: - null - Explicitly disables syntax highlighting (content rendered as plain text) - undefined - Falls back to default ('svelte')

Relationship with grammar: - If both lang and grammar are provided, grammar takes precedence for tokenization - However, lang is still used for the data-lang attribute and language detection

type string | null
optional

mode?

Highlighting mode for this component.

Options: - 'auto' - Uses CSS Custom Highlight API if supported, falls back to HTML mode - 'ranges' - Forces CSS Custom Highlight API (requires browser support) - 'html' - Forces HTML generation with CSS classes

Note: CSS Custom Highlight API has limitations and limited browser support. Requires importing theme_highlight.css instead of theme.css.

optional

grammar?

Optional custom grammar object for syntax tokenization.

When to use: - To provide a custom language definition not registered in syntax_styler.langs - To use a modified/extended version of an existing grammar - For one-off grammar variations without registering globally

Behavior: - When provided, this grammar is used for tokenization instead of looking up via lang - Enables highlighting even if lang is not in the registry (useful for custom languages) - The lang parameter is still used for metadata (data-lang attribute) - When undefined, the grammar is automatically looked up via syntax_styler.get_lang(lang)

type SyntaxGrammar | undefined
optional

inline?

Whether to render as inline code or block code. Controls display via CSS classes.

type boolean
optional

wrap?

Whether to wrap long lines in block code. Sets white-space: pre-wrap instead of white-space: pre.

Behavior: - Wraps at whitespace (spaces, newlines) - Long tokens without spaces (URLs, hashes) will still scroll horizontally - Default false provides traditional code block behavior

Only affects block code (ignored for inline mode).

type boolean
optional

syntax_styler?

Custom SyntaxStyler instance to use for highlighting. Allows using a different styler with custom grammars or configuration.

optional

children?

Optional snippet to customize how the highlighted markup is rendered. - In HTML mode: receives the generated HTML string - In range mode: receives the plain text content

type Snippet<[markup: string]>
optional

CodeSample
#

grammar_markup_add_attribute
#

grammar_markup.ts view source

(syntax_styler: SyntaxStyler, attr_name: string, lang: string): void

Adds an pattern to style languages embedded in HTML attributes.

An example of an inlined language is CSS with style attributes.

syntax_styler

attr_name

- The name of the tag that contains the inlined language. This name will be treated as case insensitive.

type string

lang

- The language key.

type string

returns

void

grammar_markup_add_inlined
#

grammar_markup.ts view source

(syntax_styler: SyntaxStyler, tag_name: string, lang: string, inside_lang?: string): void

Adds an inlined language to markup.

An example of an inlined language is CSS with <style> tags.

syntax_styler

tag_name

- The name of the tag that contains the inlined language. This name will be treated as case insensitive.

type string

lang

- The language key.

type string

inside_lang

type string
default 'markup'

returns

void

grammar_svelte_add_inlined
#

highlight_priorities
#

HighlightManager
#

highlight_manager.ts view source

Manages highlights for a single element. Tracks ranges per element and only removes its own ranges when clearing.

element_ranges

type Map<string, Array<Range>>

constructor

type new (): HighlightManager

highlight_from_syntax_tokens

Highlight from syntax styler token stream.

type (element: Element, tokens: SyntaxTokenStream): void

element
type Element
tokens
returns void

clear_element_ranges

Clear only this element's ranges from highlights.

type (): void

returns void

destroy

type (): void

returns void

HighlightMode
#

HighlightTokenName
#

HookAfterTokenizeCallback
#

HookAfterTokenizeCallbackContext
#

HookBeforeTokenizeCallback
#

HookBeforeTokenizeCallbackContext
#

HookWrapCallback
#

HookWrapCallbackContext
#

syntax_styler.ts view source

HookWrapCallbackContext

type

type string

content

type string

tag

type string

classes

type Array<string>

attributes

type Record<string, string>

lang

type string

sample_langs
#

SampleLang
#

supports_css_highlight_api
#

syntax_styler_global
#

SyntaxGrammar
#

syntax_styler.ts view source

SyntaxGrammar

A grammar after normalization. All values are arrays of normalized tokens with consistent shapes.

SyntaxGrammarRaw
#

SyntaxGrammarToken
#

syntax_styler.ts view source

SyntaxGrammarToken

Grammar token with all properties required. This is the normalized representation used at runtime.

pattern

type RegExp

lookbehind

type boolean

greedy

type boolean

alias

type Array<string>

inside

type SyntaxGrammar | null

SyntaxGrammarTokenRaw
#

syntax_styler.ts view source

SyntaxGrammarTokenRaw

The expansion of a simple RegExp literal to support additional properties.

The inside grammar will be used to tokenize the text value of each token of this kind.

This can be used to make nested and even recursive language definitions.

Note: This can cause infinite recursion. Be careful when you embed different languages or even the same language into each another.

Note: Grammar authors can use optional properties, but they will be normalized to required properties at registration time for optimal performance.

pattern

The regular expression of the token.

type RegExp

lookbehind

If true, then the first capturing group of pattern will (effectively) behave as a lookbehind group meaning that the captured text will not be part of the matched text of the new token.

type boolean

greedy

Whether the token is greedy.

type boolean

alias

An optional alias or list of aliases.

type string | Array<string>

inside

The nested grammar of this token.

type SyntaxGrammarRaw | null

SyntaxGrammarValueRaw
#

SyntaxStyler
#

syntax_styler.ts view source

Based on Prism (https://github.com/PrismJS/prism) by Lea Verou (https://lea.verou.me/)

MIT license

see also

  • LICENSE

langs

type Record<string, SyntaxGrammar | undefined>

add_lang

type (id: string, grammar: SyntaxGrammarRaw, aliases?: string[] | undefined): void

id
type string
grammar
aliases?
type string[] | undefined
optional
returns void

add_extended_lang

type (base_id: string, extension_id: string, extension: SyntaxGrammarRaw, aliases?: string[] | undefined): SyntaxGrammar

base_id
type string
extension_id
type string
extension
aliases?
type string[] | undefined
optional

get_lang

type (id: string): SyntaxGrammar

id
type string

stylize

Generates HTML with syntax highlighting from source code.

Process: 1. Runs before_tokenize hook 2. Tokenizes code using the provided or looked-up grammar 3. Runs after_tokenize hook 4. Runs wrap hook on each token 5. Converts tokens to HTML with CSS classes

Parameter Relationship: - lang is ALWAYS required for hook context and identification - grammar is optional; when undefined, automatically looks up via this.get_lang(lang) - When both are provided, grammar is used for tokenization, lang for metadata

Use cases: - Standard usage: stylize(code, 'ts') - uses registered TypeScript grammar - Custom grammar: stylize(code, 'ts', customGrammar) - uses custom grammar but keeps 'ts' label - Extended grammar: stylize(code, 'custom', this.extend_grammar('ts', extension)) - new language variant

type (text: string, lang: string, grammar?: SyntaxGrammar | undefined): string

text

- The source code to syntax highlight.

type string
lang

- Language identifier (e.g., 'ts', 'css', 'html'). Used for: - Grammar lookup when grammar is undefined - Hook context (lang field passed to hooks) - Language identification in output

type string
grammar

- Optional custom grammar object. When undefined, automatically looks up the grammar via this.get_lang(lang). Provide this to use a custom or modified grammar instead of the registered one.

type SyntaxGrammar | undefined
default this.get_lang(lang)
returns string

HTML string with syntax highlighting using CSS classes (.token_*)

grammar_insert_before

Inserts tokens before another token in a language definition or any other grammar.

Usage

This helper method makes it easy to modify existing languages. For example, the CSS language definition not only defines CSS styling for CSS documents, but also needs to define styling for CSS embedded in HTML through <style> elements. To do this, it needs to modify syntax_styler.get_lang('markup') and add the appropriate tokens. However, syntax_styler.get_lang('markup') is a regular JS object literal, so if you do this:

syntax_styler.get_lang('markup').style = { // token };

then the style token will be added (and processed) at the end. insert_before allows you to insert tokens before existing tokens. For the CSS example above, you would use it like this:

grammar_insert_before('markup', 'cdata', { 'style': { // token } });

Special cases

If the grammars of inside and insert have tokens with the same name, the tokens in inside's grammar will be ignored.

This behavior can be used to insert tokens after before:

grammar_insert_before('markup', 'comment', { 'comment': syntax_styler.get_lang('markup').comment, // tokens after 'comment' });

Limitations

The main problem insert_before has to solve is iteration order. Since ES2015, the iteration order for object properties is guaranteed to be the insertion order (except for integer keys) but some browsers behave differently when keys are deleted and re-inserted. So insert_before can't be implemented by temporarily deleting properties which is necessary to insert at arbitrary positions.

To solve this problem, insert_before doesn't actually insert the given tokens into the target object. Instead, it will create a new object and replace all references to the target object with the new one. This can be done without temporarily deleting properties, so the iteration order is well-defined.

However, only references that can be reached from syntax_styler.langs or insert will be replaced. I.e. if you hold the target object in a variable, then the value of the variable will not change.

var oldMarkup = syntax_styler.get_lang('markup'); var newMarkup = grammar_insert_before('markup', 'comment', { ... }); assert(oldMarkup !== syntax_styler.get_lang('markup')); assert(newMarkup === syntax_styler.get_lang('markup'));

type (inside: string, before: string, insert: SyntaxGrammarRaw, root?: Record<string, any>): SyntaxGrammar

inside

- The property of root (e.g. a language id in syntax_styler.langs) that contains the object to be modified.

type string
before

- The key to insert before.

type string
insert

- An object containing the key-value pairs to be inserted.

root

- The object containing inside, i.e. the object that contains the object to be modified.

Defaults to syntax_styler.langs.

type Record<string, any>
default this.langs

the new grammar object

stringify_token

Converts the given token or token stream to an HTML representation.

Runs the wrap hook on each SyntaxToken.

type (o: string | SyntaxTokenStream | SyntaxToken, lang: string): string

o

- The token or token stream to be converted.

type string | SyntaxTokenStream | SyntaxToken
lang

- The name of current language.

type string
returns string

The HTML representation of the token or token stream.

extend_grammar

Creates a deep copy of the language with the given id and appends the given tokens.

If a token in extension also appears in the copied language, then the existing token in the copied language will be overwritten at its original position.

Best practices

Since the position of overwriting tokens (token in extension that overwrite tokens in the copied language) doesn't matter, they can technically be in any order. However, this can be confusing to others that trying to understand the language definition because, normally, the order of tokens matters in the grammars.

Therefore, it is encouraged to order overwriting tokens according to the positions of the overwritten tokens. Furthermore, all non-overwriting tokens should be placed after the overwriting ones.

type (base_id: string, extension: SyntaxGrammarRaw): SyntaxGrammar

base_id

- The id of the language to extend. This has to be a key in syntax_styler.langs.

type string
extension

- The new tokens to append.

the new grammar

normalize_pattern

Normalize a single pattern to have consistent shape. This ensures all patterns have the same object shape for V8 optimization.

type (pattern: RegExp | SyntaxGrammarTokenRaw, visited: Set<number>): SyntaxGrammarToken

private
pattern
type RegExp | SyntaxGrammarTokenRaw
visited
type Set<number>

normalize_grammar

Normalize a grammar to have consistent object shapes. This performs several optimizations: 1. Merges rest property into main grammar 2. Ensures all pattern values are arrays 3. Normalizes all pattern objects to have consistent shapes 4. Adds global flag to greedy patterns

This is called once at registration time to avoid runtime overhead.

type (grammar: SyntaxGrammarRaw, visited: Set<number>): void

private
grammar
visited

- Set of grammar object IDs already normalized (for circular references)

type Set<number>
returns void

plugins

type Record<string, any>

hooks_before_tokenize

type Array<HookBeforeTokenizeCallback>

hooks_after_tokenize

type Array<HookAfterTokenizeCallback>

hooks_wrap

type Array<HookWrapCallback>

add_hook_before_tokenize

type (cb: HookBeforeTokenizeCallback): void

cb
returns void

add_hook_after_tokenize

type (cb: HookAfterTokenizeCallback): void

cb
returns void

add_hook_wrap

type (cb: HookWrapCallback): void

cb
returns void

run_hook_before_tokenize

type (ctx: HookBeforeTokenizeCallbackContext): void

ctx
returns void

run_hook_after_tokenize

type (ctx: HookAfterTokenizeCallbackContext): void

ctx
returns void

run_hook_wrap

type (ctx: HookWrapCallbackContext): void

ctx
returns void

SyntaxToken
#

syntax_token.ts view source

type

The type of the token.

This is usually the key of a pattern in a Grammar.

type string

content

The strings or tokens contained by this token.

This will be a token stream if the pattern matched also defined an inside grammar.

type string | SyntaxTokenStream

alias

The alias(es) of the token. Always an array, even if empty or single value.

type Array<string>

length

type number

constructor

type new (type: string, content: string | SyntaxTokenStream, alias: string | string[] | undefined, matched_str?: string): SyntaxToken

type
type string
content
type string | SyntaxTokenStream
alias
type string | string[] | undefined
matched_str
type string
default ''

SyntaxTokenStream
#

syntax_token.ts view source

SyntaxTokenStream

A token stream is an array of strings and SyntaxToken objects.

Syntax token streams have to fulfill a few properties that are assumed by most functions (mostly internal ones) that process them.

1. No adjacent strings. 2. No empty strings.

The only exception here is the token stream that only contains the empty string and nothing else.

tokenize_syntax
#

tokenize_syntax.ts view source

(text: string, grammar: SyntaxGrammar): SyntaxTokenStream

Accepts a string of text as input and the language definitions to use, and returns an array with the tokenized code.

When the language definition includes nested tokens, the function is called recursively on each of these tokens.

This method could be useful in other contexts as well, as a very crude parser.

text

- a string with the code to be styled

type string

grammar

- an object containing the tokens to use

Usually a language definition like syntax_styler.get_lang('markup').

returns

SyntaxTokenStream

an array of strings and tokens, a token stream

examples

Example 1