Plugins API¶

If you wish to support a new parser generator, you’ll need to implement ModelProvider and call register_provider in your extension’s setup function.

Basic interfaces¶

sphinx_syntax.register_provider(provider: ModelProvider)¶

Register new model provider.

Extensions should call this function during their setup.

class sphinx_syntax.ModelProvider¶

Base interface for extracting data from a grammar source files.

supported_extensions: set[str]¶: Set of file extensions that correspond to this provider’s syntax, including leading periods.

can_handle(path: Path) → bool¶

Check whether this provider can parse the given diagram file.

By default, checks file extension against ModelProvider.supported_extensions.

abstractmethod from_file(path: Path, options: LoadingOptions) → Model¶

Load model from file.

If model wasn’t found, or there were errors while loading the model, this method should print errors to a log and return an empty (or a partially loaded) model.

from_name(base_path: Path, name: str, options: LoadingOptions) → Model | None¶

Load model by name.

Used to load root rule if it’s located in a separate grammar. For example, when documenting a lexer, but limiting lexemes to only those used in a parser.

By default, just adds extensions from supported_extensions to name and loads the file if it exists.

class sphinx_syntax.Model¶

Represents a single parsed grammar.

abstractmethod get_provider() → ModelProvider¶: Return provider that loaded this model.

abstractmethod get_name() → str¶: Get grammar name.

abstractmethod get_path() → Path¶: Get path for the file that this model was loaded from.

abstractmethod get_model_docs() → list[tuple[int, str]] | None¶

Get documentation that appears on top of the model.

The returned list contains one item per documentation comment.

The first element of this item is a line number at which the comment started, the second element is the comment itself.

abstractmethod lookup_local(name: str) → RuleBase | None¶

Lookup symbol with the given name.

Imported models are not checked.

lookup(name: str) → RuleBase | None¶

Lookup symbol with the given name.

Check symbols in the model first, than check imported models. To lookup literal tokens, pass contents of the literal, e.g. model.lookup("'literal'").

Return None if symbol cannot be found.

If there are duplicate symbols, it is unspecified which one is returned.

abstractmethod get_imports() → Iterable[Model]¶

Get all imported models.

No order of iteration is specified.

Note: cyclic imports are allowed in the model.

iter_import_tree() → Iterable[Model]¶

Iterate over this model and all imported models.

No order of iteration is specified.

abstractmethod get_terminals() → Iterable[LexerRule]¶

Get all terminals (including fragments) declared in this model.

Terminals declared in imported models are not included.

No order of iteration is specified.

abstractmethod get_non_terminals() → Iterable[ParserRule]¶

Get all non-terminals (parser rules) declared in this model.

Non-terminals declared in imported models are not included.

No order of iteration is specified.

get_all_rules() → Iterable[RuleBase]¶

Get all rules, both terminals and non-terminals.

No order of iteration is specified.

class sphinx_syntax.ModelImpl(provider: ModelProvider, path: Path, name: str, *, docs: list[tuple[int, str]] | None, imports: Iterable[Model], terminals: Iterable[LexerRule], non_terminals: Iterable[ParserRule])¶: Default model implementation, simply stores model data.

class sphinx_syntax.LoadingOptions(use_c_char_literals: bool = True)¶

Additional options for loading a grammar file.

use_c_char_literals: bool = True¶

Bison-specific setting that indicates whether the target language uses C-lite char literals or single quoted strings.

This option affects parsing of inline code blocks within Bison file.

Production rule descriptions¶

class sphinx_syntax.RuleBase(name: str, display_name: str | None, model: Model, position: Position, content: RuleContent | None, is_nodoc: bool, is_no_diagram: bool, css_class: str | None, is_inline: bool, keep_diagram_recursive: bool, importance: int, documentation: list[tuple[int, str]] | None, section: Section | None)¶

Base class for parser and lexer rules.

Note that actual model implementations use LexerRule and ParserRule instead of this base.

name: str¶: Name of this rule.

display_name: str | None¶: Display name from doc:name command.

model: Model¶: Reference to the model in which this rule was declared.

position: Position¶: A position at which this rule is declared.

content: RuleContent | None¶

Body of the token or rule definition.

May be omitted for implicitly declared tokens or tokens that were declared in the tokens section of a lexer.

is_nodoc: bool¶

Indicates that the doc:nodoc flag is set for this rule.

If true, generators should not output any content for this rule.

is_no_diagram: bool¶

Indicates that the doc:no_diagram flag is set.

If true, generators should not produce syntax diagram for this rule.

css_class: str | None¶

Custom css class set via the doc:css_class command.

All diagram nodes referencing this rule will have this css class added to them.

is_inline: bool¶

Indicates that the doc:inline flag is set for this rule.

If true, generators should not output any content for this rule.

They should also inline diagram of this rule when rendering diagrams for any other rule that refers this rule.

keep_diagram_recursive: bool¶

Indicates that the doc:keep-diagram-recursive flag is set for this rule.

If true, diagram renderer will not attempt converting recursive alternatives to cycles.

importance: int¶: Importance of the rule, determines its placing in auto-generated diagrams.

documentation: list[tuple[int, str]] | None¶: Documentation for this rule.

section: Section | None¶: Which section this rule belongs to?

class sphinx_syntax.ParserRule(name: 'str', display_name: 'str | None', model: 'Model', position: 'Position', content: 'RuleContent | None', is_nodoc: 'bool', is_no_diagram: 'bool', css_class: 'str | None', is_inline: 'bool', keep_diagram_recursive: 'bool', importance: 'int', documentation: 'list[tuple[int, str]] | None', section: 'Section | None')¶

class sphinx_syntax.LexerRule(name: 'str', display_name: 'str | None', model: 'Model', position: 'Position', content: 'RuleContent | None', is_nodoc: 'bool', is_no_diagram: 'bool', css_class: 'str | None', is_inline: 'bool', keep_diagram_recursive: 'bool', importance: 'int', documentation: 'list[tuple[int, str]] | None', section: 'Section | None', is_literal: 'bool', is_fragment: 'bool')¶

is_literal: bool¶

Indicates that this token is a literal token.

Literal tokens are tokens with a single fixed-string literal element.

is_fragment: bool¶: Indicates that this rule is a fragment.

class sphinx_syntax.Position(file: 'pathlib.Path', line: 'int')¶

file: Path¶: Absolute path to the file in which this rule is declared.

line: int¶: Line at which this rule is declared.

class sphinx_syntax.Section(docs: list[tuple[int, str]], position: Position)¶

Represents a single section header, i.e. a group of comments that start with a triple slash.

docs: list[tuple[int, str]]¶: List of documentation lines in the section description.

position: Position¶: A position at which this section is declared.

Rule AST¶

class sphinx_syntax.RuleContent¶

Base class for AST nodes that form lexer and parser rules.

Note that all AST nodes are interned, and can be compared via is keyword instead of ==.

class sphinx_syntax.Reference(model: Model, name: str)¶

Refers another parser or lexer rule.

model: Model¶: Reference to the model in which the rule is used.

name: str¶: Referenced rule name.

get_reference() → RuleBase | None¶

Lookup and return the actual rule class.

Returns None if reference is invalid.

class sphinx_syntax.Doc(value: str)¶

Inline documentation.

value: str¶: Documentation content.

class sphinx_syntax.Wildcard¶: Matches any token.

sphinx_syntax.WILDCARD = Wildcard()¶: Matches any token.

class sphinx_syntax.Negation(child: RuleContent)¶

Matches anything but the child rules.

child: RuleContent¶: Rules that will be negated.

class sphinx_syntax.ZeroPlus(child: RuleContent)¶

Matches the child zero or more times.

child: RuleContent¶: Rule which will be parsed zero or more times.

class sphinx_syntax.OnePlus(child: RuleContent)¶

Matches the child one or more times.

child: RuleContent¶: Rule which will be parsed one or more times.

class sphinx_syntax.Sequence(children: tuple[RuleContent, ...], linebreaks: tuple[LineBreak, ...] | None = None)¶

Matches a sequence of elements.

children: tuple[RuleContent, ...]¶: Children rules that will be parsed in order.

linebreaks: tuple[LineBreak, ...]¶: Describes where it is preferable to wrap sequence.

sphinx_syntax.EMPTY = Sequence(children=())¶: Matches a sequence of elements.

class sphinx_syntax.Alternative(children: tuple[RuleContent, ...])¶

Matches either of children.

children: tuple[RuleContent, ...]¶: Children rules.

class sphinx_syntax.Literal(content: str)¶

A sequence of symbols (e.g. 'kwd').

content: str¶: Formatted content of the literal, with special symbols escaped.

class sphinx_syntax.Range(start: str, end: str)¶

A range of symbols (e.g. a..b).

start: str¶: Range first symbol.

end: str¶: Range last symbol.

class sphinx_syntax.CharSet(content: str)¶

A character set (e.g. [a-zA-Z]).

content: str¶: Character set description, square brackets included.

AST visitors¶

class sphinx_syntax.RuleContentVisitor¶

Generic visitor for rule contents.

visit(r: RuleContent) → T¶

visit_default(r: RuleContent) → T¶

visit_literal(r: Literal) → T¶

visit_range(r: Range) → T¶

visit_charset(r: CharSet) → T¶

visit_reference(r: Reference) → T¶

visit_doc(r: Doc) → T¶

visit_wildcard(r: Wildcard) → T¶

visit_negation(r: Negation) → T¶

visit_zero_plus(r: ZeroPlus) → T¶

visit_one_plus(r: OnePlus) → T¶

visit_sequence(r: Sequence) → T¶

visit_alternative(r: Alternative) → T¶

class sphinx_syntax.CachedRuleContentVisitor¶

visit(r: RuleContent) → T¶