API to parse the MeTTa language from text into Atoms. More...

Classes
struct	tokenizer_t
	Represents a handle to a Tokenizer, capable of recognizing meaningful Token substrings in text. More...
struct	sexpr_parser_t
	Represents an S-Expression Parser state machine, to parse input text into an Atom. More...
struct	syntax_node_t
	Represents a component in a syntax tree created by parsing MeTTa code. More...
struct	token_api_t
	A table of callback functions to implement custom atom parsing. More...

Typedefs
typedef void(*	c_syntax_node_callback_t) (const struct syntax_node_t node, void context)
	Function signature for a callback providing access to a syntax_node_t

Enumerations
enum	syntax_node_type_t { COMMENT , VARIABLE_TOKEN , STRING_TOKEN , WORD_TOKEN , OPEN_PAREN , CLOSE_PAREN , WHITESPACE , LEFTOVER_TEXT , EXPRESSION_GROUP , ERROR_GROUP }
	The type of language construct respresented by a syntax_node_t. More...

Functions
struct tokenizer_t	tokenizer_new (void)
	Creates a new Tokenizer, without any registered Tokens.
void	tokenizer_free (struct tokenizer_t tokenizer)
	Frees a Tokenizer handle.
void	tokenizer_register_token (struct tokenizer_t tokenizer, const char regex, const struct token_api_t api, void context)
	Registers a new custom Token in a Tokenizer.
struct tokenizer_t	tokenizer_clone (const struct tokenizer_t *tokenizer)
	Performs a "deep copy" of a Tokenizer.
struct sexpr_parser_t	sexpr_parser_new (const char *text)
	Creates a new S-Expression Parser.
void	sexpr_parser_free (struct sexpr_parser_t parser)
	Frees an S-Expression Parser.
atom_t	sexpr_parser_parse (struct sexpr_parser_t parser, const struct tokenizer_t tokenizer)
	Parses the text associated with an sexpr_parser_t, and creates the corresponding Atom.
const char *	sexpr_parser_err_str (const struct sexpr_parser_t *parser)
	Returns the error string associated with the last sexpr_parser_parse call.
struct syntax_node_t	sexpr_parser_parse_to_syntax_tree (struct sexpr_parser_t *parser)
	Parses the text associated with an sexpr_parser_t, and creates a syntax tree.
void	syntax_node_free (struct syntax_node_t node)
	Frees a syntax_node_t.
struct syntax_node_t	syntax_node_clone (const struct syntax_node_t *node)
	Creates a deep copy of a syntax_node_t
void	syntax_node_iterate (const struct syntax_node_t node, c_syntax_node_callback_t callback, void context)
	Performs a depth-first iteration of all child syntax nodes within a syntax tree.
enum syntax_node_type_t	syntax_node_type (const struct syntax_node_t *node)
	Returns the type of a syntax_node_t
bool	syntax_node_is_null (const struct syntax_node_t *node)
	Returns true if a syntax node represents the end of the stream.
bool	syntax_node_is_leaf (const struct syntax_node_t *node)
	Returns true if a syntax node is a leaf (has no children) and false otherwise.
void	syntax_node_src_range (const struct syntax_node_t node, uintptr_t range_start, uintptr_t *range_end)
	Returns the beginning and end positions in the parsed source of the text represented by the syntax node.

Detailed Description

API to parse the MeTTa language from text into Atoms.

This interface facilitates parsing textual representations of MeTTa into atom representations, and can be extended to parse custom atom types with specialized syntax.

Typedef Documentation

◆ c_syntax_node_callback_t

typedef void(* c_syntax_node_callback_t) (const struct syntax_node_t *node, void *context)

Function signature for a callback providing access to a syntax_node_t

Parameters

[in]	node	The syntax_node_t being provided. This node should not be modified or freed by the callback.
[in]	context	The context state pointer initially passed to the upstream function initiating the callback.

Enumeration Type Documentation

◆ syntax_node_type_t

enum syntax_node_type_t

The type of language construct respresented by a syntax_node_t.

Enumerator
COMMENT	A Comment, beginning with a ';' character.
VARIABLE_TOKEN	A variable. A symbol immediately preceded by a '$' sigil.
STRING_TOKEN	A String Literal. All text between non-escaped '"' (double quote) characters.
WORD_TOKEN	Word Token. Any other whitespace-delimited token that isn't a VARIABLE_TOKEN or STRING_TOKEN.
OPEN_PAREN	Open Parenthesis. A non-escaped '(' character indicating the beginning of an expression.
CLOSE_PAREN	Close Parenthesis. A non-escaped ')' character indicating the end of an expression.
WHITESPACE	Whitespace. One or more whitespace chars.
LEFTOVER_TEXT	Leftover Text that remains unparsed after a parse error has occurred.
EXPRESSION_GROUP	A Group of nodes between an OPEN_PAREN and a matching CLOSE_PAREN
ERROR_GROUP	A Group of nodes that cannot be combined into a coherent atom due to a parse error, even if some of the individual nodes could represent valid atoms.

Function Documentation

◆ sexpr_parser_err_str()

const char * sexpr_parser_err_str ( const struct sexpr_parser_t * parser )

Returns the error string associated with the last sexpr_parser_parse call.

Parameters

[in] parser A pointer to the Parser, which is associated with the text to parse

Returns: A pointer to the C-string containing the parse error that occurred, or NULL if no parse error occurred

Warning: The returned pointer should NOT be freed. It must never be accessed after the sexpr_parser_t has been freed, or any subsequent call to sexpr_parser_parse or sexpr_parser_parse_to_syntax_tree has been made.

◆ sexpr_parser_free()

void sexpr_parser_free ( struct sexpr_parser_t parser )

Frees an S-Expression Parser.

Parameters

[in] parser The sexpr_parser_t handle to free

◆ sexpr_parser_new()

struct sexpr_parser_t sexpr_parser_new ( const char * text )

Creates a new S-Expression Parser.

Parameters

[in] text A C-style string containing the input text to parse

Returns: The new sexpr_parser_t, ready to parse the text

Note: The returned sexpr_parser_t must be freed with sexpr_parser_free() or passed to another function that takes ownership

Warning: The returned sexpr_parser_t borrows a reference to the text, so the returned sexpr_parser_t must be freed before the text is freed or allowed to go out of scope.

◆ sexpr_parser_parse()

atom_t sexpr_parser_parse	(	struct sexpr_parser_t *	parser,
		const struct tokenizer_t *	tokenizer )

Parses the text associated with an sexpr_parser_t, and creates the corresponding Atom.

Parameters

[in]	parser	A pointer to the Parser, which is associated with the text to parse
[in]	tokenizer	A pointer to the Tokenizer, to use to interpret atoms within the expression

Returns: The new atom_t, which may be an Expression atom with many child atoms. Returns a none atom if parsing is finished, or an error expression atom if a parse error occurred.

Note: The caller must take ownership responsibility for the returned atom_t, and ultimately free it with atom_free() or pass it to another function that takes ownership responsibility; If this function encounters an error, the error may be accessed with sexpr_parser_err_str()

◆ sexpr_parser_parse_to_syntax_tree()

struct syntax_node_t sexpr_parser_parse_to_syntax_tree ( struct sexpr_parser_t * parser )

Parses the text associated with an sexpr_parser_t, and creates a syntax tree.

Parameters

[in] parser A pointer to the Parser, which is associated with the text to parse

Returns: The new syntax_node_t representing the root of the parsed tree

Note: The caller must take ownership responsibility for the returned syntax_node_t, and ultimately free it with syntax_node_free()

◆ syntax_node_clone()

struct syntax_node_t syntax_node_clone ( const struct syntax_node_t * node )

Creates a deep copy of a syntax_node_t

Parameters

[in] node A pointer to the syntax_node_t

Returns: The syntax_node_t representing the cloned syntax node

Note: The caller must take ownership responsibility for the returned syntax_node_t, and ultimately free it with syntax_node_free()

◆ syntax_node_free()

void syntax_node_free ( struct syntax_node_t node )

Frees a syntax_node_t.

Parameters

[in] node The syntax_node_t to free

◆ syntax_node_is_leaf()

bool syntax_node_is_leaf ( const struct syntax_node_t * node )

Returns true if a syntax node is a leaf (has no children) and false otherwise.

Parameters

[in] node A pointer to the syntax_node_t

Returns: The boolean value indicating if the node is a leaf

◆ syntax_node_is_null()

bool syntax_node_is_null ( const struct syntax_node_t * node )

Returns true if a syntax node represents the end of the stream.

Parameters

[in] node A pointer to the syntax_node_t

Returns: The boolean value indicating if the node is a a null node

◆ syntax_node_iterate()

void syntax_node_iterate	(	const struct syntax_node_t *	node,
		c_syntax_node_callback_t	callback,
		void *	context )

Performs a depth-first iteration of all child syntax nodes within a syntax tree.

Parameters

[in]	node	A pointer to the top-level syntax_node_t representing the syntax tree
[in]	callback	A function that will be called to provide a vector of all type atoms associated with the atom argument atom
[in]	context	A pointer to a caller-defined structure to facilitate communication with the callback function

◆ syntax_node_src_range()

void syntax_node_src_range	(	const struct syntax_node_t *	node,
		uintptr_t *	range_start,
		uintptr_t *	range_end )

Returns the beginning and end positions in the parsed source of the text represented by the syntax node.

Parameters

[in]	node	A pointer to the syntax_node_t
[out]	range_start	A pointer to a value, into which the starting offset of the range will be written
[out]	range_end	A pointer to a value, into which the ending offset of the range will be written

◆ syntax_node_type()

enum syntax_node_type_t syntax_node_type ( const struct syntax_node_t * node )

Returns the type of a syntax_node_t

Parameters

[in] node A pointer to the syntax_node_t

Returns: The syntax_node_type_t representing the type of the syntax node

◆ tokenizer_clone()

struct tokenizer_t tokenizer_clone ( const struct tokenizer_t * tokenizer )

Performs a "deep copy" of a Tokenizer.

Parameters

[in] tokenizer A pointer to the Tokenizer to clone

Returns: The new Tokenizer, containing all registered Tokens belonging to the original Tokenizer

Note: The returned tokenizer_t must be freed with tokenizer_free()

◆ tokenizer_free()

void tokenizer_free ( struct tokenizer_t tokenizer )

Frees a Tokenizer handle.

Parameters

[in] tokenizer The tokenizer_t handle to free

Note: When the last tokenizer_t handle for an underlying Tokenizer has been freed, then the Tokenizer will be deallocated

◆ tokenizer_new()

struct tokenizer_t tokenizer_new ( void )

Creates a new Tokenizer, without any registered Tokens.

Returns: an tokenizer_t handle to access the newly created Tokenizer

Note: The returned tokenizer_t handle must be freed with tokenizer_free()

◆ tokenizer_register_token()

void tokenizer_register_token	(	struct tokenizer_t *	tokenizer,
		const char *	regex,
		const struct token_api_t *	api,
		void *	context )

Registers a new custom Token in a Tokenizer.

Parameters

[in]	tokenizer	A pointer to the Tokenizer in which to register the Token
[in]	regex	A regular expression to match the incoming text, triggering this token to generate a new atom
[in]	api	A table of functions to manage the token
[in]	context	A caller-defined structure to communicate any state necessary to implement the Token parser

Note: Hyperon uses the Rust RegEx engine and syntax, documented here.

Classes

Typedefs

Enumerations

Functions

Detailed Description

Typedef Documentation

◆ c_syntax_node_callback_t

Enumeration Type Documentation

◆ syntax_node_type_t

Function Documentation

◆ sexpr_parser_err_str()

◆ sexpr_parser_free()

◆ sexpr_parser_new()

◆ sexpr_parser_parse()

◆ sexpr_parser_parse_to_syntax_tree()

◆ syntax_node_clone()

◆ syntax_node_free()

◆ syntax_node_is_leaf()

◆ syntax_node_is_null()

◆ syntax_node_iterate()

◆ syntax_node_src_range()

◆ syntax_node_type()

◆ tokenizer_clone()

◆ tokenizer_free()

◆ tokenizer_new()

◆ tokenizer_register_token()