Grammar System

The grammar system defines the search space for symbolic optimization: which operators, variables, and constants are available, and how they can be combined.

Simple Grammar (Untyped)

For standard symbolic regression:

grammar = Grammar(
    binary_operators = [+, -, *, /],
    unary_operators = [sin, cos, exp, log],
    variables = [:x, :y],
    constant_range = (-2.0, 2.0),
)

Typed Grammar

For domains with multiple types (e.g., vector/scalar operations):

grammar = Grammar(
    types = [:Scalar, :Vector],
    variables = [
        :ps => :Vector,
        :n => :Scalar,
    ],
    operators = [
        (mean, [:Vector] => :Scalar),
        (sum, [:Vector] => :Scalar),
        (+, [:Scalar, :Scalar] => :Scalar),
        (*, [:Scalar, :Scalar] => :Scalar),
        (Symbol(".^"), [:Vector, :Scalar] => :Vector),
    ],
    constant_types = [:Scalar],
    output_type = :Scalar,
)

Querying the Grammar

unary_operators(grammar)                # Unary operators
binary_operators(grammar)               # Binary operators
operators_producing(grammar, :Scalar)   # Operators producing a type
sample_constant(grammar)                # Random constant

valid, msg = check_tree_validity(tree, grammar)  # Validate a tree

Safe Operations

All standard operators have "safe" versions that return NaN instead of throwing:

safe_div(1.0, 0.0)    # NaN
safe_log(-1.0)        # NaN
safe_sqrt(-1.0)       # NaN
safe_pow(-2.0, 0.5)   # NaN

Additional safe functions include safe_exp, safe_mean, safe_sum, safe_std, safe_var, and activation functions like sigmoid, relu, softplus, clamp01.

The SAFE_IMPLEMENTATIONS dictionary maps standard function symbols to their safe counterparts.

Grammar Reference

SymbolicOptimization.GrammarType
Grammar

A complete grammar specification for symbolic optimization.

Two Modes

Simple Mode (like SymbolicRegression.jl)

g = Grammar(
    binary_operators = [+, -, *, /],
    unary_operators = [sin, cos, exp],
    variables = [:x, :y],
    constant_range = (-2.0, 2.0),
)

Typed Mode (for complex domains)

g = Grammar(
    types = [:Scalar, :Vector],
    variables = [:ps => :Vector, :n => :Scalar],
    operators = [
        (mean, [:Vector] => :Scalar),
        (+, [:Scalar, :Scalar] => :Scalar),
        (.^, [:Vector, :Scalar] => :Vector),
    ],
    constant_types = [:Scalar],
    output_type = :Scalar,
)
source
SymbolicOptimization.validate_grammarFunction
validate_grammar(g::Grammar) -> ValidationResult

Validate a grammar for consistency and completeness.

Checks performed:

  • All operator input/output types exist in the grammar
  • All variable types exist in the grammar
  • At least one operator or variable produces each output type
  • No duplicate operator signatures (in typed mode)
  • Constants can be generated for required types
source
SymbolicOptimization.infer_typeFunction
infer_type(tree::AbstractNode, g::Grammar, var_env::Dict{Symbol, Symbol}) -> Symbol

Infer the type of an expression tree in a typed grammar. Returns :Any for untyped grammars.

source