DSL Interface

The Domain-Specific Language (DSL) provides the simplest way to use SymbolicOptimization.

Creating Problems

Use symbolic_problem with keyword arguments:

prob = symbolic_problem(
    X = data_matrix,           # Input data (rows = samples, cols = variables)
    y = targets,               # Target values
    variables = [:x, :y, :z],  # Variable names (auto-generated if omitted)
    binary_operators = [+, -, *, /],
    unary_operators = [sin, cos, exp, log],
    constants = (-2.0, 2.0),   # Range for random constants
    objectives = [:mse, :complexity],  # What to optimize
    mode = :regression,        # :regression or :aggregation
    population = 200,
    generations = 100,
    max_depth = 6,
    max_nodes = 30,
    seed = 42,
    verbose = true
)

Evaluation Modes

:regression (default): Standard symbolic regression. Each row of X is a sample.
:aggregation: For aggregator discovery. Variables represent forecasters (columns of X), formulas combine their predictions.

result = solve(symbolic_problem(
    X = forecaster_predictions,
    y = ground_truth,
    mode = :aggregation,
    objectives = [:brier, :complexity]
))

Built-in Objectives

:mse — Mean squared error
:brier — Brier score (for probability predictions vs 0/1 truth)
:complexity — Number of nodes in the expression tree

Builder Pattern

For step-by-step construction, use SymbolicProblem:

prob = SymbolicProblem()
variables!(prob, :x, :y)
operators!(prob, binary=[+, -, *, /], unary=[sin])
constants!(prob, (-5.0, 5.0), probability=0.3)
objectives!(prob, :mse, :complexity)
mode!(prob, :aggregation)
add_objective!(prob, :custom, my_func)
add_function!(prob, :my_op, x -> ...)
data!(prob, X=X, y=y, extra=extra_data)
config!(prob, population=200, generations=100)

Solving and Results

result = solve(prob)

best_sol = best(result)
best_sol.expression                     # Formula as string
best_sol.objectives                     # [mse, complexity, ...]

front = pareto_front(result)            # All Pareto-optimal solutions
predictions = evaluate_best(result, X)  # Evaluate on new data

Macro Interface

For quick one-liners:

result = @symbolic_regression(X, y, population=200, generations=50)

DSL Functions

SymbolicOptimization.DSL.symbolic_problem — Function

symbolic_problem(; kwargs...) -> SymbolicProblem

Create a symbolic optimization problem with keyword arguments.

Example

# Standard regression
prob = symbolic_problem(
    X = my_data,
    y = my_targets,
    variables = [:x, :y, :z],
    operators = [+, -, *, /],
    population = 200,
    generations = 100
)

# Aggregator discovery
prob = symbolic_problem(
    X = forecaster_predictions,  # rows = claims, cols = forecasters
    y = ground_truth,            # 0/1 outcomes
    mode = :aggregation,
    objectives = [:brier, :complexity],
    population = 200,
    generations = 100
)

source

SymbolicOptimization.DSL.SymbolicProblem — Type

SymbolicProblem

A high-level specification of a symbolic optimization problem. Created by helper functions, then solved with solve().

source

SymbolicOptimization.DSL.solve — Function

solve(prob::SymbolicProblem) -> SymbolicResult

Solve the symbolic optimization problem and return results.

source

solve(prob::PolicyProblem) -> SymbolicResult

Solve a policy optimization problem and return results.

The evaluator function is called for each candidate expression and should return a vector of objective values to minimize.

If seed_formulas are provided, they are used to create initial population members.

If constraints are provided, they are checked and violations are penalized:

Soft mode: Violation score is added to objectives (scaled by penalty_weight)
Hard mode: Violating individuals receive very large objective values

source

SymbolicOptimization.DSL.best — Function

best(m::SymbolicModel; objective=1) -> NamedTuple

Get the best solution for a given objective (default: first).

Returns a NamedTuple with:

tree: the AbstractNode expression tree
expression: string representation
latex: LaTeX representation
objectives: vector of objective values
rank: Pareto rank

b = best(m)
b.expression   # "pH_E - pH"
b.objectives   # [0.123, 5.0]
b.tree         # AbstractNode

source

best(result::SymbolicResult) -> (expression::String, objectives::Vector{Float64})

Get the best solution (lowest first objective).

source

SymbolicOptimization.DSL.pareto_front — Function

pareto_front(m::SymbolicModel) -> Vector{NamedTuple}

Get all Pareto-optimal solutions.

for sol in pareto_front(m)
    println("$(sol.expression)  objectives=$(sol.objectives)")
end

source

pareto_front(result::SymbolicResult)

Get all Pareto-optimal solutions.

source

SymbolicOptimization.DSL.evaluate_best — Function

evaluate_best(result::SymbolicResult, X::AbstractMatrix)

Evaluate the best solution on new data.

source

SymbolicOptimization.DSL.variables! — Function

variables!(prob, vars...)

Set the variable names for the symbolic expressions.

source

SymbolicOptimization.DSL.operators! — Function

operators!(prob; binary=[], unary=[])

Set the operators available in symbolic expressions.

source

SymbolicOptimization.DSL.constants! — Function

constants!(problem, range; probability=0.3)

Set the constant range and probability.

source

SymbolicOptimization.DSL.objectives! — Function

objectives!(prob, objs...)

Set the objectives for optimization. Use :mse and :complexity for built-ins, or symbols matching custom objectives added with add_objective!.

source

SymbolicOptimization.DSL.mode! — Function

mode!(prob, mode::Symbol)

Set the evaluation mode:

:regression (default) - Standard symbolic regression. Variables map to columns of X, each row is evaluated independently, predictions compared to y.
:aggregation - For aggregator discovery. Variables are forecaster predictions (columns), the formula aggregates them into a single prediction per row.

In aggregation mode, built-in objectives like :brier and :mse compare aggregated predictions to y (ground truth).

source

SymbolicOptimization.DSL.data! — Function

data!(prob; X=nothing, y=nothing, kwargs...)

Set the data for optimization.

source

SymbolicOptimization.DSL.config! — Function

config!(prob; kwargs...)

Set configuration options.

source

SymbolicOptimization.add_objective! — Function

add_objective!(target, args...; kwargs...)

Add an objective to a problem specification. Extended by DSL (for SymbolicProblem) and API (for SymbolicModel).

source

SymbolicOptimization.DSL.add_function! — Function

add_function!(prob, name, func)

Add a custom function to the grammar.

source