DSL Interface

The Domain-Specific Language (DSL) provides the simplest way to use SymbolicOptimization.

Creating Problems

Use symbolic_problem with keyword arguments:

prob = symbolic_problem(
    X = data_matrix,           # Input data (rows = samples, cols = variables)
    y = targets,               # Target values
    variables = [:x, :y, :z],  # Variable names (auto-generated if omitted)
    binary_operators = [+, -, *, /],
    unary_operators = [sin, cos, exp, log],
    constants = (-2.0, 2.0),   # Range for random constants
    objectives = [:mse, :complexity],  # What to optimize
    mode = :regression,        # :regression or :aggregation
    population = 200,
    generations = 100,
    max_depth = 6,
    max_nodes = 30,
    seed = 42,
    verbose = true
)

Evaluation Modes

  • :regression (default): Standard symbolic regression. Each row of X is a sample.
  • :aggregation: For aggregator discovery. Variables represent forecasters (columns of X), formulas combine their predictions.
result = solve(symbolic_problem(
    X = forecaster_predictions,
    y = ground_truth,
    mode = :aggregation,
    objectives = [:brier, :complexity]
))

Built-in Objectives

  • :mse — Mean squared error
  • :brier — Brier score (for probability predictions vs 0/1 truth)
  • :complexity — Number of nodes in the expression tree

Builder Pattern

For step-by-step construction, use SymbolicProblem:

prob = SymbolicProblem()
variables!(prob, :x, :y)
operators!(prob, binary=[+, -, *, /], unary=[sin])
constants!(prob, (-5.0, 5.0), probability=0.3)
objectives!(prob, :mse, :complexity)
mode!(prob, :aggregation)
add_objective!(prob, :custom, my_func)
add_function!(prob, :my_op, x -> ...)
data!(prob, X=X, y=y, extra=extra_data)
config!(prob, population=200, generations=100)

Solving and Results

result = solve(prob)

best_sol = best(result)
best_sol.expression                     # Formula as string
best_sol.objectives                     # [mse, complexity, ...]

front = pareto_front(result)            # All Pareto-optimal solutions
predictions = evaluate_best(result, X)  # Evaluate on new data

Macro Interface

For quick one-liners:

result = @symbolic_regression(X, y, population=200, generations=50)

DSL Functions

SymbolicOptimization.DSL.symbolic_problemFunction
symbolic_problem(; kwargs...) -> SymbolicProblem

Create a symbolic optimization problem with keyword arguments.

Example

# Standard regression
prob = symbolic_problem(
    X = my_data,
    y = my_targets,
    variables = [:x, :y, :z],
    operators = [+, -, *, /],
    population = 200,
    generations = 100
)

# Aggregator discovery
prob = symbolic_problem(
    X = forecaster_predictions,  # rows = claims, cols = forecasters
    y = ground_truth,            # 0/1 outcomes
    mode = :aggregation,
    objectives = [:brier, :complexity],
    population = 200,
    generations = 100
)
source
SymbolicOptimization.DSL.solveFunction
solve(prob::SymbolicProblem) -> SymbolicResult

Solve the symbolic optimization problem and return results.

source
solve(prob::PolicyProblem) -> SymbolicResult

Solve a policy optimization problem and return results.

The evaluator function is called for each candidate expression and should return a vector of objective values to minimize.

If seed_formulas are provided, they are used to create initial population members.

If constraints are provided, they are checked and violations are penalized:

  • Soft mode: Violation score is added to objectives (scaled by penalty_weight)
  • Hard mode: Violating individuals receive very large objective values
source
SymbolicOptimization.DSL.bestFunction
best(m::SymbolicModel; objective=1) -> NamedTuple

Get the best solution for a given objective (default: first).

Returns a NamedTuple with:

  • tree: the AbstractNode expression tree
  • expression: string representation
  • latex: LaTeX representation
  • objectives: vector of objective values
  • rank: Pareto rank
b = best(m)
b.expression   # "pH_E - pH"
b.objectives   # [0.123, 5.0]
b.tree         # AbstractNode
source
best(result::SymbolicResult) -> (expression::String, objectives::Vector{Float64})

Get the best solution (lowest first objective).

source
SymbolicOptimization.DSL.pareto_frontFunction
pareto_front(m::SymbolicModel) -> Vector{NamedTuple}

Get all Pareto-optimal solutions.

for sol in pareto_front(m)
    println("$(sol.expression)  objectives=$(sol.objectives)")
end
source
pareto_front(result::SymbolicResult)

Get all Pareto-optimal solutions.

source
SymbolicOptimization.DSL.objectives!Function
objectives!(prob, objs...)

Set the objectives for optimization. Use :mse and :complexity for built-ins, or symbols matching custom objectives added with add_objective!.

source
SymbolicOptimization.DSL.mode!Function
mode!(prob, mode::Symbol)

Set the evaluation mode:

  • :regression (default) - Standard symbolic regression. Variables map to columns of X, each row is evaluated independently, predictions compared to y.
  • :aggregation - For aggregator discovery. Variables are forecaster predictions (columns), the formula aggregates them into a single prediction per row.

In aggregation mode, built-in objectives like :brier and :mse compare aggregated predictions to y (ground truth).

source
SymbolicOptimization.add_objective!Function
add_objective!(target, args...; kwargs...)

Add an objective to a problem specification. Extended by DSL (for SymbolicProblem) and API (for SymbolicModel).

source