SymbolicOptimization.jl

A Julia package for multi-objective symbolic optimization using grammar-guided genetic programming and NSGA-II.

What is Symbolic Optimization?

Symbolic optimization evolves mathematical expressions to optimize arbitrary objectives. The result is an interpretable formula, not a black-box model.

This is not (just) symbolic regression. While symbolic regression finds $f(x) \approx y$ by minimizing prediction error, symbolic optimization is the general framework for searching expression space with any objectives:

ApplicationObjective(s)Output
Aggregator discoveryCalibration, accuracy on crowd predictionsFormula combining forecaster estimates
Belief update rulesMatch normative Bayesian updatesHeuristic for updating credences
Scoring rule designProper scoring, discriminationFormula evaluating forecaster quality
Curve fittingMSE on (x, y) dataRegression formula

Quick Start

The simplest way to use SymbolicOptimization is via the DSL Interface:

using SymbolicOptimization

# Generate some data
X = reshape(-3:0.2:3, :, 1)
y = vec(X.^2 .- 1)  # Target: x² - 1

# Define and solve the problem in one call
result = solve(symbolic_problem(
    X = X,
    y = y,
    variables = [:x],
    binary_operators = [+, -, *, /],
    population = 200,
    generations = 50
))

# Get the best solution
best_sol = best(result)
println(best_sol.expression)  # Something like "(x * x) - 1.0"
println(best_sol.objectives)  # [MSE, complexity]

For more control, see the Model API which provides a JuMP-inspired macro-based interface.

Installation

using Pkg
Pkg.add("SymbolicOptimization")

Package Overview

ComponentDescription
DSL InterfaceHigh-level symbolic_problem() and builder pattern
Model APIJuMP-style SymbolicModel with macros
OptimizationNSGA-II engine, objectives, results
Core Types & TreesAbstractNode, tree utilities
Grammar SystemTyped/untyped grammars, safe operations
EvaluationExpression evaluation, context-aware evaluation
Genetic OperatorsGeneration, mutation, crossover, simplification
Advanced TopicsSymbolics.jl integration, constraints, policy problems

Limitations

For high-dimensional symbolic regression (e.g., 100+ variables), consider using SymbolicRegression.jl which implements multi-population island models optimized for that use case.

SymbolicOptimization.jl excels at:

  • Custom objective functions beyond MSE
  • Domain-specific operators and grammars
  • Context-aware evaluation (belief updating, aggregation)
  • Problems with smaller search spaces but complex objectives