Advanced Topics

Deep Algebraic Simplification (Symbolics.jl)

The built-in simplify handles basic identities (x + 0 -> x, x * 1 -> x) and is applied during evolution. For deeper algebraic simplification of final results, load Symbolics.jl:

using SymbolicOptimization
using Symbolics  # activates the extension

b = best(result)

# Deeply simplified tree (still an AbstractNode)
simple_tree = deep_simplify(b.tree)

# Or get simplified output directly
simplified_string(b.tree)   # "x^2 - 1"
simplified_latex(b.tree)    # "x^{2} - 1"

Safe operators (safe_div, safe_log, etc.) are automatically mapped to their standard equivalents before simplification.

Piecewise Formulas

If your grammar includes step_func, the search may discover piecewise formulas. Use simplify_piecewise to simplify each branch independently:

r = simplify_piecewise(b.tree)

r.if_string      # simplified if-branch as string
r.else_string    # simplified else-branch as string
r.if_latex       # LaTeX for the if-branch
r.else_latex     # LaTeX for the else-branch
r.if_branch      # simplified if-branch as AbstractNode
r.else_branch    # simplified else-branch as AbstractNode

Domain Substitutions

Declare domain-specific identities via the substitutions argument:

r = simplify_piecewise(b.tree, substitutions = complement_vars(
    :pnotH_notE => :pH_notE,   # P(not H|not E) = 1 - P(H|not E)
    :pnotH_E    => :pH_E,      # P(not H|E) = 1 - P(H|E)
))

Limitations

  • Factoring: Symbolics.jl tends to expand rather than factor polynomials.
  • Piecewise detection: Assumes all step_func(...) calls share the same condition.
  • Performance: Deep simplification should only be used on final results, not inside the GP loop.

If you don't load Symbolics, none of these functions are available, keeping the package lightweight.

Constraints

Define constraints to guide the search toward valid expressions:

cs = ConstraintSet()
add_constraint!(cs, directionality_constraint(:x, :increasing))
add_constraint!(cs, monotonicity_constraint(:y, data))
add_constraint!(cs, boundary_constraint(:x, 0.0, 1.0))

violation = check_constraints(tree, cs, data)
rate = violation_rate(tree, cs, data)

Built-in Constraints

Policy Problems

For problems where objectives cannot be computed point-by-point, use policy_problem with a custom evaluator.

Discrimination Problems (AUC-based)

function my_simulator(rng)
    inputs = Dict(:x => rand(rng), :y => rand(rng))
    label = some_ground_truth
    return (inputs, label)
end

result = solve(policy_problem(
    variables = [:x, :y],
    evaluator = discrimination_evaluator(
        simulator = my_simulator,
        n_simulations = 1000,
        objectives = [:auc, :complexity]
    ),
    n_objectives = 2,
    population = 200,
    generations = 100
))

Sequential Problems

For problems with state accumulation (e.g., belief updating):

sequences = [
    [Dict(:prior => 0.5, :likelihood => 0.8, :target => 0.67), ...],
    ...
]

result = solve(policy_problem(
    variables = [:prior, :likelihood],
    evaluator = sequential_evaluator(
        sequences = sequences,
        target_key = :target,
        objectives = [:mse, :complexity]
    ),
    ...
))

Custom Evaluators

my_evaluator = (tree, env, evaluate_fn, count_nodes_fn) -> begin
    score = ...
    complexity = count_nodes_fn(tree)
    return [score, Float64(complexity)]
end

result = solve(policy_problem(
    variables = [...],
    evaluator = my_evaluator,
    n_objectives = 2,
    ...
))

Advanced Reference

SymbolicOptimization.deep_simplifyFunction
deep_simplify(tree::AbstractNode; expand::Bool=false) -> AbstractNode

Perform deep algebraic simplification using Symbolics.jl.

Converts the tree to a Symbolics expression, applies Symbolics.simplify, and converts back to an AbstractNode. This can resolve identities like (x + 1) * (x - 1) → x^2 - 1 that the built-in simplify cannot.

Keyword Arguments

  • expand::Bool=false: If true, also expand products into sums.

Requires using Symbolics.

Example

using Symbolics
tree = FunctionNode(:*, 
    FunctionNode(:+, Variable(:x), Constant(1.0)),
    FunctionNode(:-, Variable(:x), Constant(1.0)))
deep_simplify(tree)  # x^2 - 1
source
SymbolicOptimization.simplified_stringFunction
simplified_string(tree::AbstractNode; digits::Int=3) -> String

Return a simplified human-readable string representation of the tree using Symbolics.jl for algebraic simplification.

Requires using Symbolics.

source
SymbolicOptimization.simplify_piecewiseFunction
simplify_piecewise(tree::AbstractNode; indicator::Symbol=:step_func,
                   substitutions::Dict{Symbol,AbstractNode}=Dict{Symbol,AbstractNode}()) -> PiecewiseResult

Simplify a piecewise formula by separating branches and simplifying each independently.

Detects indicator(cond) * A + (1 - indicator(cond)) * B patterns (in any tree shape), substitutes indicator(cond) = 1 and indicator(cond) = 0 to extract branches, then uses deep_simplify on each branch.

Keyword Arguments

  • indicator::Symbol=:step_func: Name of the step/indicator function in the tree.
  • substitutions::Dict{Symbol,AbstractNode}: Variable identities to apply before simplifying, e.g. Dict(:pnotH_notE => FunctionNode(:-, Constant(1.0), Variable(:pH_notE))). This lets Symbolics.jl exploit domain constraints like P(¬H|¬E) = 1 - P(H|¬E). See also complement_vars for a convenient way to build these.

Requires using Symbolics.

Example

using Symbolics

# With domain substitutions via complement_vars helper
result = simplify_piecewise(tree, substitutions = complement_vars(
    :pnotH_notE => :pH_notE,
    :pnotH_E    => :pH_E,
))
println(result)
# Piecewise formula:
#   When (pH_E - pH_notE) ≥ 0:
#     (pH_E - pH_notE) * pH_notE / (1 - pH_notE)
#   When (pH_E - pH_notE) < 0:
#     (pH_E - pH_notE) * (1 - pH_notE) / pH_notE
source
SymbolicOptimization.PiecewiseResultType
PiecewiseResult

Result of simplify_piecewise, containing the separated and simplified branches.

Fields

  • condition::AbstractNode: The condition tree (from inside step_func(...))
  • condition_string::String: Human-readable condition
  • if_branch::AbstractNode: Simplified tree for the condition ≥ 0 case
  • else_branch::AbstractNode: Simplified tree for the condition < 0 case
  • if_string::String: Simplified string for the if-branch
  • else_string::String: Simplified string for the else-branch
  • if_latex::String: LaTeX for the if-branch
  • else_latex::String: LaTeX for the else-branch
source
SymbolicOptimization.complement_varsFunction
complement_vars(pairs::Pair{Symbol,Symbol}...) -> Dict{Symbol, AbstractNode}

Build a substitution dictionary for complementary probability variables.

Each pair a => b creates the substitution a = 1 - b. This is useful for telling simplify_piecewise about identities like P(¬H|¬E) = 1 - P(H|¬E).

Example

subs = complement_vars(:pnotH_notE => :pH_notE, :pnotH_E => :pH_E)
# Equivalent to:
# Dict(:pnotH_notE => FunctionNode(:-, Constant(1.0), Variable(:pH_notE)),
#      :pnotH_E    => FunctionNode(:-, Constant(1.0), Variable(:pH_E)))

result = simplify_piecewise(tree, substitutions = subs)
source
SymbolicOptimization.to_symbolicsFunction
to_symbolics(tree::AbstractNode) -> Symbolics.Num

Convert an AbstractNode expression tree into a Symbolics.Num expression.

Safe operators (e.g., safe_div, safe_log) are mapped to their standard mathematical equivalents so that Symbolics.jl can reason about them.

Requires using Symbolics.

source
SymbolicOptimization.from_symbolicsFunction
from_symbolics(expr) -> AbstractNode

Convert a Symbolics.Num expression back into an AbstractNode tree.

Standard mathematical operators are preserved (not mapped to safe variants), so the resulting tree is best used for display or analysis rather than inside the GP loop.

Requires using Symbolics.

source
SymbolicOptimization.Constraints.ConstraintType
Constraint

A constraint that candidate expressions must satisfy.

Fields:

  • name: Identifier for the constraint
  • test_fn: Function (tree, evaluate_fn) -> (satisfied::Bool, violation_score::Float64)
  • description: Human-readable description
source
SymbolicOptimization.Constraints.directionality_constraintFunction
directionality_constraint(; n_tests=100, seed=42) -> Constraint

Confirmation measure should be:

  • Positive when P(H|E) > P(H) (E confirms H)
  • Negative when P(H|E) < P(H) (E disconfirms H)
  • Zero when P(H|E) = P(H) (E is irrelevant to H)

Uses probabilistic test cases to check this property.

source
SymbolicOptimization.Constraints.logicality_constraintFunction
logicality_constraint(; n_tests=50, seed=42) -> Constraint

Confirmation measure should achieve:

  • Maximum value when E logically entails H (E ⊨ H, so P(H|E) = 1)
  • Minimum value when E logically entails ¬H (E ⊨ ¬H, so P(H|E) = 0)
source
SymbolicOptimization.Constraints.symmetry_constraintFunction
symmetry_constraint(; type=:equivalence, n_tests=50, seed=42) -> Constraint

Symmetry constraints for confirmation measures:

  • :equivalence - C(H,E) = C(¬H,¬E) (Eells-Fitelson symmetry)
  • :sign - sign(C(H,E)) = -sign(C(¬H,E))
  • :commutativity - C(H,E) = C(E,H) (controversial, often rejected)
source
SymbolicOptimization.DSL.policy_problemFunction
policy_problem(; kwargs...) -> PolicyProblem

Create a symbolic policy optimization problem.

Required Arguments

  • variables: Vector of variable names (Symbols)
  • evaluator: Function (tree, env, evaluate_fn, count_nodes_fn) -> Vector{Float64} that computes objective values for a candidate expression

Optional Arguments

  • binary_operators: Binary operators (default: [+, -, *, /])
  • unary_operators: Unary operators (default: [])
  • ternary_operators: Ternary operators for conditionals (default: [])
  • constants: Tuple (min, max) for random constants (default: (-1.0, 1.0))
  • constant_prob: Probability of generating constants (default: 0.3)
  • n_objectives: Number of objectives returned by evaluator (default: 2)
  • environment: Dict of data/parameters passed to evaluator (default: empty)
  • constraints: ConstraintSet for theoretical requirements (default: nothing)
  • seed_formulas: Vector of functions that build seed expressions (see example below)
  • population, generations, max_depth, max_nodes, seed, verbose: Config

Example with conditionals (for piecewise functions like z measure)

# Search for confirmation measures that can have piecewise structure
result = solve(policy_problem(
    variables = [:pH, :pE, :pH_E, :pE_H, :pE_notH, ...],
    evaluator = my_evaluator,
    # Enable conditionals to discover measures like Crupi's z
    ternary_operators = [:ifelse],
    # Optional: add step function for alternative conditional constructions
    unary_operators = [safe_step, safe_abs],
    constant_prob = 0.0,  # No arbitrary constants
    ...
))

The seed_formulas functions receive a NamedTuple of variable nodes and should return an expression tree built using +, -, *, / operators.

source
SymbolicOptimization.Evaluators.discrimination_evaluatorFunction
discrimination_evaluator(; simulator, n_simulations=1000, objectives=[:auc, :complexity])

Create an evaluator for discrimination problems where formulas produce scores that should separate positive from negative cases.

Arguments

  • simulator: Function (rng) -> (inputs::Dict{Symbol,Float64}, label::Bool) that generates one trial with input variables and ground truth label
  • n_simulations: Number of simulation trials per evaluation
  • objectives: Which objectives to compute. Options:
    • :auc - Area under ROC curve (maximized, so we return 1-AUC)
    • :complexity - Number of nodes in tree
    • :correlation - Correlation between scores and labels

Returns

A function (tree, env) -> Vector{Float64} suitable for use in policy_problem.

Example

# Simulator for confirmation tracking
function conf_simulator(rng)
    # Generate random probability setup
    ...
    inputs = Dict(:pH => pH, :pE => pE, :pH_E => pH_E, ...)
    label = H_is_true
    return (inputs, label)
end

evaluator = discrimination_evaluator(
    simulator = conf_simulator,
    n_simulations = 1000,
    objectives = [:auc, :complexity]
)
source
SymbolicOptimization.Evaluators.sequential_evaluatorFunction
sequential_evaluator(; sequences, target_key, objectives=[:mse, :complexity])

Create an evaluator for sequential problems where formulas are applied step-by-step.

Arguments

  • sequences: Vector of sequences, each sequence is Vector{Dict{Symbol,Any}} representing steps with input variables. Must include target_key for comparison.
  • target_key: Symbol for the target value to predict at each step
  • state_keys: Symbols that carry state between steps (updated with formula output)
  • objectives: Which objectives to compute

Example (belief updating)

# Each sequence is a series of (prior, evidence, posterior) steps
sequences = [
    [Dict(:prior => 0.5, :likelihood => 0.8, :target => 0.67), ...],
    ...
]

evaluator = sequential_evaluator(
    sequences = sequences,
    target_key = :target,
    objectives = [:mse, :complexity]
)
source
SymbolicOptimization.Evaluators.calibration_evaluatorFunction
calibration_evaluator(; objectives=[:brier, :complexity])

Create an evaluator for probability aggregation problems.

Expects environment to contain:

  • :X - Matrix of forecaster predictions (rows = items, cols = forecasters)
  • :y - Vector of ground truth (0/1)
  • :var_names - Variable names for each forecaster column

Objectives

  • :brier - Brier score
  • :log_score - Logarithmic scoring rule
  • :accuracy - Classification accuracy at 0.5 threshold
  • :complexity - Number of nodes
source