OptimizationSession¶
The main class for Bayesian optimization workflows. Orchestrates variable space definition, data management, model training, and acquisition function execution.
Class Reference¶
alchemist_core.session.OptimizationSession(search_space=None, experiment_manager=None, event_emitter=None, session_metadata=None)
¶
High-level interface for Bayesian optimization workflows.
This class orchestrates the complete optimization loop: 1. Define search space 2. Load/add experimental data 3. Train surrogate model 4. Run acquisition to suggest next experiments 5. Iterate
Example
from alchemist_core import OptimizationSession
Create session with search space¶
session = OptimizationSession() session.add_variable('temperature', 'real', bounds=(300, 500)) session.add_variable('pressure', 'real', bounds=(1, 10)) session.add_variable('catalyst', 'categorical', categories=['A', 'B', 'C'])
Load experimental data¶
session.load_data('experiments.csv', target_columns='yield')
Train model¶
session.train_model(backend='botorch', kernel='Matern')
Suggest next experiment¶
next_point = session.suggest_next(strategy='EI', goal='maximize') print(next_point)
Initialize optimization session.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
search_space
|
Optional[SearchSpace]
|
Pre-configured SearchSpace object (optional) |
None
|
experiment_manager
|
Optional[ExperimentManager]
|
Pre-configured ExperimentManager (optional) |
None
|
event_emitter
|
Optional[EventEmitter]
|
EventEmitter for progress notifications (optional) |
None
|
session_metadata
|
Optional[SessionMetadata]
|
Pre-configured session metadata (optional) |
None
|
add_variable(name, var_type, **kwargs)
¶
Add a variable to the search space.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Variable name |
required |
var_type
|
str
|
Type ('real', 'integer', 'categorical') |
required |
**kwargs
|
Type-specific parameters: - For 'real'/'integer': bounds=(min, max) or min=..., max=... - For 'categorical': categories=[list of values] or values=[list] |
{}
|
Example
session.add_variable('temp', 'real', bounds=(300, 500)) session.add_variable('catalyst', 'categorical', categories=['A', 'B'])
get_search_space_summary()
¶
Get summary of current search space.
Returns:
| Type | Description |
|---|---|
Dict[str, Any]
|
Dictionary with variable information |
generate_initial_design(method='lhs', n_points=10, random_seed=None, **kwargs)
¶
Generate initial experimental design (Design of Experiments).
Creates a set of experimental conditions to evaluate before starting Bayesian optimization. This does NOT add the experiments to the session - you must evaluate them and add the results using add_experiment().
Supported methods: - 'random': Uniform random sampling - 'lhs': Latin Hypercube Sampling (recommended, good space-filling properties) - 'sobol': Sobol quasi-random sequences (low discrepancy) - 'halton': Halton sequences - 'hammersly': Hammersly sequences (low discrepancy)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
method
|
str
|
Sampling strategy to use |
'lhs'
|
n_points
|
int
|
Number of points to generate |
10
|
random_seed
|
Optional[int]
|
Random seed for reproducibility |
None
|
**kwargs
|
Additional method-specific parameters: - lhs_criterion: For LHS method ("maximin", "correlation", "ratio") |
{}
|
Returns:
| Type | Description |
|---|---|
List[Dict[str, Any]]
|
List of dictionaries with variable names and values (no outputs) |
Example
Generate initial design¶
points = session.generate_initial_design('lhs', n_points=10)
Run experiments and add results¶
for point in points: output = run_experiment(**point) # Your experiment function session.add_experiment(point, output=output)
Now ready to train model¶
session.train_model()
load_data(filepath, target_columns='Output', noise_column=None)
¶
Load experimental data from CSV file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filepath
|
str
|
Path to CSV file |
required |
target_columns
|
Union[str, List[str]]
|
Target column name(s). Can be: - String for single-objective: 'yield' - List for multi-objective: ['yield', 'selectivity'] Default: 'Output' |
'Output'
|
noise_column
|
Optional[str]
|
Optional column with measurement noise/uncertainty |
None
|
Examples:
Single-objective:
>>> session.load_data('experiments.csv', target_columns='yield')
>>> session.load_data('experiments.csv', target_columns=['yield']) # also works
Multi-objective:
>>> session.load_data('experiments.csv', target_columns=['yield', 'selectivity'])
Note
If the CSV doesn't have columns matching target_columns, an error will be raised. Target columns will be preserved with their original names internally.
add_experiment(inputs, output, noise=None, iteration=None, reason=None)
¶
Add a single experiment to the dataset.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
inputs
|
Dict[str, Any]
|
Dictionary mapping variable names to values |
required |
output
|
float
|
Target/output value |
required |
noise
|
Optional[float]
|
Optional measurement uncertainty |
None
|
iteration
|
Optional[int]
|
Iteration number (auto-assigned if None) |
None
|
reason
|
Optional[str]
|
Reason for this experiment (e.g., 'Manual', 'Expected Improvement') |
None
|
Example
session.add_experiment( ... inputs={'temperature': 350, 'catalyst': 'A'}, ... output=0.85, ... reason='Manual' ... )
get_data_summary()
¶
Get summary of current experimental data.
Returns:
| Type | Description |
|---|---|
Dict[str, Any]
|
Dictionary with data statistics |
add_staged_experiment(inputs)
¶
Add an experiment to the staging area (awaiting evaluation).
Staged experiments are typically suggested by acquisition functions but not yet evaluated. They can be retrieved, evaluated externally, and then added to the dataset with add_experiment().
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
inputs
|
Dict[str, Any]
|
Dictionary mapping variable names to values |
required |
Example
Generate suggestions and stage them¶
suggestions = session.suggest_next(n_suggestions=3) for point in suggestions.to_dict('records'): session.add_staged_experiment(point)
Later, evaluate and add¶
staged = session.get_staged_experiments() for point in staged: output = run_experiment(**point) session.add_experiment(point, output=output) session.clear_staged_experiments()
get_staged_experiments()
¶
Get all staged experiments awaiting evaluation.
Returns:
| Type | Description |
|---|---|
List[Dict[str, Any]]
|
List of experiment input dictionaries |
clear_staged_experiments()
¶
Clear all staged experiments.
Returns:
| Type | Description |
|---|---|
int
|
Number of experiments cleared |
move_staged_to_experiments(outputs, noises=None, iteration=None, reason=None)
¶
Evaluate staged experiments and add them to the dataset in batch.
Convenience method that pairs staged inputs with outputs and adds them all to the experiment manager, then clears the staging area.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
outputs
|
List[float]
|
List of output values (must match length of staged experiments) |
required |
noises
|
Optional[List[float]]
|
Optional list of measurement uncertainties |
None
|
iteration
|
Optional[int]
|
Iteration number for all experiments (auto-assigned if None) |
None
|
reason
|
Optional[str]
|
Reason for these experiments (e.g., 'Expected Improvement') |
None
|
Returns:
| Type | Description |
|---|---|
int
|
Number of experiments added |
Example
Stage some experiments¶
session.add_staged_experiment({'x': 1.0, 'y': 2.0}) session.add_staged_experiment({'x': 3.0, 'y': 4.0})
Evaluate them¶
outputs = [run_experiment(**point) for point in session.get_staged_experiments()]
Add to dataset and clear staging¶
session.move_staged_to_experiments(outputs, reason='LogEI')
train_model(backend='sklearn', kernel='Matern', kernel_params=None, **kwargs)
¶
Train surrogate model on current data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
backend
|
str
|
'sklearn' or 'botorch' |
'sklearn'
|
kernel
|
str
|
Kernel type ('RBF', 'Matern', 'RationalQuadratic') |
'Matern'
|
kernel_params
|
Optional[Dict]
|
Additional kernel parameters (e.g., {'nu': 2.5} for Matern) |
None
|
**kwargs
|
Backend-specific parameters |
{}
|
Returns:
| Type | Description |
|---|---|
Dict[str, Any]
|
Dictionary with training results and hyperparameters |
Example
results = session.train_model(backend='botorch', kernel='Matern') print(results['metrics'])
get_model_summary()
¶
Get summary of trained model.
Returns:
| Type | Description |
|---|---|
Optional[Dict[str, Any]]
|
Dictionary with model information, or None if no model trained |
suggest_next(strategy='EI', goal='maximize', n_suggestions=1, ref_point=None, **kwargs)
¶
Suggest next experiment(s) using acquisition function.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
strategy
|
str
|
Acquisition strategy - 'EI': Expected Improvement - 'PI': Probability of Improvement - 'UCB': Upper Confidence Bound - 'LogEI': Log Expected Improvement (BoTorch only) - 'LogPI': Log Probability of Improvement (BoTorch only) - 'qEI', 'qUCB', 'qIPV': Batch acquisition (BoTorch only) - 'qEHVI', 'qNEHVI': Multi-objective acquisition (BoTorch only) |
'EI'
|
goal
|
Union[str, List[str]]
|
'maximize' or 'minimize' (str), or list of per-objective directions |
'maximize'
|
n_suggestions
|
int
|
Number of suggestions (batch acquisition) |
1
|
ref_point
|
Optional[List[float]]
|
Reference point for MOBO hypervolume (list of floats, optional) |
None
|
**kwargs
|
Strategy-specific parameters: Sklearn backend: - xi (float): Exploration parameter for EI/PI (default: 0.01) - kappa (float): Exploration parameter for UCB (default: 1.96) BoTorch backend: - beta (float): Exploration parameter for UCB (default: 0.5) - mc_samples (int): Monte Carlo samples for batch acquisition (default: 128) |
{}
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame with suggested experiment(s) |
Examples:
>>> # Single-objective
>>> next_point = session.suggest_next(strategy='EI', goal='maximize')
>>> # Multi-objective
>>> suggestions = session.suggest_next(
... strategy='qNEHVI',
... goal=['maximize', 'maximize'],
... ref_point=[0.0, 0.0]
... )
predict(inputs)
¶
Make predictions at new points.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
inputs
|
DataFrame
|
DataFrame with input features |
required |
Returns:
| Type | Description |
|---|---|
Union[Tuple[ndarray, ndarray], Dict[str, Tuple[ndarray, ndarray]]]
|
Single-objective: Tuple of (predictions, uncertainties) |
Union[Tuple[ndarray, ndarray], Dict[str, Tuple[ndarray, ndarray]]]
|
Multi-objective: dict[str, tuple[ndarray, ndarray]] keyed by objective name |
save_session(filepath)
¶
Save complete session state to JSON file.
Saves all session data including: - Session metadata (name, description, tags) - Search space definition - Experimental data - Trained model state (if available) - Complete audit log
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filepath
|
str
|
Path to save session file (.json extension recommended) |
required |
Example
session.save_session("~/ALchemist_Sessions/catalyst_study_nov2025.json")
load_session(filepath=None, retrain_on_load=True)
¶
Load session from JSON file.
This method works both as a static method (creating a new session) and as an instance method (loading into existing session):
Static usage (returns new session): > session = OptimizationSession.load_session("my_session.json")
Instance usage (loads into existing session): > session = OptimizationSession() > session.load_session("my_session.json") > # session.experiment_manager.df is now populated
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filepath
|
str
|
Path to session file (required when called as static method, can be self when called as instance method) |
None
|
retrain_on_load
|
bool
|
Whether to retrain model if config exists (default: True) |
True
|
Returns:
| Type | Description |
|---|---|
OptimizationSession
|
OptimizationSession (new or modified instance) |
Related Classes¶
ExperimentManager¶
Manages experimental data storage and retrieval.
alchemist_core.data.experiment_manager.ExperimentManager(search_space=None, target_columns=None)
¶
Class for storing and managing experimental data in a consistent way across backends. Provides methods for data access, saving/loading, and conversion to formats needed by different backends.
Supports both single-objective and multi-objective optimization: - Single-objective: Uses single target column (default: 'Output', but configurable) - Multi-objective: Uses multiple target columns specified in target_columns attribute
The target_column parameter allows flexible column naming to support various CSV formats.
add_experiment(point_dict, output_value=None, noise_value=None, iteration=None, reason=None)
¶
Add a single experiment point.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
point_dict
|
Dict[str, Union[float, str, int]]
|
Dictionary with variable names as keys and values |
required |
output_value
|
Optional[float]
|
The experiment output/target value (if known) |
None
|
noise_value
|
Optional[float]
|
Optional observation noise/uncertainty value for regularization |
None
|
iteration
|
Optional[int]
|
Iteration number (auto-assigned if None) |
None
|
reason
|
Optional[str]
|
Reason for this experiment (e.g., 'Initial Design (LHS)', 'Expected Improvement') |
None
|
get_data()
¶
Get the raw experiment data.
get_features_and_target()
¶
Get features (X) and target (y) separated.
Returns:
| Name | Type | Description |
|---|---|---|
X |
DataFrame
|
Features DataFrame |
y |
Series
|
Target Series |
Raises:
| Type | Description |
|---|---|
ValueError
|
If configured target column is not found in data |
has_noise_data()
¶
Check if the experiment data includes noise values.
from_csv(filepath, search_space=None)
classmethod
¶
Class method to create an ExperimentManager from a CSV file.
See Also¶
- SearchSpace - Variable space management
- Models - Gaussian Process models
- Acquisition - Acquisition functions