OptimizationSession¶

The main class for Bayesian optimization workflows. Orchestrates variable space definition, data management, model training, and acquisition function execution.

Class Reference¶

`alchemist_core.session.OptimizationSession(search_space=None, experiment_manager=None, event_emitter=None, session_metadata=None)` ¶

High-level interface for Bayesian optimization workflows.

This class orchestrates the complete optimization loop: 1. Define search space 2. Load/add experimental data 3. Train surrogate model 4. Run acquisition to suggest next experiments 5. Iterate

Example

from alchemist_core import OptimizationSession

Create session with search space¶

session = OptimizationSession() session.add_variable('temperature', 'real', bounds=(300, 500)) session.add_variable('pressure', 'real', bounds=(1, 10)) session.add_variable('catalyst', 'categorical', categories=['A', 'B', 'C'])

Load experimental data¶

session.load_data('experiments.csv', target_columns='yield')

Train model¶

session.train_model(backend='botorch', kernel='Matern')

Suggest next experiment¶

next_point = session.suggest_next(strategy='EI', goal='maximize') print(next_point)

Initialize optimization session.

Parameters:

Name	Type	Description	Default
`search_space`	`Optional[SearchSpace]`	Pre-configured SearchSpace object (optional)	`None`
`experiment_manager`	`Optional[ExperimentManager]`	Pre-configured ExperimentManager (optional)	`None`
`event_emitter`	`Optional[EventEmitter]`	EventEmitter for progress notifications (optional)	`None`
`session_metadata`	`Optional[SessionMetadata]`	Pre-configured session metadata (optional)	`None`

`add_variable(name, var_type, **kwargs)` ¶

Add a variable to the search space.

Parameters:

Name	Type	Description	Default
`name`	`str`	Variable name	required
`var_type`	`str`	Type ('real', 'integer', 'categorical')	required
`**kwargs`		Type-specific parameters: - For 'real'/'integer': bounds=(min, max) or min=..., max=... - For 'categorical': categories=[list of values] or values=[list]	`{}`

Example

session.add_variable('temp', 'real', bounds=(300, 500)) session.add_variable('catalyst', 'categorical', categories=['A', 'B'])

`get_search_space_summary()` ¶

Get summary of current search space.

Returns:

Type	Description
`Dict[str, Any]`	Dictionary with variable information

`generate_initial_design(method='lhs', n_points=10, random_seed=None, **kwargs)` ¶

Generate initial experimental design (Design of Experiments).

Creates a set of experimental conditions to evaluate before starting Bayesian optimization. This does NOT add the experiments to the session - you must evaluate them and add the results using add_experiment().

Supported methods: - 'random': Uniform random sampling - 'lhs': Latin Hypercube Sampling (recommended, good space-filling properties) - 'sobol': Sobol quasi-random sequences (low discrepancy) - 'halton': Halton sequences - 'hammersly': Hammersly sequences (low discrepancy)

Parameters:

Name	Type	Description	Default
`method`	`str`	Sampling strategy to use	`'lhs'`
`n_points`	`int`	Number of points to generate	`10`
`random_seed`	`Optional[int]`	Random seed for reproducibility	`None`
`**kwargs`		Additional method-specific parameters: - lhs_criterion: For LHS method ("maximin", "correlation", "ratio")	`{}`

Returns:

Type	Description
`List[Dict[str, Any]]`	List of dictionaries with variable names and values (no outputs)

Example

Generate initial design¶

points = session.generate_initial_design('lhs', n_points=10)

Run experiments and add results¶

for point in points: output = run_experiment(**point) # Your experiment function session.add_experiment(point, output=output)

Now ready to train model¶

session.train_model()

`load_data(filepath, target_columns='Output', noise_column=None)` ¶

Load experimental data from CSV file.

Parameters:

Name	Type	Description	Default
`filepath`	`str`	Path to CSV file	required
`target_columns`	`Union[str, List[str]]`	Target column name(s). Can be: - String for single-objective: 'yield' - List for multi-objective: ['yield', 'selectivity'] Default: 'Output'	`'Output'`
`noise_column`	`Optional[str]`	Optional column with measurement noise/uncertainty	`None`

Examples:

Single-objective:

>>> session.load_data('experiments.csv', target_columns='yield')
>>> session.load_data('experiments.csv', target_columns=['yield'])  # also works

Multi-objective:

>>> session.load_data('experiments.csv', target_columns=['yield', 'selectivity'])

Note

If the CSV doesn't have columns matching target_columns, an error will be raised. Target columns will be preserved with their original names internally.

`add_experiment(inputs, output, noise=None, iteration=None, reason=None)` ¶

Add a single experiment to the dataset.

Parameters:

Name	Type	Description	Default
`inputs`	`Dict[str, Any]`	Dictionary mapping variable names to values	required
`output`	`float`	Target/output value	required
`noise`	`Optional[float]`	Optional measurement uncertainty	`None`
`iteration`	`Optional[int]`	Iteration number (auto-assigned if None)	`None`
`reason`	`Optional[str]`	Reason for this experiment (e.g., 'Manual', 'Expected Improvement')	`None`

Example

session.add_experiment( ... inputs={'temperature': 350, 'catalyst': 'A'}, ... output=0.85, ... reason='Manual' ... )

`get_data_summary()` ¶

Get summary of current experimental data.

Returns:

Type	Description
`Dict[str, Any]`	Dictionary with data statistics

`add_staged_experiment(inputs)` ¶

Add an experiment to the staging area (awaiting evaluation).

Staged experiments are typically suggested by acquisition functions but not yet evaluated. They can be retrieved, evaluated externally, and then added to the dataset with add_experiment().

Parameters:

Name	Type	Description	Default
`inputs`	`Dict[str, Any]`	Dictionary mapping variable names to values	required

Example

Generate suggestions and stage them¶

suggestions = session.suggest_next(n_suggestions=3) for point in suggestions.to_dict('records'): session.add_staged_experiment(point)

Later, evaluate and add¶

staged = session.get_staged_experiments() for point in staged: output = run_experiment(**point) session.add_experiment(point, output=output) session.clear_staged_experiments()

`get_staged_experiments()` ¶

Get all staged experiments awaiting evaluation.

Returns:

Type	Description
`List[Dict[str, Any]]`	List of experiment input dictionaries

`clear_staged_experiments()` ¶

Clear all staged experiments.

Returns:

Type	Description
`int`	Number of experiments cleared

`move_staged_to_experiments(outputs, noises=None, iteration=None, reason=None)` ¶

Evaluate staged experiments and add them to the dataset in batch.

Convenience method that pairs staged inputs with outputs and adds them all to the experiment manager, then clears the staging area.

Parameters:

Name	Type	Description	Default
`outputs`	`List[float]`	List of output values (must match length of staged experiments)	required
`noises`	`Optional[List[float]]`	Optional list of measurement uncertainties	`None`
`iteration`	`Optional[int]`	Iteration number for all experiments (auto-assigned if None)	`None`
`reason`	`Optional[str]`	Reason for these experiments (e.g., 'Expected Improvement')	`None`

Returns:

Type	Description
`int`	Number of experiments added

Example

Stage some experiments¶

session.add_staged_experiment({'x': 1.0, 'y': 2.0}) session.add_staged_experiment({'x': 3.0, 'y': 4.0})

Evaluate them¶

outputs = [run_experiment(**point) for point in session.get_staged_experiments()]

Add to dataset and clear staging¶

session.move_staged_to_experiments(outputs, reason='LogEI')

`train_model(backend='sklearn', kernel='Matern', kernel_params=None, **kwargs)` ¶

Train surrogate model on current data.

Parameters:

Name	Type	Description	Default
`backend`	`str`	'sklearn' or 'botorch'	`'sklearn'`
`kernel`	`str`	Kernel type ('RBF', 'Matern', 'RationalQuadratic')	`'Matern'`
`kernel_params`	`Optional[Dict]`	Additional kernel parameters (e.g., {'nu': 2.5} for Matern)	`None`
`**kwargs`		Backend-specific parameters	`{}`

Returns:

Type	Description
`Dict[str, Any]`	Dictionary with training results and hyperparameters

Example

results = session.train_model(backend='botorch', kernel='Matern') print(results['metrics'])

`get_model_summary()` ¶

Get summary of trained model.

Returns:

Type	Description
`Optional[Dict[str, Any]]`	Dictionary with model information, or None if no model trained

`suggest_next(strategy='EI', goal='maximize', n_suggestions=1, ref_point=None, **kwargs)` ¶

Suggest next experiment(s) using acquisition function.

Parameters:

Name	Type	Description	Default
`strategy`	`str`	Acquisition strategy - 'EI': Expected Improvement - 'PI': Probability of Improvement - 'UCB': Upper Confidence Bound - 'LogEI': Log Expected Improvement (BoTorch only) - 'LogPI': Log Probability of Improvement (BoTorch only) - 'qEI', 'qUCB', 'qIPV': Batch acquisition (BoTorch only) - 'qEHVI', 'qNEHVI': Multi-objective acquisition (BoTorch only)	`'EI'`
`goal`	`Union[str, List[str]]`	'maximize' or 'minimize' (str), or list of per-objective directions	`'maximize'`
`n_suggestions`	`int`	Number of suggestions (batch acquisition)	`1`
`ref_point`	`Optional[List[float]]`	Reference point for MOBO hypervolume (list of floats, optional)	`None`
`**kwargs`		Strategy-specific parameters: Sklearn backend: - xi (float): Exploration parameter for EI/PI (default: 0.01) - kappa (float): Exploration parameter for UCB (default: 1.96) BoTorch backend: - beta (float): Exploration parameter for UCB (default: 0.5) - mc_samples (int): Monte Carlo samples for batch acquisition (default: 128)	`{}`

Returns:

Type	Description
`DataFrame`	DataFrame with suggested experiment(s)

Examples:

>>> # Single-objective
>>> next_point = session.suggest_next(strategy='EI', goal='maximize')

>>> # Multi-objective
>>> suggestions = session.suggest_next(
...     strategy='qNEHVI',
...     goal=['maximize', 'maximize'],
...     ref_point=[0.0, 0.0]
... )

`predict(inputs)` ¶

Make predictions at new points.

Parameters:

Name	Type	Description	Default
`inputs`	`DataFrame`	DataFrame with input features	required

Returns:

Type	Description
`Union[Tuple[ndarray, ndarray], Dict[str, Tuple[ndarray, ndarray]]]`	Single-objective: Tuple of (predictions, uncertainties)
`Union[Tuple[ndarray, ndarray], Dict[str, Tuple[ndarray, ndarray]]]`	Multi-objective: dict[str, tuple[ndarray, ndarray]] keyed by objective name

`save_session(filepath)` ¶

Save complete session state to JSON file.

Saves all session data including: - Session metadata (name, description, tags) - Search space definition - Experimental data - Trained model state (if available) - Complete audit log

Parameters:

Name	Type	Description	Default
`filepath`	`str`	Path to save session file (.json extension recommended)	required

Example

session.save_session("~/ALchemist_Sessions/catalyst_study_nov2025.json")

`load_session(filepath=None, retrain_on_load=True)` ¶

Load session from JSON file.

This method works both as a static method (creating a new session) and as an instance method (loading into existing session):

Static usage (returns new session): > session = OptimizationSession.load_session("my_session.json")

Instance usage (loads into existing session): > session = OptimizationSession() > session.load_session("my_session.json") > # session.experiment_manager.df is now populated

Parameters:

Name	Type	Description	Default
`filepath`	`str`	Path to session file (required when called as static method, can be self when called as instance method)	`None`
`retrain_on_load`	`bool`	Whether to retrain model if config exists (default: True)	`True`

Returns:

Type	Description
`OptimizationSession`	OptimizationSession (new or modified instance)

ExperimentManager¶

Manages experimental data storage and retrieval.

`alchemist_core.data.experiment_manager.ExperimentManager(search_space=None, target_columns=None)` ¶

Class for storing and managing experimental data in a consistent way across backends. Provides methods for data access, saving/loading, and conversion to formats needed by different backends.

Supports both single-objective and multi-objective optimization: - Single-objective: Uses single target column (default: 'Output', but configurable) - Multi-objective: Uses multiple target columns specified in target_columns attribute

The target_column parameter allows flexible column naming to support various CSV formats.

`add_experiment(point_dict, output_value=None, noise_value=None, iteration=None, reason=None)` ¶

Add a single experiment point.

Parameters:

Name	Type	Description	Default
`point_dict`	`Dict[str, Union[float, str, int]]`	Dictionary with variable names as keys and values	required
`output_value`	`Optional[float]`	The experiment output/target value (if known)	`None`
`noise_value`	`Optional[float]`	Optional observation noise/uncertainty value for regularization	`None`
`iteration`	`Optional[int]`	Iteration number (auto-assigned if None)	`None`
`reason`	`Optional[str]`	Reason for this experiment (e.g., 'Initial Design (LHS)', 'Expected Improvement')	`None`

`get_data()` ¶

Get the raw experiment data.

`get_features_and_target()` ¶

Get features (X) and target (y) separated.

Returns:

Name	Type	Description
`X`	`DataFrame`	Features DataFrame
`y`	`Series`	Target Series

Raises:

Type	Description
`ValueError`	If configured target column is not found in data

`has_noise_data()` ¶

Check if the experiment data includes noise values.

`from_csv(filepath, search_space=None)` `classmethod` ¶

Class method to create an ExperimentManager from a CSV file.

OptimizationSession¶

Class Reference¶

alchemist_core.session.OptimizationSession(search_space=None, experiment_manager=None, event_emitter=None, session_metadata=None) ¶

Create session with search space¶

Load experimental data¶

Train model¶

Suggest next experiment¶

add_variable(name, var_type, **kwargs) ¶

get_search_space_summary() ¶

generate_initial_design(method='lhs', n_points=10, random_seed=None, **kwargs) ¶

Generate initial design¶

Run experiments and add results¶

Now ready to train model¶

load_data(filepath, target_columns='Output', noise_column=None) ¶

add_experiment(inputs, output, noise=None, iteration=None, reason=None) ¶

get_data_summary() ¶

add_staged_experiment(inputs) ¶

Generate suggestions and stage them¶

Later, evaluate and add¶

get_staged_experiments() ¶

clear_staged_experiments() ¶

move_staged_to_experiments(outputs, noises=None, iteration=None, reason=None) ¶

Stage some experiments¶

Evaluate them¶

Add to dataset and clear staging¶

train_model(backend='sklearn', kernel='Matern', kernel_params=None, **kwargs) ¶

get_model_summary() ¶

suggest_next(strategy='EI', goal='maximize', n_suggestions=1, ref_point=None, **kwargs) ¶

predict(inputs) ¶

save_session(filepath) ¶

load_session(filepath=None, retrain_on_load=True) ¶

Related Classes¶

ExperimentManager¶

alchemist_core.data.experiment_manager.ExperimentManager(search_space=None, target_columns=None) ¶

add_experiment(point_dict, output_value=None, noise_value=None, iteration=None, reason=None) ¶

get_data() ¶

get_features_and_target() ¶

has_noise_data() ¶

from_csv(filepath, search_space=None) classmethod ¶

See Also¶

`alchemist_core.session.OptimizationSession(search_space=None, experiment_manager=None, event_emitter=None, session_metadata=None)` ¶

`add_variable(name, var_type, **kwargs)` ¶

`get_search_space_summary()` ¶

`generate_initial_design(method='lhs', n_points=10, random_seed=None, **kwargs)` ¶

`load_data(filepath, target_columns='Output', noise_column=None)` ¶

`add_experiment(inputs, output, noise=None, iteration=None, reason=None)` ¶

`get_data_summary()` ¶

`add_staged_experiment(inputs)` ¶

`get_staged_experiments()` ¶

`clear_staged_experiments()` ¶

`move_staged_to_experiments(outputs, noises=None, iteration=None, reason=None)` ¶

`train_model(backend='sklearn', kernel='Matern', kernel_params=None, **kwargs)` ¶

`get_model_summary()` ¶

`suggest_next(strategy='EI', goal='maximize', n_suggestions=1, ref_point=None, **kwargs)` ¶

`predict(inputs)` ¶

`save_session(filepath)` ¶

`load_session(filepath=None, retrain_on_load=True)` ¶

`alchemist_core.data.experiment_manager.ExperimentManager(search_space=None, target_columns=None)` ¶

`add_experiment(point_dict, output_value=None, noise_value=None, iteration=None, reason=None)` ¶

`get_data()` ¶

`get_features_and_target()` ¶

`has_noise_data()` ¶

`from_csv(filepath, search_space=None)` `classmethod` ¶