hyperopt#
Source code: sensai/hyperopt.py
- iter_param_combinations(hyper_param_values: Dict[str, Sequence[Any]]) Generator[Dict[str, Any], None, None] [source]#
Create all possible combinations of values from a dictionary of possible parameter values
- Parameters:
hyper_param_values – a mapping from parameter names to lists of possible values
- Returns:
a dictionary mapping each parameter name to one of the values
- class ParameterCombinationSkipDecider[source]#
Bases:
ABC
Abstraction for a functional component which is told all parameter combinations that have been considered and can use these as a basis for deciding whether another parameter combination shall be skipped/not be considered.
- class ParameterCombinationEquivalenceClassValueCache[source]#
Bases:
ABC
Represents a cache which stores (arbitrary) values for parameter combinations, i.e. keys in the cache are derived from parameter combinations. The cache may map the equivalent parameter combinations to the same keys to indicate that the parameter combinations are equivalent; the keys thus correspond to representations of equivalence classes over parameter combinations. This enables hyper-parameter search to skip the re-computation of results for equivalent parameter combinations.
- class ParametersMetricsCollection(csv_path=None, sort_column_name=None, ascending=True, incremental=False)[source]#
Bases:
object
Utility class for holding and persisting evaluation results
- Parameters:
csv_path – path to save the data frame to upon every update
sort_column_name – the column name by which to sort the data frame that is collected; if None, do not sort
ascending – whether to sort in ascending order; has an effect only if sortColumnName is not None
incremental – whether to add to an existing CSV file instead of overwriting it
- class GridSearch(model_factory: Callable[[...], VectorModel], parameter_options: Union[Dict[str, Sequence[Any]], List[Dict[str, Sequence[Any]]]], num_processes=1, csv_results_path: Optional[str] = None, incremental=False, incremental_skip_existing=False, parameter_combination_skip_decider: Optional[ParameterCombinationSkipDecider] = None, model_save_directory: Optional[str] = None, name: Optional[str] = None)[source]#
Bases:
TrackingMixin
Instances of this class can be used for evaluating models with different user-provided parametrizations over the same data and persisting the results
- Parameters:
model_factory – the function to call with keyword arguments reflecting the parameters to try in order to obtain a model instance
parameter_options – a dictionary which maps from parameter names to lists of possible values - or a list of such dictionaries, where each dictionary in the list has the same keys
num_processes – the number of parallel processes to use for the search (use 1 to run without multi-processing)
csv_results_path – the path to a directory or concrete CSV file to which the results shall be written; if it is None, no CSV data will be written; if it is a directory, a file name starting with this grid search’s name (see below) will be created. The resulting CSV data will contain one line per evaluated parameter combination.
incremental – whether to add to an existing CSV file instead of overwriting it
incremental_skip_existing – if incremental mode is on, whether to skip any parameter combinations that are already present in the CSV file
parameter_combination_skip_decider – an instance to which parameters combinations can be passed in order to decide whether the combination shall be skipped (e.g. because it is redundant/equivalent to another combination or inadmissible)
model_save_directory – the directory where the serialized models shall be saved; if None, models are not saved
name – the name of this grid search, which will, in particular, be prepended to all saved model files; if None, a default name will be generated of the form “gridSearch_<timestamp>”
- log = <Logger sensai.hyperopt.GridSearch (WARNING)>#
- run(metrics_evaluator: MetricsDictProvider, sort_column_name=None, ascending=True) Result [source]#
Run the grid search. If csvResultsPath was provided in the constructor, each evaluation result will be saved to that file directly after being computed
- Parameters:
metrics_evaluator – the evaluator or cross-validator with which to evaluate models
sort_column_name – the name of the metric (column) by which to sort the data frame of results; if None, do not sort. Note that all Metric instances have a static member name, e.g. you could use RegressionMetricMSE.name.
ascending – whether to sort in ascending order; has an effect only if sort_column_name is specified. The result object will assume, by default, that the resulting top/first element is the best, i.e. ascending=False means “higher is better”, and ascending=True means “Lower is better”.
- Returns:
an object holding the results
- class Result(df: DataFrame, param_names: List[str], default_metric_name: Optional[str] = None, default_higher_is_better: Optional[bool] = None)[source]#
Bases:
object
- class BestParams(metric_name: str, metric_value: float, params: dict)[source]#
Bases:
object
- metric_name: str#
- metric_value: float#
- params: dict#
- get_best_params(metric_name: Optional[str] = None, higher_is_better: Optional[bool] = None) BestParams [source]#
- Parameters:
metric_name – the metric name for which to return the best result; can be None if the GridSearch used a metric to sort by
higher_is_better – whether higher is better for the metric to sort by; can be None if the GridSearch use a metric to sort by and configured the sort order such that the best configuration is at the top
- Returns:
a pair (d, v) where d dictionary with the best parameters found during the grid search and v is the corresponding metric value
- class SAHyperOpt(model_factory: Callable[[...], VectorModel], ops_and_weights: List[Tuple[Callable[[State], ParameterChangeOperator], float]], initial_parameters: Dict[str, Any], metrics_evaluator: MetricsDictProvider, metric_to_optimise: str, minimise_metric: bool = False, collect_data_frame: bool = True, csv_results_path: Optional[str] = None, parameter_combination_equivalence_class_value_cache: Optional[ParameterCombinationEquivalenceClassValueCache] = None, p0: float = 0.5, p1: float = 0.0)[source]#
Bases:
TrackingMixin
- Parameters:
model_factory – a factory for the generation of models which is called with the current parameter combination (all keyword arguments), initially initialParameters
ops_and_weights – a sequence of tuples (operator factory, operator weight) for simulated annealing
initial_parameters – the initial parameter combination
metrics_evaluator – the evaluator/validator to use in order to evaluate models
metric_to_optimise – the name of the metric (as generated by the evaluator/validator) to optimise
minimise_metric – whether the metric is to be minimised; if False, maximise the metric
collect_data_frame – whether to collect (and regularly log) the data frame of all parameter combinations and evaluation results
csv_results_path – the (optional) path of a CSV file in which to store a table of all computed results; if this is not None, then collectDataFrame is automatically set to True
parameter_combination_equivalence_class_value_cache – a cache in which to store computed results and whose notion of equivalence can be used to avoid duplicate computations
p0 – the initial probability (at the start of the optimisation) of accepting a state with an inferior evaluation to the current state’s (for the mean observed evaluation delta)
p1 – the final probability (at the end of the optimisation) of accepting a state with an inferior evaluation to the current state’s (for the mean observed evaluation delta)
- log = <Logger sensai.hyperopt.SAHyperOpt (WARNING)>#
- class State(params: Dict[str, Any], random_state: Random, results: Dict, compute_metric: Callable[[Dict[str, Any]], float])[source]#
Bases:
SAState
- compute_cost_value() SACostValueNumeric [source]#
Computes the costs of this state (from scratch)
- get_state_representation()[source]#
Returns a compact state representation (for the purpose of archiving a hitherto best result), which can later be applied via applyStateRepresentation.
- Returns:
a compact state representation of some sort
- apply_state_representation(representation)[source]#
Applies the given state representation (as returned via getStateRepresentation) in order for the optimisation result to be obtained by the user. Note that the function does not necessarily need to change this state to reflect the representation, as its sole purpose is for the optimsation result to be obtainable thereafter (it is not used during the optimisation process as such).
- Parameters:
representation – a representation as returned by getStateRepresentation
- class ParameterChangeOperator(state: State)[source]#
Bases:
SAOperator
[State
]- Parameters:
state – the state to which the operator is applied
- apply_state_change(params)[source]#
Applies the operator to the state, i.e. it makes the changes to the state only (and does not update the associated costs)
- Parameters:
params – the parameters with which the operator is to be applied
- Returns:
- cost_delta(params) SACostValue [source]#
Computes the cost change that would apply when applying this operator with the given parameters
- Parameters:
params – an arbitrary list of parameters (specific to the concrete operator)
- Returns:
- choose_params() Optional[Tuple[Tuple, Optional[SACostValue]]] [source]#
Chooses parameters for the application of the operator (e.g. randomly or greedily).
- Returns:
a tuple (params, costValue) or None if no suitable parameters are found, where params is the list of chosen parameters and costValue is either an instance of CostValue or None if the costs have not been computed.
- run(max_steps: Optional[int] = None, duration: Optional[float] = None, random_seed: int = 42, collect_stats: bool = True)[source]#
- get_simulated_annealing() SimulatedAnnealing [source]#