hyperopt#


iter_param_combinations(hyper_param_values: Dict[str, Sequence[Any]]) Generator[Dict[str, Any], None, None][source]#

Create all possible combinations of values from a dictionary of possible parameter values

Parameters:

hyper_param_values – a mapping from parameter names to lists of possible values

Returns:

a dictionary mapping each parameter name to one of the values

class ParameterCombinationSkipDecider[source]#

Bases: ABC

Abstraction for a functional component which is told all parameter combinations that have been considered and can use these as a basis for deciding whether another parameter combination shall be skipped/not be considered.

abstract tell(params: Dict[str, Any], metrics: Dict[str, Any])[source]#

Informs the decider about a previously evaluated parameter combination

Parameters:
  • params – the parameter combination

  • metrics – the evaluation metrics

abstract is_skipped(params: Dict[str, Any])[source]#

Decides whether the given parameter combination shall be skipped

Parameters:

params

Returns:

True iff it shall be skipped

class ParameterCombinationEquivalenceClassValueCache[source]#

Bases: ABC

Represents a cache which stores (arbitrary) values for parameter combinations, i.e. keys in the cache are derived from parameter combinations. The cache may map the equivalent parameter combinations to the same keys to indicate that the parameter combinations are equivalent; the keys thus correspond to representations of equivalence classes over parameter combinations. This enables hyper-parameter search to skip the re-computation of results for equivalent parameter combinations.

set(params: Dict[str, Any], value: Any)[source]#
get(params: Dict[str, Any])[source]#

Gets the value associated with the (equivalence class of the) parameter combination :param params: the parameter combination :return:

class ParametersMetricsCollection(csv_path=None, sort_column_name=None, ascending=True, incremental=False)[source]#

Bases: object

Utility class for holding and persisting evaluation results

Parameters:
  • csv_path – path to save the data frame to upon every update

  • sort_column_name – the column name by which to sort the data frame that is collected; if None, do not sort

  • ascending – whether to sort in ascending order; has an effect only if sortColumnName is not None

  • incremental – whether to add to an existing CSV file instead of overwriting it

add_values(values: Dict[str, Any])[source]#

Adds the provided values as a new row to the collection. If csvPath was provided in the constructor, saves the updated collection to that file.

Parameters:

values – Dict holding the evaluation results and parameters

Returns:

get_data_frame() DataFrame[source]#
contains(values: Dict[str, Any])[source]#
class GridSearch(model_factory: Callable[[...], VectorModel], parameter_options: Union[Dict[str, Sequence[Any]], List[Dict[str, Sequence[Any]]]], num_processes=1, csv_results_path: Optional[str] = None, incremental=False, incremental_skip_existing=False, parameter_combination_skip_decider: Optional[ParameterCombinationSkipDecider] = None, model_save_directory: Optional[str] = None, name: Optional[str] = None)[source]#

Bases: TrackingMixin

Instances of this class can be used for evaluating models with different user-provided parametrizations over the same data and persisting the results

Parameters:
  • model_factory – the function to call with keyword arguments reflecting the parameters to try in order to obtain a model instance

  • parameter_options – a dictionary which maps from parameter names to lists of possible values - or a list of such dictionaries, where each dictionary in the list has the same keys

  • num_processes – the number of parallel processes to use for the search (use 1 to run without multi-processing)

  • csv_results_path – the path to a directory or concrete CSV file to which the results shall be written; if it is None, no CSV data will be written; if it is a directory, a file name starting with this grid search’s name (see below) will be created. The resulting CSV data will contain one line per evaluated parameter combination.

  • incremental – whether to add to an existing CSV file instead of overwriting it

  • incremental_skip_existing – if incremental mode is on, whether to skip any parameter combinations that are already present in the CSV file

  • parameter_combination_skip_decider – an instance to which parameters combinations can be passed in order to decide whether the combination shall be skipped (e.g. because it is redundant/equivalent to another combination or inadmissible)

  • model_save_directory – the directory where the serialized models shall be saved; if None, models are not saved

  • name – the name of this grid search, which will, in particular, be prepended to all saved model files; if None, a default name will be generated of the form “gridSearch_<timestamp>”

log = <Logger sensai.hyperopt.GridSearch (WARNING)>#
run(metrics_evaluator: MetricsDictProvider, sort_column_name=None, ascending=True) Result[source]#

Run the grid search. If csvResultsPath was provided in the constructor, each evaluation result will be saved to that file directly after being computed

Parameters:
  • metrics_evaluator – the evaluator or cross-validator with which to evaluate models

  • sort_column_name – the name of the metric (column) by which to sort the data frame of results; if None, do not sort. Note that all Metric instances have a static member name, e.g. you could use RegressionMetricMSE.name.

  • ascending – whether to sort in ascending order; has an effect only if sort_column_name is specified. The result object will assume, by default, that the resulting top/first element is the best, i.e. ascending=False means “higher is better”, and ascending=True means “Lower is better”.

Returns:

an object holding the results

class Result(df: DataFrame, param_names: List[str], default_metric_name: Optional[str] = None, default_higher_is_better: Optional[bool] = None)[source]#

Bases: object

class BestParams(metric_name: str, metric_value: float, params: dict)[source]#

Bases: object

metric_name: str#
metric_value: float#
params: dict#
get_best_params(metric_name: Optional[str] = None, higher_is_better: Optional[bool] = None) BestParams[source]#
Parameters:
  • metric_name – the metric name for which to return the best result; can be None if the GridSearch used a metric to sort by

  • higher_is_better – whether higher is better for the metric to sort by; can be None if the GridSearch use a metric to sort by and configured the sort order such that the best configuration is at the top

Returns:

a pair (d, v) where d dictionary with the best parameters found during the grid search and v is the corresponding metric value

class SAHyperOpt(model_factory: Callable[[...], VectorModel], ops_and_weights: List[Tuple[Callable[[State], ParameterChangeOperator], float]], initial_parameters: Dict[str, Any], metrics_evaluator: MetricsDictProvider, metric_to_optimise: str, minimise_metric: bool = False, collect_data_frame: bool = True, csv_results_path: Optional[str] = None, parameter_combination_equivalence_class_value_cache: Optional[ParameterCombinationEquivalenceClassValueCache] = None, p0: float = 0.5, p1: float = 0.0)[source]#

Bases: TrackingMixin

Parameters:
  • model_factory – a factory for the generation of models which is called with the current parameter combination (all keyword arguments), initially initialParameters

  • ops_and_weights – a sequence of tuples (operator factory, operator weight) for simulated annealing

  • initial_parameters – the initial parameter combination

  • metrics_evaluator – the evaluator/validator to use in order to evaluate models

  • metric_to_optimise – the name of the metric (as generated by the evaluator/validator) to optimise

  • minimise_metric – whether the metric is to be minimised; if False, maximise the metric

  • collect_data_frame – whether to collect (and regularly log) the data frame of all parameter combinations and evaluation results

  • csv_results_path – the (optional) path of a CSV file in which to store a table of all computed results; if this is not None, then collectDataFrame is automatically set to True

  • parameter_combination_equivalence_class_value_cache – a cache in which to store computed results and whose notion of equivalence can be used to avoid duplicate computations

  • p0 – the initial probability (at the start of the optimisation) of accepting a state with an inferior evaluation to the current state’s (for the mean observed evaluation delta)

  • p1 – the final probability (at the end of the optimisation) of accepting a state with an inferior evaluation to the current state’s (for the mean observed evaluation delta)

log = <Logger sensai.hyperopt.SAHyperOpt (WARNING)>#
class State(params: Dict[str, Any], random_state: Random, results: Dict, compute_metric: Callable[[Dict[str, Any]], float])[source]#

Bases: SAState

compute_cost_value() SACostValueNumeric[source]#

Computes the costs of this state (from scratch)

get_state_representation()[source]#

Returns a compact state representation (for the purpose of archiving a hitherto best result), which can later be applied via applyStateRepresentation.

Returns:

a compact state representation of some sort

apply_state_representation(representation)[source]#

Applies the given state representation (as returned via getStateRepresentation) in order for the optimisation result to be obtained by the user. Note that the function does not necessarily need to change this state to reflect the representation, as its sole purpose is for the optimsation result to be obtainable thereafter (it is not used during the optimisation process as such).

Parameters:

representation – a representation as returned by getStateRepresentation

class ParameterChangeOperator(state: State)[source]#

Bases: SAOperator[State]

Parameters:

state – the state to which the operator is applied

apply_state_change(params)[source]#

Applies the operator to the state, i.e. it makes the changes to the state only (and does not update the associated costs)

Parameters:

params – the parameters with which the operator is to be applied

Returns:

cost_delta(params) SACostValue[source]#

Computes the cost change that would apply when applying this operator with the given parameters

Parameters:

params – an arbitrary list of parameters (specific to the concrete operator)

Returns:

choose_params() Optional[Tuple[Tuple, Optional[SACostValue]]][source]#

Chooses parameters for the application of the operator (e.g. randomly or greedily).

Returns:

a tuple (params, costValue) or None if no suitable parameters are found, where params is the list of chosen parameters and costValue is either an instance of CostValue or None if the costs have not been computed.

run(max_steps: Optional[int] = None, duration: Optional[float] = None, random_seed: int = 42, collect_stats: bool = True)[source]#
get_simulated_annealing() SimulatedAnnealing[source]#