evaluator

evaluator#

Source code: sensai/evaluation/evaluator.py

class MetricsDictProvider[source]#

Bases: TrackingMixin, ABC

compute_metrics(model, **kwargs) → Optional[Dict[str, float]][source]#

Computes metrics for the given model, typically by fitting the model and applying it to test data. If a tracked experiment was previously set, the metrics are tracked with the string representation of the model added under an additional key ‘str(model)’.

Parameters:

model – the model for which to compute metrics
kwargs – parameters to pass on to the underlying evaluation method

Returns:

a dictionary with metrics values

class MetricsDictProviderFromFunction(compute_metrics_fn: Callable[[VectorModel], Dict[str, float]])[source]#: Bases: MetricsDictProvider

class VectorModelEvaluationData(stats_dict: Dict[str, TEvalStats], io_data: InputOutputData, model: VectorModelBase)[source]#

Bases: ABC, Generic[TEvalStats]

Parameters:

stats_dict – a dictionary mapping from output variable name to the evaluation statistics object
io_data – the input/output data that was used to produce the results
model – the model that was used to produce predictions

property model_name#

property input_data#

get_eval_stats(predicted_var_name=None) → TEvalStats[source]#

get_data_frame()[source]#

Returns an DataFrame with all evaluation metrics (one row per output variable)

Returns:: a DataFrame containing evaluation metrics

iter_input_output_ground_truth_tuples(predicted_var_name=None) → Generator[Tuple[PandasNamedTuple, Any, Any], None, None][source]#

class VectorRegressionModelEvaluationData(stats_dict: Dict[str, TEvalStats], io_data: InputOutputData, model: VectorModelBase)[source]#

Bases: VectorModelEvaluationData[RegressionEvalStats]

Parameters:

stats_dict – a dictionary mapping from output variable name to the evaluation statistics object
io_data – the input/output data that was used to produce the results
model – the model that was used to produce predictions

get_eval_stats_collection()[source]#

to_data_frame(modify_input_df: bool = False, output_col_name_override: Optional[str] = None)[source]#

Creates a data frame with all inputs, predictions and prediction errors. For each predicted variable “y”, there will be columns “y_predicted”, “y_true”, “y_error” and “y_abs_error”. If there is only a single predicted variable, the variable can be renamed for convenience.

The resulting data frame can be conveniently queried and analysed using class ResultSet.

Parameters:

modify_input_df – whether to modify the input data frame in-place to generate the data frame (instead of copying it). This can be reasonable in cases where the data is very large.
output_col_name_override – overrides the output column name. For example, if this is set to “y”, then the columns named in the description above will be present in the data frame.

Returns:

a data frame containing all inputs, outputs and prediction errors

create_result_set(modify_input_df: bool = False, output_col_name_override: ~typing.Optional[str] = None, regression_result_set_factory: ~typing.Callable[[pandas.DataFrame, ~typing.List[str]], ~sensai.evaluation.result_set.RegressionResultSet] = <class 'sensai.evaluation.result_set.RegressionResultSet'>) → RegressionResultSet[source]#

Creates a queryable result set from the prediction results which can be used, in particular, for interactive analyses.

The result set will contain a data frame, and for each predicted variable “y”, there will be columns “y_predicted”, “y_true”, “y_error” and “y_abs_error” in this data frame. If there is only a single predicted variable, the variable can be renamed for convenience.

The resulting data frame can be conveniently queried and analysed using class ResultSet.

Parameters:

modify_input_df – whether to modify the input data frame in-place to generate the data frame (instead of copying it). This can be reasonable in cases where the data is very large.
output_col_name_override – overrides the output column name. For example, if this is set to “y”, then the columns named in the description above will be present in the data frame.

Returns:

a data frame containing all inputs, outputs and prediction errors

Returns:

the result set

class EvaluatorParams(data_splitter: Optional[DataSplitter] = None, fractional_split_test_fraction: Optional[float] = None, fractional_split_random_seed=42, fractional_split_shuffle=True)[source]#

Bases: ToStringMixin, ABC

Parameters:

data_splitter – [if test data must be obtained via split] a splitter to use in order to obtain; if None, must specify fractionalSplitTestFraction for fractional split (default)
fractional_split_test_fraction – [if test data must be obtained via split, dataSplitter is None] the fraction of the data to use for testing/evaluation;
fractional_split_random_seed – [if test data must be obtained via split, dataSplitter is none] the random seed to use for the fractional split of the data
fractional_split_shuffle – [if test data must be obtained via split, dataSplitter is None] whether to randomly (based on randomSeed) shuffle the dataset before splitting it

get_data_splitter() → DataSplitter[source]#

set_data_splitter(splitter: DataSplitter)[source]#

class VectorModelEvaluator(data: InputOutputData, test_data: Optional[InputOutputData] = None, params: Optional[EvaluatorParams] = None)[source]#

Bases: MetricsDictProvider, Generic[TEvalData], ABC

Constructs an evaluator with test and training data.

Parameters:

data – the full data set, or, if test_data is given, the training data
test_data – the data to use for testing/evaluation; if None, must specify appropriate parameters to define splitting
params – the parameters

set_tracked_experiment(tracked_experiment: TrackedExperiment)[source]#

Sets a tracked experiment which will result in metrics being saved whenever computeMetrics is called or evalModel is called with track=True.

Parameters:: tracked_experiment – the experiment in which to track evaluation metrics.

eval_model(model: Union[VectorModelBase, VectorModel], on_training_data=False, track=True, fit=False) → TEvalData[source]#

Evaluates the given model

Parameters:

model – the model to evaluate
on_training_data – if True, evaluate on this evaluator’s training data rather than the held-out test data
track – whether to track the evaluation metrics for the case where a tracked experiment was set on this object
fit – whether to fit the model before evaluating it (via this object’s fit_model method); if enabled, the model must support fitting

Returns:

the evaluation result

create_metrics_dict_provider(predicted_var_name: Optional[str]) → MetricsDictProvider[source]#

Creates a metrics dictionary provider, e.g. for use in hyperparameter optimisation

Parameters:: predicted_var_name – the name of the predicted variable for which to obtain evaluation metrics; may be None only if the model outputs but a single predicted variable
Returns:: a metrics dictionary provider instance for the given variable

fit_model(model: VectorModel)[source]#: Fits the given model’s parameters using this evaluator’s training data

class RegressionEvaluatorParams(data_splitter: Optional[DataSplitter] = None, fractional_split_test_fraction: Optional[float] = None, fractional_split_random_seed=42, fractional_split_shuffle=True, metrics: Optional[Sequence[RegressionMetric]] = None, additional_metrics: Optional[Sequence[RegressionMetric]] = None, output_data_frame_transformer: Optional[DataFrameTransformer] = None)[source]#

Bases: EvaluatorParams

Parameters:

data_splitter – [if test data must be obtained via split] a splitter to use in order to obtain; if None, must specify fractionalSplitTestFraction for fractional split (default)
fractional_split_test_fraction – [if dataSplitter is None, test data must be obtained via split] the fraction of the data to use for testing/evaluation;
fractional_split_random_seed – [if dataSplitter is none, test data must be obtained via split] the random seed to use for the fractional split of the data
fractional_split_shuffle – [if dataSplitter is None, test data must be obtained via split] whether to randomly (based on randomSeed) shuffle the dataset before splitting it
metrics – regression metrics to apply. If None, default regression metrics are used.
additional_metrics – additional regression metrics to apply
output_data_frame_transformer – a data frame transformer to apply to all output data frames (both model outputs and ground truth), such that evaluation metrics are computed on the transformed data frame

classmethod from_dict_or_instance(params: Optional[Union[Dict[str, Any], RegressionEvaluatorParams]]) → RegressionEvaluatorParams[source]#

class VectorRegressionModelEvaluatorParams(*args, **kwargs)[source]#

Bases: RegressionEvaluatorParams

Parameters:

data_splitter – [if test data must be obtained via split] a splitter to use in order to obtain; if None, must specify fractionalSplitTestFraction for fractional split (default)
fractional_split_test_fraction – [if dataSplitter is None, test data must be obtained via split] the fraction of the data to use for testing/evaluation;
fractional_split_random_seed – [if dataSplitter is none, test data must be obtained via split] the random seed to use for the fractional split of the data
fractional_split_shuffle – [if dataSplitter is None, test data must be obtained via split] whether to randomly (based on randomSeed) shuffle the dataset before splitting it
metrics – regression metrics to apply. If None, default regression metrics are used.
additional_metrics – additional regression metrics to apply
output_data_frame_transformer – a data frame transformer to apply to all output data frames (both model outputs and ground truth), such that evaluation metrics are computed on the transformed data frame

class VectorRegressionModelEvaluator(data: InputOutputData, test_data: Optional[InputOutputData] = None, params: Optional[RegressionEvaluatorParams] = None)[source]#

Bases: VectorModelEvaluator[VectorRegressionModelEvaluationData]

Constructs an evaluator with test and training data.

Parameters:

data – the full data set, or, if testData is given, the training data
test_data – the data to use for testing/evaluation; if None, must specify appropriate parameters to define splitting
params – the parameters

compute_test_data_outputs(model: VectorModelBase) → Tuple[pandas.DataFrame, pandas.DataFrame][source]#

Applies the given model to the test data

Parameters:: model – the model to apply
Returns:: a pair (predictions, groundTruth)

class VectorClassificationModelEvaluationData(stats_dict: Dict[str, TEvalStats], io_data: InputOutputData, model: VectorModelBase)[source]#

Bases: VectorModelEvaluationData[ClassificationEvalStats]

Parameters:

stats_dict – a dictionary mapping from output variable name to the evaluation statistics object
io_data – the input/output data that was used to produce the results
model – the model that was used to produce predictions

get_misclassified_inputs_data_frame() → pandas.DataFrame[source]#

get_misclassified_triples_pred_true_input() → List[Tuple[Any, Any, pandas.Series]][source]#

Returns:: a list containing a triple (predicted class, true class, input series) for each misclassified data point

class ClassificationEvaluatorParams(data_splitter: Optional[DataSplitter] = None, fractional_split_test_fraction: Optional[float] = None, fractional_split_random_seed=42, fractional_split_shuffle=True, additional_metrics: Optional[Sequence[ClassificationMetric]] = None, compute_probabilities: bool = False, binary_positive_label: Optional[str] = ('__guess',))[source]#

Bases: EvaluatorParams

Parameters:

data_splitter – [if test data must be obtained via split] a splitter to use in order to obtain; if None, must specify fractionalSplitTestFraction for fractional split (default)
fractional_split_test_fraction – [if dataSplitter is None, test data must be obtained via split] the fraction of the data to use for testing/evaluation
fractional_split_random_seed – [if dataSplitter is none, test data must be obtained via split] the random seed to use for the fractional split of the data
fractional_split_shuffle – [if dataSplitter is None, test data must be obtained via split] whether to randomly (based on randomSeed) shuffle the dataset before splitting it
additional_metrics – additional metrics to apply
compute_probabilities – whether to compute class probabilities. Enabling this will enable many downstream computations and visualisations (e.g. precision-recall plots) but requires the model to support probability computation in general.
binary_positive_label – the positive class label for binary classification; if GUESS, try to detect from labels; if None, no detection (assume non-binary classification)

classmethod from_dict_or_instance(params: Optional[Union[Dict[str, Any], ClassificationEvaluatorParams]]) → ClassificationEvaluatorParams[source]#

class VectorClassificationModelEvaluatorParams(*args, **kwargs)[source]#

Bases: ClassificationEvaluatorParams

Parameters:

data_splitter – [if test data must be obtained via split] a splitter to use in order to obtain; if None, must specify fractionalSplitTestFraction for fractional split (default)
fractional_split_test_fraction – [if dataSplitter is None, test data must be obtained via split] the fraction of the data to use for testing/evaluation
fractional_split_random_seed – [if dataSplitter is none, test data must be obtained via split] the random seed to use for the fractional split of the data
fractional_split_shuffle – [if dataSplitter is None, test data must be obtained via split] whether to randomly (based on randomSeed) shuffle the dataset before splitting it
additional_metrics – additional metrics to apply
compute_probabilities – whether to compute class probabilities. Enabling this will enable many downstream computations and visualisations (e.g. precision-recall plots) but requires the model to support probability computation in general.
binary_positive_label – the positive class label for binary classification; if GUESS, try to detect from labels; if None, no detection (assume non-binary classification)

class VectorClassificationModelEvaluator(data: InputOutputData, test_data: Optional[InputOutputData] = None, params: Optional[ClassificationEvaluatorParams] = None)[source]#

Bases: VectorModelEvaluator[VectorClassificationModelEvaluationData]

Constructs an evaluator with test and training data.

Parameters:

data – the full data set, or, if test_data is given, the training data
test_data – the data to use for testing/evaluation; if None, must specify appropriate parameters to define splitting
params – the parameters

compute_test_data_outputs(model) → Tuple[pandas.DataFrame, pandas.DataFrame, pandas.DataFrame][source]#

Applies the given model to the test data

Parameters:: model – the model to apply
Returns:: a triple (predictions, predicted class probability vectors, groundTruth) of DataFrames

class RuleBasedVectorClassificationModelEvaluator(data: InputOutputData)[source]#

Bases: VectorClassificationModelEvaluator

Constructs an evaluator with test and training data.

Parameters:

data – the full data set, or, if test_data is given, the training data
test_data – the data to use for testing/evaluation; if None, must specify appropriate parameters to define splitting
params – the parameters

eval_model(model: VectorModelBase, on_training_data=False, track=True, fit=False) → VectorClassificationModelEvaluationData[source]#

Evaluate the rule based model. The training data and test data coincide, thus fitting the model will fit the model’s preprocessors on the full data set and evaluating it will evaluate the model on the same data set.

Parameters:

model – the model to evaluate
on_training_data – has to be False here. Setting to True is not supported and will lead to an exception
track – whether to track the evaluation metrics for the case where a tracked experiment was set on this object

Returns:

the evaluation result

class RuleBasedVectorRegressionModelEvaluator(data: InputOutputData)[source]#

Bases: VectorRegressionModelEvaluator

Constructs an evaluator with test and training data.

Parameters:

data – the full data set, or, if testData is given, the training data
test_data – the data to use for testing/evaluation; if None, must specify appropriate parameters to define splitting
params – the parameters

eval_model(model: Union[VectorModelBase, VectorModel], on_training_data=False, track=True, fit=False) → VectorRegressionModelEvaluationData[source]#

Evaluate the rule based model. The training data and test data coincide, thus fitting the model will fit the model’s preprocessors on the full data set and evaluating it will evaluate the model on the same data set.

Parameters:

model – the model to evaluate
on_training_data – has to be False here. Setting to True is not supported and will lead to an exception
track – whether to track the evaluation metrics for the case where a tracked experiment was set on this object

Returns:

the evaluation result