evaluator#
Source code: sensai/evaluation/evaluator.py
- class MetricsDictProvider[source]#
Bases:
TrackingMixin
,ABC
- compute_metrics(model, **kwargs) Optional[Dict[str, float]] [source]#
Computes metrics for the given model, typically by fitting the model and applying it to test data. If a tracked experiment was previously set, the metrics are tracked with the string representation of the model added under an additional key ‘str(model)’.
- Parameters:
model – the model for which to compute metrics
kwargs – parameters to pass on to the underlying evaluation method
- Returns:
a dictionary with metrics values
- class MetricsDictProviderFromFunction(compute_metrics_fn: Callable[[VectorModel], Dict[str, float]])[source]#
Bases:
MetricsDictProvider
- class VectorModelEvaluationData(stats_dict: Dict[str, TEvalStats], io_data: InputOutputData, model: VectorModelBase)[source]#
Bases:
ABC
,Generic
[TEvalStats
]- Parameters:
stats_dict – a dictionary mapping from output variable name to the evaluation statistics object
io_data – the input/output data that was used to produce the results
model – the model that was used to produce predictions
- property model_name#
- property input_data#
- get_data_frame()[source]#
Returns an DataFrame with all evaluation metrics (one row per output variable)
- Returns:
a DataFrame containing evaluation metrics
- iter_input_output_ground_truth_tuples(predicted_var_name=None) Generator[Tuple[PandasNamedTuple, Any, Any], None, None] [source]#
- class VectorRegressionModelEvaluationData(stats_dict: Dict[str, TEvalStats], io_data: InputOutputData, model: VectorModelBase)[source]#
Bases:
VectorModelEvaluationData
[RegressionEvalStats
]- Parameters:
stats_dict – a dictionary mapping from output variable name to the evaluation statistics object
io_data – the input/output data that was used to produce the results
model – the model that was used to produce predictions
- to_data_frame(modify_input_df: bool = False, output_col_name_override: Optional[str] = None)[source]#
Creates a data frame with all inputs, predictions and prediction errors. For each predicted variable “y”, there will be columns “y_predicted”, “y_true”, “y_error” and “y_abs_error”. If there is only a single predicted variable, the variable can be renamed for convenience.
The resulting data frame can be conveniently queried and analysed using class ResultSet.
- Parameters:
modify_input_df – whether to modify the input data frame in-place to generate the data frame (instead of copying it). This can be reasonable in cases where the data is very large.
output_col_name_override – overrides the output column name. For example, if this is set to “y”, then the columns named in the description above will be present in the data frame.
- Returns:
a data frame containing all inputs, outputs and prediction errors
- create_result_set(modify_input_df: bool = False, output_col_name_override: ~typing.Optional[str] = None, regression_result_set_factory: ~typing.Callable[[~pandas.core.frame.DataFrame, ~typing.List[str]], ~sensai.evaluation.result_set.RegressionResultSet] = <class 'sensai.evaluation.result_set.RegressionResultSet'>) RegressionResultSet [source]#
Creates a queryable result set from the prediction results which can be used, in particular, for interactive analyses.
The result set will contain a data frame, and for each predicted variable “y”, there will be columns “y_predicted”, “y_true”, “y_error” and “y_abs_error” in this data frame. If there is only a single predicted variable, the variable can be renamed for convenience.
The resulting data frame can be conveniently queried and analysed using class ResultSet.
- Parameters:
modify_input_df – whether to modify the input data frame in-place to generate the data frame (instead of copying it). This can be reasonable in cases where the data is very large.
output_col_name_override – overrides the output column name. For example, if this is set to “y”, then the columns named in the description above will be present in the data frame.
- Returns:
a data frame containing all inputs, outputs and prediction errors
- Returns:
the result set
- class EvaluatorParams(data_splitter: Optional[DataSplitter] = None, fractional_split_test_fraction: Optional[float] = None, fractional_split_random_seed=42, fractional_split_shuffle=True)[source]#
Bases:
ToStringMixin
,ABC
- Parameters:
data_splitter – [if test data must be obtained via split] a splitter to use in order to obtain; if None, must specify fractionalSplitTestFraction for fractional split (default)
fractional_split_test_fraction – [if test data must be obtained via split, dataSplitter is None] the fraction of the data to use for testing/evaluation;
fractional_split_random_seed – [if test data must be obtained via split, dataSplitter is none] the random seed to use for the fractional split of the data
fractional_split_shuffle – [if test data must be obtained via split, dataSplitter is None] whether to randomly (based on randomSeed) shuffle the dataset before splitting it
- get_data_splitter() DataSplitter [source]#
- set_data_splitter(splitter: DataSplitter)[source]#
- class VectorModelEvaluator(data: InputOutputData, test_data: Optional[InputOutputData] = None, params: Optional[EvaluatorParams] = None)[source]#
Bases:
MetricsDictProvider
,Generic
[TEvalData
],ABC
Constructs an evaluator with test and training data.
- Parameters:
data – the full data set, or, if test_data is given, the training data
test_data – the data to use for testing/evaluation; if None, must specify appropriate parameters to define splitting
params – the parameters
- set_tracked_experiment(tracked_experiment: TrackedExperiment)[source]#
Sets a tracked experiment which will result in metrics being saved whenever computeMetrics is called or evalModel is called with track=True.
- Parameters:
tracked_experiment – the experiment in which to track evaluation metrics.
- eval_model(model: Union[VectorModelBase, VectorModel], on_training_data=False, track=True, fit=False) TEvalData [source]#
Evaluates the given model
- Parameters:
model – the model to evaluate
on_training_data – if True, evaluate on this evaluator’s training data rather than the held-out test data
track – whether to track the evaluation metrics for the case where a tracked experiment was set on this object
fit – whether to fit the model before evaluating it (via this object’s fit_model method); if enabled, the model must support fitting
- Returns:
the evaluation result
- create_metrics_dict_provider(predicted_var_name: Optional[str]) MetricsDictProvider [source]#
Creates a metrics dictionary provider, e.g. for use in hyperparameter optimisation
- Parameters:
predicted_var_name – the name of the predicted variable for which to obtain evaluation metrics; may be None only if the model outputs but a single predicted variable
- Returns:
a metrics dictionary provider instance for the given variable
- fit_model(model: VectorModel)[source]#
Fits the given model’s parameters using this evaluator’s training data
- class RegressionEvaluatorParams(data_splitter: Optional[DataSplitter] = None, fractional_split_test_fraction: Optional[float] = None, fractional_split_random_seed=42, fractional_split_shuffle=True, metrics: Optional[Sequence[RegressionMetric]] = None, additional_metrics: Optional[Sequence[RegressionMetric]] = None, output_data_frame_transformer: Optional[DataFrameTransformer] = None)[source]#
Bases:
EvaluatorParams
- Parameters:
data_splitter – [if test data must be obtained via split] a splitter to use in order to obtain; if None, must specify fractionalSplitTestFraction for fractional split (default)
fractional_split_test_fraction – [if dataSplitter is None, test data must be obtained via split] the fraction of the data to use for testing/evaluation;
fractional_split_random_seed – [if dataSplitter is none, test data must be obtained via split] the random seed to use for the fractional split of the data
fractional_split_shuffle – [if dataSplitter is None, test data must be obtained via split] whether to randomly (based on randomSeed) shuffle the dataset before splitting it
metrics – regression metrics to apply. If None, default regression metrics are used.
additional_metrics – additional regression metrics to apply
output_data_frame_transformer – a data frame transformer to apply to all output data frames (both model outputs and ground truth), such that evaluation metrics are computed on the transformed data frame
- classmethod from_dict_or_instance(params: Optional[Union[Dict[str, Any], RegressionEvaluatorParams]]) RegressionEvaluatorParams [source]#
- class VectorRegressionModelEvaluatorParams(*args, **kwargs)[source]#
Bases:
RegressionEvaluatorParams
- Parameters:
data_splitter – [if test data must be obtained via split] a splitter to use in order to obtain; if None, must specify fractionalSplitTestFraction for fractional split (default)
fractional_split_test_fraction – [if dataSplitter is None, test data must be obtained via split] the fraction of the data to use for testing/evaluation;
fractional_split_random_seed – [if dataSplitter is none, test data must be obtained via split] the random seed to use for the fractional split of the data
fractional_split_shuffle – [if dataSplitter is None, test data must be obtained via split] whether to randomly (based on randomSeed) shuffle the dataset before splitting it
metrics – regression metrics to apply. If None, default regression metrics are used.
additional_metrics – additional regression metrics to apply
output_data_frame_transformer – a data frame transformer to apply to all output data frames (both model outputs and ground truth), such that evaluation metrics are computed on the transformed data frame
- class VectorRegressionModelEvaluator(data: InputOutputData, test_data: Optional[InputOutputData] = None, params: Optional[RegressionEvaluatorParams] = None)[source]#
Bases:
VectorModelEvaluator
[VectorRegressionModelEvaluationData
]Constructs an evaluator with test and training data.
- Parameters:
data – the full data set, or, if testData is given, the training data
test_data – the data to use for testing/evaluation; if None, must specify appropriate parameters to define splitting
params – the parameters
- compute_test_data_outputs(model: VectorModelBase) Tuple[DataFrame, DataFrame] [source]#
Applies the given model to the test data
- Parameters:
model – the model to apply
- Returns:
a pair (predictions, groundTruth)
- class VectorClassificationModelEvaluationData(stats_dict: Dict[str, TEvalStats], io_data: InputOutputData, model: VectorModelBase)[source]#
Bases:
VectorModelEvaluationData
[ClassificationEvalStats
]- Parameters:
stats_dict – a dictionary mapping from output variable name to the evaluation statistics object
io_data – the input/output data that was used to produce the results
model – the model that was used to produce predictions
- class ClassificationEvaluatorParams(data_splitter: Optional[DataSplitter] = None, fractional_split_test_fraction: Optional[float] = None, fractional_split_random_seed=42, fractional_split_shuffle=True, additional_metrics: Optional[Sequence[ClassificationMetric]] = None, compute_probabilities: bool = False, binary_positive_label: Optional[str] = ('__guess',))[source]#
Bases:
EvaluatorParams
- Parameters:
data_splitter – [if test data must be obtained via split] a splitter to use in order to obtain; if None, must specify fractionalSplitTestFraction for fractional split (default)
fractional_split_test_fraction – [if dataSplitter is None, test data must be obtained via split] the fraction of the data to use for testing/evaluation
fractional_split_random_seed – [if dataSplitter is none, test data must be obtained via split] the random seed to use for the fractional split of the data
fractional_split_shuffle – [if dataSplitter is None, test data must be obtained via split] whether to randomly (based on randomSeed) shuffle the dataset before splitting it
additional_metrics – additional metrics to apply
compute_probabilities – whether to compute class probabilities. Enabling this will enable many downstream computations and visualisations (e.g. precision-recall plots) but requires the model to support probability computation in general.
binary_positive_label – the positive class label for binary classification; if GUESS, try to detect from labels; if None, no detection (assume non-binary classification)
- classmethod from_dict_or_instance(params: Optional[Union[Dict[str, Any], ClassificationEvaluatorParams]]) ClassificationEvaluatorParams [source]#
- class VectorClassificationModelEvaluatorParams(*args, **kwargs)[source]#
Bases:
ClassificationEvaluatorParams
- Parameters:
data_splitter – [if test data must be obtained via split] a splitter to use in order to obtain; if None, must specify fractionalSplitTestFraction for fractional split (default)
fractional_split_test_fraction – [if dataSplitter is None, test data must be obtained via split] the fraction of the data to use for testing/evaluation
fractional_split_random_seed – [if dataSplitter is none, test data must be obtained via split] the random seed to use for the fractional split of the data
fractional_split_shuffle – [if dataSplitter is None, test data must be obtained via split] whether to randomly (based on randomSeed) shuffle the dataset before splitting it
additional_metrics – additional metrics to apply
compute_probabilities – whether to compute class probabilities. Enabling this will enable many downstream computations and visualisations (e.g. precision-recall plots) but requires the model to support probability computation in general.
binary_positive_label – the positive class label for binary classification; if GUESS, try to detect from labels; if None, no detection (assume non-binary classification)
- class VectorClassificationModelEvaluator(data: InputOutputData, test_data: Optional[InputOutputData] = None, params: Optional[ClassificationEvaluatorParams] = None)[source]#
Bases:
VectorModelEvaluator
[VectorClassificationModelEvaluationData
]Constructs an evaluator with test and training data.
- Parameters:
data – the full data set, or, if test_data is given, the training data
test_data – the data to use for testing/evaluation; if None, must specify appropriate parameters to define splitting
params – the parameters
- class RuleBasedVectorClassificationModelEvaluator(data: InputOutputData)[source]#
Bases:
VectorClassificationModelEvaluator
Constructs an evaluator with test and training data.
- Parameters:
data – the full data set, or, if test_data is given, the training data
test_data – the data to use for testing/evaluation; if None, must specify appropriate parameters to define splitting
params – the parameters
- eval_model(model: VectorModelBase, on_training_data=False, track=True, fit=False) VectorClassificationModelEvaluationData [source]#
Evaluate the rule based model. The training data and test data coincide, thus fitting the model will fit the model’s preprocessors on the full data set and evaluating it will evaluate the model on the same data set.
- Parameters:
model – the model to evaluate
on_training_data – has to be False here. Setting to True is not supported and will lead to an exception
track – whether to track the evaluation metrics for the case where a tracked experiment was set on this object
- Returns:
the evaluation result
- class RuleBasedVectorRegressionModelEvaluator(data: InputOutputData)[source]#
Bases:
VectorRegressionModelEvaluator
Constructs an evaluator with test and training data.
- Parameters:
data – the full data set, or, if testData is given, the training data
test_data – the data to use for testing/evaluation; if None, must specify appropriate parameters to define splitting
params – the parameters
- eval_model(model: Union[VectorModelBase, VectorModel], on_training_data=False, track=True, fit=False) VectorRegressionModelEvaluationData [source]#
Evaluate the rule based model. The training data and test data coincide, thus fitting the model will fit the model’s preprocessors on the full data set and evaluating it will evaluate the model on the same data set.
- Parameters:
model – the model to evaluate
on_training_data – has to be False here. Setting to True is not supported and will lead to an exception
track – whether to track the evaluation metrics for the case where a tracked experiment was set on this object
- Returns:
the evaluation result