eval_stats_classification#
Source code: sensai/evaluation/eval_stats/eval_stats_classification.py
- class ClassificationMetric(name: Optional[str] = None, bounds: Tuple[float, float] = (0, 1), requires_probabilities: Optional[bool] = None)[source]#
Bases:
Metric
[ClassificationEvalStats
],ABC
- Parameters:
name – the name of the metric; if None use the class’ name attribute
bounds – the minimum and maximum values the metric can take on
- requires_probabilities = False#
- compute_value_for_eval_stats(eval_stats: ClassificationEvalStats)[source]#
- compute_value(y_true: Union[ndarray, Series, DataFrame, list], y_predicted: Union[ndarray, Series, DataFrame, list], y_predicted_class_probabilities: Optional[Union[ndarray, Series, DataFrame, list]] = None)[source]#
- name: str#
- class ClassificationMetricAccuracy(name: Optional[str] = None, bounds: Tuple[float, float] = (0, 1), requires_probabilities: Optional[bool] = None)[source]#
Bases:
ClassificationMetric
- Parameters:
name – the name of the metric; if None use the class’ name attribute
bounds – the minimum and maximum values the metric can take on
- name: str = 'accuracy'#
- class ClassificationMetricBalancedAccuracy(name: Optional[str] = None, bounds: Tuple[float, float] = (0, 1), requires_probabilities: Optional[bool] = None)[source]#
Bases:
ClassificationMetric
- Parameters:
name – the name of the metric; if None use the class’ name attribute
bounds – the minimum and maximum values the metric can take on
- name: str = 'balancedAccuracy'#
- class ClassificationMetricAccuracyWithoutLabels(*labels: Any, probability_threshold=None, zero_value=0.0)[source]#
Bases:
ClassificationMetric
Accuracy score with set of data points limited to the ones where the ground truth label is not one of the given labels
- Parameters:
labels – one or more labels which are not to be considered (all data points where the ground truth is one of these labels will be ignored)
probability_threshold – a probability threshold: the probability of the most likely class must be at least this value for a data point to be considered in the metric computation (analogous to
ClassificationMetricAccuracyMaxProbabilityBeyondThreshold
)zero_value – the metric value to assume for the case where the condition never applies (no countable instances without the given label/beyond the given threshold)
- get_paired_metrics() List[TMetric] [source]#
Gets a list of metrics that should be considered together with this metric (e.g. for paired visualisations/plots). The direction of the pairing should be such that if this metric is “x”, the other is “y” for x-y type visualisations.
- Returns:
a list of metrics
- name: str#
- class ClassificationMetricGeometricMeanOfTrueClassProbability(name: Optional[str] = None, bounds: Tuple[float, float] = (0, 1), requires_probabilities: Optional[bool] = None)[source]#
Bases:
ClassificationMetric
- Parameters:
name – the name of the metric; if None use the class’ name attribute
bounds – the minimum and maximum values the metric can take on
- name: str = 'geoMeanTrueClassProb'#
- requires_probabilities = True#
- class ClassificationMetricTopNAccuracy(n: int)[source]#
Bases:
ClassificationMetric
- Parameters:
name – the name of the metric; if None use the class’ name attribute
bounds – the minimum and maximum values the metric can take on
- requires_probabilities = True#
- name: str#
- class ClassificationMetricAccuracyMaxProbabilityBeyondThreshold(threshold: float, zero_value=0.0)[source]#
Bases:
ClassificationMetric
Accuracy limited to cases where the probability of the most likely class is at least a given threshold
- Parameters:
threshold – minimum probability of the most likely class
zero_value – the value of the metric for the case where the probability of the most likely class never reaches the threshold
- requires_probabilities = True#
- get_paired_metrics() List[TMetric] [source]#
Gets a list of metrics that should be considered together with this metric (e.g. for paired visualisations/plots). The direction of the pairing should be such that if this metric is “x”, the other is “y” for x-y type visualisations.
- Returns:
a list of metrics
- name: str#
- class ClassificationMetricRelFreqMaxProbabilityBeyondThreshold(threshold: float)[source]#
Bases:
ClassificationMetric
Relative frequency of cases where the probability of the most likely class is at least a given threshold
- Parameters:
threshold – minimum probability of the most likely class
- requires_probabilities = True#
- name: str#
- class BinaryClassificationMetric(positive_class_label, name: Optional[str] = None)[source]#
Bases:
ClassificationMetric
,ABC
- Parameters:
name – the name of the metric; if None use the class’ name attribute
bounds – the minimum and maximum values the metric can take on
- name: str#
- class BinaryClassificationMetricPrecision(positive_class_label)[source]#
Bases:
BinaryClassificationMetric
- Parameters:
name – the name of the metric; if None use the class’ name attribute
bounds – the minimum and maximum values the metric can take on
- name: str = 'precision'#
- get_paired_metrics() List[BinaryClassificationMetric] [source]#
Gets a list of metrics that should be considered together with this metric (e.g. for paired visualisations/plots). The direction of the pairing should be such that if this metric is “x”, the other is “y” for x-y type visualisations.
- Returns:
a list of metrics
- class BinaryClassificationMetricRecall(positive_class_label)[source]#
Bases:
BinaryClassificationMetric
- Parameters:
name – the name of the metric; if None use the class’ name attribute
bounds – the minimum and maximum values the metric can take on
- name: str = 'recall'#
- class BinaryClassificationMetricF1Score(positive_class_label)[source]#
Bases:
BinaryClassificationMetric
- Parameters:
name – the name of the metric; if None use the class’ name attribute
bounds – the minimum and maximum values the metric can take on
- name: str = 'F1'#
- class BinaryClassificationMetricRecallForPrecision(precision: float, positive_class_label, zero_value=0.0)[source]#
Bases:
BinaryClassificationMetric
Computes the maximum recall that can be achieved (by varying the decision threshold) in cases where at least the given precision is reached. The given precision may not be achievable at all, in which case the metric value is
zeroValue
.- Parameters:
precision – the minimum precision value that must be reached
positive_class_label – the positive class label
zero_value – the value to return for the case where the minimum precision is never reached
- compute_value_for_eval_stats(eval_stats: ClassificationEvalStats)[source]#
- name: str#
- class BinaryClassificationMetricPrecisionThreshold(threshold: float, positive_class_label: Any, zero_value=0.0)[source]#
Bases:
BinaryClassificationMetric
Precision for the case where predictions are considered “positive” if predicted probability of the positive class is beyond the given threshold
- Parameters:
threshold – the minimum predicted probability of the positive class for the prediction to be considered “positive”
zero_value – the value of the metric for the case where a positive class probability beyond the threshold is never predicted (denominator = 0)
- requires_probabilities = True#
- get_paired_metrics() List[BinaryClassificationMetric] [source]#
Gets a list of metrics that should be considered together with this metric (e.g. for paired visualisations/plots). The direction of the pairing should be such that if this metric is “x”, the other is “y” for x-y type visualisations.
- Returns:
a list of metrics
- name: str#
- class BinaryClassificationMetricRecallThreshold(threshold: float, positive_class_label: Any, zero_value=0.0)[source]#
Bases:
BinaryClassificationMetric
Recall for the case where predictions are considered “positive” if predicted probability of the positive class is beyond the given threshold
- Parameters:
threshold – the minimum predicted probability of the positive class for the prediction to be considered “positive”
zero_value – the value of the metric for the case where there are no positive instances in the data set (denominator = 0)
- requires_probabilities = True#
- name: str#
- create_default_binary_classification_metrics(positive_class_label: Any) List[BinaryClassificationMetric] [source]#
- class ClassificationEvalStats(y_predicted: Optional[Union[ndarray, Series, DataFrame, list]] = None, y_true: Optional[Union[ndarray, Series, DataFrame, list]] = None, y_predicted_class_probabilities: Optional[DataFrame] = None, labels: Optional[Union[ndarray, Series, DataFrame, list]] = None, metrics: Optional[Sequence[ClassificationMetric]] = None, additional_metrics: Optional[Sequence[ClassificationMetric]] = None, binary_positive_label=('__guess',))[source]#
Bases:
PredictionEvalStats
[ClassificationMetric
]- Parameters:
y_predicted – the predicted class labels
y_true – the true class labels
y_predicted_class_probabilities – a data frame whose columns are the class labels and whose values are probabilities
labels – the list of class labels
metrics – the metrics to compute for evaluation; if None, use default metrics (see DEFAULT_MULTICLASS_CLASSIFICATION_METRICS and
create_default_binary_classification_metrics()
)additional_metrics – the metrics to additionally compute
binary_positive_label – the label of the positive class for the case where it is a binary classification, adding further binary metrics by default; if GUESS (default), check labels (if length 2) for occurrence of one of BINARY_CLASSIFICATION_POSITIVE_LABEL_CANDIDATES in the respective order and use the first one found (if any); if None, treat the problem as non-binary, regardless of the labels being used.
- get_confusion_matrix() ConfusionMatrix [source]#
- get_binary_classification_probability_threshold_variation_data() BinaryClassificationProbabilityThresholdVariationData [source]#
- class ClassificationEvalStatsCollection(eval_stats_list: List[ClassificationEvalStats])[source]#
Bases:
EvalStatsCollection
[ClassificationEvalStats
,ClassificationMetric
]- get_combined_eval_stats() ClassificationEvalStats [source]#
Combines the data from all contained EvalStats objects into a single object. Note that this is only possible if all EvalStats objects use the same set of class labels.
- Returns:
an EvalStats object that combines the data from all contained EvalStats objects
- class ConfusionMatrix(y_true: Union[ndarray, Series, DataFrame, list], y_predicted: Union[ndarray, Series, DataFrame, list])[source]#
Bases:
object
- class BinaryClassificationCounts(is_positive_prediction: Sequence[bool], is_positive_ground_truth: Sequence[bool], zero_denominator_metric_value: float = 0.0)[source]#
Bases:
object
- Parameters:
is_positive_prediction – the sequence of Booleans indicating whether the model predicted the positive class
is_positive_ground_truth – the sequence of Booleans indicating whether the true class is the positive class
zero_denominator_metric_value – the result to return for metrics such as precision and recall in case the denominator is zero (i.e. zero counted cases)
- classmethod from_probability_threshold(probabilities: Sequence[float], threshold: float, is_positive_ground_truth: Sequence[bool]) BinaryClassificationCounts [source]#
- classmethod from_eval_stats(eval_stats: ClassificationEvalStats, threshold=0.5) BinaryClassificationCounts [source]#
- class BinaryClassificationProbabilityThresholdVariationData(eval_stats: ClassificationEvalStats)[source]#
Bases:
object
- class ClassificationEvalStatsPlot(*args, **kwds)[source]#
Bases:
EvalStatsPlot
[ClassificationEvalStats
],ABC
- class ClassificationEvalStatsPlotConfusionMatrix(normalise=True)[source]#
Bases:
ClassificationEvalStatsPlot
- create_figure(eval_stats: ClassificationEvalStats, subtitle: str) Figure [source]#
- Parameters:
eval_stats – the evaluation stats from which to generate the plot
subtitle – the plot’s subtitle
- Returns:
the figure or None if this plot is not applicable/cannot be created
- class ClassificationEvalStatsPlotPrecisionRecall(*args, **kwds)[source]#
Bases:
ClassificationEvalStatsPlot
- create_figure(eval_stats: ClassificationEvalStats, subtitle: str) Optional[Figure] [source]#
- Parameters:
eval_stats – the evaluation stats from which to generate the plot
subtitle – the plot’s subtitle
- Returns:
the figure or None if this plot is not applicable/cannot be created
- class ClassificationEvalStatsPlotProbabilityThresholdPrecisionRecall(*args, **kwds)[source]#
Bases:
ClassificationEvalStatsPlot
- create_figure(eval_stats: ClassificationEvalStats, subtitle: str) Optional[Figure] [source]#
- Parameters:
eval_stats – the evaluation stats from which to generate the plot
subtitle – the plot’s subtitle
- Returns:
the figure or None if this plot is not applicable/cannot be created
- class ClassificationEvalStatsPlotProbabilityThresholdCounts(*args, **kwds)[source]#
Bases:
ClassificationEvalStatsPlot
- create_figure(eval_stats: ClassificationEvalStats, subtitle: str) Optional[Figure] [source]#
- Parameters:
eval_stats – the evaluation stats from which to generate the plot
subtitle – the plot’s subtitle
- Returns:
the figure or None if this plot is not applicable/cannot be created