## Kickstart ML with Python snippets

# Evaluating classification models

## Precision, Recall, F-measure, and Weighted F-measure

These metrics are commonly used to evaluate the performance of classification models, particularly in situations where the classes are imbalanced.

### Precision

**Precision** measures the accuracy of the positive predictions made by the
model. It is the ratio of true positive predictions to the total number of positive
predictions (both true positives and false positives).

$$ \text{Precision} = \frac{\text{True Positives (TP)}}{\text{True Positives (TP)} + \text{False Positives (FP)}} $$

**True Positives (TP):**Correctly predicted positive instances.**False Positives (FP):**Incorrectly predicted positive instances.

Precision answers the question: "Of all instances predicted as positive, how many were actually positive?"

### Recall (Sensitivity or True Positive Rate)

**Recall** measures the ability of the model to identify all relevant positive
cases. It is the ratio of true positive predictions to the total number of actual positive
instances (both true positives and false negatives).

$$ \text{Recall} = \frac{\text{True Positives (TP)}}{\text{True Positives (TP)} + \text{False Negatives (FN)}} $$

**False Negatives (FN):**Actual positive instances that were incorrectly predicted as negative.

Recall answers the question: "Of all actual positive instances, how many were correctly predicted as positive?"

### F-measure (F1-score)

**F-measure** or **F1-score** is the harmonic mean of precision and
recall. It provides a single metric that balances the trade-off between precision and
recall.

$$ \text{F1-score} = 2 \times \frac{\text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}} $$

The F1-score is useful when you need a balance between precision and recall and when you have an uneven class distribution.

### Weighted F-measure (Fb-score)

The **Weighted F-measure** or **Fb-score** generalizes the F1-score
by introducing a weighting factor ( \beta ) that balances the importance of precision and
recall.

$$ F_\beta = (1 + \beta^2) \times \frac{\text{Precision} \times \text{Recall}}{(\beta^2 \times \text{Precision}) + \text{Recall}} $$

**Beta (β):**A parameter that determines the weight of recall in the combined score.- If \( \beta = 1 \), the Fb-score is equivalent to the F1-score (equal weight to precision and recall).
- If \( \beta > 1 \), recall is given more weight.
- If \( \beta < 1 \), precision is given more weight.

The Fb-score is useful when you want to emphasize either precision or recall more, depending on the specific needs of your application.

## Example Calculation

Let's illustrate these metrics with a confusion matrix example:

Predicted Positive | Predicted Negative | |
---|---|---|

Actual Positive |
50 | 10 |

Actual Negative |
5 | 35 |

**True Positives (TP):**50**False Positives (FP):**5**True Negatives (TN):**35**False Negatives (FN):**10

### Precision

$$ \text{Precision} = \frac{TP}{TP + FP} = \frac{50}{50 + 5} = \frac{50}{55} \approx 0.91 $$

### Recall

$$ \text{Recall} = \frac{TP}{TP + FN} = \frac{50}{50 + 10} = \frac{50}{60} \approx 0.83 $$

### F1-score

$$ \text{F1-score} = 2 \times \frac{\text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}} = 2 \times \frac{0.91 \times 0.83}{0.91 + 0.83} \approx 0.87 $$

### Fb-score (Example with ( \beta = 2 ))

$$ F_2 = (1 + 2^2) \times \frac{\text{Precision} \times \text{Recall}}{(2^2 \times \text{Precision}) + \text{Recall}} = 5 \times \frac{0.91 \times 0.83}{(4 \times 0.91) + 0.83} \approx 0.84 $$

## Summary of Metrics

**Precision:**Indicates the correctness of positive predictions.**Recall:**Indicates the coverage of actual positive instances.**F1-score:**Balances precision and recall.**Fb-score:**Provides a weighted balance of precision and recall, emphasizing one more than the other based on the value of ( \beta ).

## Python Example

Here’s how you can calculate these metrics using the `sklearn`

library:

```
from sklearn.metrics import precision_score, recall_score, f1_score, fbeta_score
# Sample data
y_true = [1, 1, 0, 1, 0, 1, 0, 0, 1, 0] # Actual labels
y_pred = [1, 0, 0, 1, 0, 1, 1, 0, 1, 0] # Predicted labels
# Calculate Precision
precision = precision_score(y_true, y_pred)
print("Precision:", precision)
# Calculate Recall
recall = recall_score(y_true, y_pred)
print("Recall:", recall)
# Calculate F1-score
f1 = f1_score(y_true, y_pred)
print("F1-score:", f1)
# Calculate F-beta score with beta = 2
fbeta = fbeta_score(y_true, y_pred, beta=2)
print("F2-score:", fbeta)
```