Learn And Code Confusion Matrix With Python

The confusion matrix is a way to visualize how many samples from each label got predicted correctly. The beauty of the confusion matrix is that it actually allows us to see where the model fails and where the model succeeds, especially when the labels are imbalanced. In other words, we are able to see beyond the model's accuracy.

P.S. some people use predicted values on the rows, and actual values on the columns, which is just the transpose of this matrix. Some people start by the negative class first then the positive class. These are just different ways of drawing the confusion matrix, and all convey the same thing.

False Positives (FP-Type 1 error) vs False Negatives (FN-Type 2 error)

source

Confusion Matrix in Python

Let's try generating a confusion matrix in python

importrandomimportnumpyasnp

# first 50 values are positive-labels (1), second 50 values are negative-labels (0)actual_values=[1]*50+[0]*50predicted_values=random.choices([0,1],k=100)# randomly generate 0 and 1 labelspredicted_values[0:5]

[1, 1, 0, 1, 1]

We can calculate then each of the 4 possible outcomes in the confusion matrix by simply comparing each value in the actual_values to its corresponding value in the predicted_values

fp=0fn=0tp=0tn=0foractual_value,predicted_valueinzip(actual_values,predicted_values):# let's first see if it's a true (t) or false prediction (f)ifpredicted_value==actual_value:# t?ifpredicted_value==1:# tptp+=1else:# tntn+=1else:# f?ifpredicted_value==1:# fpfp+=1else:# fnfn+=1our_confusion_matrix=[[tn,fp],[fn,tp]]# we convert it to numpy array to be printed properly as a matrixour_confusion_matrix=np.array(our_confusion_matrix)our_confusion_matrix

array([[24, 26],
       [24, 26]])

We can get the same confusion matrix using sklearn.metrics.confusion_matrix function

fromsklearn.metricsimportconfusion_matrix

confusion_matrix(actual_values,predicted_values)

array([[24, 26],
       [24, 26]])

Accuracy

How many values did we predict correctly? How many true predictions out of all samples there are?

accuracy=(tp+tn)/100accuracy

0.5

# orfromsklearn.metricsimportaccuracy_scoreaccuracy_score(actual_values,predicted_values)

0.5

Precision vs Recall

Precision

Precision calculates percentage of how many times a prediction is correct out of total predictions made. Example - If you predicted that 100 patients would catch Covid-19, but only 90 of patients actually got covid, then your precision is 90%. So out of all predicted positives (true positive and false positive) how many are actually true positive(tp)?

all_predicted_positives=tp+fpprecision_positive=tp/all_predicted_positivesprecision_positive

0.5

# orfromsklearn.metricsimportprecision_scoreprecision_score(actual_values,predicted_values,pos_label=1)# precision_positive

0.5

# for the negative classall_predicted_negatives=tn+fnprecision_negative=tn/all_predicted_negativesprecision_negative

0.5

# here we trick sklearn to think that positive label is 0 not 1 :)precision_score(actual_values,predicted_values,pos_label=0)# precision_negative

0.5

Recall

Out of all actual positive samples, how many did you detect? For example, if there are 100 covid-19 patients, and in total you predicted only 50 of them as infected (positive), so your recall is 50%. So out of all actual positives (tp and fn), how many are predicted to be positive (tp).

all_actual_positive=tp+fnrecall_positive=tp/all_actual_positiverecall_positive

0.6

# orfromsklearn.metricsimportrecall_scorerecall_score(actual_values,predicted_values)# recall_positive

0.6

all_actual_negative=tn+fprecall_negative=tn/all_actual_negativerecall_negative

0.4

# here we trick sklearn to think that positive label is 0 not 1 :)recall_score(actual_values,predicted_values,pos_label=0)# recall_negative

0.4

Importance of Precision and Recall

Let's say your dataset has just 10 positive samples, and 90 negative samples. If you use a classifier that classifies everything as negative, its accuracy would be 90%, which is misleadingly. But the classifier is actually pretty dumb! So let's calculate the precision and recall for such a model

# dataactual_values=[0]*90+[1]*10predicted_values=[0]*100acc=accuracy_score(actual_values,predicted_values)prec_pos=precision_score(actual_values,predicted_values)recall_pos=recall_score(actual_values,predicted_values)prec_neg=precision_score(actual_values,predicted_values,pos_label=0)recall_neg=recall_score(actual_values,predicted_values,pos_label=0)print(f"Accuracy: {acc}")print(f"Precision (+): {prec_pos}")print(f"Recall (+): {recall_pos}")print(f"Precision (-): {prec_neg}")print(f"Recall (-): {recall_neg}")

Accuracy: 0.9
Precision (+): 0.0
Recall (+): 0.0
Precision (-): 0.9
Recall (-): 1.0

/home/ammar/myenv/lib/python3.7/site-packages/sklearn/metrics/_classification.py:1272: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))

Sklearn is warning us about a zero division? where is that. It is in the precision of the positive class. We should be dividing by all the predicited positives, but the model made no positive predictions, so that is a zero! More importantly, the positive recall is also zero, because the model did not detect any of the positive samples, as it is naively classifying everything as negative.

F1-score

In order to unify precision and recall into one measure, we take their harmonic mean, which is called F1-score

f1_positive=2*(prec_pos*recall_pos)/(prec_pos+recall_pos)f1_positive# nan because prec_pos is 0

/home/ammar/myenv/lib/python3.7/site-packages/ipykernel_launcher.py:1: RuntimeWarning: invalid value encountered in double_scalars
  """Entry point for launching an IPython kernel.

nan

# orfromsklearn.metricsimportf1_scoref1_score(actual_values,predicted_values)# sklearn handles this nan and converts it to 0

0.0

f1_negative=2*(prec_neg*recall_neg)/(prec_neg+recall_neg)f1_negative

0.9473684210526316

Sklearn Classification Reports

In sklearn you can show all of these results in one combined table! and also for more than two classes.

actual_values=[1]*30+[2]*30+[3]*30+[4]*10# 30 samples of each class 1,2, and 3 and 10 samples of class 4predicted_values=random.choices([1,2,3,4],k=100)# 100 random samples

fromsklearn.metricsimportclassification_reportprint(classification_report(actual_values,predicted_values))

              precision    recall  f1-score   support

           1       0.39      0.23      0.29        30
           2       0.21      0.23      0.22        30
           3       0.32      0.23      0.27        30
           4       0.00      0.00      0.00        10

    accuracy                           0.21       100
   macro avg       0.23      0.17      0.19       100
weighted avg       0.27      0.21      0.23       100

Support: This columns tells you how many samples are in each class.

Macro Avg

For a multiclass classification problem, apart from the class-wise recall, precision, and f1 scores, we check the macro and weighted average recall, precision and f1 scores of the whole model. These scores help in choosing the best model for the task at hand.

In the above confusion matrix, if we do the average of precision column, we would get 0.23 as shown below. Similarly the averages of the other columns can be found out.

(0.39+0.21+0.32+0.00)/4.0

0.22999999999999998

Weighted Avg

Weighted average is average of weighted score of each column. For Example Precision column weighted average score is calculated by multiplying the precision value with corresponding number of samples and then taking the average as shown below.

(0.39*30+0.21*30+0.32*30+0.00*10)/100

0.276

John Ludhi/nbshare.io: Learn And Code Confusion Matrix With Python

Learn And Code Confusion Matrix With Python

False Positives (FP-Type 1 error) vs False Negatives (FN-Type 2 error)

Confusion Matrix in Python

Accuracy

Precision vs Recall

Precision

Recall

Importance of Precision and Recall

F1-score

Sklearn Classification Reports

Trending Articles

RAMAYAMPET Mandal Sarpanch | Upa-Sarpanch | Ward member Mobile Numbers Medak...

लड़कियां सेक्स के दौरान क्यों करती है उह! आह!लड़कियां सेक्स के दौरान क्यों करती...

Neem Baba Extra Questions Answer Class 6 English Poorvi

Throw Back: 4×4 — Sikilitele (Ft Castro) Prod by JQ

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Lowe faces four theft charges

Practice Sheet of Right form of verbs for HSC Students

Mafia, Murder & Mayhem In The Motor City: Detroit Mob Hit Timeline (1937-2007)

The 10 Tennessee Cities With The Largest Black Population For 2021

Materials Around Us Class 6 Worksheet Science Chapter 6

デスクトップヒープの枯渇

Best Suvichar in Hindi |बेस्ट सुविचार |शुभ विचार हिंदी में

Kanulanu Thaake Lyrics and translation | Manam (2014)

Korean Sex Porn Videos: XXX Videos & Free Porn Movies

Teen Shot In Miami Drive-By Dies From Injuries

Download: IQ Muzatasha feat Shy D & Pmj – Ulesi NiFertilizer Yamavuto

Mahakal Attitude Status

Property developer set up cannabis factory to help pay off debts...

♡

KB: How to troubleshoot issues when adding a Hyper-V host in System Center...