Confusion matrix in detecting Cyber Crimes

Bhavesh S. Sonewale
3 min readJun 6, 2021

--

A confusion matrix is a summarized table of the number of correct and incorrect predictions (or actual and predicted values) yielded by a classifier (or classification model) for binary classification tasks.

By visualizing the confusion matrix, an individual could determine the accuracy of the model by observing the diagonal values for measuring the number of accurate classification.

The confusion matrix is in the form of a square matrix where the column represents the actual values and the row depicts the predicted value of the model and vice versa. Specifically;

  1. A confusion matrix presents the ways in which a classification model becomes confused while making predictions.”
  2. A good matrix (model) will have large values across the diagonal and small values off the diagonal.
  3. Measuring a confusion matrix provides better insight in particulars of is our classification model is getting correct and what types of errors it is creating.

True Positive, True Negative, False Positive and False Negative

For machine learning classification based problems, a confusion matrix is a performance measurement method.

  • This is a table of four separate combinations of predicted and actual values.
  • The table compares predicted values in Positive and Negative and actual values as True and False.
  • These four elements are the fundamental building block of designing a confusion matrix.
2x2 confusion matrix

Let’s understand the concept of True positive, True negative, False positive and False negative with an example :

  1. Confusion Matrix prediction on IDS (Intrusion Detection System) :

True Positive : Model didn’t detects a virus in system and actually there is no a virus in system.

True Negative : Model detects a virus in system and actually there is a virus in system.

False Positive : Model didn’t detects a virus in system and actually there is a virus in system.

False Negative : Model detects a virus in system and actually there is no virus in system.

2. Confusion Matrix prediction on Cancer patient :

True Positive : Doctor or model didn’t detect cancer in patient and actually patient don’t have cancer.

True Negative : Doctor or model detect cancer in patient and actually patient have cancer.

False Positive : Doctor or model didn’t detect cancer in patient and actually patient have cancer.

False Negative : Doctor or model detect cancer in patient and actually patient don’t have cancer.

From the above discussion, we can say that :

True Positive:

Interpretation: Model or you predicted positive and it’s true.

True Negative:

Interpretation: Model or you predicted negative and it’s true.

False Positive:

Interpretation: Model or you predicted positive and it’s false.

False Negative:

Interpretation: Model or you predicted negative and it’s false.

Just understood, We describe predicted values as Positive and Negative and actual values as True and False.

Accuracy :

The formula for finding accuracy is :-

Accuracy: (TP+TN)/(TP+TN+FP+FN)

Benefits of Confusion Matrix :

  1. It gives information about errors made by the classifier and the types of errors that are being made.
  2. It reflects how a classification model is disorganized and confused while making predictions.
  3. This feature assists in prevailing over the limitations of deploying classification accuracy alone.

Conclusion :

A confusion matrix is a remarkable approach for evaluating a classification model. It provides accurate insight into how correctly the model has classified the classes depending upon the data feed.

--

--

Bhavesh S. Sonewale
Bhavesh S. Sonewale

Written by Bhavesh S. Sonewale

2x Red Hat Certified | Aspiring DevOps Engineer | Aviatrix Certified Engineer| AWS | Ansible | Openshift | Docker | Kubernetes

No responses yet