Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Reverted from v. 22

...

Table of Contents
maxLevel3
minLevel3

...

Available measure types

The model evaluation methods in WSO2 ML can be categorized into four types as follows.

Numerical predictions

These methods involve making a numerical prediction based on the dataset analysed. The available measures of this type are as follows.

Binary classification

Binary classification involves  involves classifying  the data items in a dataset into two categories.

Terminology of Binary Classification Metrics

 Binary Classification Metrics refer to the following two formulas used to calculated the reliability of a binary classification model.

NameFormula
True Positive Rate (Sensitivity)TPR = TP / P = TP / (TP + FN)
True Negative Rate (Specificity)SPC = TN / N = TN / (TN + FP)
Terminology for binary classification metrics

The following table explains the abbreviations used in the above formulas.

AbbreviationExpansionMeaning
PPositivesThe total number of positive outcomes (i.e. the total number of items that actually belong to the positive class).
NNegativesThe total number of negative items (i.e. the total number of items that actually belong to the negative class).
TPTrue Positive

TP data items:

  • actually belong to the positive class
  • are correctly included in the positive class
FPFalse Positive

FP data items:

  • actually belong to the negative class
  • are incorrectly included in the positive class
TNTrue Negative

TN data items:

  • actually belong to the negative class
  • are correctly included in the negative class
FNFalse Negative

FN data items:

  • actually belong to the positive class
  • are incorrectly included in the negative class

The available measures of this type are as follows.

Multi-class

...

classification

Multi-class classification involves classifying the items in a dataset into multiple categories. The available measures of this type are as follows.

Clustering

This involves clustering the items in a dataset.

Model evaluation measures

The following methods are used to evaluate the performance of models in terms of accuracy.toc

Confusion Matrix

...

Confusion Matrix

Anchor

...

confusion matrix
confusion matrix

Info

This method is available for binary classification and multi class classification models.

...

The following is an example of a confusion matrix with both correctly classified points as well as incorrectly classified points.

Anchor
Confusion
Confusion

Accuracy
Anchor
accuracy
accuracy

Info

This method is available for binary classification and multi class classification models.

...

This illustrates the performance of  binary classifier model by showing the TPR (True Positive Rate> against the SPC (False Positive Rate) for different threshold values. A completely accurate model would pass through the 0, 1 coordinate (i.e. TPR of 1 and SPC of 0) in the upper left corner of the plot. However, this is not achievable in practical scenarios. Therefore, when comparing models, the model with the ROC curve closest to the 0, 1 coordinate can be considered the best performing model in terms of accuracy. The best threshold for this model is the threshold associated with the point that is closest to the 0,1 coordinate on the ROC curve. You can find ROC curve for a particular binary classification model under the model summary in WSO2 ML UI.

AUC
Anchor
AUC
AUC

Info

This method is available for binary classification models.

...

You can find the AUC value for a particular model in its ROC curve in the model summary (see the image of the ROC curve in the previous section with text ROC Curve (AUC = 0.619).

Feature Importance
Anchor
feature importance
feature importance

Info

This method is available for binary classification and numerical prediction models.

...

This chart plots the data points according to the correctness of the classification. You can select two dataset features to be visualized and the plot will display data distribution with the classification accuracy (correct/incorrect) for each point.

MSE
Anchor
MSE
MSE

Info

This method is available for numerical prediction models.

...

MSE (Mean Squared Error) is the average of the squared errors of the prediction. An error is the difference between the actual value and the predicted value. Therefore, a better performing model should have a comparatively lower MSE. This metric is widely used to evaluate the accuracy of numerical prediction models. You can find this metric for numerical prediction models in the model summary as shown in the above image.

Residual Plot
Anchor
residual plot
residual plot

Info

This method is available for numerical prediction models.

...