...
Table of Contents | ||||
---|---|---|---|---|
|
...
Available measure types
The model evaluation methods in WSO2 ML can be categorized into four types as follows.
Numerical predictions
These methods involve making a numerical prediction based on the dataset analysed. The available measures of this type are as follows.
Binary classification
Binary classification involves involves classifying the data items in a dataset into two categories.
Terminology of Binary Classification Metrics
Binary Classification Metrics refer to the following two formulas used to calculated the reliability of a binary classification model.
Name | Formula |
---|---|
True Positive Rate (Sensitivity) | TPR = TP / P = TP / (TP + FN) |
True Negative Rate (Specificity) | SPC = TN / N = TN / (TN + FP) |
Terminology for binary classification metrics
The following table explains the abbreviations used in the above formulas.
Abbreviation | Expansion | Meaning |
---|---|---|
P | Positives | The total number of positive outcomes (i.e. the total number of items that actually belong to the positive class). |
N | Negatives | The total number of negative items (i.e. the total number of items that actually belong to the negative class). |
TP | True Positive | TP data items:
|
FP | False Positive | FP data items:
|
TN | True Negative | TN data items:
|
FN | False Negative | FN data items:
|
The available measures of this type are as follows.
Multi-class
...
classification
Multi-class classification involves classifying the items in a dataset into multiple categories. The available measures of this type are as follows.
Clustering
This involves clustering the items in a dataset.
Model evaluation measures
The following methods are used to evaluate the performance of models in terms of accuracy.toc
Confusion Matrix
...
Confusion Matrix
Anchor |
---|
...
|
Info |
---|
This method is available for binary classification and multi class classification models. |
...
The following is an example of a confusion matrix with both correctly classified points as well as incorrectly classified points.
Anchor | ||||
---|---|---|---|---|
|
Accuracy
Anchor | ||||
---|---|---|---|---|
|
Info |
---|
This method is available for binary classification and multi class classification models. |
...
This illustrates the performance of binary classifier model by showing the TPR (True Positive Rate> against the SPC (False Positive Rate) for different threshold values. A completely accurate model would pass through the 0, 1 coordinate (i.e. TPR of 1 and SPC of 0) in the upper left corner of the plot. However, this is not achievable in practical scenarios. Therefore, when comparing models, the model with the ROC curve closest to the 0, 1 coordinate can be considered the best performing model in terms of accuracy. The best threshold for this model is the threshold associated with the point that is closest to the 0,1 coordinate on the ROC curve. You can find ROC curve for a particular binary classification model under the model summary in WSO2 ML UI.
AUC
Anchor | ||||
---|---|---|---|---|
|
Info |
---|
This method is available for binary classification models. |
...
You can find the AUC value for a particular model in its ROC curve in the model summary (see the image of the ROC curve in the previous section with text ROC Curve (AUC = 0.619)
.
Feature Importance
Anchor | ||||
---|---|---|---|---|
|
Info |
---|
This method is available for binary classification and numerical prediction models. |
...
This chart plots the data points according to the correctness of the classification. You can select two dataset features to be visualized and the plot will display data distribution with the classification accuracy (correct/incorrect) for each point.
MSE
Anchor | ||||
---|---|---|---|---|
|
Info |
---|
This method is available for numerical prediction models. |
...
MSE (Mean Squared Error) is the average of the squared errors of the prediction. An error is the difference between the actual value and the predicted value. Therefore, a better performing model should have a comparatively lower MSE. This metric is widely used to evaluate the accuracy of numerical prediction models. You can find this metric for numerical prediction models in the model summary as shown in the above image.
Residual Plot
Anchor | ||||
---|---|---|---|---|
|
Info |
---|
This method is available for numerical prediction models. |
...