Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: add ML url

WSO2 Machine Learner provides an interface to configure algorithms to build machine learning models using datasets. These models are used for tasks such as numerical prediction, classification and clustering.

This guide walks you through the basic features of WSO2 ML to get you started. For this purpose, a dataset is analyzed and a ML model is trained to predict the possibility of a person suffering with breast cancer when data relating to a set of other bodily characteristics is provided.

The dataset used in this scenario contains 10 features to provide data on bodily characteristics, and a response variable with the following labels.

LabelPrediction
2This value indicates that the situation is benign.
4This value indicates that the situation is malignant.

The following steps demonstrate how the Machine Learner can use this dataset to make a prediction.

Table of Contents
maxLevel4
minLevel3

Tip
titleBefore you begin,
  1. Install Oracle Java SE Development Kit (JDK) version 1.6.24 or later or 1.7.* and set the JAVA_HOME environment variable.
  2. Download WSO2 ML.
  3. Start the ML by going to <ML_HOME>/bin using the command-line and executing wso2server.bat  (for Windows) or  wso2server.sh  (for Linux.) 

 

Step 1: Create a dataset

Follow the procedure below to upload the dataset based on which the training model is created.

  1. Log into the ML UI (default URL: https://127.0.0.1:9443/ml) using admin as both the username and password. The following is displayed in the Home page.
    Image Added 
  2. Click ADD DATASET to open the Create Dataset page. 
  3. In the Data Source field, click Choose File and browse for the <ML_HOME>/samples/tuned/naive-bayes/breastCancerWisconsin.csv file. Enter values for the rest of the parameters as shown below.
    Image Added

    Parameter NameValue
    Dataset NameBreast_Cancer_Dataset
    Version1.0.0
    DescriptionBreast cancer data in Wisconsin.
    Source TypeFile
    Data FormatCSV
    Column Header AvailableYes
  4. Click CREATE DATASET to save your changes. The Datasets page opens and the dataset you entered is displayed as follows.
    Image Added Note that the status of the dataset is Processing.
  5. Click REFRESH. The status of the dataset changes to Processed as shown below.
    Image Added 

Step 2: Create a project

Follow the procedure below to create a project for the dataset uploaded.

  1. Log into the ML Management Console if you are not already logged in.
  2. Click ADD PROJECT
    Image Added

    If you are already logged in, you can click CREATE PROJECT in the DATASETS page as shown below. 
     Image Added
  3. In the Create Project page, enter information as shown below.
    Image Added

    Parameter NameDescription
    Project NameBreast_Cancer_data_analytics_project
    DescriptionThis project performs predictive analysis on the breast cancer data in Wisconsin.
    DatasetBreast_Cancer_Dataset
  4. Click Create Project to save the information. The project is displayed in the Projects page as follows.
    Image Added 

Step 3: Create an analysis and train a model

Follow the procedure below to analyse the Breast_Cancer_Dataset dataset, and then create a training model based on that analysis. 

  1. Log into the ML UI if you are not already logged in. 
  2. Click the You have X projects link as shown below.
    Image Added
  3. Click on the Breast_Cancer_data_analytics_project project to expand it.
  4. Enter breast_cancer_analysis_1 as the analysis name and click CREATE ANALYSIS. The following page appears displaying the summary statistics.
    Image Added 
  5. Click Next without making any changes to the summary statistics.
    Image Added
    The Explore view opens. Note that Parallel Sets and Trellis Chart visualisations are enabled, and Scatter Plot and Cluster Diagram visualisations are disabled. This is determined by the feature types of the dataset. 
  6. Click Next. The Algorithms view is displayed. Enter values as shown below.
    Image Added

    ParameterValue
    Algorithm nameLOGISTIC REGRESSION L_BFGS
    Response variableClass
    Train data fraction0.7
  7. Click Next. The Parameters view appears. Enter L2 as the reg type.
    Image Added 
  8. Click Next. The Model view appears. Select Breast_Cancer_Dataset-1.0.0 as the dataset version.
    Image Added 
  9. Click RUN to train the model.
    Image AddedThe training model is created as displayed as shown below.
    Image Added
    Note that the status is In Progress.
  10. Click REFRESH. The status os the analysis changes as shown below.
    Image Added 

Step 4: Predict using the model

Follow the procedure below to make a prediction based on the training model you created.

  1. Log into the ML UI if you are not already logged in. 
  2. Click the You have X projects link as shown below to open the Projects window.
    Image Added 
  3. Click MODELS for the breast_cancer_analysis_1  analysis.
    Image Added
  4. Click Predict on the model displayed.
    Image Added 
  5. Enter values in the Predict page as shown below.
    Image Added

    Parameter NameValue
    Prediction SourceFeature values
    SampleCodeNumber1018561
    ClumpThickness2
    UniformityOfCellSize1
    UniformityOfCellShape1
    MarginalAdhesion1
    SingleEpithelialCellSize2
    BareNuclei1
    BlandChromati1
    NormalNucleoli1
    Mitoses5
  6. Click Predict. The prediction is displayed as follows.
    Image Added