...
Follow the procedure below to upload the dataset based on which the training model will be createdis created.
- Log into the ML UI using
admin
as both the username and password. The following will be displayed is displayed in the Home page.
- Click ADD DATASET to open the Create Dataset page.
In the Data Source field, click Choose File and browse for the
<ML_HOME>/samples/tuned/naive-bayes/breastCancerWisconsin.csv
file. Enter values for the rest of the parameters as shown below.Parameter Name Value Dataset Name Breast_Cancer_Dataset Version 1.0.0 Description Breast cancer data in Wisconsin. Source Type File Data Format CSV Column Header Available Yes - Click CREATE DATASET to save your changes. The Datasets page will open and opens and the dataset you entered will be is displayed as follows.
Note that the status of the dataset is Processing. - Click REFRESH. The status of the dataset will change dataset changes to Processed as shown below.
...
- Log into the ML Management Console if you are not already logged in.
- Click ADD PROJECT.
If you are already logged in, you can click CREATE PROJECT in the DATASETS page as shown below.
In the Create Project page, enter information as shown below.
Parameter Name Description Project Name Breast_Cancer_data_analytics_project Description This project performs predictive analysis on the breast cancer data in Wisconsin. Dataset Breast_Cancer_Dataset - Click Create Project to save the information. The project will be displayed is displayed in the Projects page as follows.
...
- Log into the ML UI if you are not already logged in.
- Click the You have X projects link as shown below.
- Click on the Breast_Cancer_data_analytics_project project to expand it.
- Enter breast_cancer_analysis_1 as the analysis name and click CREATE ANALYSIS. The following page will appear page appears displaying the summary statistics.
- Click Next without making any changes to the summary statistics.
The Explore view will open. You will notice that opens. Note that Parallel Sets and Trellis Chart visualisations are enabled, and Scatter Plot and Cluster Diagram visualisations are disabled. This is determined by the feature types of the dataset. Select and clear the checkboxes for categorical features as follows.
Click Next. The Algorithms view will be displayedis displayed. Enter values as shown below.
Parameter Value Algorithm name LOGISTIC REGRESSION L_BFGS Response variable Class Train data fraction 0.7 - Click Next. The Parameters view will appearappears. Enter L2 as the reg type.
- Click Next. The Model view will appearappears. Select Breast_Cancer_Dataset-1.0.0 as the dataset version.
- Click RUN to train the model.
The analysis will be created and displayed for the project training model is created as displayed as shown below.
Note that the status is In Progress. - Click REFRESH. The status os the analysis changes as shown below.
Step 4: Predict using the model
...
- Log into the ML UI if you are not already logged in.
- Click the You have X projects link as shown below to open the Projects window.
- Click MODELS for the breast_cancer_analysis_1 analysis.
- Click Predict on the model displayed.
Enter values in the Predict page as shown below.
Parameter Name Value Prediction Source Feature values SampleCodeNumber 1018561 ClumpThickness 2 UniformityOfCellSize 1 UniformityOfCellShape 1 MarginalAdhesion 1 SingleEpithelialCellSize 2 BareNuclei 1 BlandChromati 1 NormalNucleoli 1 Mitoses 5 - Click Predict. The prediction will be displayed is displayed as follows.