Generating a Model Using the K-Means Algorithm

Introduction

This sample demonstrates how a model is generated out of a data set using the k-means algorithm. The sample uses a data set to generate a model, which is divided into two sets for training and testing.

The K-Means algorithm generates a PMML supported model.

Prerequisites

Follow the steps below to set up the prerequisites before you start.

Download WSO2 Machine Learner, and start the server. For information on setting up and running WSO2 ML, see Getting Started.
Download and install jq (CLI JSON processor). For instructions, see jq Documentation.
If you are using Mac OS X, download and install GNU stream editor (sed). For instructions, see GNU sed Documentation.

Executing the sample

Follow the steps below to execute the sample.

Navigate to <ML_HOME>/samples/default/k-means/ directory using the CLI.
Execute the following command to execute the sample: ./model-generation.sh

Analyzing the output

Once the sample is successfully executed, you can view the summary and the prediction of the model as described below.

By default , the sample generates the model in the <ML_HOME>/models/ directory of your machine. For example, the generated file is in the following format denoting the date and time when it was generated: wso2-ml-kmeans-sample-analysis.Model.2015-09-03_12-02-05

Viewing the model

You can view the summary of the built model using the ML UI as follows.

Log in to the ML UI from your Web browser using admin/admin credentials and the following URL: https://<ML_HOST>:<ML_PORT>/ml
Click the Projects button as shown below.
Click Models to view the models of the wso2-ml-kmeans-sample-analysis analysis that was created when the sample was executed.
Click View on the model as shown below.

The summary of the model is displayed as shown below.

Viewing the model prediction

The sample executes the generated model on the <ML_HOME>/samples/default/k-means/prediction-test data set, and it prints the value [1] as the prediction result In the CLI logs.