Installing Machine Learner Features

Introduction

As explained in Feature Management , each WSO2 product is a collection of reusable software units called features. A single feature is a list of components and/or other features. This section describes how to install WSO2 Machine Learner features in WSO2 CEP.

Installing required features in WSO2 CEP

Follow the steps below to install the required features in WSO2 CEP.

Log in to the WSO2 CEP management console using admin/admin credentials and the following URL: https://<CEP_HOME>:<CEP_PORT>/carbon/
Click Configure, and then click Features.
Click Repository Management, and then click Add Repository.
Enter the details as shown below to add the Carbon P2 repository.
Click Add.
Click Available Features tab, and select the repository added in the previous step.
Deselect the Group features by category option.
Click Find Features. It can take a while to list out all the available features in the feature repository. Once listed, select the following features.
- Machine Learner Core
- Machine Learner Commons
- Machine Learner Database Service
- ML Siddhi Extension
If you cannot see this feature, retry with one of the following suggestions:
- Try adding a more recent P2 repository. The repository you added could be deprecated.
- Check the Installed Features tab to see whether the feature is already installed.
Once the features are selected, click Install to proceed with the installation.
Click Next, and then select I accept the terms of the license agreement.
Once the installation is completed, click Restart Now, and click Yes in the message which appears.

When installing ML features in an Apache Storm cluster, it is recommended to use a pom file instead of the Management Console. The Management Control only allows the features to be installed in the default profile instead of in all the profiles of the CEP nodes, and as a result, an exception occurs when events are sent to nodes that do not use the default profile. For more information, see Installing Features using pom Files.

When you run WSO2 CEP in a distributed mode, the following needs to be carried out <CEP_HOME>/samples/utils/storm-dependencies.jar/pom.xml

The following dependencies should be uncommented in the <CEP_HOME>/samples/utils/storm-dependencies.jar/pom.xml file as shown below.

<!-- Uncomment the following depedency section if you want to include Siddhi ML extension as part of
    Storm dependencies -->

        <dependency>
            <groupId>org.wso2.carbon.ml</groupId>
            <artifactId>org.wso2.carbon.ml.siddhi.extension</artifactId>
            <version>${carbon.ml.version}</version>
        </dependency>
        <dependency>
            <groupId>org.wso2.carbon.ml</groupId>
            <artifactId>org.wso2.carbon.ml.core</artifactId>
            <version>${carbon.ml.version}</version>
        </dependency>
        <dependency>
            <groupId>org.wso2.carbon.ml</groupId>
            <artifactId>org.wso2.carbon.ml.database</artifactId>
            <version>${carbon.ml.version}</version>
        </dependency>
        <dependency>
            <groupId>org.wso2.carbon.ml</groupId>
            <artifactId>org.wso2.carbon.ml.commons</artifactId>
            <version>${carbon.ml.version}</version>
        </dependency>
        <dependency>
            <groupId>org.wso2.carbon.metrics</groupId>
            <artifactId>org.wso2.carbon.metrics.manager</artifactId>
            <version>${carbon.metrics.version}</version>
        </dependency>

        <!--&lt;!&ndash; Dependencies for Spark &ndash;&gt;-->
        <dependency>
            <groupId>org.wso2.orbit.org.apache.spark</groupId>
            <artifactId>spark-core_2.10</artifactId>
            <version>${spark.core.version}</version>
        </dependency>
        <dependency>
            <groupId>org.wso2.orbit.org.apache.spark</groupId>
            <artifactId>spark-sql_2.10</artifactId>
            <version>${spark.version}</version>
        </dependency>
        <dependency>
            <groupId>org.wso2.orbit.org.apache.spark</groupId>
            <artifactId>spark-mllib_2.10</artifactId>
            <version>${spark.version}</version>
        </dependency>
        <dependency>
            <groupId>org.wso2.orbit.org.apache.spark</groupId>
            <artifactId>spark-streaming_2.10</artifactId>
            <version>${spark.version}</version>
        </dependency>
        <dependency>
            <groupId>org.wso2.orbit.org.scalanlp</groupId>
            <artifactId>breeze_2.10</artifactId>
            <version>${breeze.version}</version>
        </dependency>
        <dependency>
            <groupId>org.wso2.orbit.jblas</groupId>
            <artifactId>jblas</artifactId>
            <version>${jblas.version}</version>
        </dependency>
        <dependency>
            <groupId>org.wso2.orbit.spire-math</groupId>
            <artifactId>spire_2.10</artifactId>
            <version>${spire.version}</version>
        </dependency>
        <dependency>
            <groupId>org.wso2.orbit.org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>${hadoop.client.version}</version>
        </dependency>
        <dependency>
            <groupId>org.wso2.uncommons.maths</groupId>
            <artifactId>uncommons-maths</artifactId>
            <version>${uncommons.maths.version}</version>
        </dependency>
        <dependency>
            <groupId>org.wso2.json4s</groupId>
            <artifactId>json4s-jackson_2.10</artifactId>
            <version>${json4s.jackson.version}</version>
        </dependency>
        <dependency>
            <groupId>org.slf4j</groupId>
            <artifactId>slf4j-api</artifactId>
            <version>${slf4j.version}</version>
        </dependency>
        <dependency>
            <groupId>org.wso2.orbit.github.fommil.netlib</groupId>
            <artifactId>core</artifactId>
            <version>${fommil.netlib.version}</version>
        </dependency>
        <dependency>
            <groupId>org.wso2.orbit.sourceforge.f2j</groupId>
            <artifactId>arpack_combined</artifactId>
            <version>${arpack.combined.version}</version>
        </dependency>
        <dependency>
            <groupId>org.scala-lang</groupId>
            <artifactId>scala-library</artifactId>
            <version>${scala.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.commons</groupId>
            <artifactId>commons-csv</artifactId>
            <version>${commons.csv.version}</version>
        </dependency>

<!-- ML extension dependencies -->

         <include>org.wso2.orbit.org.apache.spark:spark-core_2.10
         </include>
         <include>org.wso2.orbit.org.apache.spark:spark-sql_2.10
         </include>
         <include>org.wso2.orbit.org.apache.spark:spark-mllib_2.10
         </include>
         <include>org.wso2.orbit.org.apache.spark:spark-streaming_2.10
         </include>
         <include>org.wso2.orbit.org.scalanlp:breeze_2.10</include>
         <include>org.wso2.orbit.jblas:jblas</include>
         <include>org.wso2.orbit.spire-math:spire_2.10</include>
         <include>org.wso2.orbit.org.apache.hadoop:hadoop-client
         </include>
         <include>org.wso2.uncommons.maths:uncommons-maths</include>
         <include>org.wso2.json4s:json4s-jackson_2.10</include>
         <include>org.slf4j:slf4j-api</include>
         <include>org.wso2.orbit.github.fommil.netlib:core</include>
         <include>org.wso2.orbit.sourceforge.f2j:arpack_combined
         </include>
         <include>org.scala-lang:scala-library</include>
         <include>org.apache.commons:commons-csv</include>
         <include>org.wso2.carbon.ml:org.wso2.carbon.ml.core</include>
         <include>org.wso2.carbon.ml:org.wso2.carbon.ml.database
         </include>
         <include>org.wso2.carbon.ml:org.wso2.carbon.ml.commons</include>
         <include>
             org.wso2.carbon.ml:org.wso2.carbon.ml.siddhi.extension
         </include>
         <include>
             org.wso2.carbon.metrics:org.wso2.carbon.metrics.manager
         </include>

Run the following command from the <CEP_HOME>/samples/utils/storm-dependencies-jar directory.
mvn clean install
This will generate a jar in the target directory.

Queries are collectively processed by all the nodes in a CEP cluster. Therefore, make sure that the ML model is located in the same path in all the nodes. This allows all the nodes to access the model when events are sent to a specific node.