General ML configuration related questions
Where can I find the ML configuration file?
The <ML_HOME>/repository/conf/machine-learner.xml
file includes all the ML-specific configurations.
How can I change the ML datasource name?
The default name is jdbc/WSO2ML_DB
. You can change it by changing the value of the <DataSourceName>
element in the <ML_HOME>/repository/conf/machine-learner.xml
file. For more information, see ML-specific configurations.
How can I change the sample points size that ML use to generate summary statistics?
Default size is 10000. You can change it by changing the value of the <
SampleSize>
property within the <
SummaryStatisticsSettings>
element in the <ML_HOME>/repository/conf/machine-learner.xml
file. For more information, see ML-specific configurations.
How can I change the directory which holds datasets?
By default, the <ML_HOME>/datasets/
directory holds datasets, and the default storage type is ‘file
’. You can change it by changing the value of the <
StorageDirectory>
property within the <DatasetStorage>
element in the <ML_HOME>/repository/conf/machine-learner.xml
file. For more information, see ML-specific configurations.
How can I change the directory with models?
By default, the <ML_HOME>/models/
directory holds models, and default storage type is ‘file’
. You can change it by changing the value of the <StorageDirectory>
property within the <ModelStorage>
element in the <ML_HOME>/repository/conf/machine-learner.xml
file. For more information, see ML-specific configurations.
How can I increase ML thread pool size?
WSO2 ML uses threads in a thread pool to run different tasks such as dataset summary generation and model generation etc. You can control the size of this thread pool by changing the value of the following property in the <ML_HOME>/repository/conf/machine-learner.xml
file: <Property name="ml.thread.pool.size" value="100"/>
For more information, see ML-specific configurations.
Where should I configure the email addresses of recipients whom will be notified upon a model generation?
You can configure WSO2 ML to send emails on the completion of a model generation. You can have a comma-separated set of email addresses as the value of the <EmailNotificationEndpoint>
property in the <ML_HOME>/repository/conf/machine-learner.xml
file. For more information on configuring email support, see Enabling Email Notifications.
How can I change datasets storage to HDFS?
If you want to change the dataset storage type to HDFS, change the value of the <
StorageType>
property within the <DatasetStorage>
element to ‘hdfs
’. For more information, see ML-specific configurations.
How can I change model storage to HDFS?
If you want to change the model storage type to HDFS, change the value of the <
StorageType>
property within the <ModelStorage>
element to ‘hdfs
’. For more information, see ML-specific configurations.
How can I give a HDFS URL?
If you want to store your datasets and models in an HDFS, you need to enter HDFS URL as the value of the <
HdfsURL>
property in the <ML_HOME>/repository/conf/machine-learner.xml
file. For more information, see ML-specific configurations.
Data related questions
What should be the format of my dataset?
WSO2 ML currently supports the following formats.
- CSV with comma separated values
- TSV with tab separated values
Where can I have my data?
Data can be retrieved from the following sources.
- File system
- Hadoop distributed file system
- WSO2 Data Analytics Server table
Do I need to have a header row in my dataset?
It is not mandatory to have a header row. If the dataset does not have a header row, it will be indicated when you upload the dataset. Then WSO2 ML will generate a header similar to V1, V2 .. Vn. For more information see Exploring Data.
Is there a file size limit to my dataset?
Yes, currently it is 100MB. You can change it via the following property in <ML_HOME>/bin/wso2server.bat file
(for Windows) or <ML_HOME>/bin/wso2server.sh
file (for Linux).
100MB = 100 x 1024 x 1024 = 104857600 Bytes
-Dorg.apache.cxf.io.CachedOutputStream.Threshold=104857600 \
Algorithm related questions
Does ML support numerical prediction?
Yes it does. The following algorithms are available in this version.
Linear Regression
Ridge Regression
Lasso Regression
See How to Select an Algorithm in WSO2 ML for more information
Does ML support classification algorithms?
Yes it does. The classification algorithms currently supported are as follows.
Logistic Regression with Stochastic Gradient Descent
Logistic Regression with Limited memory Broyden-Fletcher-Goldfarb-Shanno
Decision tree
Random forest
Naive bayes
See How to Select an Algorithm in WSO2 ML for more information
Does ML support clustering algorithms?
At present, very primitive support is available for clustering. K-means
is the only clustering algorithm supported for this ML version. We have plans to improve on this area. Please contact us if you are someone who is interested to learn more about these plans.
Analysis related questions
What type of data preprocessing does WSO2 ML support?
It supports feature selection and missing value filling/filtering.
Can I get an insight about the dataset before creating an analysis out of it?
Yes, WSO2 ML supports dataset exploration functionality with multiple visualization techniques. See Exploring Data for more information.
Model related questions
How can I find the details of a built model?
Once you built a model, you can view its model summary, in which you find a summary of the model evaluation. For more information, see Evaluating Models.
How can I calculate the accuracy for a given model?
For classification type algorithms, generate an accuracy measurement based on the predictions made by the model for the test dataset. Test dataset is extracted from the uploaded dataset and the proportion is configurable for each analyses.
Can I download a built model?
You can download a built model or publish it to WSO2 registry. For more information, see Generating Models.
Do you support exporting models in PMML format?
WSo2 ML 1.0.0 does not support PMML format. This is in the roadmap to be provided in future versions.
Can I use a built model in a Java program?
You can use a built model in a Java program. For a sample on how to use a built model in a Java program, see Using ML Models in a Java Client.
How can I make predictions for a test dataset using ML UI wizard?
Once you build a model, you can make predictions to a test dataset by uploading a csv/tsv file from the Predict page in the ML UI. For more information on making predictions, see Making Predictions Using the ML UI.
Can I use a built model in other WSO2 products?
Yes, you can use the generated models in WSO2 ESB with the Predict mediator and in WSO2 CEP with CEP ML Extension.
REST API related questions
What is the root context of the ML REST API?
WSO2 ML REST API root context is ‘/ml
’. For more information, see REST API Guide.
What authentication mechanisms you support?
WSO2 ML supports basic authentication and cookie based authentication. For more information, see REST API Guide.
What are the main APIs you expose?
WSO2 ML exposes five main APIs. They are configurations, datasets, projects, analyses, and models. For more information, see REST API Guide.