Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Following set of properties define the input/output handling configurations of WSO2 ML.

Code Block
languagexml
<Properties>
	<Property name="ml.thread.pool.size" value="100" />
	<Property name="file.in" value="org.wso2.carbon.ml.core.impl.FileInputAdapter" />
	<Property name="file.out" value="org.wso2.carbon.ml.core.impl.FileOutputAdapter" />
	<Property name="hdfs.in" value="org.wso2.carbon.ml.core.impl.HdfsInputAdapter" />
	<Property name="hdfs.out" value="org.wso2.carbon.ml.core.impl.HdfsOutputAdapter" />
	<Property name="das.in" value="org.wso2.carbon.ml.core.impl.BAMInputAdapter" />
	<Property name="registry.in" value="org.wso2.carbon.ml.core.impl.RegistryInputAdapter" />
	<Property name="registry.out" value="org.wso2.carbon.ml.core.impl.RegistryOutputAdapter" />
</Properties>

The following table describes the properties of the input/output handling configuration.

Property NameDescriptionTypeDefault Value

ml.thread.pool.size

The size of the thread pool used by WSO2 ML.Integer100
file.inThe adapter that reads files from the local file system.Stringorg.wso2.carbon.ml.core.impl.FileInputAdapter
file.outThe adapter that writes files to the local file system.Stringorg.wso2.carbon.ml.core.impl.FileOutputAdapter
hdfs.inThe adapter that reads files from a Hadoop File System (HDFS).Stringorg.wso2.carbon.ml.core.impl.HdfsInputAdapter
hdfs.outThe adapter that writes files to a Hadoop File System (HDFS).Stringorg.wso2.carbon.ml.core.impl.HdfsOutputAdapter
registry.inThe adapter that reads data from WSO2 registry.Stringorg.wso2.carbon.ml.core.impl.RegistryInputAdapter
registry.outThe adapter that writes data into WSO2 registry.Stringorg.wso2.carbon.ml.core.impl.RegistryOutputAdapter 
Note
If you want to add an custom input/output adapter, add the following properties to the above input/output handling configurations:
<Property name="custom.in" value="org.wso2.carbon.ml.custom.adapter.input.CustomMLInputAdapter"/>
<Property name="custom.out" value="org.wso2.carbon.ml.custom.adapter.output.CustomMLOutputAdapter"/>

Anchor
Storage configuration
Storage configuration
Storage configurations

This section contains configurations relating to the storage of datasets and models using the storage type file or hdfs. Configurations relating to storage are defined as shown in the example below. This configuration is optional and commented out by default. You can uncomment it and edit the default configurations as required.

Code Block
languagexml
<HdfsURL>hdfs://localhost:9000</HdfsURL>
<!-- DatasetStorage> 
	<StorageType>file</StorageType> 
	<StorageDirectory>/tmp</StorageDirectory> 
</DatasetStorage -->

<!-- ModelStorage> 
	<StorageType>file</StorageType> 
	<StorageDirectory>/tmp</StorageDirectory> 
</ModelStorage -->

The following table explains the parameters of the storage configuration.

Parameter NameDescriptionTypeDefault Value
HdfsURLThe HDFS location in which the ML is allowed to store files.String
hdfs://localhost:9000
DatasetStorage
Location where datasets are stored. By default, the value of this server configuration is the file system. For information on using HDFS as the dataset storage, see HDFS Support, and for information on using custom input/output adapters as the dataset storage, see ML Custom Adapter Extension.N/AN/A
ModelStorage
Location where models are persisted. By default, the value of this server configuration is the file system. For information on using HDFS as the model storage, see HDFS Support. For information on using HDFS as the model storage, see HDFS Support, and for information on using custom input/output adapters as the model storage, see ML Custom Adapter Extension.N/AN/A
StorageTypeThis parameter specifies whether the relevant artifact should be stored in the file system, HDFS or a storage defined by a custom input/output adapter.String
  • If you want to use the file system as the storage type, enter file as the value of this parameter.

  • If you want to use HDFS as the storage type, enter hdfs as the value of this parameter.

  • If you want to use a storage defined by a custom input/output adapter, as the storage type, enter the prefix (e.g. custom) of the custom input/output adapter property name (e.g. custom.in) as the value of this parameter.

StorageDirectoryThe storage directory in which the relevant artifact should be saved.String
  • If the storage type is file, the artifact is saved in the <CARBON_HOME>/datasets or <CARBON_HOME>/models/ directory by default (i.e. depending on whether your are configuring storage parameters for datasets or models).
  • If the storage type is hdfs, the artifact is saved in the directory (which is in the location to which the HDFS URL points). Specify this location as the value of this parameter.
  • If the storage type is a storage defined by a custom input/output adapter, the artifact is saved in the directory which you define as the value of this parameter.

Algorithm configurations

WSO2 ML supports various machine learning algorithms. Configurations of these algorithms are defined as shown in the example below.

Code Block
languagexml
<Algorithms>
		<Algorithm>
			<Name>LINEAR_REGRESSION</Name>
			<Type>Numerical_Prediction</Type>
			<Parameters>
				<Name>Iterations</Name>
				<Value>100</Value>
			</Parameters>
			<Parameters>
				<Name>Learning_Rate</Name>
				<Value>0.001</Value>
			</Parameters>
			<Parameters>
				<Name>SGD_Data_Fraction</Name>
				<Value>1</Value>
			</Parameters>
		</Algorithm>
	</Algorithms>

The following table describes the parameters of an algorithm configuration.

Parameter NameDescriptionType
NameThe name of the algorithm.String
TypeThe type of the algorithm.String
IterationsThe number of iterations of gradient descent to run.Integer
InfoIn the above configurations, the interpretability, scalability, multicollinearity, and dimensionality define a set of weights (on a scale of zero to five), given to each algorithm. These weights are used for calculating ratings for algorithms when recommended algorithms are being requested. It is highly recommended that these values remain unchanged. Each parameter under algorithms represents the hyper-parameters associated with each of the algorithm, and their default values.

 

Other configurations

Parameter NameDescriptionTypeDefault Value
EmailNotificationEndpointThis parameter is used to enter a list of comma-separated email addresses to which model building status mails should be sent. This is an optional parameter.StringN/A
ModelRegistryLocation

The location in the Governance Registry where ML related models are published.

e.g.,

Code Block
languagexml
<ModelRegistryLocation>ml</ModelRegistryLocation>
Stringml