Table of Contents | ||
---|---|---|
|
...
Start the Apache Hadoop server by executing the following command: {HADOOP_HOME}/sbin/start-all.sh
Access the Hadoop UI using the following URL: http://localhost:50070
Upload your dataset file to HDFS.
Tip You can use the HDFS Writer utility tool for this.
Start the WSO2 ML server. For instructions on starting, see Running the Product.
Log in to the WSO2 ML UI using the following URL: https://<ML_HOST>:<ML_PORT>/ml
Click the Datasets button as shown below.
- Click ADD DATASET button in the top menu.
Enter the following details as shown below.
- Enter a dataset name for Dataset Name.
- Enter the version number for Version.
- Select
HDFS
fro the Source Type. - Enter the HDFS URL of the dataset file for Data Source.
- Select the Data Format.
- Select
Yes
for Column header available, if you get column headers defined in the dataset.SelectFile
for Destination.
Click Create Dataset. You view the new dataset added to the list of all available datasets on successful dataset creation.
...
Dataset storage is a WSO2 Machine Learner server configuration. By default it is uses the file system. For instructions on changing the dataset storage to HDFS, see Storage Configurations. To upload your dataset files to HDFS, select HDFS
for Destination in the WSO2 ML UI, when you create the dataset.
Spark read datasets from HDFS
...