com.atlassian.confluence.content.render.xhtml.migration.exceptions.UnknownMacroMigrationException: The macro 'next_previous_link3' is unknown.

Uploading a Dataset via REST API v10

Overview

DescriptionUpload a dataset
Resource Path

/api/v10/datasets

HTTP MethodPOST
Request Format

multipart/form-data

Response Formatapplication/json

Sample cURL command

curl -X POST -b cookies  https://<ML_HOST>:<ML_HTTPS_PORT>/api/v10/datasets -H "Authorization: Basic YWRtaW46YWRtaW4=" -H "Content-Type: multipart/form-data" -F datasetName='<DATASET_NAME>' -F version='<VERSION>' -F description='<DESCRIPTION>' -F sourceType='<SOURCE_TYPE>' -F destination='<DESTINATION>' -F dataFormat='<DATA_FORMAT>' -F containsHeader='<CONTAINS_HEADER>' -F file=<FILE_PATH> -k
  • By default, <ML_HOST> is localhost. However, if you are using a public IP, the respective IP address or domain needs to be specified.
  • By default,  <ML_HTTPS_PORT> has been set to 9443. However, if the port offset has been incremented by n, the default port value needs to be incremented by n.

Example cURL command

curl -X POST -b cookies  https://localhost:9443/api/v10/datasets -H "Authorization: Basic YWRtaW46YWRtaW4=" -H "Content-Type: multipart/form-data" -F datasetName='seeds' -F version='1.0.0' -F description='Seeds Dataset' -F sourceType='file' -F destination='file' -F dataFormat='CSV' -F containsHeader='true' -F file=@'seeds.csv' -k
Parameter definitions

Dataset should be uploaded as a  multipart/form-data  request. Therefore, you need not pass a JSON definition payload in the request. However, you can set the following parameters in the request. 

ParameterDescriptionRequiredDefault valueExample
datasetName
A unique name for the datasetYesNone

diabetes

sourceType
Storage type of the source of the dataset (file/HDFS/WSO2 DAS)Yesfile

file

destination
Storage type of the server side copy of the dataset (file/HDFS)Yesfile

file

dataFormat
Format of the dataset (CSV/TSV)YesNone

csv

containsHeader
Whether the dataset contains a header row or not.Yesfalse

true

comments
Comments on this datasetYesNone

diabetes

sourcePath

If source type is HDFS, this should hold the absolute URI of the dataset file and if source type is BAM, this should be in

<table_name>:<col_1_name>,<col_1_type>;<col_2_name>,<col_2_type> format.

OptionalNone

hdfs://localhost:9000/diabetes.csv

version
Version of the datasetYesNone

1.0

file
Path to dataset file, if source type is file.OptionalNone

file=@'diabetes.csv'

description
Description about the dataset.OptionalNone

Diabetes Dataset

Example

POST https://localhost:9443/api/v10/datasets

Sample output

For information on the properties returned of a sample dataset definition, see Sample Dataset Definition.

{
  "name": "seeds",
  "id": 2,
  "tenantId": -1234,
  "comments": "Seeds Dataset",
  "dataType": "CSV",
  "userName": "admin",
  "version": "1.0.0",
  "sourcePath": null,
  "dataSourceType": "file",
  "dataTargetType": "file",
  "containsHeader": true
}

REST API response

HTTP status code

200, 400 or 500

For descriptions of the HTTP status codes, see HTTP Status Codes.