Does H2O Driverless AI have inbuilt support for merging multiple dataset and using the merged dataset for training?

Question

Does H2O Driverless AI have inbuilt support for merging multiple dataset and using the merged dataset for training?

119 views Asked by Firenze At 06 October 2020 at 06:46

Suppose we have three datasets containing data from a company.

employee.csv : This dataset contains the details of the employees working in the company, like employee ID, employee name, dept id of the dept he works in, country code of the country where he is from and his annual salary.
dept.csv : This dataset has information about the department of the company, like the dept id, dept name, dept specialization.
country.csv : This dataset contains some country names with its country code and the capital city of the country.

Is there a feature in H2O Driverless AI where we can upload these datasets (without merging using python) and merge it in H2O Driverless AI platform and use it for training using overlapping columns ?

Original Q&A

There are 1 answers

**Neema Mashayekhi** · Accepted Answer · 2020-10-12T06:33:41+00:00

Yes, you can use a data recipe for processing datasets (including joining them). See the docs for more about data recipes. You can create a recipe that joins datasets.

# Let's join a `employee.csv` (X) to `dept.csv` (Y1) and `country.csv` (Y2)
# Define and read locations of datasets for Y1/Y2
Y_file_name1 = "./tmp/user/location_of_dept.csv.bin"
Y_file_name2 = "./tmp/user/location_of_country.csv.bin"
Y1 = dt.fread(Y_file_name1)
Y2 = dt.fread(Y_file_name2)

# Set key and join Y1
key1 = ["dept_id"]
Y1.key = key1
X = X[:, :, dt.join(Y1)]

# Set key and join Y2
key2 = ["country_code"]
Y2.key = key2
X = X[:, :, dt.join(Y2)]

return X

See this recipe as an example for joining one dataset to another.

TechQA.

Does H2O Driverless AI have inbuilt support for merging multiple dataset and using the merged dataset for training?

There are 1 answers

Related Questions in CSV

Related Questions in H2O

Related Questions in TRAINING-DATA

Related Questions in MERGING-DATA

Related Questions in DRIVERLESS-AI

Popular Questions

Popular Tags

Trending Questions