How do I manage multiple training sets using the Watson NLC Toolkit

586 views Asked by At

From what I see, there's no way to upload multiple training sets to the new Watson NLC tooling. I need to manage separate training sets and their associated classifiers. What am I missing here?

enter image description here

2

There are 2 answers

1
John Bufe On BEST ANSWER

Preferred option: Provision an NLC service instance for each set of training data you'd like to work with and separately access the tooling for each.

Workaround: Currently, the flow for managing multiple training sets in one NLC service instance is as follows:

  1. (Optional to start fresh) Go to the training data page and click on the garbage icon to delete all training data.
  2. Upload a training set on the training data page using the upload icon.
  3. Manipulate the data as necessary. Add texts and classes, tag texts with classes, etc.
  4. Create a classifier. When you create a classifier, it is essentially a snapshot of your current training data since you are able to retrieve it later from the classifiers page.

Repeat steps 1-4 as necessary until you have uploaded all of your training data sets and created the corresponding classifiers.

When you want to continue working on a previous training set:

  1. Clear your training data (step 1 from above).
  2. Go to the classifiers page.
  3. Click on the download icon for the classifier which contains the training data you'd like to work with.
  4. Return to the training data page and upload the file downloaded from step 3.
4
James Taylor On

The best way to manage multiple training sets is to use a different NLC service instance for each training set.

The current beta NLC tooling is not intended to manage separate training sets within a single service instance. For example, the tool makes suggestions when you add texts without classes- these are based on the most recently trained classifier which won't make sense if that was based on a completely different training set.

The work around suggested by @John Bufe will work if you have a hard limit on the number of NLC services you can use for some reason, e.g. you have reached your limit of Bluemix services. Cost is not a factor here as additional NLC service instances will not increase the overall price since the monthly charge is for trained classifier instances. For example, if you have four service instances with a single classifier in each, you'll see 3 charged and 1 free.

If you want to use the NLC beta tooling to manage your training data, I would recommend using separate NLC services for each training set you require.