I've been trying to figure out how to use Watson Knowledge Studio for couple weeks now. I've been working with cooking recipes to keep data simple and easy to annotate. My goal would be to be able to submit a recipe as an unstructured text and get a structured response with the recipe name, ingredients, cooking devices, budget, diet, etc.
It's actually doing ok so far, except for the recipe name.
So my question is how to teach the model how to identify this very specific part (recipe name) since it's almost always different?
Any advice welcome :)
In the "Annotator Component" of the Watson Knowledge studio, you have a component called Machine learning. Create a corpus of few representative documents and complete human annotation. You can use this set as training set for the machine learning component and see the statistics of the evaluation and fine tune the model. The process works like this:
Create type system (you can create custom dictionaries for auto annotate the documents) --> Create a document corpus of representative documents --> Human Annotate the documents (entities, relations & conferences) --> Submit the annotations--> Approve the annotations --> Create machine learning annotator --> select the document corpus --> Build Training Set, Test Set and Blind Set (or you can use the system proposed distribution) --> Train & Evaluate --> Check statistics --> Create a snapshot the version --> Deploy the version with your AlchemyAPI key --> Your model will be created.
Try the model with new documents and see how it performs and you can repeat the process to fine tune it.
HTH Gopal