Very new to cTAKES and looking through the docs, curious about what exactly the UMLS and SNOMEDCT "vocabularies" are. The user installation docs don't really seem to tell and simply applying for the UMLS license and the language around the UMLS Metathesaurus does not really divulge much more about the structure of the data being accessed. Eg. is it some online API service? Is it some files that come with the cTAKES download that can only be unlocked with a valid UMLS password that is checked against an online DB?
What exactly are the UMLS and SNOMED-CT vocabularies used by cTAKES?
613 views Asked by lampShadesDrifter At
1
Info on what the UMLS Metathesaurus and SNOMEDCT are can be found here (https://www.nlm.nih.gov/research/umls/knowledge_sources/metathesaurus/index.html) and here (https://www.ncbi.nlm.nih.gov/books/NBK9676/, specifically https://www.ncbi.nlm.nih.gov/books/NBK9684/):
While I'm not sure how exactly cTAKES implements its use of the UMLS Metathesaurus (anyone who knows could please enlighten), I assume that it is accessing some API for a relational database based on the UMLS credentials you need to add to the example scripts that come with the cTAKES download (see https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+4.0+User+Install+Guide#cTAKES4.0UserInstallGuide-(Recommended)AddUMLSaccessrights).
(I think) this is what is used to power the UIMA analysis engines used to process text in cTAKES