What exactly is the difference between AnalysisEngine and CAS Consumer?

380 views Asked by At

I'm learning UIMA, and I can create basic analysis engines and get results. But What I'm finding it difficult to understand is use of CAS consumers. At the same time I want to know how different it is from AnalysisEngine? From many examples I have seen, CAS consumer is not really needed(?). Is CAS consumer is very important from big applications point of view or can we do without it?

3

There are 3 answers

2
Claudiu On BEST ANSWER

There is no difference between them in the current version. Historically, a CASConsumer would tipically not modify the CAS, but only use the data existing in the CAS (previously added by an Analysis Engine) to aggregate it/prepare it for use in other systems, e.g., ingestion in databases.

In the current version, it is recommended that CASConsumers be replaced by Analysis Engine components.

1
Renaud On

You can totally do without it. Just use an analysis engine. BTW, are you using uimaFIT already?

0
rec On

The main difference is that by default analysis engines are configured to allow being run in parallel so that they may see only some CASes each (OperationalProperties multipleDeploymentAllowed = true).

CAS consumers are configured to disallow being run in parallel, meaning that they will see all CASes (OperationalProperties multipleDeploymentAllowed = false).

The latter is necessary, e.g. when you want to write all results to a single file.

E.g. the CPE engine respects this flag. When configured for multi-threaded execution, CPE will keep multiple parallel instances of all analysis engines until it hits the first one in the pipeline with multipleDeploymentAllowed = false, which is usually a consumer. For all following components (analysis engines, consumers) only a single instance is created.

Disclosure: I'm on the Apache UIMA project.