Monolithic ETL to distributed/scalable solution and OLAP cube to Elasticsearch/Solr

Question

Monolithic ETL to distributed/scalable solution and OLAP cube to Elasticsearch/Solr

234 views Asked by codelove At 10 June 2015 at 02:28

I am relatively a newbie to big data processing looking for some specific guidance from the SO community.

We are currently setup with a monolithic/sequential ETL, needless to say it is not scalable as our data grows. What are our options (sure distributing and parallelizing are but need specifics)? I have played with Hadoop and it may be appropriate to use here, but I am wondering what are some of the other options out there? May be something that's easier to transition to for a database developer?

Kind of related to question above is we also have an OLAP cube for aggregated data. Is Elasticsearch or Solr good candidates for replacing an OLAP cube? Has anyone successfully done this? What are the gotchas?

Original Q&A

There are 1 answers

**Sravan K Reddy** · Answer 1 · 2015-06-23T12:21:47+00:00

same kind of use case currently we are working on.

our approach may be use full.

step 1: we are sqooping data to Hdfs from dbs

step 2: ETL logic in Pig scripting

step 3: building index on aggregated table data to solr.

step 4: search on solr through web interface.

in our use case we are developing pig jobs to perform transformation logic storing them to final folders incrementally. later MR indexer tool will index the data to solr.we are using cloudera-search. let me know if any thing.

TechQA.

Monolithic ETL to distributed/scalable solution and OLAP cube to Elasticsearch/Solr

There are 1 answers

Related Questions in HADOOP

Related Questions in SOLR

Related Questions in ELASTICSEARCH

Related Questions in ETL

Related Questions in OLAP-CUBE

Popular Questions

Popular Tags

Trending Questions