mahout datamodel for amazon redshift Recommendation Engine

286 views Asked by At

how would i build Recommendation Engine with amazon Redshift as a data source.is there any mahout data model for amazon redshift or S3

1

There are 1 answers

2
pferrel On BEST ANSWER

Mahout uses Hadoop to read data, except for a few supported NoSQL dbs and JDBC dbs. Hadoop in turn can use S3. You'd have to configure Hadoop to use the S3 filesystem and then Mahout should work fine reading and writing to S3.

Redshift is a data warehousing solution based on Postgres and supporting JDBC/ODBC. Mahout 0.9 supports data models stored in JDBC compliant stores so, though I haven't done it, it should be supported

The Mahout v1 recommenders runs on Spark and input and output is text by default. All I/O goes through Hadoop. So S3 data is fine for input but the models created are also text and need to be indexed and queried with a search engine like Solr or Elasticsearch. You can pretty easily write a reader to get data from any other store (Redshift) but you might not want to save the models in a data warehouse since they need to be indexed by solr and should have super fast search engine style retrieval.