I am designing a new hadoop-based data warehouse using hive and I was wondering whether the classic star/snowflake schemas were still a "standard" in this context.
Big Data systems embrace redundancy so that fully normalized schemas have usually poor performance (for example, in NoSQL databases like HBase or Cassandra).
Is still a best practice making star-schema data warehouses with hive?
Is it better designing row-wide (reduntant) tables, by exploiting new columnar file formats?
When designing for NoSQL databases you tend to optimize for a specific query by preprocessing parts of the query and thus store a denormalized copy of the data (albeit denormalized in a query-specific way).
The star schema, on the other hand, is an all-purpose denormalization that's usually appropriate.
When you're planning on using hive, you're really not using it for the optimization but for the general-purposefullness (?) of SQL and as such, I'd imagine the star schema is still appropriate. For a NoSQL db with a non-SQL interface, however, I'd suggest you use a more query-specific design.