My goal is to run Spark worker on the same Cassandra node and also have a separate node for the Spark master.
Right now, am trying out Datastax. During the installation of Cassandra datastax, I can select the 3 different node types - Cassandra, Search and analytics.
On my 3 node cluster, should I select transactional node type on 2 nodes and 1 node type for analytics (Spark master)? How do I enable the Spark worker on the Cassandra nodes ?
Thanks to MarcintheCloud answer:
Keep in mind Cassandra is still running when DSE is running in analytics mode. You can still service Cassandra transactions as you would normally.
You're going to want to run all of those nodes in "analytic" mode (running Spark specifically).
You can do that by setting the Spark flag in the dse default file (if using the rpm) or start dse using the -k parameter (tarball) see: http://docs.datastax.com/en/datastax_enterprise/4.7/datastax_enterprise/spark/sparkStart.html
DSE will automatically select a master for you if you haven't specified one explicitly. The Spark worker process will also start automatically on all nodes.
EDIT: keep in mind Cassandra is still running when DSE is running in analytics mode. You can still service Cassandra transactions as you would normally.
Let me know if you have any other questions!