Spark Jobserver High Available

Question

Spark Jobserver High Available

363 views Asked by José Carlos Guevara Turruelles At 05 September 2017 at 15:47

I have an standalone Spark cluster with few nodes. I was able to get it High Available with zookeeper. Im using Spark Jobserver spark-2.0-preview and I have configured the jobserver env1.conf file with the available spark URL's like following:

spark://<master1>:<port>,<master2>:<port>

Everything works fine, so if the master1 is down the jobserver connects to the master2.

But what happens if the machine where the jobserver is installed crashes?
Is there a way to do something like what I have done with spark? Having 2 jobserver instances on 2 separates machines and zookeeper to manage if one fails.
Or do I need to manage that situation by myself?

Original Q&A

There are 1 answers

**dumitru** · Accepted Answer · 2017-09-06T07:11:20+00:00

I would go with the third solution. I used once Spark Jobserver, not in HA but I was looking at that moment for a solution. Let me give you my opinions:

If Spark Jobserver is deployed on only one machine, by default it's a point of failure in case the machine crashes.
Spark Jobserver does not use Zookeeper for node's coordination (at least at the moment I used it), instead it uses the actor model implemented in Akka framework.
Best way, I think, is to handle it yourself. And here a approach might be: the simple way, is to start multiple Spark Jobserer instances, on different machines that point to the same database and a proxy in front of them. Now the problem will move the HA of the database server(probably more easy to solve)

I suggest to check Spark Jobserver github repo, cause they discussion about this. (https://github.com/spark-jobserver/spark-jobserver/issues/42)

TechQA.

Spark Jobserver High Available

There are 1 answers

Related Questions in APACHE-SPARK

Related Questions in APACHE-ZOOKEEPER

Related Questions in SPARK-JOBSERVER

Popular Questions

Popular Tags

Trending Questions