The version of python with BigInsights is currently 2.6.6. How can I use a different version of Python with my spark jobs running on yarn?
Note that users of BigInsights on cloud do not have root access.
The version of python with BigInsights is currently 2.6.6. How can I use a different version of Python with my spark jobs running on yarn?
Note that users of BigInsights on cloud do not have root access.
Install Anaconda
This script installs anaconda python on a BigInsights on cloud 4.2 Enterprise cluster. Note that these instructions do NOT work for Basic clusters because you are only able to login to a shell node and not any other nodes.
Ssh into the mastermanager node, then run (changing the values for your environment):
Next run the following. The script attempts to be as idemopotent as possible so it shouldn't matter if you run it multiple times:
Running a pyspark job
If you are using pyspark, you can use anaconda python, set the following variables before running the pyspark command:
Zeppelin (optional)
If you are using Zeppelin (as per these instructions for BigInsights on cloud), set the following variables in zeppelin_env.sh: