Import own modules in Jupyter notebook on HDInsight

Question

Import own modules in Jupyter notebook on HDInsight

973 views Asked by Stijn At 30 December 2016 at 20:32

I have deployed a HDInsight 3.5 Spark (2.0) cluster on Microsoft Azure with the standard configurations (Location = US East, Head Nodes = D12 v2 (x2), Worker Nodes = D4 v2 (x4)). When the cluster is running, I connect to Jupyter notebook and I try to import an own created module.

import own_module

This unfortunately does not work, so I tried to 1) upload own_module.py in Jupyter Notebook home and 2) added own_module.py to /home/sshuser via ssh connection. Afterwards I added /home/sshuser to the sys.path and PYTHONPATH:

sys.path.append('/home/sshuser')
os.environ['PYTHONPATH'] = os.environ['PYTHONPATH'] + ':/home/sshuser'

This manipulation also does not work. And the error still shows:

No module named own_module
Traceback (most recent call last):
ImportError: No module named own_module

Could someone tell how to I can import own modules? Preferably by putting them in Azure blob storage and afterwards transferring them to the HDInsight cluster.

Original Q&A

There are 1 answers

**Mariusz** · Accepted Answer · 2016-12-31T17:20:29+00:00

Mariusz On 31 December 2016 at 17:20 BEST ANSWER

You can use spark context's addPyFile method. First put the file into Azure blob storage, then copy the public http/https address and use this URL into addPyFile function. The module will be accesible on driver and all executors.

TechQA.

Import own modules in Jupyter notebook on HDInsight

There are 1 answers

Related Questions in AZURE

Related Questions in PYSPARK

Related Questions in AZURE-HDINSIGHT

Popular Questions

Popular Tags

Trending Questions