Error with findspark.init()

Hello. I am trying to complete the Spark Installation and Jupyter Notebook Integration mission but am running into trouble at the very end, when I set up Jupyter Notebook. I have copy/pasted the code from the tutorial but receive the following error:


IndexError Traceback (most recent call last)
in
1 # Find path to PySpark.
2 import findspark
----> 3 findspark.init()
4
5 # Import PySpark and initialize SparkContext object.

~/Library/Python/2.7/lib/python/site-packages/findspark.py in init(spark_home, python_path, edit_rc, edit_profile)
133 # add pyspark to sys.path
134 spark_python = os.path.join(spark_home, ‘python’)
–> 135 py4j = glob(os.path.join(spark_python, ‘lib’, ‘py4j-*.zip’))[0]
136 sys.path[:0] = [spark_python, py4j]
137

IndexError: list index out of range

It seems to be some kind of issue in findspark itself? Has anyone else run into this issue? Any tips on how I might resolve it? Thanks!

Solved using this blog post: https://jakevdp.github.io/blog/2017/12/05/installing-python-packages-from-jupyter/

3 Likes