Code produces correct results but when I hit submit it gives me an error

Hi all!

I successfully installed pyspark, spark, findspark and java8 on my mac osx. However I am running into errors when starting it from jupyter notebook.

Screen Link:
https://app.dataquest.io/m/127/project%3A-spark-installation-and-jupyter-notebook-integration/1/introduction

My Code:

import findspark
findspark.init()
f= sc.textFile("recent-grads.csv")

The imports work fine but when I execute the 3rd code-line this happens:

ERROR:root:Exception while sending command.
Traceback (most recent call last):
  File "/Users/myuser/Documents/DataScience/Dataquest/spark-3.0.0-preview2-bin-hadoop2.7/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", line 1188, in send_command
    raise Py4JNetworkError("Answer from Java side is empty")
py4j.protocol.Py4JNetworkError: Answer from Java side is empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/myuser/Documents/DataScience/Dataquest/spark-3.0.0-preview2-bin-hadoop2.7/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", line 1014, in send_command
    response = connection.send_command(command)
  File "/Users/myuser/Documents/DataScience/Dataquest/spark-3.0.0-preview2-bin-hadoop2.7/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", line 1192, in send_command
    raise Py4JNetworkError(
py4j.protocol.Py4JNetworkError: Error while receiving
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (127.0.0.1:51547)
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3331, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-4-80131646d7c7>", line 1, in <module>
    f= sc.textFile("recent-grads.csv")
  File "/Users/myuser/Documents/DataScience/Dataquest/spark-3.0.0-preview2-bin-hadoop2.7/python/pyspark/context.py", line 610, in textFile
    return RDD(self._jsc.textFile(name, minPartitions), self,
  File "/Users/myuser/Documents/DataScience/Dataquest/spark-3.0.0-preview2-bin-hadoop2.7/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", line 1285, in __call__
    return_value = get_return_value(
  File "/Users/myuser/Documents/DataScience/Dataquest/spark-3.0.0-preview2-bin-hadoop2.7/python/lib/py4j-0.10.8.1-src.zip/py4j/protocol.py", line 334, in get_return_value
    raise Py4JError(
py4j.protocol.Py4JError: An error occurred while calling o11.textFile

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 2044, in showtraceback
    stb = value._render_traceback_()
AttributeError: 'Py4JError' object has no attribute '_render_traceback_'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/myuser/Documents/DataScience/Dataquest/spark-3.0.0-preview2-bin-hadoop2.7/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", line 958, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/myuser/Documents/DataScience/Dataquest/spark-3.0.0-preview2-bin-hadoop2.7/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", line 1096, in start
    self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 61] Connection refused

My Jupyter Notebook Terminal output this:

[I 17:51:21.652 NotebookApp] Replaying 6 buffered messages
20/05/29 17:51:34 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
dyld: lazy symbol binding failed: Symbol not found: ____chkstk_darwin
  Referenced from: /private/var/folders/26/4zss0t615_j12pj2s4jhd_4w0000gn/T/liblz4-java-5812855804467217827.dylib (which was built for Mac OS X 10.15)
  Expected in: /usr/lib/libSystem.B.dylib

dyld: Symbol not found: ____chkstk_darwin
  Referenced from: /private/var/folders/26/4zss0t615_j12pj2s4jhd_4w0000gn/T/liblz4-java-5812855804467217827.dylib (which was built for Mac OS X 10.15)
  Expected in: /usr/lib/libSystem.B.dylib

Here´s my bash_profile:

# Setting PATH for Python 3.8
# The original version is saved in .bash_profile.pysave
PATH="/Library/Frameworks/Python.framework/Versions/3.8/bin:${PATH}"
export PATH
alias python="python3"
alias pip="pip3"

export SPARK_HOME="/Users/myuser/Documents/DataScience/Dataquest/spark-3.0.0-preview2-bin-hadoop2.7"
export PATH=$PATH:"/Users/myuser/Documents/DataScience/Dataquest/spark-3.0.0-preview2-bin-hadoop2.7/bin"

export JAVA_HOME="/Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home"
export PATH=$JAVA_HOME/bin:$PATH

Since these errors are very specific I couldn´t find a one-fits-all solution on the internet. I figure it might have something to do with the standard ports java is using vs. jupyter/python (which I don´t know how to change). Or also the install paths and env-variables set for java, spark and python.

any help would be appreciated
many thanks in advance
Marina

hi @tchintchie

please ignore if you have already come across this forum post:

I just googled the first part of the error stack you have given.

P.S. I have 0 experience with all the components you are trying to install but the first thing that came to my mind as well was version incompatibility based on previous experience.