Unable to find py4j, your SPARK_HOME may not be configured correctly

Screen Link:

My Code:

Find path to PySpark.

import findspark
findspark.init()

Import PySpark and initialize SparkContext object.

import pyspark
sc = pyspark.SparkContext()

Read recent-grads.csv in to an RDD.

f = sc.textFile(‘recent-grads.csv’)
data = f.map(lambda line: line.split(’\n’))
data.take(10)

Replace this line with your code

What I expected to happen:

What actually happened:
IndexError Traceback (most recent call last)
C:\ProgramData\Anaconda3\lib\site-packages\findspark.py in init(spark_home, python_path, edit_rc, edit_profile)
142 try:
→ 143 py4j = glob(os.path.join(spark_python, “lib”, “py4j-*.zip”))[0]
144 except IndexError:

IndexError: list index out of range

During handling of the above exception, another exception occurred:

Exception Traceback (most recent call last)
in
1 # Find path to PySpark.
2 import findspark
----> 3 findspark.init()
4
5 # Import PySpark and initialize SparkContext object.

C:\ProgramData\Anaconda3\lib\site-packages\findspark.py in init(spark_home, python_path, edit_rc, edit_profile)
143 py4j = glob(os.path.join(spark_python, “lib”, “py4j-*.zip”))[0]
144 except IndexError:
→ 145 raise Exception(
146 “Unable to find py4j, your SPARK_HOME may not be configured correctly”
147 )

Exception: Unable to find py4j, your SPARK_HOME may not be configured correctly

Replace this line with the output/error

I also struggle to get pyspark to work, I had the wrong java version.

Try following this, if you are unsure about how setting environment variable, try googling it!