Spark: Confused in reduceByKey

Screen Link: https://app.dataquest.io/m/60/introduction-to-spark/9/explanation

daily_show.map(lambda x: (x[0], 1)).reduceByKey(lambda x, y: x+y)

Can someone exaplain how does reduceByKey works?

@ prateek Prateek here this reducebykey(f) takes the same element of tuple as a key , and counts the other element of the tuple as the lamda function declared here lambda x, y: x+y, image ----> image , since (1+1+1+1) = 4
If you asked to multiply each corresponding element to the key like lambda x, y: x*y , then it would be( '1991', 1), since 1 * 1 * 1 * 1 = 1.
for details: https://sparkbyexamples.com/pyspark/pyspark-reducebykey-usage-with-examples/
Please mark my answer as solution, if you find it useful.

1 Like

So, in the given scenario, 1991 is the key and 1 is the value. Right?

1 Like

Yeah , absolutely :slight_smile:

Cool, thanks :grinning:

1 Like

@ prateek Prateek you marked your own reply as solution :slight_smile: