I would like to know your opinion between using Python or Scala with Apache Spark in a large project, what are the advantages of both?
Hi there! It’s been a while, but assuming nothing has changed, I’ll say that Scala slightly outperforms Python IF YOU KNOW BOTH LANGUAGES EQUALLY. If I remember correctly, the project is native to Scala, and so using Python requires a few adjustments or extra steps, has somewhat slower performance, and occasionally fails at runtime instead of at compile-time with Scala. When Spark came out, I learned Scala just for Spark… I thought this was fun, but it was also time consuming. If you know Python but not Scala, I would recommend you just get started with Spark in Python - you can do amazing things with minimal code and having to learn a few adjustments is WAY easier than learning a new language. If you know Scala but not Python… use Spark in Scala… but also learn Python at some point soon, because it’s a great scripting language to round out your programming skill set! It will be like that time when: https://xkcd.com/353/ Good luck!!