I wish I could make all this prettier, but I gave a talk at PyGotham. It’s not my best work since I’m not actually terribly experienced with the subject matter. But it does represent packaged knowledge that I wish I had when I was getting started with Spark.

A few people asked for slides, so this is my attempt to post them. I’m not bothering with making reveal.js work, so it’s all one long html doc. Sorry it’s not prettier. I know what pretty looks like; I don’t know how to make pretty things.

Well, for what it’s worth, I present to you the slides to “Spark Dataframes for the Pandas Pro”.