Project: A Crime Analysis of the Last Decade NYC
Big Data Project with Apache Spark + Amazon EMR
PreviousProject: The Winning Recipes to an Oscar AwardNextProject: Predict User Type Based on Citibike Data
Last updated
Big Data Project with Apache Spark + Amazon EMR
Last updated
The followings are slides & written report for the Big Data Seminar course. It is a project written with Spark, deployed first on DataBricks and then on Amazon EMR. The packages involved: SparkML developed by Apache Spark team, and Azure Machine Learning developed by Microsoft.
I used the community version DataBricks, and the EMR costed around $5 (paid by school). If you are interested in replicating the result yourself, feel free to take my code from the following links: