Advanced Analytics with Spark: Patterns for Learning from Data at Scale
In this practical book, four Cloudera data scientists present a set of self-contained patterns for performing large-scale data analysis with Spark. The authors bring Spark, statistical methods, and real-world data sets together to teach you how to approach analytics problems by example.
You€ll start with an introduction to Spark and its ecosystem, and then dive into patterns that apply common techniques€"classification, collaborative filtering, and anomaly detection among others€"to fields such as genomics, security, and finance. If you have an entry-level understanding of machine learning and statistics, and you program in Java, Python, or Scala, you€ll find these patterns useful for working on your own data applications.
Patterns include: