Large Scale Data Analysis with MongoDB
Donn Felker
In traditional academics and medical research, SQL and XML (gasp!) based data stores are very commonly used for large data management. These systems are often used to store large amounts of computationally intensive data, but traditionally have poor overall performance for these data sets. In this class, we will cover how to use MongoDB running on EC2 for large scale, efficient DNA genome analysis at AgileMedicine. You will see benchmarked numbers and compare them to a comparable MySQL database running on the same EC2 configuration. We will conclude with how MongoDB has helped to import large DNA genome data (such as from 23andMe, etc) and speed up the analysis time while dropping cost for the academic and medical research fields.

Level : Overview