A Raw Data Journal: A History of Everything
Every time you run an 'UPDATE' on your database, you lose information. If you want to analyze how your data is changing through time, or undo a mistake, you’re out of luck. Several next-generation database systems have been designed from the
ground up to solve this problem by placing an emphasis on data
versioning, but it can be done without living on the bleeding edge. In
fact, you can start today with your existing database, without impacting
performance, changing your application, or even going down for
This class will introduce the concept of a "Raw Data Journal," which is a log of every state your data has ever been in, and share how you can start aggregating this journal in HDFS directly from your MySQL or PostgreSQL database. We'll then work through some examples of how to use the journal for analytics and disaster recovery.
Level : Intermediate