Getting Started in Big Data Science Consulting
Will Ford
Getting started in Big Data Science (BDS) consulting can be a daunting task. The primary focus of this class will be to walk through the process of going from concept to completion of a BDS consulting engagement. We will start by discussing the professional landscape and various ways to get started as an aspiring consultant. Next, we will cover available open-source tools and other resources such as Massively Open Online Courses. The majority of our time will be spent discussing the technical side of scoping, executing, and delivering your project. If time permits, we will discuss ancillary topics such as trends that we are seeing with our clients and how we approach new domains.

We will discuss:
  • Joining an old firm vs. a start-up vs. working independently
  • Scoping your project and risks that early BDS consultants should avoid
  • Choosing a modeling method based on associated trade-offs
  • Using feature generation to improve models
  • Using models to improve feature generation
  • Variable Selection
  • Dimensionality reduction
  • Overtraining
  • Model delivery and presentation
  • Project Management for Data Scientists
Note: Having an introductory understanding of p-values, information gain, receiver operating characteristic curves, Logistic Regression, and classification in general will be useful.

Level : Intermediate