Modern machine learning libraries make model-building look deceptively easy. An unnecessary emphasis (admittedly, annoying to the speaker) on tools like R, Python, SparkML, and techniques like deep learning is prevalent. Relying on tools and techniques while ignoring the fundamentals is the wrong approach to model building.
Real-world machine learning requires hard work, discipline, and rigor. The development of robust models requires due diligence during the data acquisition phase and an obsession with data quality.
Experienced machine learning engineers spend most of their time dealing with data-related issues, model evaluation, and parameter tuning while spending only a fraction of their time in actual model building. This is the 80/20 rule.
Unlike most talks these days, this talk is not about deep learning. We will ignore the hype and strictly focus on the fundamentals of building robust machine learning models.
What You'll Learn
- To take a data-driven approach to solve a business problem
- How feature engineering, choice of evaluation metrics, and an understanding of the model bias/variance trade-off is often more important than the choice of tools
Raja Iqbal - Raja Iqbal is a data scientist, a passionate educator, and an internationally recognized speaker on all things data science. He is the Founder and Chief Data Scientist at Data Science Dojo. Prior to Data Science Dojo, Raja worked at Microsoft in a variety of research and development roles involving machine learning and data mining at very large scale. Raja has a Ph.D in Computer Science from Tulane University with a focus on machine learning and data mining.