Course Description

    In this series of Introduction to Text Analytics with R, we will focus on data pipelines which is a series of processes to migrate data from source to destination.

    What You'll Learn

      >  Exploration of textual data for pre-processing “gotchas”

      >  Using the quanteda package for text analytics

      >  Creation of a prototypical text analytics pre-processing pipeline, including (but not limited to): tokenization, lower casing, stop word removal, and stemming

      >  Creation of a document-frequency matrix used to train machine learning models


    Text Analytics Tutorial slides can be accessed here

    Download R here

    SMS Spam Collection Dataset used in this tutorial can be accessed here


    Data Science Dojo Instructor - Data Science Dojo is a paradigm shift in data science learning. We enable all professionals (and students) to extract actionable insights from data.