## TF-IDF

**Course Description**

Term frequency-inverse document frequency, commonly referred to as TF-IDF, is used to show the relevancy of a term within a document. We will discuss on how the Document-Term frequency matrix representation can be improved and an introduction of the mighty term Frequency-Inverse document frequency.

**What You'll Learn**

** > **How to deal with documents of unequal lengths.

**> **What to do about terms that are very common across documents.

**> **TF for dealing with documents of unequal lengths.

**> **IDF for dealing with terms that appear frequently across documents.

**> **Implementation of TF-IDF using R functions and applying them to document-term frequency matrices.

**> **Data cleaning of matrices post weighting/transformation.

