If you think machines learning from numbers is interesting, then machines learning from words is even more interesting.
In this quick tutorial, we go over the basics of Natural Language Processing (NLP). Machines can’t simply read and interpret language innately as we humans can. So how can machines understand sarcasm, or if a sentence is posed as a question, or even just to find the main topic and re-occurring themes in the words?
What You'll Learn
- What is Natural Language Processing
- Important applications of NLP
- Why data processing is an essential step in NLP
Welcome to this short introduction to natural language processing. If you think machines learning from numbers is interesting, machines learning from words is even more interesting.
Natural language processing is how computer programs are able to make sense of words in the surrounding context. For example, you could write a computer program to pick up on sarcasm such as “That’s funny… not.” Or to understand “The world will end!” as an exclamation verses “The world will end?” as a question. But machines can’t simply read and interpret language innately like humans can. So how can machines understand sarcasm or if a sentence is posed as a question, or even just find the main topic and reoccurring themes in the words. Well, they do this through the best means they can. Through calculations. If there’s one thing machines do very well it’s calculations. And so calculations on words and textual features is what allows machines to determine if a piece of text contains sarcasm, or if it’s more negative than positive in a sentiment, or contains more rhetoric rather than factual statements. Or is on this topic versus that topic. Counting the frequency of words and taking into account the surrounding context and then doing calculations is the basis of how machines make sense of natural language.
So then in order to count or calculate words and textual features the raw text, or natural language itself, first needs to be processed in a way that allows machines to work with the more structured data format. This basically means cleaning up the text and then organizing it into tables of word counts across documents. It could also mean tabling pairs of words that occur together taking into account surrounding context of the words. Cleaning up raw text and organizing it into a table is absolutely an essential step in natural language processing.
The word processing should be emphasized in natural language processing. Without processing you’re just left with natural language which is mentioned machines cannot easily interpret like you and I. They needed to be processed first, and then do calculations on.
Some key applications of natural language processing are categorizing texts into negative or positive sentiment to automatically identify unsatisfied customers from satisfied customers. Or categorizing texts into topics to recommend articles on the same topic. Or to answer user questions by retrieving relevant information in the documentation. Or summarizing the most important information and lengthy documents. All these rely on a machines ability to understand words and textual features. And that quickly sums up natural language processing.
Thanks for watching!
Rebecca Merrett - Rebecca holds a bachelor’s degree of information and media from the University of Technology Sydney and a post graduate diploma in mathematics and statistics from the University of Southern Queensland. She has a background in technical writing for games dev and has written for tech publications.