Natural language classification (NLC) is related to the field of natural language processing (NLP), and other related technologies include natural language understanding (NLU) and natural language generation (NLG). IBM Watson™ provides a cloud-based service aptly called Watson Natural Language Classifier, which allows a developer to create classifiers for text, and using cognitive computing techniques, it will return the best matching predefined classifier.
The easiest way I found to wrap my head about NLC was to use the sample data in the Getting started tutorial. You can watch a walk-through video of that tutorial, which uses Watson Natural Language Classifier to classify questions about weather. In the tutorial, we have a small dataset of text input to act as our training data. The training data includes sentences we train as part of the
weather class, such as “How is the weather outside?” or “Is it snowing?” and sentences we train as part of the
temperature class, such as “What’s the temperature outside?” or “Is it cold outside?” Using even a small dataset, we can see fairly high accuracy to questions not in our training data - questions that involve “blizzard” or “rain,” for instance.
We recently created a “Classify ICD-10 data with Watson” code pattern to take things a bit further. We created a small Python-based web app and used a much larger dataset - specifically, the ICD-10 dataset, which classifies medical diagnoses to an ICD-10 designation. Check out the code in our GitHub repo - fork it, clone it, modify it to fit your use case.
It’s not hard to imagine a scenario where text classification could be useful. Classifying email, tweets, or posts as spam or malicious is an easy-to-understand example. Perhaps we could use Watson Natural Language Classifier to look up FAQs or other documents (like ICD-10).
To learn more about Watson Natural Language Classifier, check out the following resources:
- Sample apps that also use NLC
- 5 Things to Know About Watson Natural Language Classifier
- Watson Natural Language Classifier Best Practices
- Watson Natural Language Classifier APIs