OpenNLP for Text Based Machine Learning Training Course


The Apache OpenNLP library is a machine learning based toolkit for processing natural language text. It supports the most common NLP tasks, such as language detection, tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing and coreference resolution.

In this instructor-led, live training, participants will learn how to create models for processing text based data using OpenNLP. Sample training data as well customized data sets will be used as the basis for the lab exercises.

By the end of this training, participants will be able to:

  • Install and configure OpenNLP
  • Download existing models as well as create their own
  • Train the models on various sets of sample data
  • Integrate OpenNLP with existing Java applications


  • Developers
  • Data scientists

Format of the course

  • Part lecture, part discussion, exercises and heavy hands-on practice


  • Java programing experience

Course Outline

Introduction to Machine Learning and Natural Language Processing

Installing and Configuring OpenNLP

Overview of OpenNLP’s Library Structure

Downloading Existing Models

Calling the OpenNLP’s APIs

Sentence Detection and Tokenization

Part-of-Speach (POS) Tagging

Phrase Chunking


Name Finding

English Coreference

Training the Tools

Creating a Model from Scratch

Extending OpenNLP

Closing remarks

Leave a Reply

Your email address will not be published. Required fields are marked *