Skip to main content

What is Natural Language Processing(NLP)

Introduction to Natural Language Processing(NLP)

Natural Language Processing is a technique where in we need to process the normal human language and     makes sense out of it.  Basically this tutorial has two part one is theory and another is practical.
We will post few posts on examples of Natural Language Processing on both Stanford CoreNLP / Apache OpenNLP.

There are some steps that needs to be follow in processing the Natural Language. Actually there are many Natural Language Processing tools like Apache OpenNLP, Stanford CoreNLP etc.
But major steps are same for everyone.
a)    Sentence Detection :  In this part, the individual sentence are detected from the main sentence. As for example ,as seen in two different colours,

Natural Processing Language is a good Technique. It has various parts,
  there are two different sentences. Using this step we detect this two individual sentence from the main sentence and do the processing for the next step.

Note : the definition of sentence is  --- a sentence is defined as the longest white space trimmed character sequence between two punctuation marks. The first and last sentence make an exception to this rule. The first non whitespace character is assumed to be the begin of a sentence, and the last non whitespace character is assumed to be a sentence end. The sample text below should be segmented into its sentences.

       b)   Tokenizer :  In this stage of Natural Language Processing, the individual tokens are generated from the sentence. The default delimiter  for tokens in most of the cases is white space.  So if we consider the sentence, 
:       Natural Processing Language is a good Technique, there will be following tokens : 
 Natural,  Language, Processing, is , a good , technique
As mentioned in different colors,  these are the different token

      c)   POS Tagging – It is one of the most important part in Natural Language Processing. Basically it identifies the Part Of Speech Identifier for each of the tokens generated above.
As for example -
Natural-JJ  Processing-NN  Language-NN is-VBZ  a-DT  good-JJ Technique-NN

These tags(JJ,NN etc) are known as POS Tags(Penn Tree Bank). Complete list and there meaning can be found at the following link :

POS Penn-Tree Bank

      d)  Chunking- This is the Step that is not found in every NLP. Like Chunking is only present in Apache OpenNLP, but not present in Stanford CoreNLP.

What chunking does is that it combines the POS Tagging O/P :
Consider an example : In the above sentence Processing & Language are both nouns. So it will combine them and produce the Processing Language as one Noun.

Text chunking consists of dividing a text in syntactically correlated parts of words, like noun groups, verb groups, but does not specify their internal structure, nor their role in the main sentence.

      e)    Parsing(Parse Tree):  In this step the parse tree in generated for the sentence .

This can be parsed to use the necessary information.

       f)  Dependency Parsing:  Currently this is only available in Stanford CoreNLP. This does what that it identifies the dependencies between words in a sentence. The dependencies are between a governor and a dependent.

The dependencies exist between two words Natural & processing in above case. The relationship between them is amod.

Stanford dependencies (SD) are triplets: name of the relation, governor and dependent
In this case the triplet would be amod, Natural , Processing.

      g)      Named Entity Recognition:  The Name Finder can detect named entities and numbers in text.
As for example in sentence :  Harward University is good,

These are the most common steps that are mainly found in any Natural Language Processing Tools. How they work is that they have created there training data and based on that training data, they generates the models. Using this models as there knowledge, the code works.

Apache OpenNLP models can be downloaded from the below link :

Stanford provides them in the form of jar which can be downloaded from

Remember this, you can also train these models, but this could also impact the whole learning of the system. So be careful before doing that.

If you need more detail on that please visit :

Here is the complete video which describes the Natual Language Processing.

Video Ref : Prof.Sudeshna Sarkar and Prof.Anupam Basu, Department of Computer Science and Engineering,I.I.T, Kharagpur (Embedded Youtube Video)

You can also check our Microservices post

Introduction to Microservices



  1. Hey nice story... I love the way you presented whole story, Thanks for sharing such a useful information with us. Natural Language Processing (NLP) Market Report |
    Mobile Health Apps and Solutions Market Report|
    Password Management Market Report

  2. Explained in very nice way. It is easy to understand. Thanks for sharing such a useful document.


Post a Comment

Popular posts from this blog

Login with Google Account using PHP / Javascript using OAuth2.0

Login with Google Account using PHP with code
This post have Complete Code for Login / Sign-in  with google Account  using PHP / Javascript with oAuth2.0

Basically today we have seen almost every website needs you to register yourself before you can post or take part in any discussions to the website. But it become a tedious task to register and login to many different sites. Solution is to provide the users the option to Login with existing Google / Facebook account as almost everyone have Facebook and Google account..

In this post, I am going to explain how to integrate the Google Login / Sign in  for your website.

For this,  First you need to create your Client ID, Client Secret and your developer API key.

For this go to

Click on the button Create Project. A new window will open up. Please select Create Project / or select already created Project.

It will then ask for about type of Project. Please select Web-browser. Then it …

What is MonGoDB??

Share on Google Plus Now a Days, we are hearing a lot about MongoDB.

So, in this post I will try to briefly introduce to MongoDB.

MonGoDB is "Non-Relational, JSON, Document Store".

Explaining in detail,

Non-Relational, --> The DB we use most like SQl, Oracle are Relational DBs(RDBMS). They do have the fixed schemas, lots of tables. So Non-relational in nothing like that.

JSON : (JavaScript Object Notation) : It is basically a document with information in the form of key value pair.

for example,


thats it. 
A simple JSON document where name and address are key
and "lorem" and "ipsum" are corresponding values.

So, MongoDB is the Non Relational database which stores information using DOCUMENT and that document is a JSON document.

MongoDB stores collections of documents.
(Consider the above exxample as collection  "Person"  with a document)(will elaborate it later more)

Need of MongoDB:
The Databases we use…

TCP Chat Server in Node.js

d.Hello all.

Today in this post, I am going to explain about how to use a "net" module in node.js and build a basic chat server.(i.e a TCP chat server)

For this, you need to use either telnet / netcat(nc).(I have tried it on netcat, simply google "netcat download")
(Dont worry about the script for now, i will upload it, just go through the tutorial to understand the concept)

So, lets begin,

first of all to use node "net" module , we need to import it in out script.

var net = require("net")

This will import the net module. Now moving ahead, we need to create a server. This is done by createServer  function(on simply Server function).

var server = net.createServer()

createServer has a callback whenever a client connects to a server.

var server = net.createServer(function(socket){


The code inside the createServer() function is callback(Hope you have know about the callbacks / event driven programming). This gets called whenever, a client gets c…