The Basics of Natural Language Processing

14 min read

Where humans lack collaborative partners, we are creating them — not just with artificial intelligence but specifically with natural language processing.

When we dream of hyper-advanced machines, from “2001: A Space Odyssey” to “The Jetsons,” we dream of them talking to us.

WALL-E’s titular robot understood human speech, and his strained efforts to speak made him instantly endearing to audiences.

In the cultural imagination, intelligence — or even humanity — has always been intertwined with speech.

Chimpanzees may be able to use tools, dolphins might recognize themselves in a mirror, and elephants might even hold funerals for fallen herd members, but people find it hard to consider them intellectual peers, simply due to their lack of human speech.

However, speech isn’t so much a hallmark of intelligence as a very specific kind of intelligence — one we don’t even fully understand ourselves. Replicating it in machines, then, is no easy task.

But in recent years, leaps and bounds have been made in the field of natural language processing. The day has arrived when our machines — or at least Alexa — can tell us, “I can’t do that, Dave.” (Thankfully, she does so without ulterior motives.)

Where humans lack collaborative partners, we are creating them — not just with artificial intelligence but specifically with natural language processing. These talking computer programs are already becoming a major part of our lives and taking large burdens off our businesses.

What Is Natural Language Processing, and How Does It Work?

Natural language processing, or NLP, is the field of research and technology dedicated to teaching machines to use and understand language in a human-like fashion.

Its results can be seen in everyday applications such as Google Translate and Siri. But to genuinely understand this NLP revolution, it’s necessary to first be familiar with the technologies that enable NLP and the applications it’s used for.

Machine learning has played a critical role in the recent flourishing of NLP. It revolves around writing programs that can learn beyond their initial programming, rather than being constrained by the rules coded into them.

This is why machine learning matters for NLP. Language is bigger than can conceivably be coded into a program.

We’re not even consciously aware of many of the rules governing language, which are continually under debate in the scientific community.

How do you teach a computer something you still don’t fully understand?

The answer is that you don’t. Instead, you feed text into a machine learning program and let it discern the rules for itself, often using probabilistic models to figure out usage in a more fleshed-out, natural way.

This also makes improving the model easier. Instead of trying to figure out and write rules of increasing complexity, simply feed the model more text and let it learn the same way a human might.

Recent machine learning techniques have taken natural language processing even further. In particular, word embedding employs samples of natural language to encode the context of a given word, phrase, or sentence.

This could be considered the first step in the cognitive understanding of natural language by machines. Like humans, NLP programs are better able to understand the meaning of language when they have access to context.

Why Care About NLP?

The importance of NLP in the consumer products of Google Translate and Alexa is obvious. But there are many less visible yet equally dramatic ways in which NLP is helping businesses.

Big data is a popular buzzword right now but, in reality, many businesses are still unsure what to do with their data — particularly since it doesn’t always come in the neatest or most digestible format.

According to Oracle, only 20% of all generated data is structured data that’s formatted to be easily understood by machines.

The rest is locked away in books, journals, notes, audio, video, images, analog data, and other formats. It’s not that humans can’t comb through that data, but the sheer amount of time it would take is staggering, not to mention a waste of resources.

NLP platforms can sort through document archives and stream newly created documents in real time, automating the creation of clear, dynamic organizational systems that update as needed. By classifying documents in this way, NLP makes it easier to search and use information from any database.

NLP can also be applied via entity extraction, an application that clarifies the actors in a given text and also their roles and relations to one another. This method can be used to automate manual tasks like filtering and searching, looking at trends, and doing data analytics.

Instead of raw text, you have features that can be put into tabular form. In this way, NLP eliminates the need for human effort at a number of stages, streamlining operations and freeing up human resources.

A Library of Information

Another application of NLP is cognitive information retrieval. NLP software can retrieve complex information from free text in databases and manuals and then use that information to discover unknown correlations between events.

This capability is of particular interest to industrial enterprises looking to minimize safety issues. NLP allows operators to search through safety logs in powerful ways.

For instance, the query “Find incidents involving debris falling on lone employees” could turn up a log that reads “Was hit by a piece of plastic that had come loose while working solo on a turbine,” despite the latter statement not containing any of the keywords “debris,” “falling,” or “lone.”

In this way, cognitive information retrieval helps reveal hidden insights from safety records to alert industrial operators on near misses likely to recur and avoidable events.

Many businesses are interested in NLP for yet another reason: report generation. A 2016 Accenture survey found that managers spend an average of 54% of their time on administrative tasks, including the ever-present business report.

As machines learn to wield language the way a human does, we can expect to see these reports being written by AI instead. This means saving time for the kind of creative and critical thinking only humans can do.

Language is a sprawling, complex cognitive system that our best minds have long struggled to understand and teach.

With machine learning, those decades of work on NLP are finally bearing fruit for individuals and organizations. The better we can communicate with our machines, the better they can help us with our routine tasks.

And eventually — sooner rather than later — the fully conversational mechanical companions of our dreams will be a reality.

After all, if there’s one thing humankind will always want, it’s someone to talk to.

Robert Work Envisions the Path Forward for AIPrevious ArticleRobert Work Envisions the Path Forward for AI Art or Artifact: Music in the AI EraNext ArticleArt or Artifact: Music in the AI Era