This article is brought to you by the Eden AI team. We allow you to test and use in production a large number of AI engines from different providers directly through our API and platform. If you are a NER engine provider and want to integrate Eden AI, contact us at: contact@edenai.co
In this article, we are going to see how we can easily integrate a Named Entity Recognition (NER) engine in your project and how to choose and access the right engine according to your data.
In Natural Language Processing (NLP), Named Entity Recognition (NER) is a process where a sentence or a chunk of text is parsed through to find entities that can be put under categories like names, organizations, locations, quantities, monetary values, percentages, etc. With named entity recognition, you can extract key information to understand what a text is about, or merely use it to collect important information to store in a database.
The term “Named Entity (NE)” was born in the Message Understanding Conferences
(MUC) which influenced IE research in the U.S. in the 1990’s. At that time, MUC focused on Information Extraction tasks where structured information of company activities and defense related activities is extracted from unstructured text, such as newspaper articles. Outside the U.S., there have been several evaluation-based projects for NE. Around this time, the number of categories is limited to 7 to 10, and the NE taggers, automatic annotation systems for NE entities in unstructured text, are based on dictionaries and rules which were made by hand or some supervised learning technique. More recent and currently dominating technology is the supervised learning techniques like Decision Tree, Support Vector Machine, etc.
The Text Analytics API is a cloud-based service that provides advanced natural language processing over raw text, and includes four main functions: sentiment analysis, key phrase extraction, named entity recognition, and language detection.
Available on Eden AI
ParallelDots provides Komprehend AI APIs that are a comprehensive set of document classification and NLP APIs for software developers. Their NLP models are trained on more than a billion documents and provide state-of-the-art accuracy on most common NLP use-cases such as named entity recognition, sentiment analysis and emotion detection.
The Cloud Natural Language API provides natural language understanding technologies to developers, including sentiment analysis, entity analysis, entity sentiment analysis, content classification, and syntax analysis. This API is part of the larger Cloud Machine Learning API family. Each API call also detects and returns the language, if a language is not specified by the caller in the initial request.
Available on Eden AI
TextRazor offers a complete cloud or self-hosted text analysis infrastructure. They combine state-of-the-art natural language processing techniques with a comprehensive knowledgebase of real-life facts to help rapidly extract the value from your documents, tweets or web pages. They provide features such as entity extraction, disambiguation and linking, keyphrase extraction, automatic topic tagging and classification.
Amazon Comprehend uses natural language processing (NLP) to extract insights about the content of documents. Amazon Comprehend processes any text file in UTF-8 format, and semi-structured documents, like PDF and Word documents. It develops insights by recognizing the entities, key phrases, language, sentiments, and other common elements in a document.
Available on Eden AI
Dandelion API is a set of semantic APIs to extract meaning and insights from texts in several languages (Italian, English, French, German and Portuguese). It’s optimized to perform text mining and text analytics for short texts, such as tweets and other social media. Dandelion API extracts entities (such as persons, places and events), categorizes and classifies documents in user-defined categories, augments the text with tags and links to external knowledge graphs and more.
IBM Natural Language Understanding is a collection of APIs that offer text analysis through natural language processing. This set of APIs can analyze text to help you understand its concepts, entities, keywords, sentiment, and more. Additionally, you can create a custom model for some APIs to get specific results that are tailored to your domain.
Available on Eden AI
MonkeyLearn is a Text Analysis platform with Machine Learning to automate business workflows and save hours of manual data processing. They provide pre-built NLP APIs adapted to use cases such as entity extraction, sentiment analysis, text classification, etc. With MonkeyLearn you can also train custom machine learning models to get topic, sentiment, intent, keywords and more.
Repustate is a leading provider of text analytics services for enterprise companies. Sentiment analysis and semantic analysis in multiple languages using machine learning. Extract named entities (athletes, politicians, entertainers, countries, cities, landmarks, you name it) from any piece of text. Analyze text for opinions and sentiment in one of English, French, Spanish, German, Italian, Arabic or Chinese. All delivered over a very simple to use API.
Allganize provides Natural Language Understanding API and conversational AI for enterprises. It also helps businesses automate workflows by natural language understanding AI. It provides insight into what your teammates are working on, as well as overarching work patterns/trends in your team. Allganize leverages deep learning-based, high-performance NLU technology to enable companies of all sizes to apply AI technology to develop their own AI systems and services. The company provides real-time customer and project related information.
spaCy is an open-source software library for advanced natural language processing, written in the programming languages Python and Cython. spaCy comes with pretrained pipelines and currently supports tokenization and training for 60+ languages. It features state-of-the-art speed and neural network models for tagging, parsing, named entity recognition, text classification and more, multi-task learning with pre-trained transformers like BERT, as well as a production-ready training system and easy model packaging, deployment and workflow management. spaCy is commercial open-source software, released under the MIT license.
You can use Named Entity Recognition in numerous fields, here are some examples of common use cases:
When you need a NER engine, you have 2 options:
The only way you have to select the right provider is to benchmark different providers’ engines with your data and choose the best OR combine different providers’ engines results. You can also compare prices if the price is one of your priorities, as well as you can do for rapidity.
This method is the best in terms of performance and optimization but it presents many inconveniences:
Here is where Eden AI becomes very useful. You just have to subscribe and create an Eden AI account, and you have access to many providers engines for many technologies including NER. The platform allows you to benchmark and visualize results from different engines, and also allows you to have centralized cost for the use of different providers.
Eden AI provides the same easy to use API with the same documentation for every technology. You can use the Eden AI API to call NER engines with a provider as a simple parameter. With only a few lines, you can set up your project in production:
Here is the code in Python (GitHub repo) that allows to test Eden AI for Named Entity Recognition:
Answer:
Platform:
Eden AI also allows you to compare these engines directly on the web interface without having to code:
There are numerous NER engines available on the market: it’s impossible to know all of them, to know those who provide good performance. The best way you have to integrate NER technology is the multi-cloud approach that guarantees you to reach the best performance and prices depending on your data and project. This approach seems to be complex but we simplify this for you with Eden AI which centralizes best providers APIs.
You can directly start building now. If you have any questions, don't hesitate to schedule a call with us!
Get started