This article is brought to you by the Eden AI team. We allow you to test and use in production a large number of AI engines from different providers directly through our API and platform. You are a solution provider and want to integrate Eden AI, contact us at : firstname.lastname@example.org
In this article, we are going to see how we can easily integrate a Keyword Extraction engine in your project and how to choose and access the right engine according to your data.
Keyword extraction (also known as keyword detection or keyword analysis) is a text analysis technique that automatically extracts the most used and most important words and expressions from a text. It helps summarize the content of texts and recognize the main topics discussed. Keyword extraction uses machine learning artificial intelligence (AI) with natural language processing (NLP) to break down human language so that it can be understood and analyzed by machines.
In 1999, Turney hypothesized that keywords facilitate a user's reading by allowing him to surf from one key point to another when they highlighted in a text. Other researchers use their synthetic virtues in automatic summary construction methods, but keyword extraction is becoming increasingly useful with the rise of the Internet.
In the 2010s, many researchers are interested in automatic keyword extraction and some evaluation campaigns, such as DEFT and SemEval, propose automatic keyword extraction tasks in order to compare the different existing systems. For this purpose, the data and the evaluation method are the same for all systems. Supervised and unsupervised methods are emerging, and are nowadays combined to train keyword extraction engines.
The Text Analytics API is a cloud-based service that provides advanced natural language processing over raw text, and includes four main functions: sentiment analysis, key phrase extraction, named entity recognition, and language detection.
Available on Eden AI
ParallelDots provides Komprehend AI APIs that are a comprehensive set of document classification and NLP APIs for software developers. Their NLP models are trained on more than a billion documents and provide state-of-the-art accuracy on most common NLP use-cases such as named entity recognition, sentiment analysis and emotion detection.
Yonder Labs is a data science company with a special expertise in Natural Language Processing, Machine Learning, and Multimedia Analysis. Yonder is currently releasing new API for extracting semantic information both from single text documents, such as sentiment analysis, entity extraction, semantic tagging, etc. and from collections of texts, allowing for services such as text comparison, clustering, and data mining on text collections.
TextRazor offers a complete cloud or self-hosted text analysis infrastructure. They combine state-of-the-art natural language processing techniques with a comprehensive knowledgebase of real-life facts to help rapidly extract the value from your documents, tweets or web pages. They provide features such as entity extraction, disambiguation and linking, keyphrase extraction, automatic topic tagging and classification.
Amazon Comprehend uses natural language processing (NLP) to extract insights about the content of documents. Amazon Comprehend processes any text file in UTF-8 format, and semi-structured documents, like PDF and Word documents. It develops insights by recognizing the entities, key phrases, language, sentiments, and other common elements in a document.
Available on Eden AI
Cortical.io provides natural language understanding (NLU) solutions that enable large enterprises to automate the extraction, monitoring, and analysis of key information from any kind of text data. Cortical.io offers AI-based natural language understanding solutions built on technology inspired by Neuroscience.
IBM Natural Language Understanding is a collection of APIs that offer text analysis through natural language processing. This set of APIs can analyze text to help you understand its concepts, entities, keywords, sentiment, and more. Additionally, you can create a custom model for some APIs to get specific results that are tailored to your domain.
Available on Eden AI
MonkeyLearn is a Text Analysis platform with Machine Learning to automate business workflows and save hours of manual data processing. They provide pre-built NLP APIs adapted to use cases such as entity extraction, sentiment analysis, text classification, etc. With MonkeyLearn you can also train custom machine learning models to get topic, sentiment, intent, keywords and more.
As natural language processing company, Twinword provides unique experience for your keyword research process. Combining data science and SEO solutions, Twinword Ideas strives to deliver quality multi-language keyword results for any businesses on web. Twinword Topic Tagging API does more than just extract keywords from the given text, it also generates human-like topics even without the presence of that particular word in the context.
Cloudmersive brings its customers a complete portfolio of APIs across document conversion and processing, deep learning OCR, image recognition, NLP, etc. Cloudmersive powerful natural language processing APIs perform analytics over unstructured text to identify sentiment, key words and phrases, and much more.
spaCy is an open-source software library for advanced natural language processing, written in the programming languages Python and Cython. spaCy comes with pretrained pipelines and currently supports tokenization and training for 60+ languages. It features state-of-the-art speed and neural network models for tagging, parsing, named entity recognition, text classification and more, multi-task learning with pre-trained transformers like BERT, as well as a production-ready training system and easy model packaging, deployment and workflow management. spaCy is commercial open-source software, released under the MIT license.
You can use Keyword extraction in numerous fields, here are some examples of common use cases:
When you need a Keyword Extraction engine, you have 2 options:
The only way you have to select the right provider is to benchmark different providers’ engines with your data and choose the best OR combine different providers’ engines results. You can also compare prices if the price is one of your priorities, as well as you can do for rapidity.
This method is the best in terms of performance and optimization but it presents many inconveniences:
Here is where Eden AI becomes very useful. You just have to subscribe and create an Eden AI account, and you have access to many providers engines for many technologies including Keyword Extraction. The platform allows you to benchmark and visualize results from different engines, and also allows you to have centralized cost for the use of different providers.
Eden AI provides the same easy to use API with the same documentation for every technology. You can use the Eden AI API to call Keyword Extraction engines with a provider as a simple parameter. With only a few lines, you can set up your project in production.
Here is the code in Python (GitHub repo) that allows to test Eden AI for keyword extraction:
Part of Answer:
There are numerous Keyword Extraction engines available on the market: it’s impossible to know all of them, to know those who provide good performance. The best way you have to integrate Keyword Extraction technology is the multi-cloud approach that guarantees you to reach the best performance and prices depending on your data and project. This approach seems to be complex but we simplify this for you with Eden AI which centralizes best providers APIs.
In this article, we explain how the mapping between the input language and the languages supported by the providers is performed to facilitate access to one of our AI engines.