This article is brought to you by the Eden AI team. We rallow you to test and use in production a large number of AI engines from different providers directly through our API and platform. In this article, we test several pre-trained Object Detection APIs . We test these solutions on various relevant use cases.
You are a solution provider and want to integrate Eden AI, contact us at : firstname.lastname@example.org
In recent years,within the world of Artificial Intelligence, one of the most popular applications is computer vision. This popularity is due to the huge diversity of applications and needs: medical imaging, industry, transport, surveillance, etc. Nowadays, every fields use cameras and pictures in their activities. Computer vision includes various functionalities:
This table does not represent an exhaustive list of all computer vision functionalities. Many solutions are based on several functionalities combined.
It is very important to distinguish pre-trained APIs and AutoML APIs :
This article briefly treats about pre-trained Object Detection APIs. The aim is to inform about which problematics can be solved with this kind of API? Who are the main providers on the market? What is the optimal process when using pre-trained APIs?
During our study on Object Detection pre-trained APIs, we decided to choose 6 providers APIs that provide high performance according to many blog articles, Gartner and Forrester rankings:
This is the pull of providers APIs we are going to test. It is interesting to note that some other solutions and open source solutions exist.
As said previously, object detection APIs are used in hundreds of fields, for many various use cases. In this article, we are going to test different object detection APIs with various types of pictures representing common use cases.
We chose 3 use cases from different fields, represented by 3 pictures :
For each use case, we tested the Object Detection API from the 6 providers, with one picture per use case. Of course, for a real project you will need to test on a representative part of your database (not only one picture) to have the right view about different performance.
For GCP, AWS, Azure and Watson, we don’t need to use their API directly. In fact, the Eden AI Object Detection API allows to get the 4 providers APIs results with only one simple request. With few lines of code, we can have access to the results from the 4 providers. Clarifai and Chooch AI are not implemented yet on Eden AI, so we use their API directly.
Better than comparing results from different APIs, Eden AI provides the Genius functionality. This functionality returns a smart combination of all results. For our examples, we will see what we can get with this combined result.
The API response is only a text response. This response (often json format) will be used to develop applications. For our example, the way to proceed is:
1- Benchmark object detections APIs available on the market
2- Choose the API provider that best fits with your project OCR combine multiples providers APIs results
3- Integrate final API in your project / software
Finally, depending on the project, the visual results with bounding boxes printed on the pictures can be useful or not. But for the benchmark, this is the best and fastest way to find and visualize performances.
Google, IBM, Clarifai, AWS, Azure and Chooch provide API for multiple computer vision functionality. They also provide a graphic interface only to test and compute few pictures.
For Clarifai, here is the result:
Chooch AI result:
Eden AI API returns responses for AWS, GCP, IBM and Azure APIs:
For Clarifai, here is the API response:
For Chooch AI, here is the API response:
With the visual results, we can see that Google, Chooch and Microsoft are the most efficient APIs, but they can all be complementary. Here, the Genius functionality seems to give the best results.
You can also look at the API requests in Python and the json responses. This response data will be use to realize the projects.
For this use case, we can note that Google Cloud API is really performing. On this picture, any people is detected by a provider but not Google. The Genius feature is interesting here insofar as to the person detection that Google does very well, it will add the context elements (park, trees, etc.) that are better detected by other providers (Chooch AI, AWS and IBM in particular).
For this use case, all the provider APIs seem to give good performance. In this case, the Genius feature will essentially validate that AI providers are predicting the same thing and therefore secure those predictions. Moreover, the API provider choice will be based on :
So we have chosen 3 random use cases, and for the 3, it seems that the way to manage the project will be different :
Depending on the use case, the best way to obtain the highest performance is always different. For some use case, an API from the provider A will be the best, for another use case provider B’ API is better. For a more complex use case, maybe a custom model can be better, or develop a proprietary model based on open source solutions.
With Eden AI, you can get a fast access to various results from various providers. So you can have a better idea about which is the solution that best fits for you.
The decision making is as following :
First you run your datas on Eden AI to benchmark solutions available on the market. Then you have 3 options :
a — You find a result that push you to choose one API that fits with your attempted performance
b — Different providers give good results so you use the Genius functionality to gather forces and get a combined result, better than any single result from a provider.
c — Pre-trained APIs cannot provide good results for your project :
This process garanties you to make the right choice to success in your project. Eden AI is only a tool that allows you to realize a benchmark very easily and quickly. Finally, it is possible to use Eden AI API to realize the entire project avoiding accounts and billings from many providers, and keeping the flexibility to not just choose one provider.
In this article, we explain how the mapping between the input language and the languages supported by the providers is performed to facilitate access to one of our AI engines.