Why Computer Vision Projects Fail (and How to Succeed)

Build, deploy, operate computer vision at scale

One platform for all use cases
Connect all your cameras
Flexible for your needs

It’s only a matter of time until Computer Vision and Deep Learning will surpass human vision. In some areas, such as in the medical field, this has already happened: artificial intelligence (AI) can detect breast cancer with higher accuracy than humans.

However, implementing computer vision to easily solve business challenges is not as straightforward as it may seem. In this article, we will be looking into two grave reasons why Computer Vision projects fail – and how to overcome these challenges.

About us: Viso Suite is the premier end-to-end computer vision infrastructure. By omitting the need for point solutions, machine learning teams can easily perform project development, model training, application deployment, and maintenance on their computer vision projects in a single interface. To learn more, book a demo with the Viso team.

The Value of Computer Vision

Computers can analyze video streams in real time, turn them into variables, and apply logic workflows to solve complex visual problems.

Based on AI, a computer can solve visual problems such as counting objects or recognizing a visual shape (Object Detection) at much higher precision and speed than humans. Also, a computer is able to quickly and autonomously repeat such a task as many times as needed. This is the basis of a wide range of real-world computer vision applications across multiple industries. A recent example is the use of computer vision for coronavirus control.

There are many situations where a computer can complete a visual task better than humans – at higher consistency and precision. The advantages are pretty much the same as with the automation of any manual task.

Hence, it’s not surprising that many businesses in offline industries could potentially make use of computers that increase the quality of their product or service and/or save costs in their operations. However, some big pitfalls come with the real-world use of Computer Vision and visual AI in general.

Problem One: The Objects are Not Visible

As simple as it sounds: AI vision is not magic and cannot overcome physics. A problem that deals with things that are not visible cannot be solved using Computer Vision. A computer’s ability to “see” can only be as good as the image quality of the underlying camera images and videos or even video streams.

Some time ago, I worked on a remote monitoring project where dogs should be counted using machine learning algorithms. Unfortunately, because some of the dogs’ fur was of the same color as the floor, they were “invisible” to the camera and the AI.

The solution to such situations is fairly simple; either the camera angle or the scene needs to change to ensure the objects are clearly visible. In some cases where this is not possible, sophisticated workflows can be set up to count objects over time. This makes sense if either the scene or the objects change and become visible over some time.

yolo object detection — Object Detection example with all objects visible in the camera’s line of sight

Problem Two: Cloud Computing is Not Enough

For many AI vision applications, the traditional cloud computing model is not suitable. Cloud-driven computer vision systems need a constant internet connection, require communication time leading to latency issues, and often come with privacy concerns related to data offloading. Therefore, machine learning algorithms are deployed to the edge device (Edge AI), where the data is generated in a resource-constrained environment (power usage, commuting hardware).

With AI moving from the cloud to the edge, the main challenge is no longer finding an algorithm to do something but achieving an efficient setup, especially since numerous Computer Vision Libraries and Deep Learning Frameworks have recently been open-sourced.

The accuracy and latency of Computer Vision tasks depend on the availability of computational resources. Therefore, more accurate computer vision models (for example, Mask R-CNN) tend to consume significantly more resources.

Concept of edge computing — Concept of Edge Computing

Especially in high-scale AI vision solutions, this matters greatly. Achieving similar results with lower-grade hardware means cost savings that quickly go to the millions. Unfortunately, many visual AI solutions are not viable in production because they rely on a very “heavy” ( computationally intensive) model that requires expensive hardware such as GPUs. Thus, the economic benefit achieved would not make up for the costs that come with such a setup.

The Landscape is Improving

But I have good news for you! As computing costs drastically decline year by year, computers not only get more powerful but also cheaper. Hence, machine learning models considered to be “heavy” can be used more broadly, and switching the hardware to use modern AI accelerators can result in great performance gains.

If waiting or exchanging the AI hardware is not an option, there is much you can do. Many visual problems can be solved by drastically reducing the Frames per Second (FPS). If you, for example, count static objects, much higher precision can be achieved by processing only 1 frame or less per second. As surprising as it may seem, the perceived quality of the application could be much higher.

edge hardware compatibility — Edge hardware compatibility with computer vision systems

Avoid Common Pitfalls and Improve Computer Vision Projects With Viso Suite

Viso Suite infrastructure offers an end-to-end, fully customizable solution for the success of computer vision applications. With advanced capabilities and a visual interface, the development, deployment, and management of computer vision applications means that organizations can achieve project goals efficiently and effectively.

To see how Viso Suite can take your business’ automated solutions to the next level, book a demo with our team.

Viso Platform — End-to-end Computer Vision with Viso Suite

If you want to learn more about the basics of Computer Vision, we recommend you read the following articles:

What is Computer Vision? Learn what Computer Vision is (not technical).
Explore a list of popular Computer Vision applications today.
Learn about the 5 most popular Deep Learning Frameworks.
Read about self-supervised learning of machines.
Understand how to evaluate model performance
Three Types of Deep Neural Networks (MLP, CNN, RNN)
A deep dive into Convolutional Neural Networks (CNN)

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
elementor	never	This cookie is used by the website's WordPress theme. It allows the website owner to implement or change the website's content in real-time.
JSESSIONID	session	The JSESSIONID cookie is used by New Relic to store a session identifier so that New Relic can monitor session counts for an application.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
ZCAMPAIGN_CSRF_TOKEN	session	This cookie is used to distinguish between humans and bots.
zfccn	session	Zoho sets this cookie for website security when a request is sent to campaigns.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_gtag_UA_177371481_2	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
zabUserId	1 year	This cookie is set by Zoho and identifies whether users are returning or visiting the website for the first time
zabVisitId	one year	Used for identifying returning visits of users to the webpage.
zft-sdc	24hours	It records data about the user's navigation and behavior on the website. This is used to compile statistical reports and heat maps to improve the website experience.
zps-tgr-dts	1 year	These cookies are used to measure and analyze the traffic of this website and expire in 1 year.

Cookie	Duration	Description
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.

Cookie	Duration	Description
2d719b1dd3	session	This cookie has not yet been given a description. Our team is working to provide more information.
4662279173	session	This cookie is used by Zoho Page Sense to improve the user experience.
ad2d102645	session	This cookie has not yet been given a description. Our team is working to provide more information.
zc_consent	1 year	No description available.
zc_show	1 year	No description available.
zsc2feeae1d12f14395b6d5128904ae3746	1 minute	This cookie has not yet been given a description. Our team is working to provide more information.