The Most Valuable Computer Vision Smart City Applications (2021 Guide)

Computer Vision Smart City Use Cases

Hi, we are from Switzerland. We power a no-code computer vision platform. Thank you for reading our blog.

Need Computer Vision?

Viso Suite is an all-in-one solution for organizations to build computer vision apps without coding. Learn more.

Find the best and most valuable applications of computer vision technology in Smart Cities and Smart Towns in 2021. This article includes novel, practical, and critical real-world use cases with cameras and computer vision technologies. At, we provide the end-to-end technology for organizations to power computer vision projects and systems in Smart Cities.

Smart Cities demand highly scalable and connected technologies to operate at multiple distributed locations. Recent advances in Computer Vision such as Edge AI and Deep Learning combine AI vision with IoT. These new technologies make it possible to handle the huge amount of complex visual data, enable fast processing, robustness by decentralization, and scalability of real-world computer vision systems.

In this article, you will learn about

  • The basis: What is Computer Vision
  • New trends: Deep Learning and Edge AI
  • Safety and security smart city applications
  • Pandemic control and compliance
  • Traffic and infrastructure monitoring
  • How to get started

Computer Vision in Smart Cities

In the past two decades, smart city solutions have emerged, enabled by technologies such as the Internet of Things (IoT), Artificial Intelligence (AI), Deep Learning, and Cloud Computing. They offer vast potential to address infrastructural, societal, and pandemic challenges.

AI, Computer Vision, and Image Recognition

Inside the field of Artificial Intelligence, Computer Vision is a subfield that includes technologies allowing computers to “learn” to recognize a picture or the characteristics of images. This allows identifying objects, humans, animals, or a position in an image or video feed. Hence, the goal of Computer Vision is that a machine can understand the world and provide information to automate tasks based on this data.

New machine learning technologies, most prominently Deep Learning, brought significant breakthroughs to the field of image recognition, making AI vision robust and useful for mission-critical business applications.

Compared to traditional Machine Vision, Deep Learning does not require special imaging cameras and is able to provide accurate results with almost any digital camera or even webcam. Hence, Smart City applications of Computer Vision often include already installed network cameras (IP cameras or CCTV cameras) to provide the input for real-time video analytics with AI models.

Edge Computer Vision

The latest trend, Edge AI, moves machine learning from the cloud to multiple connected edge devices (physical computers) connected to cameras and processes the data on-device (Edge Intelligence). This approach is based on Edge Computing in connection with the Internet of Things (IoT) to manage multiple remote devices.

Edge AI helps to overcome the limitations of the cloud and enables high-performing, robust, real-time, and private computer vision applications – making it possible to use Computer Vision at scale in the real world.

Compared to other sensor technologies such as RFID, GPS, and UWB or BLE that require the installation of sensors to all entities, computer vision is non-invasive, comparably easy to implement and scale, and provides much more information (location, context, semantic information, multi-dimension perspective).

Video analytics with deep learning for vehicle detection
Video analytics with deep learning for vehicle detection
Smart City Computer Vision Applications

In the following, I will provide an extensive list of computer vision examples and deep learning use cases in smart cities. The following high-value applications include security and safety, pandemic control strategies, traffic and infrastructure monitoring tasks.

Top Computer Vision Smart City Applications
  • Application #1: Perimeter monitoring and Person detection
  • Application #2: Detect violent and dangerous situations
  • Application #3: Action Recognition for Vandalism Detection
  • Application #4: Compliance Monitoring and Inspection
  • Application #5: Suicide prevention in public spaces
  • Application #6: Crowd disaster avoidance application
  • Application #7: Protection of Critical Infrastructures
  • Application #8: Weapon detection and reporting
  • Application #9: Social Distancing monitoring in public places
  • Application #10: Automated mask detection
  • Application #11: Hygiene Compliance Control
  • Application #12: Healthcare monitoring
  • Application #13: Smart city traffic monitoring
  • Application #14: Traffic rule violation detection
  • Application #15: Roadside surveillance application
  • Application #16: Parking Lot Occupancy Detection

Safety and Security Applications in Smart Cities

1. Perimeter monitoring and Person detection

The detection of people has a wide range of applications in smart cities. Applications include computer vision methods for real-time video analysis to interpret human activity. Other computer vision applications are used to detect people and activity recognition in restricted areas at airports or rail stations.

Deep Learning systems are used to detect intrusion events in real-time through the perimeter and identify the target’s position. Such automated, AI-based perimeter monitoring systems can efficiently cover large monitoring areas for safety and security applications.

2. Detect violent and dangerous situations

Computer Vision in smart city applications often involves the automatic detection of dangerous situations with the goal to ensure the safety of residents with smart video surveillance. Such dangerous situations can be fighting, brawls, robberies, and more. The strong variability of such scenes is a challenging problem.

Algorithms are used for detecting people, tracking people, and estimating three-dimensional human poses (Human Pose Estimation), all with the aim of recognizing the actions and interactions of people. The output of AI models is processed with a logic flow to define scenarios and events that should be triggered to automate human intervention or provide analytics. Such applications involve people counting and the detection of people that spend an unusual amount of time in specific areas (dwell time analysis).


Object Detection in Smart Cities to recognize dangerous situations
Object Detection in Smart Cities to recognize dangerous situations
3. Action Recognition for Vandalism Detection

Different machine learning methods can be used in action recognition applications. For example, machine learning and feature extraction techniques have been introduced for human behavior monitoring and support. Vision-based technologies can recognize human behaviors such as fighting and vandalism events that may occur in a public district using one or numerous camera views. The computer vision systems can be used to detect and predict suspicious and aggressive behaviors in real-time, even in complex and crowded environments.

4. Safety and compliance Control and Inspection with Deep Learning

There is a wide range of compliance monitoring use cases with computer vision in a smart city. Since manual monitoring of the compliant and unsafe conditions is very challenging, labor-intensive, and expensive, AI vision methods offer automated and scalable alternatives to manual inspection and site observations. Also, computer vision methods are not only more time-efficient but also more accurate. Expert safety officers are usually not always present and make inspection processes unreliable for regular practice. Computer vision algorithms can be used to perform continuous monitoring of complex, large-scale sites.

Especially in construction jobs, the workforce faces the highest probability of injury or death on job sites, almost 5 times more compared to any other industry. Numerous injuries can be avoided if workers always wear proper Personal Protective Equipment (PPE) during work, including helmets, safety glasses, vests, hand gloves, steel toe boots, etc. To minimize accidents, compliance monitoring is essentially required to detect situations of workers not wearing proper PPE or constant norm violations effectively.

5. Suicide prevention in public spaces

Camera systems are used to automate suicide prevention in public spaces by classifying visual features, characterizing body joint movements, and identifying unusual behavior. CCTV cameras with deep learning smart city applications can be used for assessing crisis behaviors at hotspots such as metro stations. Therefore, automated computer vision applications can help to identify self-harm attempts in progress and initiate an intervention.

The main goal is the development of automated detection systems for early intervention and detection of pre-attempt behaviors. Therefore, computer vision applications can help to identify risk factors, for example, by inferring depression from facial expressions using facial analytics. Another way is to detect unusual behavior and movement patterns or waiting times at specific hotspots. Also, forensic vision applications are used to understand factors after an attempt.

6. Crowd disaster avoidance application

Computer vision surveillance in crowded scenes is one of the most important applications due to the increase of people gatherings in public places where the possibilities of disasters and stampedes are also high. Therefore, vision-based crowd disaster avoidance systems are used to improve public safety.

In general, such applications focus on crowd scene analysis and behavior analysis. Stampedes may occur due to the abnormal behavior of individuals or unexpected events. Deep learning models are used to count people, estimate crowd density with one or multiple cameras at scale.

7. Protection of Critical Infrastructures in smart cities

Smart cities use intelligent computing technologies to monitor activities or prevent potential risks. Critical infrastructures are essential resources for society, and their failure would have a very high impact and cost. These include roads, communications, water, energy, and more.

While closed-circuit television (CCTV) systems have become an essential element for security and law enforcement, traditional surveillance systems depend on the attention capacity of a human operator who is confronted with a multitude of video streams. The main limitations of conventional video surveillance systems include high operating costs due to the need for personnel to watch the videos, fatigue-related errors, an enormous amount of video data demanding high bandwidths. Thus, local processing (Edge AI) is required since centralized processing cannot scale. High latency and communication delays are a major concern in mission-critical real-time use cases where instant decision-making is needed to reduce risks. Rigid centralized systems lack flexibility after deployment and drive maintenance costs.

Smart Edge AI computer vision systems make it possible to provide a scalable and easily calibrated multi-camera system that automatically triggers alarms for potential risks. Different ML and computer vision methods can be applied to the same video workflow. The network bandwidth can be significantly reduced through on-device processing with local edge nodes (connected computers, edge devices) that are dynamically configured according to the task conditions or metrics. Such edge computers can be any computer or embedded system capable of running high-performance video processing in real-time. A popular edge device for edge computer vision is the Nvidia Jetson TX2.

Therefore, automatic video analysis capabilities of deep learning drive enormous value. Human detection with deep learning can be used in combination with perimeter monitoring, facial recognition, face tracking, and multi-person tracking with re-identification.

8. AI-based automated weapon detection and reporting

Mega-cities with dense urban populations face huge challenges controlling the increasing crime rate. Therefore, deep learning vision applications can be developed to autonomously surveillance public places to detect handheld arms in real-time. AI models analyze video streams to perform object detection with all objects visible in the camera streams. The weapon detection application triggers an alert when any type of weapon is visible and can track the movement of the object.

This approach can be used to develop an automated firearm detection system. High-performance systems are capable of detecting and tracking the person holding the weapon and use the information to perform facial recognition.

A related smart city computer vision use case involves the automated detection of suspicious objects that have been placed in public places. AI models, such as the popular Convolutional Neural Network based model YOLOv3 and the recent YOLOR, can be used to detect placed objects that could pose a thread and flag it for human review.

COVID-19 Pandemic Measures for a Smart City

The coronavirus pandemic has revealed the limitations of existing city structures. Hence new systems and technologies such as computer vision provide fast and effective mechanisms to limit further spread of the COVID virus.

The investment in smart city initiatives can enhance the planning and preparation ability to combat pandemics now and in the future. Technological advances offer alternative methods to physical connectivity with the goal of maintaining urban functionality during times of crisis.

9. Social Distancing monitoring in public places

Smart vision systems are capable of monitoring and enforcing social distancing between people to slow the pandemic spread effectively. For example, deep learning based AI models can be used to create applications for social distancing monitoring with real-time object detection to detect people in videos of surveillance cameras at scale.

In crowd detection applications, a deep neural network (DNN) model can automatically detect people in urban public spaces using CCTV security cameras. The people’s movement trajectories are used to identify potentially high-risk areas for planners to redesign the structure of open and public spaces to make them more pandemic-proof.

10. Automated mask detection

Another application example is automated mask detection of people in public places. AI-enabled systems can facilitate mass surveillance using deep learning algorithms to monitor the conditions and send alerts if people do not wear a mask or do not comply with lock-down measures, in combination with compliance to social distancing protocols in public spaces such as metro stations.

The insights can be used to enhance the capacity of cities to predict pandemic patterns, facilitate a timely response, minimize the transmission of the virus, provide support to specific sectors, minimize supply chain disruption and ensure the continuation of essential services and operations.

11. Hygiene Compliance Control with Deep Learning

Computer Vision and Deep Learning based approaches are valuable to automate safety and compliance monitoring. For example, such applications power non-intrusive vision-based systems for tracking people’s activity in public places and infrastructure. AI vision methods have shown better results compared to proximity-based techniques.

An example application is automated hand hygiene compliance monitoring with deep learning. This system conducts spatial analytics to provide insights into human movement patterns. There are promising use cases for reducing hospital-acquired infections.

12. Healthcare monitoring to minimizing physical contact

Face-recognition technologies are used to minimize the need for touching surfaces and objects or physical contact between individuals to reduce the transmission risk. Smart technologies help to reduce the risk of human exposure, especially in risky places such as airports or hospitals.

Also, telemedicine is increasingly used to minimize risks as they aim to reduce hospital visits of groups such as the elderly, which are more vulnerable. Furthermore, telemedicine can also contribute to recovery ability and remote monitoring.

Healthcare monitoring systems include applications based on convolutional neural networks to detect a person falling, so-called human fall detection. Surveillance video can be used to detect the fall in combination with processing rules to automatically alert the caregivers.

Traffic and Infrastructure Monitoring

13. Smart city traffic monitoring

Computer Vision is used to analyze and predict traffic conditions. Traffic data include car count, frequency, and direction, gathered from surveillance cameras. Vehicle counting uses deep neural networks to detect different vehicle types in high-traffic situations and use the information to optimize traffic management. Traffic monitoring also includes tracking of waiting times (dwell time tracking) and traffic flows.

14. Traffic rule violation detection

Computer vision is used to analyze the huge amount of video surveillance data for traffic control in smart cities in order to locate traffic rule violations. While traditional computer vision methods have been unable to analyze such a huge amount of visual data generated in real-time, modern deep learning methods with Edge AI make it possible to find semantic patterns that are useful for interpretation.

An example is the automatic detection of bike riders without helmets in city traffic. Other use cases include the automatic recognition of situations that lead to road accidents to promote urban traffic safety and efficiency. Such applications include, for example, the detection of abnormal situations such as vehicles that are stopping at dangerous locations.

15. Roadside surveillance application

Double-parking and busy roadside activities such as frequent loading and unloading of trucks can also have a critical impact on traffic situations in cities with high transportation density. Here, computer vision applications with real-time surveillance of roadside loading and unloading bays are needed to detect the roadside occupation automatically.

Traffic Computer Vision Vehicles
Vehicle Detection
16. Parking Lot Occupancy Detection

Real-time parking lot occupancy detection recently gained a lot of importance. Solutions based on computer vision showed good performance in terms of accuracy and can be implemented using existing camera networks. Deep Learning is used to detect vacant parking lots over multiple cameras.

Novel Edge AI capabilities make it possible to analyze visuals close to the sensing devices without the need of transmitting video streams to a central controller to acquire, encode, analyze and process the images. Hence, distributed parking lot occupancy detection yields better performance also in terms of the overall energy efficiency and accuracy.

Get started with Computer Vision in Smart City

The opportunity of computer vision in smart city use cases is largely untapped. Novel technologies such as robust Deep Learning models that can be easily re-applied in a wide range of situations and Edge AI capabilities that make computer vision scalable in real-time use cases have great potential to provide significant economic value to societies and cities.

Computer vision application platforms make it possible to develop such applications for smart cities. Therefore, camera streams are analyzed in real-time with AI models in combination with a workflow logic to automate specific tasks (counting, warning, insights for decision-making). Check out Viso Suite, a state-of-the-art computer vision platform that offers scalable low-code/no-code tools to rapidly build and test such computer vision applications and run entire Edge AI systems on fully managed infrastructure.

Find more computer vision applications in related sectors:

Related Articles

Join 6,300+ Fellow
AI Enthusiasts

Get expert AI news 2x a month. Subscribe to the most read Computer Vision Blog.

You can unsubscribe anytime. See our privacy policy.

Develop Computer Vision
10x faster with Viso Suite

End-to-end computer vision platform
for developers to manage the entire
application lifecycle better.

By using, you agree to our Cookie Policy.

Request a live demo

By clicking “Request Demo” you agree to our Terms of Use and Privacy Policy.

Not interested?

We’re always looking to improve, so please let us know why you are not interested in using Computer Vision with Viso Suite.

error: Alert: Content is protected