The use of AI in retail is of great importance for companies with physical stores to keep up with data-driven online business models. This article is a retail industry update to our earlier article featuring the most popular applications of computer vision in retail.
In the following, we will share practical information about the challenges and opportunities of implementing AI vision in retail. We will provide a technology overview and an extensive list of use cases and applications of visual AI in retail, covering the following areas:
- AI vision analysis and compliance inspection
- Visitor profiling and customer behavioral analysis using AI
- AI product engagement assessment
- AI customer service evaluation
- In-store and outdoor AI inspection
- AI-powered checkout experience
- Security and safety use cases
About us: Viso.ai provides the leading end-to-end Computer Vision Platform Viso Suite. Our solution enables leading organizations worldwide to build, deliver and scale their computer vision applications. Get a demo for your company.

Before we get started, this is a typical AI vision application for people counting in retail, built and delivered with Viso Suite:
Computer Vision in Retail
In recent years, web-based retailers have been taking over a growing market share from traditional brick-and-mortar retailers with several stores. While e-commerce facilitates access to the Point-of-Sale from anywhere through the internet, physical retail’s strength is the ability to shape the customer experience and provide personalized services.
However, one of the advantages leveraged by online retail is the ability to personalize the customer journey. Therefore, massive amounts of data can be acquired easily in a digital environment and used to generate individual recommendations based on prior “click behavior.”
Physical retailers have to transform their business into the digital world to keep up and play to their actual strengths. Therefore, new data acquisition methods are needed that are practical, compatible, and viable in an offline setting.

Data acquisition bottleneck
Today’s reality is that most data acquisition systems are very basic and far from the fancy though expensive showcases exhibited by large technology companies. It’s challenging to digitize the offline world. In fact, retailers find it hard to scale sensing systems to collect valuable information. The need for impractical layout changes that may even impact the customer experience makes it difficult to gather data in offline stores effectively.
Often, retailers have limited possibilities to install equipment or change store layouts when properties are rented, franchised, or externally managed. And while sensors are becoming more cost-effective, it’s usually software, implementation, and maintenance costs that quickly add up and surmount the value of the business case.
In-store retail analytics technologies
A variety of sensors (3D sensors, cameras, lidar, Time-of-Flight sensors, etc.) have been introduced to detect customers, people counting, analyze footfall, and measure dwell times (Xovis, SensMax, ShopperTrak, V-Count, FLIR Brickstream, etc.). However, sensors are usually limited to specific tasks and require new hardware purchases and certain setups (height, distance, etc.).
Modern deep learning technology allows applying ML algorithms to almost any video stream. The application software uses the video feed of cameras and applies AI video analysis in a multi-step process (computer vision pipeline or flow). Any network camera or even existing CCTV or surveillance cameras can be integrated with such computer vision systems, thus reducing the risk of sunk costs and hardware lock-in. Some cameras already provide some intelligence features, but those are often limited and not flexible or powerful enough.

What is computer vision?
AI vision analyzes camera video with algorithms, allowing for data-based innovations that help traditional retailers improve the customer experience and strengthen their market position.
Computer Vision has shown to be a powerful technology to overcome the data acquisition bottleneck. Compared to other sensors, camera analysis requires much less installation equipment, is contactless, and can operate automatically.
Navigating the sea of sensors and applications
In an attempt to digitize the offline store, retailers often tend to create massive lists of data points and information they want to gather through the use of AI with cameras. Ambitious digital strategies require the development and maintenance of a portfolio of applications. One computer vision application or system is not enough to do all that, and serious investment is required in any way.
While proof of concepts (POC) or smaller implementations work well, issues usually appear when it comes to application scaling and operation in production outside of the optimized lab setting. In this phase, companies need to have full control over data and tailor applications to specific business needs to increase effectiveness and cost-efficiency.
Hence, the implementation of a computer vision application platform provides a way to build up internal capabilities, leverage synergies, standardize privacy and security, and prevent data silos.
Data Privacy for Computer Vision
When it comes to data acquisition, it’s absolutely critical to ensure complete data privacy in accordance with privacy protection laws at the state and federal levels (GDPR, CCPA, etc.). Therefore, privacy-preserving computer vision is a must-have for all vision intelligence systems in retail.
Any information that can be used to distinguish or trace an individual’s identity (Personally Identifiable Information, PII) is highly sensitive. Compared to human operators watching video streams, visual AI provides a fully automated method. In addition, there is a wide range of visual obfuscation methods, for example, face blurring. However, this may not be enough to comply with data privacy regulations.
Easy-to-use computer vision APIs such as Google Vision API or AWS Rekognition are often used to develop prototype applications. However, by using cloud-based AI processing, data has to be sent to the cloud and is analyzed using remote recourses of a data center. Most companies’ privacy policies do not allow this form of cloud-based computer vision.
New computer vision technologies leverage Edge AI concepts to process data on-device and in real-time, without storing or transferring video to the cloud. In contrast, Edge Computer Vision does not require a constant internet connection, allowing for much higher performance and cost-efficiency.
Hence, companies need to have full control over the data flows and application design, to implement mechanisms that allow complete anonymization of videos analyzed by AI algorithms. Deep learning models can analyze video streams and convert them into an anonymous stream of metadata that contains no sensitive visuals without losing any valuable information the business is looking for.

AI vision Inspection and Compliance
AI technology with computer vision has been widely used to automate visual inspection tasks of all kinds.
- SOP Monitoring: Computer vision can be used to continuously track compliance with standard operating procedures (SOP) in order to meet safety and operational guidelines. Compared to human operators, AI vision analysis is continuous, subjective, and consistent.
- Branding in line with policy: Cameras have been used to ensure adherence to branding policies, for example, to analyze the presence or distribution of promotional material.
- Temperature and controls monitoring: Machine learning algorithms can be used to read analog controls and temperature displays. Such an AI application can send alerts if thresholds are met.
- Staff uniform detection as per policy: Detection of uniform clothing of staff to identify adherence to internal policies.
- Store opening and closing time analysis: Video analysis can detect retail store openings and closings to identify irregularities.
- Safety compliance: Automatically detect dangerous events and situations in the store, such as unstable stacks or narrow aisles. AI vision is also widely used to monitor adherence to COVID policies such as social distancing and mask detection.

Customer Behavior Analysis
Deep learning for people detection and movement analysis is used to gain an understanding of customer behavior. Pose estimation and real-time object detection are technologies used to provide data and metrics about the shopper journey. The in-store consumer analytics are often modeled after the online retail analytics: people’s paths vs. page visits in e-commerce.
- Average time spent (Dwell Time): AI algorithms can estimate the average time a customer spends in the shop. The dwell time provides valuable insights for marketing and operations.
- Customer visit time distribution over the day: The system identifies the number of visitors per hour of the day to optimize staffing and planning decisions.
- People counting and flow analytics: The customer footfall analysis uses deep learning to track the movement paths of customers inside the retail store. Information about customer paths is often visualized as a heat map or spaghetti diagram.
- Shopper behavior analysis: Image classification within certain regions of interest is used to detect important shopper behavior, for example, to analyze the use of self-checkouts.
- Trolley against baskets: Generate data about whether customers use trolleys or baskets while shopping.

Visitor Profile Analysis
In-store analytics often involves the classification of visitors into groups or customer types. Deep-learning algorithms can classify visitors by gender, mood, activity, or other characteristics. Such customer analytics can be combined with other data (sales data or loyalty card programs) in business intelligence tools.
- Male vs. Female distribution of visitors (Gender): Machine learning algorithms with facial analysis have been used to detect the gender profile of customer groups in specific store areas.
- Demographics analytics: Use deep learning models to estimate the age and gender or cultural background of customers automatically.
- Shopper profile against shelf visits in stores: AI classification algorithms can categorize typical customer characteristics and link them to specific shelves inside the store.
- Basket size: Estimate the filling levels of baskets and shopping carts across different hours of the day and customer groups.
- Emotion analysis: Emotion recognition aims to classify the facial expressions (e.g., “happy,” “surprised”) of customers to analyze the sentiments of customer groups in specific areas (“regions of interest”).

Product Engagement Analysis
Image recognition with one or multiple cameras is used to determine what products customers are interacting with or what products they considered but abandoned. The ability to digitize the product engagement and in-shop buying process of visitors provides valuable insights to retail store managers.
This type of in-store analytics is essential for understanding and optimizing the on-site buying process. Cashierless stores combine multiple methods together, with a high number of different cameras and multiple sensors.
- Hot zones and hot shelves (Attention): AI camera analysis is used to identify popular shelves or locations inside the store. The information can be used to optimize promotions and product placements.
- Count the visits per shelf or area (Aisle usage): Count the number of interested customers stopping at specific shelves or aisles. The data shows changes or irregularities over time.
- Shelf-level offtakes: Object detection can analyze the offtakes of products from shelves at different heights. This can be used to analyze the low-level shelf offtake.
- Product on-shelf interaction: Movement analysis can detect situations where a product is picked up but placed back.
- Zero purchase shoppers: Detect and count the number of visitors who leave the store without making a purchase.
- Average time spent on key aisles or areas: Gather the customer dwell time at specific regions of interest throughout the shop.
- Store-in-store analysis: Analyze specific product categories and special shelves to measure customer engagement in shop-in-shop areas.
- Logo recognition: Computer vision is used to automatically recognize logos and detect logo placements with the aim to analyze the level of brand exposure.

Customer Service Quality
Video AI is able to identify factors that determine in-store interactions. Hence, retailers can generate data about the retail service quality at the point of sale (POS).
- Occupancy detection: Computer vision can detect if the personnel is present at assigned points such as aisles, service desks, checkout, cash registers, self-checkout, or entries (occupancy analysis).
- Staff engagement levels: Recognize customer-staff interactions to measure the engagement levels across different stores.
- Queue detection (line counting): AI camera analysis is used to detect queues and lines of people waiting at service desks or during the retail checkout process.
- Missed rings: Detect and track the number of times when customer service staff disregards customer requests.

Customer Experience
AI vision can identify and track parameters that directly impact the customer experience in a retail location.
- Shopping carts and baskets availability: Use real-time object detection to monitor the supply and distribution of baskets and trolleys throughout the shops, and localize carts automatically.
- Customer experience factors: Detect metrics for waiting times, crowd density, or queue length to digitize and quantify the shopping experience.
- Visited empty shelves: Identify and count occurrences where people did not find what they were looking for.
- Attentiveness of staff: Spot situations and events where employees use their mobile phones while customers are present or waiting.
- Kiosk activation: Enable kiosks or interactive screens automatically when customers are present, and verify that sales displays are working as intended.
Backroom and Inventory Management
- Real-time object detection: Use common security cameras to recognize specific products, boxes, or pallets delivered or stored in specific areas.
- Out-of-stock alerts: Image recognition can be used to detect whether longer out-of-stock situations occur and to improve inventory accuracy.
- Loading dock theft: Retail video analytics is used to detect and prevent loading dock theft in areas where goods are permanently or temporarily stored.
- Logistics and warehousing: AI vision is used to detect misplaced objects in warehouses. Explore our article with 25+ Applications of Computer Vision in Logistics.
- Label and barcode detection: Optical Character Recognition (OCR) can be used to read labels, barcodes, QR-codes, or text printed on containers and packages.

In-Store and Outside AI inspection
Regular security cameras in stores can be used to implement continuous store inspections with AI vision. Here, visual sensing can automate manual inspection tasks effectively.
- Cleanliness: Use AI vision to detect if the shop floor is clean and clear. Use cleanliness analysis to determine whether shop shelves, billboards, and columns are intact, clean, and complete.
- Store accessibility: Analyze the pavement and entrance area outside of the store to identify and optimize people flows.
- Vehicles parked in front of the store: Use vehicle detection to recognize cars or trucks parked so that they block the shop entrance for a longer time.
- Parking lot occupancy analysis: Computer vision is used to recognize if parking spaces are occupied or available (Parking lot analysis).
- Door operation: Use people detection to analyze if the doors operate and open efficiently or if they stay open for too long.
- Shop floor obstacles: Identify obstacles that hinder the visitor flow inside the store, for example, misplaced boxes or promotional material.
- Event detection: Camera-based spill detection and fall detection or crash alerts help to automatically identify events and measure the responding time.

Checkout experience
- Cash register occupancy: Use deep learning to recognize if service desks and cash registers are occupied whenever there are customers.
- Counter cleanliness: Implement ways to organize counters well and ensure that there is enough room for customer service.
- Customer checkout time: Measure the average customer checkout times across different counters and identify unusual waiting times.
- Self Checkout: Analyze the occupancy of self-checkout machines, identify where more staff is needed to assist, or whether the number of checkouts needs to be adjusted.
Security and Safety
- Presence of fire extinguishers: Implement an automated inspection to ensure fire extinguishers and other safety gear is constantly available.
- Accessibility of emergency exits: Use intelligent security and surveillance camera systems to constantly monitor the emergency exits and detect and remove obstacles early.
- Detect intrusion: Detect intrusion events with automated people detection during times when the stores are closed. Automatically identify and report suspicious activities.
- Detection of shoplifting: Deep learning algorithms have been used to detect shoplifting or unattended registers. This task is rather difficult because people may pay for the goods later.
- Self-Checkout Monitoring: Detect pass-around and empty carts, detect the bottom of the baskets, and identify ticket switching automatically.
- Unrestricted access alert: Recognize unauthorized access to restricted areas, for example, behind the counter or in staff-only areas.
- Authentication: Face recognition can be applied to identify staff members in restricted areas and detect unauthorized activities by people with temporary access (for example, suppliers or visitors).

How to get started
The potential of visual AI in retail is enormous, and the list of use cases for computer vision in Retail keeps getting longer. For organizations looking to implement computer vision systems, it’s important to realize that each application may be difficult and require a significant financial commitment. Check out our business guide “What Does Computer Vision Cost?” to learn more about the costs and pricing of AI services and software.
Organizations that intend to use visual AI for business applications need full control over the data and applications and high agility to adjust and change applications towards their needs, also because regulations and business requirements change. Hence, companies establish Proof-of-Concept (PoC) and Pilot projects to gather operational experience with applying AI vision.
Due to the novelty and technical difficulty, it’s usually critical for organizations to start gaining experience with adopting AI as early as possible. Due to the highly disruptive nature of AI technology, catching up with competitors later will be difficult and expensive.
Compared to web applications or standard software, AI systems are considerably more sophisticated. Especially real-world applications require connected sensors, advanced computing algorithms, and compute-intensive deep learning workloads. A computer vision application is far more than only the ML model; it’s also about image acquisition, business logic, data formatting, data aggregation, and integration into other systems.
Initially, companies often start with various PoC or Pilot projects to identify and validate high-value strategies. In later stages, it’s not only the technical feasibility but also other factors such as security, scalability, cost-efficiency, and flexibility of applications that determine the economic potential of visual AI applications.
Using the Viso Suite Platform
Most organizations will need multiple AI vision applications, and it will be challenging to develop and maintain such a portfolio efficiently. This is why companies use our computer vision application platform Viso Suite™.
The end-to-end platform enables their teams to build, deploy, operate and scale all their computer vision applications in one software suite. Viso provides no-code development for professionals to accelerate every step from image annotation and training to application development, deployment, device management, remote configuration, and monitoring.
Fortune 500 and Governmental Organizations worldwide deliver their AI vision applications using Viso Suite™ technology. The model-driven architecture and automated infrastructure is fully optimized for Edge AI and real-time computer vision (Read the solution brief by Intel). This makes the computer vision platform very extensible, letting you integrate your Business Intelligence tools (Tableau, MS PowerBI, etc.) or use existing ML models and software containers.
If you are looking for enterprise-grade computer vision, request a personalized demo or send us your questions.
Check out other articles from the Viso Blog: