• Train




          Data Collection

          Building Blocks​

          Device Enrollment

          Monitoring Dashboards

          Video Annotation​

          Application Editor​

          Device Management

          Remote Maintenance

          Model Training

          Application Library

          Deployment Manager

          Unified Security Center

          AI Model Library

          Configuration Manager

          IoT Edge Gateway

          Privacy-preserving AI

          Ready to get started?

          Expert Services
  • Why Viso Suite
  • Pricing
Close this search box.

Privacy-preserving Deep Learning for Private Computer Vision (2024)


Viso Suite is the all-in-one solution for teams to build, deliver, scale computer vision applications.

Need Computer Vision?

Viso Suite is the world’s only end-to-end computer vision platform. Request a demo.

Privacy is becoming a key issue in computer vision. The advent of deep learning enables image recognition to build powerful AI vision applications which transform entire industries. However, the performance of deep learning depends on the availability of large amounts of visual data, driving the need for privacy-preserving computer vision.

Our background: At viso.ai, we’ve built the Viso Suite computer vision platform. The powerful end-to-end computer vision platform helps leading organizations worldwide develop, deploy and scale their AI vision applications that meet the highest privacy requirements. Learn more about Viso Suite or request a demo here.

In this article, we will share insights about:

  1. Privacy of visual data with AI analysis
  2. Privacy-Preserving Machine Learning (PPML)
  3. Methods for private image recognition
  4. Private AI for image-to-text



Computer vision in aviation for real-time image processing using YOLOv7
Computer vision in aviation for real-time image processing using YOLOv7


Privacy of Visual Data in Computer Vision

Recently, visual data is being generated at an unprecedented scale. For example, people upload billions of photos daily on social media, and a high number of security cameras capture video data.

Worldwide, there are over 770 million CCTV surveillance cameras in use. Additionally, an increasing amount of image data is being generated due to the popularity of camera-equipped personal devices.


1. Deep Learning leverages the value of data

Recent advances in deep learning methods based on artificial neural networks (ANN) have led to significant breakthroughs in long-standing AI fields such as Computer Vision, Image Recognition and Video Analytics.

The success of deep learning techniques is directly proportional to the amount of data available for training. Hence, companies such as Google, Meta/Facebook, and Apple take advantage of the massive amounts of training data collected from their users and the immense computational power of GPU farms to deploy deep learning on a large scale.

AI vision and learning from visual data have led to the introduction of computer vision applications that promote the common good and economic benefits, such as smart transportation systems, medical research, or marketing.


2. Privacy concerns regarding visual data

While the utility of deep learning is undeniable, the same training data that has made it so successful also presents serious privacy issues that drive the need for visual privacy. The collection of photos and videos from millions of individuals comes with significant privacy risks.

  • Permanent collection. Companies gathering data usually keep it forever. Users from whom the data were collected can neither delete it nor control how it will be used nor influence what will be learned from it.
  • Sensitive information. Images often contain accidentally captured sensitive items such as faces, license plates, computer screens, location indications, and more. Such sensitive visual data could be misused or leaked through various vulnerabilities.
  • Legal concerns. Visual data kept by companies could be subject to legal matters, subpoenas, and warrants, as well as warrantless spying by national-security and intelligence organizations.


Privacy-preserving Deep Learning in Computer Vision
Privacy-preserving Deep Learning in Computer Vision: People Counting with blurred faces


Privacy-Preserving Machine Learning (PPML)

While public datasets are accessible to everyone, machine learning frequently uses private datasets that can only be accessed by the dataset owner. Hence, privacy-preserving machine learning is concerned with adversaries trying to infer such private data, even from trained models.

  • Model inversion attacks aim to reconstruct training data from model parameters, for example, to recover sensitive attributes such as gender or genotype of an individual given the model’s output.
  • Membership inference attacks are used to infer whether an individual was part of the model’s training set.
  • Training data extraction attacks aim to recover individual training examples by querying the model.

A general approach that is commonly used to defend against such attacks is Differential Privacy (DP), which offers strong mathematical guarantees of the visual privacy of the individuals whose data is contained in a database.


Methods To Prevent Privacy Breaches During Training and Inference

  • Secure Enclaves. An important field of interest is the protection of data that is currently in use. Hence, enclaves have been used to execute machine learning workloads in a memory region that is protected from unauthorized access.
  • Homomorphic encryption. Machine Learning models can be run on encrypted private data using homomorphic encryption, a cryptographic method that allows mathematical operations on data to be carried out on ciphertext instead of on the actual data itself.
  • Secure Federated Learning. The concept of federated learning was originally proposed by Google. The main idea is to build machine learning models based on data sets that are distributed across multiple devices. With federated learning, multiple data owners can train a model collectively without sharing their private data.
  • Secure multi-party computation. Privacy-preserving multi-party deep learning distributes a large volume of training data among many parties. For example, Secure Decentralized Training Frameworks (SDTF) can be used to create a decentralized network setting that does not need a trusted third-party server while simultaneously ensuring the privacy of local data with a low cost of communication bandwidth.


Methods for Privacy-Preserving Deep Learning in Visual Data

Edge AI processing enables real-time and on-device image processing with machine learning, without sending or storing sensitive visual data. Such vision systems are fully autonomous. Private image processing using distributed edge devices can be combined with additional methods:

  • Image obfuscation. Several methods have been developed to sanitize and anonymize sensitive visual data. Such techniques include blacking, pixelization (or mosaicing), and blurring. However, the deterministic obfuscation of traditional image privacy preservation techniques can lead to re-identification with well-trained neural networks. Recent studies show that standard obfuscation methods are ineffective due to the adaptability of convnet-based models. In experiments, obfuscated faces could be re-identified up to 96%, and even black fill-in faces, body, and scene features could be utilized to re-identify 70% of the people. There, new image obfuscation methods were developed based on metric privacy, a rigorous privacy notion generalized from differential privacy. This allows sharing pixelized images with rigorous privacy guarantees by extending the standard differential privacy notion to image data, which protects individuals, objects, or their features.
  • Removal of moving objects. An alternative to blurring is a method to automatically remove and inpaint faces and license plates (e.g., pedestrians, vehicles) in Google street-view imagery. A moving object segmentation algorithm was used to detect, remove and inpaint moving objects with information from other views to obtain a realistic output image in which the moving object is not visible anymore.

Some recent datasets contain sanitized visual data. For example, nuScenes is an autonomous driving dataset where faces and license plates are detected and then blurred. Also, the public video dataset for action recognition AViD anonymized all the face identities to protect their privacy.


Automated number plate blurring in live video


Recovering Sensitive Information in Images

  • Limits of Face Obfuscation. Face obfuscation does not provide any formal guarantee of visual privacy. Because both humans and machines can infer identities from face-blurred images based on context information such as height or clothing. In certain cases, both humans and machines can infer an individual’s identity from face-blurred images, presumably relying on cues such as height and clothing. The obfuscation techniques like mosaicing, pixelation, blurring, and P3 (encryption of the significant coefficients in the JPEG representation of the image) can be defeated with artificial neural networks.
  • Preventing Anti-Obfuscation. Methods try to protect sensitive image regions against anti-obfuscation attacks, for example, by perturbing the image adversarially to reduce the performance of a recognizer. However, they usually only work for certain recognizers, may not work for humans, and provide no privacy guarantee either.


Privacy-preserving Image-to-Text

In computer vision, the task of text recognition in images is called Optical Character Recognition (OCR) or Scene Text Recognition (STR). There is an increasing need for privacy-preserving AI analysis of text documents in image-to-text applications, mainly because such use cases often involve scanning or digitizing sensitive documents containing sensitive personal information or business secrets.

  • Cryptographic Hashing: One possible way to achieve privacy-preserving OCR is to use a cryptographic hashing function to obscure the text before passing it to the OCR engine. This would prevent anyone from being able to extract the original text from the OCR output. However, this approach would not be effective against an attacker who is able to modify the input image.
  • Differential Privacy: Another technique called Differential Privacy has been developed to implement privacy-preserving OCR that is robust against image modification attacks. A Differential Privacy system adds random noise to the data before passing it to the OCR engine, in such a way that the original data cannot be reconstructed from the noisy output. However, this comes at the cost of increased error rates in the OCR output.
  • Homomorphic Encryption: There is ongoing research into developing private OCR techniques that are both accurate and robust against image modification attacks. One such method is called Homomorphic Encryption, which encrypts the input data before forwarding it to the OCR algorithm. This allows the OCR engine to perform its computations on the encrypted data, without ever decrypting it. As a result, the privacy of the input data can be maintained while the OCR engine achieves accurate results.

However, those methods are very new and have not been broadly tested. A major limitation is the significant increase of the computational workloads leading to much slower performance and the need for more expensive AI processing hardware.


Example of Optical Character Recognition (OCR)
Example of Optical Character Recognition (OCR)


Private Computer Vision – Getting started

Deep learning methods and their need for massive amounts of visual data pose serious privacy concerns as this data can be misused. We reviewed the privacy concerns brought by deep learning and the mitigation techniques to tackle these issues. In the near future, and with the induction of deep learning applications in production, we expect privacy-preserving deep learning for computer vision to become a major concern with commercial impact.

If you are looking for a way to build custom, private computer vision applications for your organization, check out Viso Suite. Our computer vision platform provides an end-to-end solution infrastructure to build, deploy and scale private AI vision applications. It allows you to process visual data in real-time and local on-device computer vision with Edge AI, without any video upload or storage at any point in time.

Check out the extensive features of Viso Suite and request a personal Demo.


We recommend you to explore the following related topics:

Follow us

Related Articles

Join 6,300+ Fellow
AI Enthusiasts

Get expert news and updates straight to your inbox. Subscribe to the Viso Blog.

Sign up to receive news and other stories from viso.ai. Your information will be used in accordance with viso.ai's privacy policy. You may opt out at any time.
Play Video

Join 6,300+ Fellow
AI Enthusiasts

Get expert AI news 2x a month. Subscribe to the most read Computer Vision Blog.

You can unsubscribe anytime. See our privacy policy.

Build any Computer Vision Application, 10x faster

All-in-one Computer Vision Platform for businesses to build, deploy and scale real-world applications.