Discover Visual General Intelligence Breakthrough

On Thursday, 25th September 2025, our very own Talia Bender and Jeremy Michaels officially launched the new whitepaper, proudly announcing the dawn of Visual General Intelligence (VGI) to the world.

This is the third of six articles covering all six sections of that whitepaper – The Future of Visual Intelligence: AI Vision Through The Looking Glass – which includes our belief that VGI will be the quintessential proof point of Artificial General Intelligence (AGI). Crucially, it also outlines the role VGI plays in accelerating the paths to zero downtime and zero harm.

You can now download the whitepaper and enjoy our on-demand webinar with our founders – Nico Klingler and Gaudenz Boesch – stepping us through what VGI is and why it matters. In this article, we summarize the third section of that whitepaper.

The future of visual intelligence: AI Vision through the looking glass. — The future of visual intelligence: AI Vision through the looking glass

“The future of AI is not just about making computers smarter but about making them see and understand the world like humans do.”

– Andrej Karpathy (Computer Scientist and co-founder of Open AI)

See more:

Visual General Intelligence: through the looking glass of AI

Lewis Carroll’s ‘Through the Looking Glass’ invited readers to step into a world where logic followed new rules, and familiar things became extraordinary. Today, we find ourselves in a similar moment with Artificial Intelligence (AI). The mirror we are stepping through is not fiction but Visual General Intelligence (VGI) – a leap beyond what machines have ever been able to perceive, interpret, and understand.

Computer Vision (CV) and Visual Intelligence (VI) have already proven powerful: detecting defects in factories, monitoring safety risks, and interpreting medical scans. But VGI promises something different. It is not about incremental accuracy or faster pattern recognition. It is about machines redefining what it means to see – surpassing human vision in depth, adaptability, and contextual awareness.

In this blog, adapted from our whitepaper – The Future of Visual Intelligence: AI Vision Through the Looking Glass – we explore why VGI may become the quintessential proof point of Artificial General Intelligence (AGI), and why businesses must prepare now.

Material route compliance VGI — Material route compliance: an application that VGI will further enhance.

Setting the scene: AI today

At its core, AI refers to systems capable of performing tasks that once required human intelligence: reasoning, problem-solving, perception, language. Within AI, Machine Learning (ML) and Deep Learning (DL) have driven rapid progress.

Computer Vision, a key branch of AI, mimics human sight – interpreting images and video to detect objects, recognize faces, or identify anomalies. Over time, this evolved into VI, which goes beyond recognition to understand context, relationships, and meaning.

Yet these systems remain narrow. A model trained to identify a defective circuit board cannot instantly pivot to detecting hazards on a construction site. Each use case demands retraining, new data, and costly configuration. That’s where VGI shifts the paradigm.

machine utilization downtime logging — VGI unlocks the path to fully optimized machine utilization and zero downtime.

Defining Visual General Intelligence (VGI)

VGI is the capability of AI systems to understand and reason about visual information across all domains and contexts, achieving human-level comprehension and beyond.

Imagine a system that:

Recognizes not just what is visible but infers why it matters
Adapts to new environments without retraining
Performs open-world reasoning across diverse industries
Learns continuously from real-world feedback

If AGI represents human-level intelligence across domains, VGI represents human-level vision across all contexts – a domain-specific but universal form of intelligence.

This makes VGI an ideal proof point for AGI. Unlike abstract reasoning, visual intelligence can be measured objectively against medical and cognitive benchmarks such as perception speed, recall accuracy, and contextual reasoning. When machines surpass these thresholds, we may know we’ve crossed into AGI territory.

AI Vision on the line drives industrial safety and precision — With VGI, AI Vision on the line will drives even greater industrial safety and precision.

Why vision matters most

Among the five human senses, vision dominates. Studies show that up to 80% of human perception and learning comes through sight. This makes vision not only our richest data stream but also the most natural testbed for general intelligence.

As Yann LeCun (Chief AI Scientist at Meta) has noted, training models on visual data is critical to achieving AGI. Compared to text, which is limited by grammar and structure, visual data is vast, continuous, and multi-dimensional. A four-year-old child’s optic nerves process more information in bits per second than most large language models (LLMs) have been trained with in their entire learning.

If machines can learn to see as humans do – and eventually better – this is the clearest, most tangible milestone toward general intelligence.

VGI-visual-general-intelligence — We believe that VGI is the quintessential proof point of AGI.

When might AGI arrive?

Timelines vary, but the trajectory is accelerating. Analysts suggest AGI could arrive by 2040, while entrepreneurs predict much sooner: Sam Altman (OpenAI) points to 2027, Demis Hassabis (DeepMind) estimates 2030-2035, and Ray Kurzweil suggests 2029.

VGI may emerge first. Just as specialized milestones (e.g. protein folding) marked breakthroughs in AI before broader capabilities, VGI could become the measurable indicator that AGI is within reach.

Benchmarks might include:

Latency – neural signals distinguish targets in ~150 ms, and humans categorize images in <300 ms (machines matching or surpassing this is a key milestone)
Accuracy – recognition, recall, and reasoning across domains without retraining
Adaptability – self-improving systems capable of generalizing from minimal data

From artificial neural networks to the application layer, key enablers of VGI are converging.

The starting gun: market forces driving adoption

Technological revolutions don’t happen in isolation – they are accelerated by market dynamics. Several forces are converging to make VGI not only possible but inevitable.

1. Foundation model race

Open-source vision models are accelerating innovation. Businesses can now focus less on building from scratch and more on scaling applications.

2. Hardware democratization

GPUs and edge devices are becoming more affordable, enabling real-time inference everywhere – from factories to smartphones.

3. Reduced data dependency

New training methods slash the need for labeled datasets, enabling rapid iteration and lower costs. Closed-loop learning with human-in-the-loop (HITL) feedback ensures continuous improvement.

Together, these forces create a tipping point where VGI is no longer just a research goal but a commercial inevitability.

Why the application layer matters most

Models alone don’t deliver value. The real impact emerges at the application layer – the orchestration systems that connect vision models to real-world decisions.

As Gerard Corrigan, CTO at viso, explains:

“The application layer isn’t just important – it’s where the magic happens, because VGI without applications is just expensive computer vision. VGI with the right application architecture… that’s when we stop building software and start building digital senses.”

The application layer ensures that insights from visual models translate into action: alerting workers to risks, adjusting supply chains, or reallocating resources in real time. Just as the app ecosystem unlocked the smartphone revolution, application layers will unlock the VGI revolution.

Implications for global industries

The leap to VGI will ripple across industries, for example:

Manufacturing: real-time defect detection, predictive maintenance, zero downtime
Construction: adaptive safety systems, progress tracking, anticipatory risk prevention
Waste management: intelligent sorting, contamination detection, facility optimization

These aren’t futuristic scenarios: they’re the next phase of transformation. For companies, the question is not if but how quickly they can integrate VGI into their operations.

CCTV camera the eyes that never blink — With AI Vision and VGI, CCTV becomes ‘the eyes that never blink’.

Beyond perception: building anticipatory systems

Perhaps the most profound shift VGI brings is moving from re-active to anticipatory systems. Today, AI often responds to events after they occur. With VGI, machines will predict risks, anticipate needs, and act proactively.

Imagine a logistics network that reroutes itself before a disruption, or a workplace safety system that prevents an accident rather than recording one. This is the true promise of VGI: not machines that see better, but machines that understand and act with foresight.

Looking ahead

VGI represents more than a technical milestone. It is a civilizational threshold. For the first time, machines may share with us not only the ability to process language but the ability to perceive and understand the world visually.

In that moment, we will not just be building smarter software. We will be building digital senses that extend and surpass our own.

VGI future of computer vision — The future of visual intelligence: AI Vision through the looking glass.

Why download the whitepaper?

This blog has introduced some key concepts and arguments from our whitepaper. But the full paper goes deeper:

Definitions and benchmarks of AI, VI, and VGI
Expert predictions on timelines to AGI and VGI
Technical frameworks for proving VGI as the North Star of AGI
Industry case studies showing real-world value at the application layer

👉 Download the full whitepaper – The Future of Visual Intelligence: AI Vision Through the Looking Glass – to explore how your organization can harness VGI and prepare for the most important technological leap of our time.

Beyond human sight: why VGI is an essential proof point AGI