On this page

How to Detect AI-Generated Content

Deep Learning

How to Detect AI-Generated Content

With the ubiquity of generative AI tools, we now see their outputs everywhere. Learn how to spot different types of AI-generated content.

Subscribe to the viso blog

Stay connected with viso.ai and receive new blog posts straight to your inbox.

Tools like Dalle-2, ChatGPT, and more have entered the playing field, irreparably changing the nature of content creation. AI-generated content is now everywhere, and it can be very difficult for humans to identify and differentiate between what is created organically and what is not.

We have seen AI content directly infiltrate content marketing, blog posts, product descriptions, and more. And, while content generation and AI writing tools are extremely impactful on efficiency and idea generation, they do not always result in the creation of high-quality content.

In this article, we will look at how to detect AI content with various methods and tools and discuss why this is important.

The Minds Behind the AI Machine

Before we dive into methods of AI detection, let’s first examine models used as content generation tools. There are essentially two main methods of AI generation:

Generative Adversarial Networks (GANs): GANs use a generator (learn to produce examples) and a discriminator (distinguishes between classes) architecture to create realistic images, music, and other things. A few examples of GANs are CycleGAN, StyleGAN2, and GauGAN.
Large Language Models (LLMs): These models are recent breakthroughs in the space of natural language processing (NLP), empowering machines to understand and generate human-like language. LLMs are built using deep learning techniques and trained on vast amounts of data. A few examples of LLMs are ChatGPT, Bard, Claude 2, and LLAMA2.

How to Identify AI-Generated Content

Distinguishing between AI writing and human-written content can be quite challenging. It involves analyzing, evaluating, and occasionally utilizing resources. Below are some techniques that can assist you in discerning between AI-generated content and human-generated content.

Lexical Diversity

One of the challenges with machine-generated text is its tendency to depend on a vocabulary and often repeat words. It’s essentially a measure of how rich and diverse the vocabulary is, as opposed to relying on a limited set of repetitive words.

Lexical diversity is low in AI-generated content because AI models learn from large datasets of text, which might contain common words and phrases used repeatedly. While AI can mimic and reproduce these patterns, it can struggle to grasp the nuances of language and incorporate a wider range of vocabulary, leading to less diverse and less impactful writing.

Sentence Length and Structure

Sentence structure is an indicator of content generated by AI. Often, such texts exhibit a lack of variety in sentence length and a reliance on predictable syntactic patterns. This can result in writing that feels formulaic and monotonous, lacking the dynamic rhythm and complexity characteristic of human-authored prose.

AI models often favor straightforward sentence structures, relying heavily on subject-verb-object constructions and avoiding the use of more complex grammatical arrangements. This can result in writing that appears fragmented and lacks expression.

While human writers readily employ sentences of varying lengths and structures, AI models often struggle with constructing longer, more intricate sentences. This can result in a text that lacks depth and sophistication, failing to engage the reader with the subtle interplay of clauses and phrases.

AI-generated texts often have a structure because they rely on a set of grammatical patterns. Sentence beginnings and endings tend to follow repetitive formulas, creating a sense of uniformity that can quickly become tiresome.

Lack of Depth and Critical Thinking

Although AI excels at summarizing existing information and synthesizing facts, it cannot often generate ideas, construct arguments, and engage in profound intellectual research.

The arguments presented in AI-generated papers tend to be derivative, lacking the depth and complexity of arguments crafted by human scholars who have critically analyzed and engaged with the underlying material.

AI-generated content frequently repeats information that already exists, without offering perspectives or interpretations. It seldom pushes the limits of knowledge.

Real and artificial text have different intrinsic dimensions: (a-b) idea; and (c) actual results. — Real and artificial text have different intrinsic dimensions: (a-b) idea; and (c) actual results – source.

Topic Incoherence

One of the pitfalls of AI-generated content, particularly in extended works like research papers or narratives, is its vulnerability to incoherence and abrupt topic swerves. While AI can adeptly string together words and sentences, it often struggles with maintaining a logical flow of ideas and constructing a cohesive narrative arc. This can lead to texts that feel disjointed and confusing, and ultimately fail to engage the reader on a deeper level.

AI sometimes injects sections that seem random or tangential to the central theme, disrupting the smooth flow of the text and leaving the reader wondering about their purpose or relevance.

AI falls short in terms of thinking and independent analysis, which can result in drawing conclusions that may contradict statements or overlook the arguments being presented. This lack of coherence and logical reasoning can lead to confusion, leaving readers with a sense of dissatisfaction as if something crucial is omitted.

How to Identify AI-Generated Images

Unnatural Lighting and Physics

One can easily spot an image created by AI, or a synthetic image, when it struggles to capture the complexities of physics and lighting. While AI is excellent at generating details ,it often falls behind in grasping these principles.

While natural shadows have varying degrees of opacity, AI-generated shadows might appear eerily uniform and dark, lacking the subtle gradients and transitions you’d find in reality.

Observe objects defying gravity, seemingly suspended in mid-air or defying basic physical principles. This observation highlights the challenge that AI faces when trying to recreate the representation of object weight and its interaction with gravity.

Take a look at the reflections to identify any inconsistencies in their relationship, to the object. Look for distortions, unnatural angles, or missing details in reflected elements, indicating AI’s difficulty in mirroring reality accurately.

Enhanced AI image editing and restoration with Visio.ai for detailed photo enhancements. — The image showcases techniques such as text-guided object inpainting, context-aware image inpainting, shape-guided object inpainting with shape fitting, and out-painting – source.

Repetitive Patterns

It’s pretty easy to tell if an image is created by AI when it shows patterns. While AI can create textures and intricate details, its algorithms sometimes result in repetitions. When you look closely, you can see these patterns that give away their AI-generated nature.

Be wary of fabrics exhibiting suspiciously regular patterns, especially in materials like woven cloth or intricate tapestries. AI can struggle to create true organic variations, often resorting to repeating the same texture element endlessly.

While subtle patterns might not be readily apparent at first glance, zooming in can reveal their repetitive nature, especially in areas with high complexity.

Unrealistic Emotions and Expressions

Artificial intelligence is highly proficient in creating images of faces. However, it often falls short when it comes to understanding and expressing complex emotions. This limitation can be a sign that an image is not created by a human. Here, we embark on a psychological escapade, deciphering the clues hidden within unrealistic smiles, vacant eyes, and exaggerated gazes, unveiling the truth behind the mask of AI-generated expressions.

Look for eyes that lack depth and focus, appearing glassy or devoid of emotional engagement. The essence of life is unseen, mainly in AI-generated eyes that can betray the absence of true understanding and real expression.

One way to detect the artificiality of emotion in AI-generated images is by examining the alignment between expressions, body posture, and gestures. Inconsistencies between these elements can expose the lack of emotion.

Missing Metadata and Provenance

In the digital realm, where pixels reign supreme, metadata acts as the whisper of provenance, the ghost in the machine whispering the story of an image’s origin. Yet, when it comes to AI-generated visuals, this whisper often fades into an unsettling silence, the absence of detailed metadata raising a red flag in the face of hyperrealism and professional polish.

Another clue lies in the presence or absence of EXIF (Exchangeable Image File Format) data. This digital information usually contains details about the camera model used, shutter speed, aperture, and other technical specifications. AI-generated images often lack this metadata due to their nature.

By analyzing the metadata associated with an image, you can follow its journey through editing software, online platforms, and potential manipulation attempts. This transparency plays a role in verifying information. Combating the spread of misinformation.

Synthetic data is used to generate hyper-realistic human faces. — Synthetic data is used to generate hyper-realistic human faces – source.

How to Identify AI-Generated Video

Facial Flickering

One of the indicators that can reveal if a video is generated by AI is the way it portrays expressions. While AI can produce faces, it often falls short when it comes to understanding and replicating the intricate nuances of human emotions and subtle micro-movements. These limitations leave behind hints that whisper about their origins.

Smoothly transitioning between emotions can be a challenge for AI, resulting in flickering or jittering around the eyes and mouth as expressions change. The involuntary twitches and adjustments that naturally occur in real-life faces are often missing in AI creations, leading to emotionless expressions that lack the dynamics of human interaction.

Inconsistent Editing

It is possible to identify AI-generated videos by their style of storytelling rather than specific visual or auditory details. Even though AI can replicate real-world footage with accuracy, its understanding of narrative structure and the art of crafting a story often reveals inconsistencies in the editing, hinting at its synthetic origins.

Text to Video high-level architecture – source.

Maintaining a flow of time within the video can be a challenge for AI, resulting in cuts that either jump forward or backward without any apparent reason, disrupting the narrative flow. Additionally, establishing a theme or conveying a message throughout the video can prove difficult for AI, leading to inconsistent shifts in tone or unexpected combinations of unrelated elements.

When relying excessively on narrative tropes or stereotypical character interactions, AI’s dependence on programmed patterns becomes evident and its lack of original storytelling becomes apparent.

How to Identify AI-Generated Audio

Robotic Voices and Monotonous Intonation

While AI vocal synthesis has made impressive strides, the human ear remains a discerning judge. One of the most telling giveaways of AI voices lies in their robotic intonation and monotonous delivery, lacking the rich tapestry of nuance and emotion that characterizes human speech.

AI struggles to grasp the subtle rhythm, pitch, and stress patterns that convey emotion and meaning in human speech, often resulting in flat, monotonous delivery.

Pay attention to the melody of the voice. Sometimes, AI-generated voices can sound overly melodic, lacking the variations and nuances found in speech.

In attempts to convey emotion, AI voices might resort to overly exaggerated inflections or unnatural emphasis, sounding cartoonish or forced rather than genuine.

Researchers have conducted analysis, on conversations between humans and AudioGPT – source.

Spectral Analysis

Text detectors produced by AI have made advancements in their ability to synthesize information. However, there are still obstacles to overcome when it comes to capturing and reproducing the spectrum of frequencies. Upon analyzing the spectrogram, certain patterns begin to emerge and catch our attention.

Real sounds, from musical instruments to voices, contain a rich spectrum of harmonics, multiples of the fundamental frequency. AI often struggles to accurately generate these complex harmonic overtones, leading to missing or weak harmonics in the spectrogram.

AI often struggles when it tries to recreate and mimic the frequencies found in sounds. This challenge becomes especially noticeable when dealing with things, like cymbals, animal sounds, or certain consonants, in speech.

Top AI Detection Tools

Orignality.AI

Originality AI is a plagiarism checker and AI content detector platform that analyzes pieces of content, including text and images, generated by AI systems. It employs a multifaceted approach, combining various techniques to provide users with a thorough assessment of content authenticity.

The Originality AI platform employs analysis and sophisticated deep learning algorithms to identify text generated by AI models. Statisticians like N-grams and syntax sniff out repetitive patterns and unnatural structures, while lexical analysis hunts for curious word choices, all hallmarks of AI’s mechanical touch.

Deep learners take over, with transformers untangling topic coherence and logic and sentiment analysis, searching for inconsistencies in emotional expression, exposing the machine’s struggle to mimic true feeling. Together, these models pull back the curtain on AI’s fabricated narratives, revealing the genuine voice beneath.

CopyLeaks

This platform combines the power of AI to ensure the authenticity of your work by employing proven methods for detecting plagiarism. Copyleaks is an AI detector that takes an approach involving analyzing N-grams, comparing syntax, and utilizing AI technology not to detect instances of plagiarism but to identify subtle rephrasing or even content generated by AI.

By leveraging algorithms such as Levenshtein distance and TF IDF, it meticulously evaluates text similarity by highlighting patterns in word usage and providing insights into AI-generated content.

a proposal for detecting AI content produced by LLMs — A proposal for detecting textual content produced by Large Language Models – source.

GPTZero

GPTZero is a user-friendly, AI-powered tool built to analyze text and identify patterns commonly found in content created by AI systems. Its purpose is to notify users in real-time when they come across content that may have been written by a language model, like GPT-3.

When examining written content, GPTZero uses three techniques. GPTZero uses N-gram sequences to detect phrases, syntactic tree comparisons to identify sentence structures, and word frequency analysis to highlight the overuse of certain vocabulary. By utilizing this toolkit GPTZero enhances its ability to distinguish between voices and the echoes of automated content creation.

Thermal imaging of diverse people demonstrating body language and movement analysis. — Visual synthetic data involves artificially generated images, mimicking real-world characteristics, to enhance machine learning models’ training by providing diverse and privacy-conscious datasets – source.

Importance of Detecting Artificial Content

Transparency plays a very key role. It is important to understand the sources of content to assess its credibility, authenticity, and potential biases. In this era, content is abundant, including images, videos, and audio, circulating on the internet more rapidly than ever. To navigate the complexities of the world effectively, it is essential to be able to recognize and understand information that has been created by AI.

News articles: AI-generated news can perpetuate misinformation if not identified. It is essential to verify the source of information.
Product reviews: Glowing AI-generated reviews can manipulate consumer decisions. Being aware of its origin allows for making informed decisions.
Creative fields: While AI tools can be valuable for artists and content creators, plagiarism of AI-generated content hurts originality and undermines genuine creativity.

Keep in mind that AI-generated content isn’t inherently negative. It has the potential for expression, education, content creation, and even significant scientific discoveries. In computer vision model training, synthetic imagery can even serve to fill in gaps in datasets to make sure that the model is trained effectively.

However, as with any tool, responsible usage and accurate identification are crucial. By arming yourself with understanding and staying aware, you can confidently navigate the evolving world, stay ahead of AI advancements, and still create high-quality content.

To learn more about AI content and computer vision, check out the following blogs:

How to Detect AI-Generated Content

How to Detect AI-Generated Content

Subscribe to our newsletter

Share

Subscribe to the viso blog

The Minds Behind the AI Machine

How to Identify AI-Generated Content

Lexical Diversity

Sentence Length and Structure

Lack of Depth and Critical Thinking

Topic Incoherence

How to Identify AI-Generated Images

Unnatural Lighting and Physics

Repetitive Patterns

Unrealistic Emotions and Expressions

Missing Metadata and Provenance

How to Identify AI-Generated Video

Facial Flickering

Inconsistent Editing

How to Identify AI-Generated Audio

Robotic Voices and Monotonous Intonation

Spectral Analysis

Top AI Detection Tools

Orignality.AI

CopyLeaks

GPTZero

Importance of Detecting Artificial Content