YOLO (You Only Look Once) is a state-of-the-art (SOTA) object-detection algorithm introduced as a research paper by J. Redmon, et al. (2015). In real-time object identification, YOLO11 architecture is an advancement over its predecessor, the Region-based Convolutional Neural Network (R-CNN).
This single-pass approach, using an entire image as input, uses a single neural network to predict bounding boxes and class probabilities. In this article, we will elaborate on YOLO11, the latest developed by Ultralytics.
About us: Viso Suite is an End-to-End Computer Vision Infrastructure that provides all the tools required to train, build, deploy, and manage computer vision applications at scale. Combining accuracy, reliability, and lower total cost of ownership Viso Suite, lends itself perfectly to multi-use case, multi-location deployments. To get started with enterprise-grade computer vision infrastructure, book a demo of Viso Suite with our team of experts.
What is YOLO11?
YOLO11 is the latest version of YOLO, an advanced real-time object detection. The YOLO family enters a new chapter with YOLO11, a more capable and adaptable model that pushes the boundaries of computer vision.
The model supports computer vision tasks like posture estimation and instance segmentation. CV community that uses previous YOLO versions will appreciate YOLO11 because of its better efficiency and optimized architecture.
Ultralytics CEO and founder Glenn Jocher claimed: “With YOLOv11, we set out to develop a model that offers both power and practicality for real-world applications. Because of its increased accuracy and efficiency, it’s a versatile instrument that is tailored to the particular problems that different sectors encounter.”
Supported Tasks
For developers and researchers alike, Ultralytics YOLOv11 is a ubiquitous tool due to its inventive architecture. CV community will use YOLOv11 to develop creative solutions and advanced models. It enables a variety of computer vision tasks, including:
- Object Detection
- Instance Segmentation
- Pose Estimation
- Oriented Detection
- Classification
Some of the main enhancements include improved feature extraction, more accurate detail capture, higher accuracy with fewer parameters, and faster processing rates that greatly boost real-time performance.
An Overview of YOLO Models
Here is an overview of the YOLO family of models up until YOLOv11.
Release | Authors | Tasks | Paper | |
---|---|---|---|---|
YOLO | 2015 | Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi | Object Detection, Basic Classification | You Only Look Once: Unified, Real-Time Object Detection |
YOLOv2 | 2016 | Joseph Redmon, Ali Farhadi | Object Detection, Improved Classification | YOLO9000: Better, Faster, Stronger |
YOLOv3 | 2018 | Joseph Redmon, Ali Farhadi | Object Detection, Multi-scale Detection | YOLOv3: An Incremental Improvement |
YOLOv4 | 2020 | Alexey Bochkovskiy, Chien-Yao Wang, Hong-Yuan Mark Liao | Object Detection, Basic Object Tracking | YOLOv4: Optimal Speed and Accuracy of Object Detection |
YOLOv5 | 2020 | Ultralytics | Object Detection, Basic Instance Segmentation (via custom modifications) | no |
YOLOv6 | 2022 | Chuyi Li, et al. | Object Detection, Instance Segmentation | YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications |
YOLOv7 | 2022 | Chien-Yao Wang, Alexey Bochkovskiy, Hong-Yuan Mark Liao | Object Detection, Object Tracking, Instance Segmentation | YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors |
YOLOv8 | 2023 | Ultralytics | Object Detection, Instance Segmentation, Panoptic Segmentation, Keypoint Estimation | no |
YOLOv9 | 2024 | Chien-Yao Wang, I-Hau Yeh, Hong-Yuan Mark Liao | Object Detection, Instance Segmentation | YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information |
YOLOv10 | 2024 | Ao Wang, Hui Chen, Lihao Liu, Kai Chen, Zijia Lin, Jungong Han, Guiguang Ding | Object Detection | YOLOv10: Real-Time End-to-End Object Detection |
Key Advantages of YOLOv11
YOLOv11 is an improvement over YOLOv9 and YOLOv10, which were released earlier this year (2024). It has better architectural designs, more effective feature extraction algorithms, and better training methods. The remarkable blend of YOLOv11’s speed, precision, and efficiency sets it apart, making it among the most powerful models by Ultralytics to date.
YOLOv11 possesses an improved design, which enables more precise detection of delicate details – even in difficult situations. It also has better feature extraction, i.e. it can extract multiple patterns and details from photos.
Concerning its predecessors, Ultralytics YOLOv11 offers several noteworthy improvements. Important advancements consist of:
- Better accuracy with fewer parameters: YOLOv11m is more computationally efficient without sacrificing accuracy. It achieves greater mean Average Precision (mAP) on the COCO dataset with 22% fewer parameters than YOLOv8m.
- Wide variety of tasks supported: YOLOv11 is capable of performing a wide range of CV tasks, including pose estimation, object recognition, image classification, instance segmentation, and orientated object detection (OBB).
- Improved speed and efficiency: Faster processing rates are achieved via improved architectural designs and training pipelines that strike a compromise between accuracy and performance.
- Fewer parameters: fewer parameters make models faster without significantly affecting v11’s correctness.
- Improved feature extraction: YOLOv11 has a better neck and backbone architecture to improve feature extraction capabilities, which leads to more accurate object detection.
- Adaptability across contexts: YOLOv11 is adaptable to a wide range of contexts, such as cloud platforms, edge devices, and systems that are compatible with NVIDIA GPUs.
YOLOv11 – How to Use It?
As of October 10, 2024, Ultralytics has not published the YOLOv11 paper, nor its architecture diagram. However, there is enough documentation released on GitHub. The model is less resource-intensive and capable of handling complicated tasks. It is an excellent choice for challenging AI projects because it also enhances large-scale model performance.
The training process has improvements to the augmentation pipeline, which makes it simpler for YOLOv11 to adjust to various tasks – whether small projects or large-scale applications. Install the most recent version of the Ultralytics package to begin using YOLOv11:
pip install ultralytics>=8.3.0
You can use YOLOv11 for real-time object detection and other computer vision applications with just a few lines of code. Use this code to load a pre-trained YOLOv11 model and perform inference on a picture:
from ultralytics import YOLO
# Load the YOLO11 model
model = YOLO("yolo11n.pt")
# Run inference on an image
results = model("path/to/image.jpg")
# Display results
results[0].show()
Components of YOLOv11
YOLOv11 includes the following tools: oriented bounding box (-obb), pose estimation (-pose), instance segmentation (-seg), bounding box models (no suffix), and classification (-cls).
The following sizes are also available for the tools: nano (n), small (s), medium (m), large (l), and extra-large (x). The engineers can utilize Ultralytics Library models to:
- Track objects and trace them along their paths.
- Export files: the library is easily exportable in a variety of formats and uses.
- Execute various scenarios: they can train their models using a range of items and picture types.
Furthermore, Ultralytics has released the YOLOv11 Enterprise Models, which will be available on October 31st. Though it will use larger proprietary custom datasets, teams can use it similarly to the open-source YOLOv11 models.
YOLOv11 offers unparalleled flexibility for a wide range of applications since it can be seamlessly integrated into multiple workflows. In addition, teams can optimize it for deployment across several settings, including edge devices and cloud platforms.
With the Ultralytics Python package and the Ultralytics HUB, engineers can already start using YOLOv11. It will bring them advanced CV possibilities and they’ll see how YOLO-11 can support diverse AI projects.
Performance Metrics and Supported Tasks
With its exceptional processing power, efficiency, and compatibility for cloud and edge device deployment, YOLOv11 offers flexibility in a variety of settings. Moreover, Yolo11 isn’t just an upgrade – rather, it’s a much more precise, effective, and adaptable model that can tackle diverse CV tasks.
It provides better feature extraction with more accurate detail capture, higher accuracy with fewer parameters, and faster processing rates (better real-time performance). Regarding accuracy and speed – YOLO-11 is superior to its predecessors:
- Efficiency and speed: It is ideal for edge applications and resource-constrained contexts by having up to 22% fewer parameters than other models. Also, it enhances real time object detection by up to 2% faster.
- Accuracy improvement: when it comes to object detection on COCO, YOLO-11 outperforms YOLOv8 by up to 2% in terms of mAP (mean Average Precision).
- Surprisingly, YOLO11m uses 22% fewer parameters than YOLOv8m and obtains a higher mean Average Precision (mAP) score on the COCO dataset. Thus, it is computationally lighter without compromising performance.
This indicates that it executes more efficiently and produces more accurate outcomes. Furthermore, YOLOv11 offers better processing speeds than YOLOv10, with inference times that are about 2% faster. This makes it perfect for real-time applications.
YOLOv11 Applications
Teams can utilize flexible YOLO-11 models in a variety of computer vision applications, such as:
- Object tracking: This feature, which is crucial for many real-time applications, tracks and monitors the movement of objects over a series of video frames.
- Object detection: For use in surveillance, autonomous driving, and retail analytics, this technology locates and identifies things within pictures or video frames and draws bounding boxes around them.
- Image classification: This technique classifies pictures into pre-established groups. It makes it perfect for uses like e-commerce product classification or animal observation.
- Instance segmentation: This process requires pinpointing and pixel-by-pixel identification and separation of specific objects inside an image. Applications such as medical imaging and manufacturing defect uncovering can benefit from its use.
- Pose estimation: in a wide range of medical applications, sports analytics, and fitness tracking. Pose estimation identifies important spots inside an image size, or video frame to track movements or poses.
- Oriented object detection (OBB): This technology locates items with an orientation angle, making it possible to localize rotational objects more precisely. It is particularly useful for jobs involving robotics, warehouse automation, and aerial images.
Therefore, YOLO-11 is adaptable enough to be used in different CV applications: autonomous driving, surveillance, healthcare imaging, smart retail, and industrial use cases.
Implementing YOLOv11
Thanks to community contributions and broad applicability, the YOLO models are the industry standard in object detection. With this release of YOLOv11, we have seen that it has good processing power efficiency and is ideal for deployment on edge and cloud devices. It provides flexibility in a variety of settings and a more precise, effective, and adaptable approach to computer vision tasks. We are excited to see further developments in the world of open-source computer vision and the YOLO series!
To get started with YOLOv11 for open-source, research, and student projects, we suggest checking out the Ultralytics Github repository. To learn more about the legalities of implementing computer vision on enterprise applications, check out our guide to model licensing.
Get Started With Enterprise Computer Vision
Viso Suite is an End-to-End Computer Vision Infrastructure that provides all the tools required to train, build, deploy, and manage computer vision applications at scale. Our infrastructure is designed to expedite the time taken to deploy real-world applications, leveraging existing camera investments and running on the edge. It combines accuracy, reliability, and lower total cost of ownership lending itself perfectly to multi-use case, multi-location deployments.
Viso Suite is fully compatible with all popular machine learning and computer vision models.
We work with large firms worldwide to develop and execute their AI applications. To start implementing state-of-the-art computer vision, get in touch with our team of experts for a personalized demo of Viso Suite.
Frequently Asked Questions
Q1: What are the main advantages of YOLOv11?
Answer: The main YOLO-11 advantages are: better accuracy, faster speed, fewer parameters, improved feature extraction, adaptability across different contexts, and support for various tasks.
Q2: Which tasks can YOLOv11 perform?
Answer: By using YOLO-11 you can classify images, detect objects, segment images, estimate poses, and object orientation detection.
Q3: How to train the YOLOv11 model for object detection?
Answer: Engineers can train the YOLO-11 model for object detection by using Python or CLI commands. First, they import the YOLO library in Python and then utilize the model.train() command.
Q4: Can YOLOv11 be used on edge devices?
Answer: Yes, because of its lightweight efficient architecture, and efficient processing method – YOLOv11 can be deployed on multiple platforms including edge devices.