• Train




          Data Collection

          Building Blocks​

          Device Enrollment

          Monitoring Dashboards

          Video Annotation​

          Application Editor​

          Device Management

          Remote Maintenance

          Model Training

          Application Library

          Deployment Manager

          Unified Security Center

          AI Model Library

          Configuration Manager

          IoT Edge Gateway

          Privacy-preserving AI

          Ready to get started?

          Expert Services
  • Why Viso Suite
  • Pricing
Close this search box.

An Introduction to Federated Learning


Viso Suite is the all-in-one solution for teams to build, deliver, scale computer vision applications.

Need Computer Vision?

Viso Suite is the world’s only end-to-end computer vision platform. Request a demo.

Federated learning is used for distributed training of machine learning algorithms on multiple edge devices without exchanging training data. Therefore, Federated learning introduces a new learning paradigm where statistical methods are trained at the edge in distributed networks.

Read about how federated learning works as well as its unique properties and associated challenges:

  • Why we need Federated Learning
  • What is Federated Learning?
  • Core challenges and concepts of Federated Learning


About us: Viso.ai provides the leading end-to-end Computer Vision Platform Viso Suite. Global organizations use it to develop, deploy, and scale distributed Computer Vision Applications that run at the Edge. Get a personal demo.


Viso Suite the computer vision infrastructure for enterprises
Viso Suite – End-to-End Computer Vision and No-Code for Computer Vision Teams

Why We Need Federated Learning

Big Data and Edge-Computing Trend

Today, an immense number of connected devices, including mobile devices, wearables, and autonomous vehicles, generate massive amounts of data (Big Data). Due to the fast-growing computational power of these devices, along with privacy concerns, there is an increasing need to store and process local data – pushing the computation from the cloud to the edge.

Artificial Intelligence (AI) is needed to leverage the value of Big Data: Deep Learning is very effective in learning from complex data. For example, deep neural network architectures have been able to outperform humans when recognizing images from the popular ImageNet dataset.

Computer Vision and Deep Learning Smart City
Distributed computer vision with deep learning in Smart Cities.


Edge Computing became the new paradigm, enabling the adoption of computation-intense applications. Edge Intelligence or Edge AI is a combination of AI and Edge Computing; it enables the deployment of machine learning algorithms to the edge device where the data is generated.

The combination of distributed and connected systems with machine learning is also called Artificial Intelligence of Things, or AIoT.

Most concepts of Edge Intelligence generally focus on the inference phase (running the AI model) and assume that the AI model training is performed in cloud data centers, mostly due to the high resource consumption of the training phase.

However, the growing computational capabilities and storage of connected edge devices enable methods for distributed training and deployment of machine learning models.

Capabilities comparison of cloud, on-device and edge intelligence
Capabilities comparison of cloud, on-device, and edge intelligence


Need for Privacy-Preserving Deep Learning

Traditional machine learning approaches need to combine all data at one location, typically a cloud data center, which may violate the laws on user privacy and data confidentiality. Today, many parts of the world demand that technology companies treat user data or other sensitive data carefully according to user privacy laws. A prime example is the European Union’s General Data Protection Regulation (GDPR).

Federated learning is an emerging approach to preserve privacy when training the Deep Neural Network Model based on data originated by multiple clients. Federated machine learning addresses this problem with solutions combining distributed machine learning, cryptography and security, and incentive mechanism design based on economic principles and game theory.

Therefore, Federated learning could become the foundation of next-generation machine learning that caters to technological and societal needs for responsible AI development and application.


What is Federated Learning

Federated learning (FL) is a machine learning setting where many clients (e.g., mobile devices) collaboratively train a model under the orchestration of a central server (e.g., service provider) while keeping the annotated training data decentralized. Hence, machine learning algorithms, such as deep neural networks, are trained on multiple local datasets contained in local edge nodes.

Instead of aggregating the raw data to a centralized data center (Cloud) for training, federated learning leaves the raw data distributed on the client devices and trains a shared model on the server by aggregating locally computed updates.

In this context, we refer to a “local model” that is trained on a specific subset or locality of the dataset (locally trained) rather than the entire dataset.

Therefore, Federated learning can mitigate many systemic privacy risks and costs resulting from traditional, centralized machine learning approaches and centralized models.


Federated Learning Applications

Federated learning methods play a critical role in supporting privacy-sensitive applications where the training data is distributed at the edge.

Some examples of federated learning applications include learning sentiment, semantic location, mobile phone activity, adapting to pedestrian behavior in autonomous vehicles, and predicting health events like heart attack risks from wearable devices.

There are multiple types of prominent federated learning applications:

  • Smartphones. Statistical models are used to power applications such as next-word prediction, face detection, and voice recognition by jointly learning user behavior across a large pool of mobile phones. However, users may not agree to share their data to protect their personal privacy or minimize the bandwidth or battery usage of their phones. Federated learning can be used to enable predictive features on smartphones without leaking private information or diminishing the user experience.
  • Organizations. In the context of federated learning, entire organizations or institutions can also be viewed as “devices”. For example, hospitals are organizations that contain a large amount of patient data for predictive healthcare applications. Meanwhile, hospitals operate under strict privacy practices and may face legal, administrative, or ethical constraints that require data to remain local. Federated learning is a solution for such applications because it can reduce strain on the network and enable private learning between various devices/organizations.
  • Internet of Things. Modern IoT networks, such as wearable devices, autonomous vehicles, or smart homes, use sensors to collect and react to incoming data in real time. For example, a fleet of autonomous vehicles may require an up-to-date model of traffic, construction, or pedestrian behavior to operate safely. However, building aggregate models in these scenarios may be difficult due to privacy concerns and the limited connectivity of each device. Federated learning methods enable the training of models that efficiently adapt to changes in these systems while maintaining user privacy.
Federated Learning Example Application
Federated Learning Example Application for next-word prediction on mobile phones. – Source


Core Challenges of Federated Learning

The implementation of Federated Learning depends on a set of key challenges:

  • Efficient Communication across the federated network
  • Managing heterogeneous systems in the same networks
  • Statistical heterogeneity of data in federated networks
  • Privacy concerns and privacy-preserving methods



Communication is a key bottleneck to consider when developing methods for federated networks. This is because Federated networks potentially include a massive number of devices (for example, millions of smartphones), and communication in the network can be slower than local computation by many orders of magnitude.

Therefore, federated learning depends on communication-efficient methods that iteratively send small messages or model updates as part of the distributed training process instead of sending the entire dataset over the network. There are two main goals to further reduce communication: (1) reducing the total number of communication rounds or (2) reducing the size of transmitted messages at each round.

The following are general concepts that aim to achieve communication-efficient distributed learning methods:

  1. Local updating methods allow for a variable number of local updates to be applied on each machine in parallel at each communication round. Thus, the goal of local updating methods is to reduce the total number of communication rounds.
  2. Model compression schemes such as sparsification, subsampling, and quantization can significantly reduce the size of messages communicated at each update round.
  3. Decentralized training. In the federated learning settings, a server connects with all remote devices. Decentralized topologies are an alternative when communication to the server becomes a bottleneck, especially when operating in low bandwidth or high latency networks.


Systems Heterogeneity

The storage, computational, and communication capabilities of the devices that are part of a federated network may differ significantly. Differences usually occur due to variability in hardware (CPU, memory), network connectivity (3G, 4G, 5G, wifi), and power supply (battery level).

Additionally, only a small fraction of the devices may be active at once. Each device may be unreliable as it is not uncommon for an edge device to drop out due to connectivity or energy constraints. Therefore, fault tolerance is important as participating devices may drop out before completing the given training iteration.

Therefore, federated learning methods have to be developed so that they (1) anticipate a low amount of participation, (2) tolerate heterogeneous hardware, and (3) are robust to dropped devices in the network.

There are some key directions to handle systems heterogeneity:

  1. Asynchronous communication is used to parallelize iterative optimization algorithms. Asynchronous schemes are an attractive approach to mitigating stragglers in heterogeneous environments.
  2. Active device sampling. Typically, only a small subset of devices participate in each round of training. Therefore, an approach is to actively select participating devices at each round with the goal of aggregating as many device updates as possible within a pre-defined time window.
  3. Fault tolerance. A practical approach is to ignore device failure, which may lead to bias in the device sampling scheme if the failed devices have specific data characteristics. Coded computation is another option to tolerate device failures by introducing algorithmic redundancy.


Statistical Heterogeneity

Devices frequently generate and collect data in a non-identically distributed manner across the network, e.g., mobile phone users have varied use of language in the context of a next-word prediction task.

Also, the number of data points across devices may vary significantly, and there may be an underlying structure present that captures the relationship between devices and their associated distributions. This data generation paradigm violates frequently-used independent and identically distributed (I.I.D.) assumptions in distributed optimization, increases the likelihood of stragglers, and may add complexity in terms of modeling, analysis, and evaluation.

Challenges arise when training federated models from data that is not identically distributed across devices, both in terms of modeling the data and in terms of analyzing the convergence behavior of associated training procedures.


Privacy Concerns

Privacy concerns often motivate the need to keep raw data on each device locally in federated settings. However, sharing other information such as model updates as part of the training process can also potentially reveal sensitive information, either to a third party or to the central server.

Recently methods aim to enhance the privacy of federated learning using secure multiparty computation (SMC) or differential privacy. However, those methods usually provide privacy at the cost of reduced model performance or system efficiency. Therefore, balancing these trade-offs is a considerable challenge in realizing private federated learning systems.

Recently, multiple privacy-preserving methods for machine learning have been researched. For example, the following three main strategies could be used for federated settings: Differential privacy to communicate noisy data sketches, homomorphic encryption to operate on encrypted data, and secure function evaluation or multiparty computation.

  • Differential Privacy is a popular privacy approach due to its strong information-theoretic guarantees, algorithmic simplicity, and comparably small systems overhead. A randomized mechanism is differentially private if the change of one input element will not result in too much difference in the output distribution. Therefore, it is not possible to draw conclusions about whether or not a specific sample is used in the learning process. Furthermore, there exists an inherent trade-off between differential privacy and model accuracy, as adding more noise results in greater privacy but may compromise accuracy significantly.
  • Homomorphic Encryption can be used to secure the learning process by computing encrypted data. However, it has currently been applied in limited settings, e.g., training linear models or involving only a few entities.
  • Secure multiparty computation (SMC) or secure function evaluation (SFE) are other options for performing privacy-preserving learning with sensitive datasets distributed across different data owners. Those protocols enable multiple parties to collaboratively compute an agreed-upon function without leaking raw input information from any party except for what can be inferred from the output. SMC is a lossless method and can retain the original accuracy with a very high privacy guarantee. To achieve even stronger privacy guarantees, SMC can be combined with differential privacy.

Privacy in Federated Learning poses novel challenges to existing privacy-preserving algorithms. Most importantly, privacy-preserving methods have to offer rigorous privacy guarantees without overly compromising accuracy. Therefore, such methods have to be computationally cheap, communication-efficient, and tolerant to dropped devices.

Current implementations of privacy-preserving federated learning typically build around classical cryptographic protocols such as SMC and differential privacy. However, SMC techniques impose significant performance overheads, and their application to privacy-preserving deep learning remains an open problem.


What’s Next for Federated Learning

If you want to learn more about related topics, we recommend the following articles:


  • Advances and Open Problems in Federated Learning – Source
  • Federated Learning: Challenges, Methods, and Future Directions – Source

Follow us

Related Articles

Join 6,300+ Fellow
AI Enthusiasts

Get expert news and updates straight to your inbox. Subscribe to the Viso Blog.

Sign up to receive news and other stories from viso.ai. Your information will be used in accordance with viso.ai's privacy policy. You may opt out at any time.
Play Video

Join 6,300+ Fellow
AI Enthusiasts

Get expert AI news 2x a month. Subscribe to the most read Computer Vision Blog.

You can unsubscribe anytime. See our privacy policy.

Build any Computer Vision Application, 10x faster

All-in-one Computer Vision Platform for businesses to build, deploy and scale real-world applications.