0% found this document useful (0 votes)

39 views45 pages

Data Fusion - KEN4223

The document discusses federated learning, including taxonomy, data partitioning methods, algorithms like linear regression and federated random forest, and issues such as communication efficiency, non-IID data, and privacy. Vertical federated learning is described where different entities collaboratively train a model without sharing their raw data.

Uploaded by

Lilit

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views45 pages

Data Fusion - KEN4223

Uploaded by

Lilit

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 45

Data Fusion – KEN4223

Lecture 2
Taxonomy of data fusion
Federated learning as model fusion

https://towardsdatascience.com/introduction-to-ibm-federated-learning-a-
https://doi.org/10.1016/B978-0-444-63984-4.00001-6 collaborative-approach-to-train-ml-models-on-private-data-2b4221c3839
Recap
“Federated learning is a machine learning setting where multiple
entities (clients) collaborate in solving a machine learning problem,
under the coordination of a central server or service provider. Each
client’s raw data is stored locally and not exchanged or transferred;
instead focused updates intended for immediate aggregation are
used to achieve the learning objective.”

Kairouz et al., Advances and open problems in federated learning,

2019.
Taxonomy of Federated Learning
Federated learning systems

Data Machine Privacy Communication Scale of Motivation for

partitioning learning model mechanisms architecture federation federation

- horizontal - linear models - differential - centralized - cross-silo - incentive

- vertical - neural networks privacy - decentralized - cross-device - regulation
- hybrid -… - cryptographic
methods

Li et al., A survey on federated learning systems: vision, hype and reality for data privacy and protection,
arXiv preprint arXiv:1907.09693, 2019.
Data partitioning
Horizontal FL Vertical FL

Data Data
from A from A
labels

labels
Data
from B
Data
from B
FEDAVG [McMahan et al.]
Vertical FL

Data
from A

labels
Data
from B
Possible application
Vertical federated learning

Yang, et al., Federated Machine Learning: Concept and Applications

Vertical federated learning
Part 1. Encrypted entity alignment
Monica Scannapieco, et al., 2007. Privacy Preserving Schema and Data Matching. https://doi.org/10.1145/1247480.1247553
Vertical federated learning
Part 2. Encrypted model training
• Step 1: collaborator C creates encryption pairs, send public key to A and B;
• Step 2: A and B encrypt and exchange the intermediate results for gradient
and loss calculations;
• Step 3: A and B computes encrypted gradients and adds additional mask,
respectively, and B also computes encrypted loss; A and B send encrypted
values to C;
• Step 4: C decrypts and send the decrypted gradients and loss back to A and
B; A and B unmask the gradients, update the model parameters accordingly.
Existing Vertically Federated Learning Algorithms
• Linear regression
(Gascon, et al., Privacy-preserving distributed linear regression on high-dimensional data. Proceedings on Privacy Enhancing Technologies,
2017(4):345-364,2017)
• Association rule-mining
(Vaidya, Clifton, Privacy preserving association rule mining in vertically partitioned data. In Proceedings of the eighth ACM SIGKDD
international conference on Knowledge discovery and data mining, pages 639-644. ACM, 2002.)
• K-means clustering
(Vaidya, Clifton. Privacy-preserving k-means clustering over vertically partitioned data. In Proceedings of the ninth ACM SIGKDD international
conference on Knowledge discovery and data mining, pages 206-215, 2003.)
• Logistic regression
(Hardy et al., Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption,
arXiv:1711.10677, 2017.)
• Random forest
(Liu, et al., Federated forest. arXiv:1905.10053, 2019.)
• XGBoost
(Cheng, et al., Secureboost: A lossless federated learning framework. arXiv:1901.08755, 2019.)
• …
Regression

(Zhu et al. Federated Learning on Non-IID Data: A Survey)

Linear regression
• 𝜂 – learning rate
• 𝜆 – regularization parameter
• 𝑥𝑖𝐴 𝑖∈𝐷𝐴
, 𝑥𝑖𝐵 , 𝑦𝑖 𝑖∈𝐷𝐵
– data set
• 𝜃𝐴 , 𝜃𝐵 – model parameters

Training objective:
2 𝜆
min σ𝑖 𝜃𝐴 𝑥𝑖𝐴 + 𝜃𝐵 𝑥𝑖𝐵 − 𝑦𝑖 + 𝜃𝐴 2 + 𝜃𝐵 2
𝜃𝐴 ,𝜃𝐵 2

Yang, et al., Federated Machine Learning: Concept and Applications

Linear regression
𝑢𝑖𝐴 = 𝜃𝐴 𝑥𝑖𝐴 , 𝑢𝑖𝐵 = 𝜃𝐵 𝑥𝑖𝐵 ,
2 𝜆
Loss: ℒ = σ𝑖 𝑢𝑖𝐴 + 𝑢𝑖𝐵 − 𝑦𝑖 + 𝜃𝐴 2
+ 𝜃𝐵 2
2
2 𝜆 2 𝜆
ℒ𝐴 = σ𝑖 𝑢𝑖𝐴 + 𝜃𝐴 , ℒ𝐵 = 𝑖 𝑢𝑖 − 𝑦𝑖 + 𝜃𝐵 2 ,
2
σ 𝐵
2 2
2
ℒ𝐴𝐵 = 2 σ𝑖 𝑢𝑖𝐴 𝑢𝑖𝐵 − 𝑦𝑖 then ℒ = ℒ𝐴 + ℒ𝐵 + ℒ𝐴𝐵
𝑑𝑖 = 𝑢𝑖𝐴 𝑢𝑖𝐵 − 𝑦𝑖 , then gradients are
𝜕ℒ 𝐴 𝜕ℒ
= σ𝑖 𝑑𝑖 𝑥𝑖 + 𝜆𝜃𝐴 and = σ𝑖 𝑑𝑖 𝑥𝑖𝐵 + 𝜆𝜃𝐵
𝜕𝜃𝐴 𝜕𝜃𝐵
Linear regression - training

Yang, et al., Federated Machine Learning: Concept and Applications

Linear regression - evaluation

Yang, et al., Federated Machine Learning: Concept and Applications

Linear regression - possible modification
SGD – linear regression
𝛼
𝜃0𝑖+1 = 𝜃0𝑖 − σ𝑛𝑙=1 𝑓 𝜽, 𝒙(𝑙) ) − 𝑦 (𝑙)
𝑛
𝛼 𝑛
𝑖+1 𝑖 (𝑙)
𝜃𝑗 = 𝜃0 − ෍ 𝑓 𝜽, 𝒙(𝑙) ) − 𝑦 (𝑙) 𝑥𝑗
𝑛 𝑙=1

Why don’t share 𝑓 𝜽, 𝒙(𝑙) ) − 𝑦 (𝑙) instead of the partial gradient?

Do we need a coordinator?

(Yang et al., Parallel Distributed Logistic Regression for Vertical Federated

Learning without Third-Party Coordinator, arXiv:1911.09824)
Do we need a coordinator?

(Yang et al., Parallel Distributed Logistic Regression for Vertical Federated

Learning without Third-Party Coordinator, arXiv:1911.09824)
Updates sequential or parallel?

(Liu, et al., A Communication-Efficient Collaborative Learning Framework

for Distributed Features, arXiv:1912.11187)
Updates sequential or parallel?

(Liu, et al., A Communication-Efficient Collaborative Learning Framework

for Distributed Features, arXiv:1912.11187)
Federated random forest

(Liu, et al., Federated Forest, arXiv:1905.10053 )

Issues?
Communication efficiency
Communication efficiency

Communication-Efficient Vertical Federated Learning

A Khan, M ten Thij, A Wilbik, Algorithms 15 (8), 273
Issues - Non-IID data
• Linear models.
- The loss function of training logistic regression in vertical FL has no
difference to that in centralized learning.
- Non-IID data does not affect the learning performance for linear models.
• Neural networks
Issues – performance, convergence, speed
E.g.,
• Multiple local updates
(Liu, et al., A Communication-Efficient Collaborative Learning Framework for Distributed Features,
arXiv:1912.11187)
• Using gradient and the Hessian of the Taylor loss approximation
of logistic regression
(Yang, et al. A Quasi-Newton Method Based Vertical Federated Learning Framework for Logistic Regression,
arXiv:1912.00513)
Issues - Privacy

• Cryptographic longterm key (CLK) for multiple personal identifiers

• Similarity between CLKs - the number of matching bits (Dice coefficient)
Frameworks

(arXiv:2212.00622)
VFL: research & opportunities

(arXiv:2212.00622)
VFL: research & opportunities
• Handling Dynamic Data/ Model Drift – Continual learning
• Explainability
• Fairness
• Incentive Mechanisms
• Dataset Availability

(arXiv:2212.00622)
Privacy (FL)
Preserving the Privacy of User Data
Keeping raw data local to each device is a first step
privacy

utility
Privacy principles
aggregate
anonymous server release
model deployment
collection

………
………
deployed
model
early
network aggregation
federated
training

client minimize
data data
exposure
Federated learning landscape - privacy
local
central
differential server model deployment differential
privacy secure

………
privacy

………
multi-party
computation deployed
model
encryption
network federated
training

client
data
Robustness to attacks and failures
model deployment

………
………
evasion
attacks

clientfederated
dropouttraining

model data
poisoning poisoning
Backdoor attacks

(Bagdasaryan, et. al. How to

Backdoor Federated Learning.
AISTATS’20)
Open topics
Open topics
• Going beyond empirical risk minimization formulations: tree-
based methods, online learning, Bayesian learning...
• RL, unsupervised and semi-supervised, active learning?
• Support ML workflows like hyperparameter searches?
• Make trained models smaller?
• Fairness in FL?
Open topics
• Security in FL:
- how to mitigate poisoning attacks?
- how to make local computation verifiable ?
• Do more with fewer clients or less resources per client?
• Reduce training time?
• Achieve personalization?
• Theory for FL?
• Real world applications
TRL
Exam material
• Slides
Next…
• Next week – Carnival Week
No Education!!

• 20/02/2024 (8:30 am) – Lab : Federated Learning

• 21/02/2024 – Lecture: High-Level Fusion
• 22/02/2024 – Guest lecture: Industry perspective

Federated Learning Decentralized AI For Privacy-Preserving Collaboration
No ratings yet
Federated Learning Decentralized AI For Privacy-Preserving Collaboration
1 page
Vsam Tutorial
100% (1)
Vsam Tutorial
42 pages
Bharati Et Al 2022 Federated Learning Applications Challenges and Future Directions
No ratings yet
Bharati Et Al 2022 Federated Learning Applications Challenges and Future Directions
17 pages
1FL 2024
No ratings yet
1FL 2024
75 pages
FL 1
No ratings yet
FL 1
25 pages
(24.07) Combining Federated Learning and Control A Survey
No ratings yet
(24.07) Combining Federated Learning and Control A Survey
19 pages
100percent Updated 1
No ratings yet
100percent Updated 1
13 pages
Federated Learning - Hope and Scope
No ratings yet
Federated Learning - Hope and Scope
4 pages
Federated Learning - Hope and Scope
No ratings yet
Federated Learning - Hope and Scope
3 pages
Federated Learning
No ratings yet
Federated Learning
11 pages
2024 MTH058 Lecture07 FederatedLearning
No ratings yet
2024 MTH058 Lecture07 FederatedLearning
25 pages
杨强教授：2021联邦学习专题研讨会
No ratings yet
杨强教授：2021联邦学习专题研讨会
76 pages
Federated Learning Advancements Applications and F
No ratings yet
Federated Learning Advancements Applications and F
7 pages
Splitfed: When Federated Learning Meets Split Learning
No ratings yet
Splitfed: When Federated Learning Meets Split Learning
14 pages
Extended Research Paper 4
No ratings yet
Extended Research Paper 4
6 pages
InfoSec Project Report
No ratings yet
InfoSec Project Report
7 pages
A Hybrid Approach To Privacy-Preserving Federated Learning
No ratings yet
A Hybrid Approach To Privacy-Preserving Federated Learning
11 pages
Privacy-Preserving Federated Learning Based On Differential Privacy and Momentum
No ratings yet
Privacy-Preserving Federated Learning Based On Differential Privacy and Momentum
6 pages
TCT: Convexifying Federated Learning Using Bootstrapped Neural Tangent Kernels
No ratings yet
TCT: Convexifying Federated Learning Using Bootstrapped Neural Tangent Kernels
29 pages
Federated Learning
No ratings yet
Federated Learning
50 pages
Federated Learning On Non-IID Data Silos: An Experimental Study
No ratings yet
Federated Learning On Non-IID Data Silos: An Experimental Study
20 pages
A Survey On Vertical Federated Learning - From A Layered Perspective
No ratings yet
A Survey On Vertical Federated Learning - From A Layered Perspective
36 pages
FedAFR Enhancing Federated Learning With Adaptive Fea - 2024 - Computer Communi
No ratings yet
FedAFR Enhancing Federated Learning With Adaptive Fea - 2024 - Computer Communi
8 pages
Federated Foundation Models: Privacy-Preserving and Collaborative Learning For Large Models
No ratings yet
Federated Foundation Models: Privacy-Preserving and Collaborative Learning For Large Models
10 pages
FL Tut Ans
No ratings yet
FL Tut Ans
19 pages
8 A Privacy Preserving Federated Learning Scheme Using Homomorphic
No ratings yet
8 A Privacy Preserving Federated Learning Scheme Using Homomorphic
15 pages
Federated Learning With Differential Privacy Algorithms and Performance Analysis
No ratings yet
Federated Learning With Differential Privacy Algorithms and Performance Analysis
16 pages
Report
No ratings yet
Report
59 pages
1 s2.0 S0167739X21004726 Main
No ratings yet
1 s2.0 S0167739X21004726 Main
9 pages
F O: O V F L S E: ED Ptimus Ptimizing Ertical Ederated Earning FOR Calability and Fficiency
No ratings yet
F O: O V F L S E: ED Ptimus Ptimizing Ertical Ederated Earning FOR Calability and Fficiency
6 pages
Federated Learning Attacks and Defenses: A Survey
No ratings yet
Federated Learning Attacks and Defenses: A Survey
10 pages
Federated Learning
No ratings yet
Federated Learning
10 pages
Fedlab A Flexible Federated Learning Framework
No ratings yet
Fedlab A Flexible Federated Learning Framework
10 pages
Hybridalpha: An Efficient Approach For Privacy-Preserving Federated Learning
No ratings yet
Hybridalpha: An Efficient Approach For Privacy-Preserving Federated Learning
11 pages
Recent Advances
No ratings yet
Recent Advances
18 pages
Paper 5
No ratings yet
Paper 5
7 pages
Newres 5
No ratings yet
Newres 5
23 pages
To Appear in KAIS: From Distributed Machine Learning To Federated Learning: A Survey
No ratings yet
To Appear in KAIS: From Distributed Machine Learning To Federated Learning: A Survey
36 pages
Presentation On Distributed Federated Learning For Ddos Attack Mitigation in Industrial Control System (Ics)
No ratings yet
Presentation On Distributed Federated Learning For Ddos Attack Mitigation in Industrial Control System (Ics)
12 pages
A Communication-Efficient Collaborative Learning21
No ratings yet
A Communication-Efficient Collaborative Learning21
19 pages
Kairouz, McMahan Et Al 2019 - Advances and Open Problems in Federated Learning
No ratings yet
Kairouz, McMahan Et Al 2019 - Advances and Open Problems in Federated Learning
121 pages
File - BB 6 15
No ratings yet
File - BB 6 15
10 pages
Flair Paper 19 Socialsec
No ratings yet
Flair Paper 19 Socialsec
19 pages
Implementation and Analysis of A Federated Learning Architecture Using CIFAR 10 Dataset 1
No ratings yet
Implementation and Analysis of A Federated Learning Architecture Using CIFAR 10 Dataset 1
6 pages
Fedsa: Accelerating Intrusion Detection in Collaborative Environments With Federated Simulated Annealing
No ratings yet
Fedsa: Accelerating Intrusion Detection in Collaborative Environments With Federated Simulated Annealing
9 pages
Research Paper 8
No ratings yet
Research Paper 8
7 pages
Split-Fed Learning A Deep Dive Into Methods Innova
No ratings yet
Split-Fed Learning A Deep Dive Into Methods Innova
24 pages
Federated Learning IoT Presentation
No ratings yet
Federated Learning IoT Presentation
9 pages
Federated Learning Presentation
No ratings yet
Federated Learning Presentation
11 pages
Final Report
No ratings yet
Final Report
50 pages
Robust and Communication-Efficient Federated Learning From Non-IID Data
No ratings yet
Robust and Communication-Efficient Federated Learning From Non-IID Data
17 pages
Blockchain For Federated Learning
No ratings yet
Blockchain For Federated Learning
18 pages
Federated Learning For Privacy-Preserving Iot Data Analytics
No ratings yet
Federated Learning For Privacy-Preserving Iot Data Analytics
2 pages
A Detailed Survey On Federated Learning Attacks and Defenses
No ratings yet
A Detailed Survey On Federated Learning Attacks and Defenses
18 pages
Introduction To Federated Learning
No ratings yet
Introduction To Federated Learning
1 page
A Practical Recipe For Federated Learning Under Statistical Heterogeneity Experimental Design
No ratings yet
A Practical Recipe For Federated Learning Under Statistical Heterogeneity Experimental Design
14 pages
A Survey On Federated Learning Systems: Vision, Hype and Reality For Data Privacy and Protection
No ratings yet
A Survey On Federated Learning Systems: Vision, Hype and Reality For Data Privacy and Protection
41 pages
Federated Learning A Survery
No ratings yet
Federated Learning A Survery
31 pages
1.5.-Federated Learning Over Wireless Networks Convergence Analysis and Resource Allocation
No ratings yet
1.5.-Federated Learning Over Wireless Networks Convergence Analysis and Resource Allocation
12 pages
SecureBoost A Lossless Federated Learning Framework
No ratings yet
SecureBoost A Lossless Federated Learning Framework
9 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Free VRIO Analysis Template PowerPoint Download
No ratings yet
Free VRIO Analysis Template PowerPoint Download
3 pages
Journal Pone 0293266
No ratings yet
Journal Pone 0293266
17 pages
Department of Advanced Computing Sciences - Student Handbook
No ratings yet
Department of Advanced Computing Sciences - Student Handbook
131 pages
1602 05629 PDF
No ratings yet
1602 05629 PDF
11 pages
The Hveem Method
No ratings yet
The Hveem Method
22 pages
Ab Initio
No ratings yet
Ab Initio
17 pages
ISW2001NBF - AEB (VERSIONE 6.1.3) - Installatore - ENG PDF
No ratings yet
ISW2001NBF - AEB (VERSIONE 6.1.3) - Installatore - ENG PDF
33 pages
Distributed Scheduling: B. Prabhakaran 1
No ratings yet
Distributed Scheduling: B. Prabhakaran 1
27 pages
Supervised and Unsupervised Machine Learning
No ratings yet
Supervised and Unsupervised Machine Learning
3 pages
Booklet Primer Grado Insps 2024
No ratings yet
Booklet Primer Grado Insps 2024
42 pages
On-Line Monetary Transaction: Marketing in IT
No ratings yet
On-Line Monetary Transaction: Marketing in IT
16 pages
Ns2 Installation Procedure
No ratings yet
Ns2 Installation Procedure
18 pages
Missing Neighbors in WCDMA Analysis Guide
100% (2)
Missing Neighbors in WCDMA Analysis Guide
15 pages
20 Coding Patterns To Master MAANG Interviews
No ratings yet
20 Coding Patterns To Master MAANG Interviews
22 pages
B1-2DA datasheet-EN PDF
No ratings yet
B1-2DA datasheet-EN PDF
4 pages
Advanced Topics in Control Systems: Exercises and Project Ideas
No ratings yet
Advanced Topics in Control Systems: Exercises and Project Ideas
12 pages
Scratch Programming (Scratch 3.0)
No ratings yet
Scratch Programming (Scratch 3.0)
13 pages
The Language of Algebra: Lesson
No ratings yet
The Language of Algebra: Lesson
8 pages
Difference Between Microkernel and Exokernel
No ratings yet
Difference Between Microkernel and Exokernel
4 pages
Compiler Construction
0% (1)
Compiler Construction
19 pages
Maintenance Planning and Scheduling Laboratory Assessment 1
No ratings yet
Maintenance Planning and Scheduling Laboratory Assessment 1
4 pages
Prepositions of Place - My Room
100% (1)
Prepositions of Place - My Room
1 page
Math Lesson Plan The Vitruvian Man
No ratings yet
Math Lesson Plan The Vitruvian Man
9 pages
Belina RTGS 2020 Year End Notes
No ratings yet
Belina RTGS 2020 Year End Notes
20 pages
Mathlinks 9 Review Bundles CH 2
No ratings yet
Mathlinks 9 Review Bundles CH 2
4 pages
YouTube Gains by DarkFerret
No ratings yet
YouTube Gains by DarkFerret
11 pages
Change Management and iFIX
No ratings yet
Change Management and iFIX
67 pages
AIS Book Chapter 1 Answer
No ratings yet
AIS Book Chapter 1 Answer
5 pages
Flask WTF
No ratings yet
Flask WTF
29 pages
Agilent 1220 Infinity LC User Manual PDF
No ratings yet
Agilent 1220 Infinity LC User Manual PDF
380 pages
SimCube SC 5 User Manual PDF
No ratings yet
SimCube SC 5 User Manual PDF
24 pages
1Z0 1091 24 Demo
0% (1)
1Z0 1091 24 Demo
6 pages
Formal Letter Unsatisfactory Facilities at The Library
No ratings yet
Formal Letter Unsatisfactory Facilities at The Library
1 page