apply in 2 min. Thesis - PhD "Machine Unlearning for attack mitigation in federated learning" M/F
back to list

PhD "Machine Unlearning for attack mitigation in federated learning" M/F

ref :2023-24761 | 08 Apr 2024

apply before : 30 Sep 2024

Châtillon 92320, France - France

about the role

As a PhD student, you will work on a PhD thesis on the subject of: "Securing Federated Learning by Unlearning".

 

  • Motivations and Context

Cybersecurity has become a major issue in an increasingly digital world. Indeed, cyberattacks are multiplying against both organizations and individuals. These attacks are increasingly sophisticated and now threaten Machine Learning (ML) models [1]. Researchers has thus shown the vulnerability of ML and the need to secure them [2].

 

Among these attacks against ML, we can cite the data poisoning attack where the attacker introduces or modifies the training data to corrupt the model of the victim. The attacker can thus introduce a backdoor in the model so that in the presence of a pattern the behavior of the model changes in accordance with his objective [3]. For example, for a self-driving car, the attacker could trick the AI by putting a green sticker on a “stop” sign so that it is recognized as a 70mph speed limit. To do this, he would simply have to modify the training dataset by introducing images of a "stop" sign with the label "stop" (legitimate data) accompanied by these same images but after adding a small green square with the label “70” (malicious data).

 

While it is obviously difficult for an attacker to gain privileged access to a training dataset, it is on the other hand easier for him to inject his malicious data in a collaborative learning setting.

 

In the field of collaborative learning, federated learning (FL) is a new paradigm of ML allowing participants to collaborate in the training of a global model while preserving the data privacy [4]. Thus, instead of sharing their data, participants share their model parameters. This approach arouses a large interest, especially at Orange. However, it is more vulnerable to the risk of poisoning attacks due to the probability of having an attacker among the diversity of participants [5].

 

One of the particularities of backdoors is that they are introduced by a subtle modification of the training set. It is therefore difficult to detect them during the training phase and they are often discovered during the production phase. Training being an expensive process, deleting the model and restarting from scratch is not an option. It is therefore necessary to unlearn the contribution of the malicious participants [6,7,8].

 

  • Scientific Objective

The objective of this thesis is therefore to study and propose new unlearning algorithms in the federated learning setting. For this, it will be necessary to deeply understand the attack mechanisms against FL and ML to develop a framework to reproduce these attacks. The focus will be on the backdoor attacks. The main challenges to overcome are:

  1. the detection of an attack and the attackers [9],
  2. the suppression of the attacker's contribution in a model (unlearning).

 

The results of this work can then be used to secure Orange's federated learning platform "OrangeFL".

 

about you

Expected Skills (scientific, technical and personal)

  • Strong skills in Machine Learning from a theoretical and practical point of view
  • Mastering algorithms and python programing (scikit-learn, tensorflow, pytorch, etc.)
  • Interest for cybersecurity
  • Ability to read and understand scientific articles written in English
  • Autonomy, rigor, curiosity, dedication, persistence and initiative are highly recommended qualities for completing a doctorate

 

Educational background and diploma

  • Research Master or engineering degree in computer science/applied mathematics with a specialization in Data-Science

 

Desired experiences

  • One or more research internships completed in the fields of Machine Learning and Deep Learning

additional information

Cybersecurity as well as artificial intelligence is experiencing an unprecedented global interest. These two fields are not only fascinating but also full of opportunities. The doctoral student will be able to explore a new and growing field of research where the security of AI models is at the heart of the problem.

 

She/he will also have access to data from connected vehicles (V2X) to test her/his approaches. In addition, during the thesis, the candidate will benefit from a stimulating environment for research. 

 

 

References

[1] https://www.microsoft.com/en-us/security/blog/2020/10/22/cyberattacks-against-machine-learning-systems-are-more-common-than-you-think/
[2] Enisa, “Securing Machine Learning Algorithm”, 2021, https://www.enisa.europa.eu/publications/securing-machine-learning-algorithms
[3] Schwarzschild at al., “Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks”, 2020
[4] McMahan et al., “Communication-Efficient Learning of Deep Networks from Decentralized Data”, 2016
[5] Bagdasaryan et al., “How To Backdoor Federated Learning”, 2018
[6] Fraboni et al., “Sequential Informed Federated Unlearning: Efficient and Provable Client Unlearning in Federated Optimization”, 2022
[7] Liu et al., “Federated Unlearning”, 2020
[8] Halimi et al., “Federated Unlearning: How to Efficiently Erase a Client in FL?”, 2022
[9] Chen et al., “Detecting Backdoor Attacks on Deep Neural Networks by Activation Clustering”, 2018

 

department

Orange Innovation brings together the research and innovation activities and expertise of the Group's entities and countries. We work every day to ensure that Orange is recognized as an innovative operator by its customers, and we create value for the Group and the Brand in each of our projects. With 720 researchers, thousands of marketers, developers, designers and data analysts, it is the expertise of our 6,000 employees that fuels this ambition every day.

 

Orange Innovation anticipates technological breakthroughs and supports the Group's countries and entities in making the best technological choices to meet the needs of our consumer and business customers.

 

Within Orange Innovation, you will join the MORE (Mathematical Models for Optimization and peRformance Evaluation) team which has around ten permanent engineers/researchers whose mission is to develop models and techniques to optimize the quality and performance of Orange group services. The team is also made up of around ten doctoral students, apprentices, and interns. You will thus be part of a research ecosystem alongside operational units, with the aim of developing algorithms at the cutting edge of innovation.

contract

Thesis

Only your skills matter

Regardless of your age, gender, origin, religion, sexual orientation, neuroatypia, disability or appearance, we encourage diversity within our teams because it is a strength for the collective and a vector of innovation. Orange Group is a disabled-friendly company: don't hesitate to tell us about your specific needs.

recruitment process

Orange on Glassdoor

Similar offers

Orange SA

Orange Group

91%

of our employees are proud to work for Orange

87%

recommend Orange as a good place to work

4,21/5

is the candidate experience in France, in the category of companies with over 1,000 employees

Since 2011, Orange has GEEIS (Gender Equality European & International Standard) certification in some twenty countries