apply in 2 min. Thesis - PhD thesis : Explaining “black box” AI algorithms through their training examples
back to list

PhD thesis : Explaining “black box” AI algorithms through their training examples

ref :2025-42577 | 12 Mar 2025

apply before : 30 Sep 2025

  • 44-46 Avenue de la République, 92320 CHATILLON - France

about the role

Global context

Recent advances in machine learning have led to new AI applications promising increased automation of new tasks to enhance operational efficiency or relieve professionals from less interesting tasks. These AIs often rely on over-parameterized models (known as deep learning), whose parameters are learned from vast training datasets. As a result, they are often described as 'opaque': it is impossible for users or data scientists to precisely understand the process that leads to decisions once the system is in production. Meanwhile, regulators (AI Act), business users, and end users are demanding more transparency, creating a need for explainability of these algorithms.
In recent years, many explainability methods have emerged for various data modalities, each with its advantages and disadvantages. Among them, example-based explainability methods identify training examples that influence an AI's decisions [1]. This provides a meaningful explanation for non-experts, similar to how people reason: 'I think it is XXX because I have seen a similar case in the past.' These methods have, for instance, been recently applied to state-of-the-art language models [2].


Scientific goal

The thesis work will consist of a literature review of the theoretical, algorithmic, and practical advancements of these methods in order to derive new research avenues, as well as to propose new methods in partnership with supervisors (at Orange Innovation and academic institutions). We have identified a number of challenges to address, as well as references that tackle these issues:

  • Scaling methods that can be resource-intensive (memory, algorithmic complexity), based on recent approximate algorithms and proposing new methods [3].
  • The formalization of how to appropriately assign importance to each training example when the prediction made by the model relies on multiple training examples [4].
  • The identification of groups of examples that lead to undesirable behaviors, such as unfair predictions towards certain groups of instances, mislabeled examples, or anomalies [5].
  • The issue of usability regarding the provision of these explanations to real users, rather than just data science experts. Here, there will be an opportunity to engage with expert ergonomists at Innovation on the reception of technical objects in work conditions [6]."


During the thesis, the candidate will also be expected to publish their new scientific contributions and present them at conferences, as well as at internal seminars and the host university.

about you

Technical and scientific skills and personal qualities required for the position

  • Excellent level in mathematics/statistics
  • Programming (Python, Scikit-learn, PyTorch); having a GitHub repository with projects is a plus
  • Ability to solve complex problems and be creative in proposing innovative solutions
  • Ability to integrate into a team
    Ideally, have a sensitivity to issues related to ethical AI: fairness, transparency, frugal AI
  • Writing skills
  • Synthesis and communication skills.


Required Education

  • MSc with a strong component in machine learning


Desired Experience

  • Internship with a model development component in machine learning
    Ideally, demonstrated research experience.

 

 

 

additional information

The thesis topic is highly promising due to its applications in new AI technologies, while the supervisors bring an original vision and specific expertise (papers published at NeurIPS, ECML, TMLR, AISTATS, ...) that will enable the candidate to quickly identify research avenues.

The Data & AI division of Orange Innovation oversees AI applications for all entities within Orange, providing a comprehensive view of concrete uses in the industry. Research at Orange is distributed across various teams (network, marketing, services, etc.), offering enriching interactions with different fields. Researchers, spread across different sites, collaborate regularly, and internal events at the national level allow different sites to share their results and research directions.

 

References

[1] https://ml-data-tutorial.org/

[2] Grosse, Roger, et al. "Studying large language model generalization with influence functions." ArXiv 2023

[3] George, Thomas et al. "Fast approximate natural gradient descent in a kronecker factored eigenbasis.” NeurIPS 2018

[4] Ghorbani, A., & Zou, J. "Data shapley: Equitable valuation of data for machine learning.” ICML 2019

[5] Black, E., & Fredrikson, M. “Leave-one-out unfairness.” FAccT 2021

[6] https://hellofuture.orange.com/en/for-a-contextual-approach-to-explainability/

department

Orange Innovation brings together the research and innovation activities and expertise of the Group's entities and countries. We work every day to ensure that Orange is recognized as an innovative operator by its customers and we create value for the Group and the Brand in each of our projects. With 720 researchers, thousands of marketers, developers, designers and data analysts, it is the expertise of our 6,000 employees that fuels this ambition every day.

Orange Innovation anticipates technological breakthroughs and supports the Group's countries and entities in making the best technological choices to meet the needs of our consumer and business customers.

 
Within Innovation, you will be integrated into a cutting-edge research team focused on innovation and expertise in artificial intelligence and its industrial applications for maintenance, supervision, marketing, among others. You will be part of a research ecosystem working alongside delivery engineers (short-term) that enables the concrete implementation of the concepts studied in real use cases.

contract

Thesis

Only your skills matter

Regardless of your age, gender, origin, religion, sexual orientation, neuroatypia, disability or appearance, we encourage diversity within our teams because it is a strength for the collective and a vector of innovation. Orange Group is a disabled-friendly company: don't hesitate to tell us about your specific needs.

recruitment process

Orange on Glassdoor

Similar offers

Orange SA

Orange Group

91%

of our employees are proud to work for Orange

87%

recommend Orange as a good place to work

4,21/5

is the candidate experience in France, in the category of companies with over 1,000 employees

Since 2011, Orange has GEEIS (Gender Equality European & International Standard) certification in some twenty countries