Inria

Research projects

Real-life bandits (RELIANT)

  • Inria-Japan Associate Team « Real-life bandits » (RELIANT), Principal Investigator with Junya Honda, 2022-2024. Link

Chaire IA « Apprentissage par renforcement » (AppRenf)

  • Chaire IA « Apprentissage par renforcement » (AppRenf) project (R-PILOTE-19-004-APPRENF), Fondation I-SITE ULNE within the project PILOTE from cluster HumAIn@Lille, Principal Investigator with Philippe Preux, 400k€.

Data Collection for Sustainable Crop management (DC4SCM)

  • Inria-India Associate team « Data collection for Sustainable Crop management » (DC4SCM), Investigator with Bihar Agriculture University, Inria, 2020-2022.

Other projects

  • Participant of project Pl@ntAgroEco from PEPR Agroécologie et Numérique
  • Participant of project Foundry from PEPR Foundry
  • Participant of ANR Bip-up, Bandits for Patient Follow-up
  • Participant of ANR EPI-RL, Epistemic Reinforcement Learning.
  • Participant of STIC-AmSud project EMISTRAL « Environmental Monitoring and Inspection Sailboat via Transfer, Reinforcement and Autonomous Learning », January 2021, 24 months, 47k€.
  • Participant of I-Site Expand2 project B4H « Bandit for Health », 2019, 150k€

Software projects

FarmGym

  • A gamified simulator of agrosystems, able to generate a diversity of farming challenges. Principal Investigator and Lead developer. (Follow-up of AEx .SR4SG) Link

WeG@rden

  • A collaborative platform for data collection and recommendation of agroecological practices. Principal Investigator. (Follow-up of AEx .SR4SG). See the presentation page of the proejct. See also this inria article.

Statistical Reinforcement Learning

  • A python library with several gymnasium compatible reinforcement algorithms and environments. Principal Investigator and Lead developer. Link

Past research projects

Sequential Recommendation for Sustainable Gardening (SR4SG)

  • Action Exploratoire: Sequential Recommendation for Sustainable Gardening (SR4SG), Principal Investigator, September 2019, 48 months, 165k€. Link

BAnDits Against non-Stationarity and Structure (BADASS)

  • ANR JCJC BAnDits Against non-Stationarity and Structure (BADASS), Principal investigator. Other participants: Emilie Kaufmann and Richard Combes, November 2016, 42 months, 180k€. Link

Smaller projects

  • co-PI of « RLRL: Real Life Reinforcement Learning » project, with Benoîte de Saporta (Université Montpellier, IMAG), 12 months.
  • PI of GdT digicosmes : Sequential Structured Statistical Learning (SSSL), with Richard Combes and Kinda Khawam, September 2016, 12 months: 2k€.
  • co-PI of IFCAM project: Contextual multi-armed bandits with hidden structure, with Aditya Gopalan. September 2016, 18 months: 7.5k€.
  • PI of Inria Carnot : Forecasting in hydraulic networks, with Olivier Teytaud and Marc Schoenauer. October 2015, 12 months: 12 months engineer.
  • co-PI of PEPS JCJC: Promo, with Rémi Bardenet. May 2015, 6 months: 7k€.

Workshop organization

  • 2019 RLSS – Reinforcement Learning Summer School.
  • 2018 ANR BADASS Lecture Series (Invited lecturer: Peter Grünwald)
  • 2018 EWRL workshop – European Workshop on Reinforcement Learning.
  • 2017 Workshop – Sequential Structured Statistical Learning.
  • 2014 NIPS Workshop – “From Bad Models to Good Policies” (Sequential Decision Making under Uncertainty)

Other interest

As every researcher knows, there is generally a gap between all what we know/master about, all what we are interested in and would like to do, and what finally appears scarcely in some of our published papers. Here I want to list some topics/keywords/questions I would love working on but do not have time to, being too busy with another exciting project. Feel free to be hooked by some. If this gives you inspiration, please go ahead and work on them. From E-learning to Permaculture or Circular economy, this section tries to embrace the potential of Sequential Decision Making for shaping our future societies.

  • Computational permafarming: Given a farm, with plants that are strongly interacting and sharing resources, the goal is to decide which action to perform (planting/moving which plant) in order to maximize the resistance of the system to attacks from the weather, insects or diseases while minimizing the external resources added to the system. Handling the strong dependency between the plants is a beautiful challenge. This directly falls under the scope of computational sustainability. Robust hydraulic systems: Say you want to monitor the level of rivers, in order to avoid flooding or that certain pollution reach critical sites. An incoming rain, storm, or pollution is seen as an attack, and the goal is to control the system in a robust way. Such a formalism may also apply to the management of electrical systems, when we further constraint the communications between control nodes.
  • E-Learning: The goal is to recommend a series of exercises to incoming students in order maximize their learning level. Since each learner improves at each time steps, the learning progress of a student is a non-stationary reward signal. Handling such a signal for recommender systems is a great challenge. Job recommendations: Here the goal is to match job announcements to people looking for jobs. The task is fairly close to news article recommendation, and involves some natural language processing. Obviously the potential impact on unemployment is exciting.
  • Urban recommendations: Consider you display the activity of a set of town councils to a web platform and record a feedback from citizens. Based on this data, you now want to help a council make better decisions, by recommending projects that may work well and alerting about possibly bad projects. Here one challenge is about long term decisions.
  • Sustainable Economy: Consider a graph of economic agents exchanging resources. Here you want to price the additional number of resources needed at the source nodes of the graph, in order to increase the production at a certain node, and then recommend to a user which producer to favor when they both output the same resource. Handling the network activity of many agents is especialy challenging, and crucial for the development of closed-loop economy.
  • Information reconstruction in resource networks: In this project, we study a large network of agents who produce, transfer and consumate resources. Only transfer of resources can be observed but neither production nor consumptions. Under some assumptions such that a production can only start if the resources needed for production have been received by the agent, and that transfer of resources systematically occur when a production cannot start, the goal is to study to which point it is possible to reconstruct the information of production and consumption, with quantitive bounds, as well as the network of effective dependency of a specific production.
  • Stable and self-moving structures on weakly-differentiable manifolds: Motivated by the loss of differentiability occurring in shocks between « particles » we study manifolds that are only weakly-differentiable, with respective extensions of the tangent space, geodesics, curvature, currents, etc. Also, for certain types of dynamical systems governed by a « flattening » dynamics, we study initial conditions that ensure the existence of stable and « self-moving » structures.
  • Co-articulation Optimization: Given a dynamical dystem (known dynamics) and a finite set of landmark points as inputs, we want to compute, for each finite sequence of landmark points, an optimal interpolation path passing maximally close to the targeted landmark points in the given order while being maximally distinct from the other landmark points. Then, we want to do the same when the dynamics is unknown. coarticulation This model naturally applies to the computation of co-articulation complexity of words, each landmark point corresponding to the prononciation parameters of one phoneme for a given speech apparatus. Then, based on a corpus of documents in some natural language, we can for instance compute the average co-articulation complexity of a language with respect to a given model of speech aparatus. In case we moreover have access to a grammar generator, we may generate a new natural language that minimizes the co-articulation complexity of most frequent grammatical structures while ensuring that the phoneme distance (geodesic distance in the parametric model of speech apparatus) between two grammatical structures increases with their co-occurrence frequency.
  • Associative Memories with massive storage capabilities, Maximal no-hallucination capacity and optimal reconstruction: An associative memory stores a signal into a (hyper-)graph by creating a (hyper-)clique structure, leading to a sparse and robust representation. Optimal reconstruction can be done via the use of random matrices under some conditions. We want to investigate basic properties of associatove memories with hyper graph of a given size, like how to avoid hallucination (either creation or reconstruction of a clique corresponding to no signal), what is the maximal capacity of the memory, and what are the guarantees of optimal reconstruction under the constraint of avoiding hallucinations.