International Conference on Learning Representations, Vienna, 2024
Abstract: We present a framework to define a large class of neural networks for which, by construction, training by gradient flow provably reaches arbitrarily low loss when the number of parameters grows. Distinct from the fixed-space global optimality of non-convex optimization, this new form of convergence, and the techniques introduced to prove such convergence, pave the way for a usable deep learning convergence theory in the near future, without overparameterization assumptions relating the number of parameters and training samples. We define these architectures from a simple computation graph and a mechanism to lift it, thus increasing the number of parameters, generalizing the idea of increasing the widths of multi-layer perceptrons. We show that architectures similar to most common deep learning models are present in this class, obtained by sparsifying the weight tensors of usual architectures at initialization. Leveraging tools of algebraic topology and random graph theory, we use the computation graph’s geometry to propagate properties guaranteeing convergence to any precision for these large sparse models.
Full text : [ OpenReview ]
Neural Information Processing Systems, New Orleans, 2022
Abstract: We present a new strategy to prove the convergence of Deep Learning architectures to a zero training (or even testing) loss by gradient flow. Our analysis is centered on the notion of Rayleigh quotients in order to prove Kurdyka-Lojasiewicz inequalities for a broader set of neural network architectures and loss functions. We show that Rayleigh quotients provide a unified view for several convergence analysis techniques in the literature. Our strategy produces a proof of convergence for various examples of parametric learning. In particular, our analysis does not require the number of parameters to tend to infinity, nor the number of samples to be finite, thus extending to test loss minimization and beyond the over-parameterized regime.
Full text & code : [ OpenReview ] [ Github ]
Neural Information Processing Systems, NeurReps Workshop, New Orleans, 2022
Abstract: We consider the problem of learning a periodic one-dimensional signal with neural networks, and designing models that are able to extrapolate the signal well beyond the training window. First, we show that multi-layer perceptrons with ReLU activations are provably unable to perform this task, and lead to poor performance in practice even close to the training window. Then, we propose a novel architecture using sine activation functions along with a well-chosen non-convex regularization, that is able to extrapolate the signal with low error well beyond the training window. Our architecture is several orders of magnitude better than its competitors for distant extrapolation (beyond 100 periods of the signal), while being able to accurately recover the frequency spectrum of the signal in a multi-tone setting.
Full text & code : [ OpenReview ] [ Github ]
WO 2020/236976 A1, filed with Interdigital, Palo Alto (CA)
Nov 2020 (WIPO)
Patent for the neural-network compression algorithm developed during the internship with Technicolor AI Lab / Interdigital. Layer-wise weight compression recast as a sequence of convex activation-reconstruction problems, preserving the output of the deep neural network while reducing its size, in memory and on disk.
Google Patent page : [ summary ] [ pdf ]
Laboratoire de Mathématiques LMO, Orsay
Oct 2020 - Jun 2021
Research internship with Lénaïc Chizat (CNRS) on the implicit bias induced by the gradient descent algorithm on two-layer neural networks. Characterized the continuous limit point as the Bregman projection with hyperbolic entropy potential of the initialization weights to the set of zero-loss weights, with linear convergence speed under some technical assumptions.
Internship report : [ pdf ]
Upstride SAS, Station F, Paris
Feb - Aug 2020
Research internship with Wilder Lopes exploring computational efficiency of variational auto- encoders defined over Clifford algebras. Demonstrated experimentally superior reconstruction performance of networks leveraging higher-dimensional algebras on small images.
Technicolor AI Lab (acquired by Interdigital), San Francisco (CA)
Feb - Aug 2019
Research internship with Swayambhoo Jain on compression of neural networks. Developed a fast compression method able to cut up to 90% of weights with no drop in accuracy by casting layerwise compression as a series of convex activation reconstruction problems.
Internship report : [ html ] [ pdf ] [ slides ] [ patent ]
Massachussets Institute of Technology, Boston (MA)
Jun - Aug 2018
Research Internship with Philippe Rigollet (MIT) on reconstruction of cellular trajectories in gene expression space with optimal transport. The resulting toolkit for single cell RNA sequencing timeseries analysis is open source and available as a Python package.
Waddington Optimal Transport : broadinstitute/wot (diverged since)
Internship report (in french) : [ html ] [ pdf ] [ slides ]
Deep Learning (MAP583) course by Kevin Scaman (INRIA - ENS), École Polytechnique
Practical introduction to deep learning and all implementation details, with a focus on coverage of a large amount of different data domains and network architectures.
Resources : [ Synapses page ] [ Practicals repository ] [ Custom python package ]
Deep Learning course by Marc Lelarge (INRIA - ENS), ENS Paris
Introduction to neural network compression concepts and recent results, with a focus and practical session on activation reconstruction.
Resources : [ Lecture slides ] [ Practical Session ] [ Practical Session Solution ]
From Oct 2021 to present
INRIA - ENS, Paris. DYOGENE Project-team
Advised by Marc Lelarge and Kevin Scaman
Reparameterizations of deep neural networks for structured data with symmetries.
Final year of the ENS cursus
École Normale Supérieure, Paris, 2020-2021
Additional advanced courses on stochastic processes and algebraic geometry.
Mathématiques, Vision & Apprentissage (MVA)
École Normale Supérieure, Paris, 2018-2020
Advanced mathematics and computer science, focused on Machine Learning
Coursework includes:
École Normale Supérieure, Paris, 2017-2018
Solid basis in modern mathematics and computer science.
Coursework includes:
Lycée Louis-le-Grand, Paris, 2015-2017
Post-secondary program in advanced maths and physics leading to nationwide entrance examinations to the Grandes Écoles for scientific studies
Lycée Hoche, Versailles, 2015
A-levels French equivalent
Awarded with highest honours
ACM Asia Conference on Computer and Communications Security, Taipei Taiwan, 2020
Abstract: We provide the first analysis on the feasibility of Return-Oriented programming (ROP) on RISC-V, a new instruction set architecture targeting embedded systems. We show the existence of a new class of gadgets, using several Linear Code Sequences And Jumps (LCSAJ), undetected by current Galileo-based ROP gadget searching tools. We argue that this class of gadgets is rich enough on RISC-V to mount complex ROP attacks, bypassing traditional mitigation like DEP, ASLR, stack canaries, G-Free and some compiler-based backward-edge CFI, by jumping over any guard inserted by a compiler to protect indirect jump instructions. We provide examples of such gadgets, as well as a proof-of-concept ROP chain, using C code injection to leverage a privilege escalation attack on two standard Linux operating systems. Additionally, we discuss some of the required mitigations to prevent such attacks and provide a new ROP gadget finder algorithm that handles this new class of gadgets.
Full text : [ ACM Link ] [ ArXiv ]
UNIX-like 64-bit micro-kernel with MMU handling, dynamic memory allocation, hardware interruptions, multi-processing, and basic filesystem for the Raspberry Pi 3 (before even Linux implements 64-bit support)
Source code available on github: robindar/sysres-os
Small SMT solver for equality theory decision procedures.
Implements DPLL, two-watched literals, and is fully unit-tested.
Source code available on github: robindar/semver-smt
Compiler for a small (yet Turing-complete) subset of Rust.
Borrow-checked and compiled down to x86 assembly.
Source code available on github: robindar/compil-petitrust
"RISC V"-style basic processor emulator in Minijazz (Netlist superset) and Minijazz-to-C compiler. Supports few instructions but has a good build system and is unit-tested
Source code available on gitlab: alpr-sysdig/processor
School project (TIPE)
Genetic algorithm to find good solutions to the Traveling Salesman Problem and a testing structure around it to optimize meta-parameters like population size, mutation probability or crossover method
https://www.robindar.com
Headless Debian to practice web design and server administration
Also acts as a personal Git server and occasional blog
If you have a project that you want to get started, think you need my help with something, or just fancy saying hi, send me a message, I'm always happy to help !
Message Me