Publications | Foundations of Responsible Machine Learning @ UCPH

Language Generation with Replay: A Learning-Theoretic View of Model Collapse

Giorgio Racca, Michal Valko, Amartya Sanyal

Preprint Preprint (2026)

arXiv

Less Noise, Same Certificate: Retain Sensitivity for Unlearning

Carolin Heinzler, Kasra Malihi, Amartya Sanyal

Preprint Preprint (2026)

arXiv

Adaptive Sampling for Private Worst-Case Group Optimization

Max Cairney-Leeming, Amartya Sanyal, Christoph H. Lampert

Preprint Preprint (2026)

arXiv

LoRA and Privacy: When Random Projections Help (and When They Don't)

Yaxi Hu, Johanna Düngler, Bernhard Schölkopf, Amartya Sanyal

Preprint Preprint (2026)

arXiv

Delta-influence: Unlearning poisons via influence functions

Wenjie Li, Jiawei Li, Christian Schroeder de Witt, Ameya Prabhu, Amartya Sanyal

TMLR Transactions on Machine Learning Research (2025)

arXiv

Learning in an Echo Chamber: Online Learning with Replay Adversary

Daniil Dmitriev, Harald Eskelund Franck, Carolin Heinzler, Amartya Sanyal

SODA Symposium on Discrete Algorithms (2025)

arXiv

Fairness for the People, by the People: Minority Collective Action

Omri Ben‑Dov, Samira Samadi, Amartya Sanyal, Alexandru Ţifrea

Preprint arXiv Preprint (2025)

arXiv

An Iterative Algorithm for Differentially Private k-PCA with Adaptive Noise

Johanna Düngler, Amartya Sanyal

NeurIPS Conference on Neural Information Processing Systems (2025)

arXiv

Provable Unlearning in Topic Modeling and Downstream Tasks

Stanley Wei, Sadhika Malladi, Sanjeev Arora, Amartya Sanyal

ICLR International Conference on Learning Representations (2025)

arXiv

Differentially Private Steering for Large Language Model Alignment

Anmol Goel, Yaxi Hu, Iryna Gurevych, Amartya Sanyal

ICLR International Conference on Learning Representations (2025)

arXiv

Protecting Against Simultaneous Data Poisoning Attacks

Alex Neel, Shoaib Ahmed Siddiqui, Amartya Sanyal, David Krueger

ICLR International Conference on Learning Representations (2025)

arXiv

Accuracy on the Wrong Line: On the Pitfalls of Noisy Data for Out-of-Distribution Generalisation

Amartya Sanyal, Yaxi Hu, Yaodong Yu, Yian Ma, Yixin Wang, Bernhard Schölkopf

AISTATS Artificial Intelligence and Statistics (2025)

arXiv

Online Learning and Unlearning

Yaxi Hu, Bernhard Schölkopf, Amartya Sanyal

Preprint Preprint (2025)

arXiv

Open problems in machine unlearning for ai safety

Fazl Barez, Tingchen Fu, Ameya Prabhu, Stephen Casper, Amartya Sanyal, Adel Bibi, Aidan O'Gara, Robert Kirk, Ben Bucknall, Tim Fist, others

Preprint Preprint (2025)

arXiv

Robust Mixture Learning when Outliers Overwhelm Small Groups

Daniil Dmitriev, Rares‑Darius Buhai, Stefan Tiegel, Alexander Wolters, Gleb Novikov, Amartya Sanyal, David Steurer, Fanny Yang

NeurIPS Conference on Neural Information Processing Systems (2024)

arXiv

What Makes and Breaks Safety Fine-tuning? A Mechanistic Study

Samyak Jain, Ekdeep Singh Lubana, Kemal Oksuz, Tom Joy, Philip Torr, Amartya Sanyal, Puneet K. Dokania

NeurIPS Conference on Neural Information Processing Systems (2024)

arXiv

Provable Privacy with Non-Private Pre-Processing

Yaxi Hu, Amartya Sanyal, Bernhard Schölkopf

ICML International Conference on Machine Learning (2024)

arXiv

The Role of Learning Algorithms in Collective Action

Omri Ben‑Dov, Jake Fawkes, Samira Samadi, Amartya Sanyal

ICML International Conference on Machine Learning (2024)

arXiv

On the Growth of Mistakes in Differentially Private Online Learning: A Lower Bound Perspective

Daniil Dmitriev, Kristóf Szabó, Amartya Sanyal

COLT Conference on Learning Theory (2024)

arXiv

Corrective Machine Unlearning

Shashwat Goel, Ameya Prabhu, Philip Torr, Ponnurangam Kumaraguru, Amartya Sanyal

TMLR Transactions on Machine Learning Research (2024)

arXiv