| Time | Monday – 15th June | Tuesday – 16th June |
|---|---|---|
| 9h00-9h45 | Nicola Toschi | |
| 9h45-10h30 | Simone Gallivanone | |
| 10h30-11h00 | coffee break | |
| 11h00-11h45 | Participants arrival | Paolo Zunino |
| 11h45-12h30 | Spotlight talks + Posters | |
| 12h30-14h00 | Light lunch | lunch |
| 14h00-14h30 | Welcome address | Elisabetta Rocchetti |
| 14h30-15h15 | Johannes Hertrich | Davide Badalotti |
| 15h15-16h00 | Enrico Ventura | Fedor Goncharov |
| 16h00-16h30 | coffee break | Lorenzo Baldassarri |
| 16h30-16h45 | Davide Evangelista | |
| 16h45-17h15 | Roundtable with coffee break; Closure | |
| 17h15-18h00 | Raffaele Marchesi | participants departure |
| 18h00-18h30 | Georgios Arvanitidis | |
| 20h00 | Social Dinner |
On the Relation between Rectified Flows and Optimal Transport.
Johannes Hertrich
We investigate the connections between rectified flows, flow matching, and optimal transport. Flow matching is a recent approach to learning generative models by estimating velocity fields that guide transformations from a source to a target distribution. Rectified flow matching aims to straighten the learned transport paths, yielding more direct flows between distributions. In this talk, we examine under which conditions limits of these iterations resamble optimal transport and present explicit counterexamples where this is not true. We pay particular attention to minibatch optimal transport which reorders the batches of samples before they are used in the optimization.
From Generalization to Memorization: Results & Open Questions in Generative Diffusion.
Enrico Ventura
Diffusion models have achieved state-of-the-art performance in generating images, videos, scientific data, and even text. Despite its broad practical success, neural-network-based generative diffusion is, in theory, condemned to memorize the training set rather than learn the underlying data-generating distribution. This talk will first present recent advances in understanding the surprising generalization capabilities of diffusion models. We will argue that, in this framework, generalization can be understood as an intermediate and necessary dynamical stage that precedes memorization. We will then discuss how diffusion models can progressively transition from generalizing to memorizing the data, highlighting a counterintuitive sampling behavior that arises in this still largely unexplored regime. The talk will present both theoretical insights and open questions on the generative capabilities of diffusion models, while fostering the quest for a new definition of generalization in generative modeling.
TBA
Nicola Toschi
Coherent cross-modal generation of biomedical data.
Raffaele Marchesi
Clinical datasets in multimodal healthcare are often sparse, fragmented, and constrained by high acquisition costs. Generative Artificial Intelligence (GenAI) promises to offer practical methods to address these limitations by synthesizing missing modalities, augmenting imbalanced datasets, and generating privacy-preserving synthetic patient profiles. Applying generative models to biological data requires systems capable of handling arbitrary combinations of available inputs while maintaining cross-modal coherence. Synthesizing missing clinical information involves navigating multidimensional health records while managing uncertainty. The objective is to ensure the generated data preserves the structural features and predictive signals present in the original patient profiles. In practice, cross-modal generation could be used to mitigate the performance degradation of downstream predictive models, when patient records are incomplete. Additionally, generative inference and counterfactual analysis could support diagnostic planning. This approach could provide a quantitative framework for prioritizing clinical resources and diagnostic workflows, exploring how synthetic data might be applied to precision medicine tasks.
Geometric approximate inference for Bayesian neural networks
Georgios Arvanitidis
Uncertainty quantification in Bayesian deep learning is typically achieved by characterizing the posterior distribution of neural network weights given the observed data; however, exact inference is in general computationally intractable. At the same time, in the overparametrized regime, the parameter space exhibits nonlinear reparametrization invariances, implying that distinct parameter configurations correspond to the same function. Standard approximate inference methods that operate in parameter space do not explicitly account for this nonlinear structure, often leading to poor approximations of the true posterior, particularly along directions associated with these invariances. Recent approaches leverage geometric insights to construct approximations that better align with the intrinsic structure of the parameter space. These methods provide scalable alternatives that can yield better local approximations of the posterior. In this talk, we introduce the role of geometry in approximate Bayesian inference, present representative methods, and discuss future research directions.
Solving Inverse Problems with Diffusion-Based Posterior Sampling Algorithms
Davide Evangelista
Diffusion models have recently emerged as a powerful framework for image generation and restoration, with important applications in image reconstruction from noisy and corrupted measurements. Their success is closely related to the possibility of combining expressive learned priors with principled probabilistic sampling procedures. In this talk, I will first introduce the main ideas underlying diffusion-based generative models, briefly discussing both their discrete formulations, such as DDPM and DDIM, and their continuous-time interpretation through stochastic differential equations and probability-flow ODEs. Then, I will discuss one of the most promising families of methods for the solution of inverse problems using diffusion models: the Diffusion Posterior Sampling methods. These approaches integrate measurement consistency into the reverse diffusion process in order to sample from posterior distributions conditioned on the observed data. I will discuss the general structure of posterior samplers, their interpretation from a probabilistic and numerical viewpoint, and their application to image reconstruction. Finally, I will highlight some practical and theoretical limitations of current methods.
Attacking and Defending Models: A Mathematical Overview
Simone Gallivanone
Adversarial machine learning studies the vulnerability of models to crafted perturbations. In this talk, we present a mathematical overview of adversarial attacks and defences for modern machine learning models, discussing both classifiers and large language models. We discuss how these phenomena are related to decision boundary properties and model structure. We analyse defence mechanisms through the lens of robustness, regularization, and worst-case risk minimization.
The goal is to highlight the geometric and optimization principles that underlie both attacks and defences, providing a mathematical intuition for why such attacks exist and how to mitigate their effects.
Neural networks for dimensionality reduction of numerical approximation for PDEs
Paolo Zunino
Neural networks and learning algorithms have gained substantial attention among researchers engaged in scientific computing. Notably, there are well-established methodologies for employing these tools in solving mathematical models based on partial differential equations. Today, a significant overlap exists between the machine learning and computational science and engineering communities in the realm of data-driven reduced order models. We will review the main trends in this field, with particular focus on the role of autoencoders for nonlinear dimensionality reduction of the numerical approximation of parametrized PDEs.
This is a joint work with Nicola Rares Franco and Andrea Manzoni.
Annealed Langevin Dynamics for Multimodal Sampling in Infinite Dimensions
Lorenzo Baldassari
Score-based diffusion generative models rely on two coupled ingredients: learned scores and sampling dynamics capable of transporting simple initial laws toward complex, often ultimodal, target distributions. In many applications, these targets arise as finite-dimensional approximations of underlying function-space probability laws, making it crucial to understand whether the sampling dynamics remains stable as the approximation is refined. In this talk, we discuss this question for preconditioned annealed Langevin dynamics.
The goal is to study its stability when the underlying infinite-dimensional target is multimodal, a regime in which classical Langevin diffusion is known to suffer, especially as the dimension of the approximation increases. Annealing is designed to address this issue by starting from a smoothed version of the target and gradually removing the smoothing. For multimodal Gaussian-mixture targets, we identify spectral conditions linking the smoothing covariance, the covariances of the mixture components, and the preconditioner under which the continuous-time dynamics achieves a prescribed Kullback-Leibler accuracy within a dimension-uniform time horizon. We then discuss the effect of errors, including approximate scores, misspecified initialization, and time discretization.
The main message is that dimension-stable annealed sampling can still be guaranteed under suitable spectral conditions, even when the initial law used by the sampler diverges from the target as the dimension increases.
Representation Manifolds in Recurrent Neural Dynamic
Davide Badalotti
Recurrent neural networks define dynamical systems whose trajectories may converge to stable internal configurations. In this talk, we discuss a class of binary recurrent. networks in which learning is formulated as the stabilization of such configurations under local plasticity rules.
The main object of interest is the representation manifold: a structured set of dynamically accessible fixed points reached by the network when inputs and targets are transiently clamped. Although the ambient configuration space may contain many fixed points, most of them are not relevant to the learned computation. The effective geometry is instead determined by accessibility, stability, and the way local updates organize the recurrent dynamics.
We will present the mathematical structure of this setting and relate it to nearby ideas in energy-based learning, predictive coding, equilibrium propagation, and associative memory. The connection with generative modeling will be discussed at the level of dynamical representations and learned attractor landscapes, rather than through a specific probabilistic generative model
Functional Autoencoders with Sobolev spaces embeddings
Fedor Goncharov
In recent years functional, mesh-free modelling approach has taken a lot of attention in physics-inspired machine learning. Indeed, for many industrial applications or physics-related tasks, the signals of interest are needed to be sampled on varying set of grids. Advantage of functional generative models that they do not require re-training when changing sampling grids, however, access to finer resolutions from coarsely-sampled training data raises the question of validity of produced samples (i.e., the problem of superresolution).
Though there is a considerable progress in this direction, it is still of interest to propose generative architectures that can be adapted to intrinsic properties of data,
e.g., its smoothness.
In this talk we present a novel autoencoder architecture where encoder and decoder are modeled as smoothing/sharpening PDO operators between certain Sobolev spaces, respectively. I will discuss discretization and approximation errors from theoretical side and present numerical experiments on various datasets (custom-made and PDE-inspired benchmarks).
Geometric Approaches to Interpretability
Elisabetta Rocchetti
Understanding what Transformer models perceive and why they behave as they do remains a fundamental open problem. Standard interpretability approaches often rely on input perturbations or gradient-based attribution, implicitly assuming that the model’s input space is flat and Euclidean. This assumption, however, breaks down in practice: neural networks progressively deform their representation spaces during training, meaning inputs that differ significantly in Euclidean terms may be treated as equivalent by the model, while small variations can induce large changes in output.
This talk presents a principled framework for exploring Transformer perception through Riemannian geometry. By modelling the network as a sequence of smooth maps between manifolds and pulling back the output metric, equivalence classes can be defined in input space, which are sets of inputs the model perceives as functionally identical. This construction yields two complementary exploration algorithms: one that navigates within an equivalence class, and one that crosses class boundaries toward a different one.
The perspective is then broadened to discuss how geometry is emerging as a unifying lens for interpretability, arguing that the right mathematical language for understanding neural networks may not be linear algebra, but differential geometry.
Spotlight Talks & Posters
Paolo Botta, Politecnico di Milano
Nunzio Dimola, Politecnico di Milano
Gianluca Guadagni, University of Virginia
Nicolai Hermann, USI Lugano
Piermario Vitullo, Politecnico di Milano
Welcome Address
Paola Causin, Department of Mathematics, University of Milan
Elena Morotti, Department of Political and Social Science, University of Bologna
Roundtable and Closure
Elena Loli Piccolomini, Department of Computer Science and Engineering, Alma Mater Studiorum
Andrea Sebastiani, Department of Informatics, Bioengineering, Robotics and Systems Engineering, University of Genova
Giovanni Alberti, Department of Mathematics, University of Genova