The advances in generative modelling have shown that we can generate high-quality samples from complex distributions. A next step is to use these generative models as priors to help solve inverse problems.
To solve inverse problems usinbg a prior, we normally require the ability to evaluate likelihoods ($p(x)$). However, diffusion models (Song et al., 2021) do not support likelihood estimates, only generating samples. Thus inverse problem solvers revert to sampling from the posterior to generate solutions (Chung et al., 2023). Though it’s best to think of these ‘solutions’ as proposals, as there is no guarantee on quality or accuracy.
Neural flows (Albergo et al., 2023; Liu et al., 2022; Lipman et al., 2023) have recently achieved s.o.t.a (Esser et al., 2024) and do support likelihood estimates. They can be used to find the local maximum of the posterior (Ben-Hamu et al., 2024). However, differentiating through a flow is extremely expensive.
So, it seems that solving inverse problems via a principled approach like MAP is not quite possible with s.o.t.a generative models. Maybe we can provide a viable alternative.
Inverse problems are a class of problems where we want to find the input to a function given the output. For example (within generative machine learning) we care about;
- image recoloring, where we want to find the original image given the black and white image.
- image inpainting, where we want to find the original image given the image with a hole in it.
- speech enhancement, where we want to find the clean speech given the noisy speech.
We consider the setting where we have access to a prior $p(x)$ (e.g. normal, clear speech) and likelihood function $p(y \mid x)$ (the environment adding background noise and interference). We observe $y$ and want to recover $x$.
Using Bayes rule, we can write the posterior and our goal as;
\[\begin{align*} p(x | y) &= \frac{p(y | x) p(x)}{p(y)} \tag{posterior} \\ x^* &= \arg \max_x p(x | y) \tag{the MAP solution} \end{align*}\]MAP will return the most likely value of $x$, given $y$. However, is the most likely value of $x$ the ‘best’ guess?
We offer an alternative approach, suggesting that our guess of $x$ should be typical of the prior. We write this as;
\[\begin{align*} x^* &= \arg \max_{x \in \mathcal T(p(x))_\epsilon} p(y | x) \tag{PITS} \end{align*}\]where $\mathcal T(p(x))_\epsilon$ is the $\epsilon$-typical set of $p(x)$. We call this Projection Into the Typical Set (PITS).
I wrote a few posts to help you understand PITS;
1. Background on typicality
2. A simple worked example showing that MAP produces solutions that are not typical.
3. Using neural flows we can approximate the typical set for arbitrary distributions.
4. (WIP) How to combine typicality with flows to solve inverse problems.
5. (WIP) A demonstration of the PITS+flow approach to inverse problems.
6. (WIP) Does it matter if solutions are not typical?
7. A brief review of methods attempting to solve inverse problems using s.o.t.a generative models.
Bibliography
- Song, Y., Sohl-Dickstein, J., Kingma, D. P., Kumar, A., Ermon, S., & Poole, B. (2021). Score-Based Generative Modeling through Stochastic Differential Equations. https://arxiv.org/abs/2011.13456
- Chung, H., Kim, J., Mccann, M. T., Klasky, M. L., & Ye, J. C. (2023). Diffusion Posterior Sampling for General Noisy Inverse Problems. arXiv. http://arxiv.org/abs/2209.14687
- Albergo, M. S., Boffi, N. M., & Vanden-Eijnden, E. (2023). Stochastic Interpolants: A Unifying Framework for Flows and Diffusions. arXiv. http://arxiv.org/abs/2303.08797
- Liu, X., Gong, C., & Liu, Q. (2022). Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow. arXiv. http://arxiv.org/abs/2209.03003
- Lipman, Y., Chen, R. T. Q., Ben-Hamu, H., Nickel, M., & Le, M. (2023). FLOW MATCHING FOR GENERATIVE MODELING.
- Esser, P., Kulal, S., Blattmann, A., Entezari, R., Müller, J., Saini, H., Levi, Y., Lorenz, D., Sauer, A., Boesel, F., Podell, D., Dockhorn, T., English, Z., Lacey, K., Goodwin, A., Marek, Y., & Rombach, R. (2024). Scaling Rectified Flow Transformers for High-Resolution Image Synthesis. https://arxiv.org/abs/2403.03206
- Ben-Hamu, H., Puny, O., Gat, I., Karrer, B., Singer, U., & Lipman, Y. (2024). D-Flow: Differentiating through Flows for Controlled Generation. https://arxiv.org/abs/2402.14017
These ideas were generated while studying at VUW with Bastiaan Kleijn and Marcus Frean. I was funded by GN.