Safe reconstruction and guarantees

We want the ability reconstruct images from fewer samples as this would allow cheaper and faster imaging. We can achieve this because natural images lie in a subspace of their domain, they are structured. But how can this reconstruction from fewer samples go wrong? Recent work has used learned priors (via a GAN, see reading.md for references) to act as a regulariser on partially reconstructed images. What I am worried about is, for example, a tumor being removed from a reconstruction because our prior knowledge about the images allocates the tumor low probability. If these algorithms are to be used in paractice they will need to provide guarantees. For example, provable bounds on the likelihood of introducing a false positive, or a false negative classification.

Compressed sensing

Copressed sensing is typically formaised as finding the optimal solution to an $L_1$ minimisation problem, such that the reconstruction explains the measurements. Where $y$ are the measreuments (far fewer than the size of $x$) made by the forward process $f$. The truth we want to reconver is $\hat x$. And finally $\phi(\cdot)$ is the all important sparse basis. $$ \begin{align} y &= f(\hat x) \\ \mathop{\text{argmin}}_x &\parallel \phi(x) \parallel_1 \text{ s.t. } \parallel f(x) - y\parallel_2 < \epsilon \\ \end{align} $$ Another way to think about this is that: because we have few measurements, there are many possible images that explain these measurements. Thus we need to external information, regularisers/beliefs, to pick a reconstruction out of the many plausible ones. We could write this set of plausible images, given the measurements as; $$ \begin{align} \mathcal S &= \{x_i: \parallel \psi(x) - y \parallel_2 \le \epsilon, \forall x_i \in \mathbb R^n\} \tag{4} \\ \end{align} $$

Which priors/regularisers work and why?

Some regularisers and sparse domains have been used for many years and have shown great success. These include, using the fourier and wavelet bases or the total variation regulariser. $$ \begin{align} \mathop{\text{argmin}}_x & \parallel \psi(x) - y \parallel_2 + \lambda \parallel \text{fft}(x) \parallel_1 \tag{2}\\ \end{align} $$ As mentioned above, recent work has used GANs in place of traditional regularisers. $$ \begin{align} \mathop{\text{argmin}}_x & \parallel \psi(x) - y \parallel_2 + \lambda \cdot \text{discriminator}(x) \tag{3}\\ \end{align} $$ However, I am worried that there is a fundamental difference between these regularisers.
Conjecture: compressed sensing regularisers (TV, L1) are 'safer' than learned regularisers.
TV and $ L_1 $ (in a 'simple' domain) seem to lack the ability to add semantically meaningful information. I want to say something like:
learned priors can add objects with marcoscopic structure into the reconstructed image.
but that is not true. $ L_1 $ in the fourier domain can regularise large wavelength signals and thus have globally structured effects. Questions I would like to find answers for: Note: I think there are some important relationships to;

Guarantees

If I get a MRI scan and I am told that 12-samples were used to construct the images that show I have a tumor, I am going to want some guarantees on what information can be added (or removed) into the reconstruction.
What form could these guarantees take? But, empirical tests only apply to the existing data gathered, and provide no guarantees for the future.

Given a MRI dataset and a classification task (like classifying images with/without tumors). We want to find the minimum number of samples such that images with tumors and without are $\delta$ likely to be distinguishable.

Safe reconstruction

What we want to is balance the two competing objectives: A reconstruction that explains the measurements, and a reconstruction that is structured according to our prior beliefs. What we dont want is our prior beliefs out-competing the measurements made (given other beliefs about the accuracy the measurements). In the original framing of the problem (see Compressed Sensing) this is not an issue because we are optimising the image such that the reconstruction must explain the measurements.

But, in practice you often use the Largangian multiplier to make optimisation easier. $$ \mathop{\text{argmin}}_x \parallel \phi(x) \parallel_1 + \lambda \parallel f(x) - y\parallel_2 \\ $$ We have now lost the nice guarantee that the reconstructed $ x $ must explain the data (within $ \epsilon $). The reconstruction now relies on the optimisation prodceure/dynamics (which is possibly nonconvex) and the value of the multiplier. So, we want to;