Safe reconstruction and guarantees
We want the ability reconstruct images from fewer samples as this would allow cheaper and faster
imaging. We can achieve this because natural images lie in a subspace of their domain, they are structured.
But how can this reconstruction from fewer samples go wrong?
- we need to ensure you have sampled the 'right' information.
- we need to make sure that beliefs do not outcompete the measurements (safety)
Recent work has used learned priors (via a GAN, see
reading.md for references)
to act as a regulariser on partially reconstructed images.
What I am worried about is, for example, a tumor being removed from a reconstruction
because our prior knowledge about the images allocates the tumor low probability.
If these algorithms are to be used in paractice they will need to provide guarantees.
For example, provable bounds on the likelihood of introducing a false positive,
or a false negative classification.
Compressed sensing
Copressed sensing is typically formaised as finding the optimal solution
to an $L_1$ minimisation problem, such that the reconstruction explains
the measurements. Where $y$ are the measreuments (far fewer than the size of $x$) made by the forward process $f$.
The truth we want to reconver is $\hat x$. And finally $\phi(\cdot)$ is the
all important sparse basis.
$$
\begin{align}
y &= f(\hat x) \\
\mathop{\text{argmin}}_x &\parallel \phi(x) \parallel_1 \text{ s.t. } \parallel f(x) - y\parallel_2 < \epsilon \\
\end{align}
$$
Another way to think about this is that: because we have few measurements,
there are many possible images that explain these measurements.
Thus we need to external information, regularisers/beliefs, to pick a
reconstruction out of the many plausible ones.
We could write this set of plausible images, given the measurements as;
$$
\begin{align}
\mathcal S &= \{x_i: \parallel \psi(x) - y \parallel_2 \le \epsilon, \forall x_i \in \mathbb R^n\} \tag{4} \\
\end{align}
$$
Which priors/regularisers work and why?
Some regularisers and sparse domains have been used for many years and have shown great success.
These include, using the fourier and wavelet bases or the total variation regulariser.
$$
\begin{align}
\mathop{\text{argmin}}_x & \parallel \psi(x) - y \parallel_2 + \lambda \parallel \text{fft}(x) \parallel_1 \tag{2}\\
\end{align}
$$
As mentioned above, recent work has used GANs in place of traditional regularisers.
$$
\begin{align}
\mathop{\text{argmin}}_x & \parallel \psi(x) - y \parallel_2 + \lambda \cdot \text{discriminator}(x) \tag{3}\\
\end{align}
$$
However, I am worried that there is a fundamental difference between these regularisers.
Conjecture: compressed sensing regularisers (TV, L1) are 'safer' than learned regularisers.
TV and $ L_1 $ (in a 'simple' domain) seem to lack the ability to add semantically meaningful information. I want to say something like:
learned priors can add objects with marcoscopic structure into the reconstructed image.
but that is not true. $ L_1 $ in the fourier domain can regularise large wavelength signals and thus have globally structured effects.
Questions I would like to find answers for:
- Under which conditions are macroscopic fantaies added?
- What is the definition of macroscopic? How can it be measured?
Note: I think there are some important relationships to;
- mode collapse. If a mode has been dropped by the learned prior
then it will push the reconstruction away from that location. How can
you verify that all modes have been captured?
- adversarial examples. Inperceptibly small changes to an image
can cause classification error. So if a learned prior introduces these
artifacts then, while looking similar to us, automated classifiers will miss classify the image.
- That fourier/wavelet/TV are not 'semantically meaningful' to a down stream classification task (???)
Guarantees
If I get a MRI scan and I am told that 12-samples were used to
construct the images that show I have a tumor, I am going to want some
guarantees on what information can be added (or removed) into the reconstruction.
What form could these guarantees take?
- bounds on false positive/negatives, $ \text{mistakes} \sim {O(log{\frac{1}{n_{samples}}})} $
- empirical tests, $ \text{class}(x) \approx \text{class}(\text{recon}_k(y)) $
But, empirical tests only apply to the existing data gathered,
and provide no guarantees for the future.
Given a MRI dataset and a classification task (like classifying images with/without tumors).
We want to find the minimum number of samples such that images with tumors and without are $\delta$ likely
to be distinguishable.
Safe reconstruction
What we want to is balance the two competing objectives:
A reconstruction that explains the measurements, and a reconstruction
that is structured according to our prior beliefs.
What we dont want is our prior beliefs out-competing the
measurements made (given other beliefs about the accuracy the measurements).
In the original framing of the problem (see Compressed Sensing) this is not an
issue because we are optimising the image such that the reconstruction
must explain the measurements.
But, in practice you often use the Largangian multiplier to make optimisation easier.
$$
\mathop{\text{argmin}}_x \parallel \phi(x) \parallel_1 + \lambda \parallel f(x) - y\parallel_2 \\
$$
We have now lost the nice guarantee that the reconstructed $ x $ must explain
the data (within $ \epsilon $). The reconstruction now relies on the
optimisation prodceure/dynamics (which is possibly nonconvex) and the value of the multiplier.
So, we want to;
- be able to calculate a lambda, given beliefs about the amount of noise in the measurements,
- project gradients from optimising the prior back into a 'safe' set (possibly
related to Strongly typed agents)