Let G∈Rd×m. The goal of compressed sensing is to recover x∈Rm from measurement(s) / observation(s) y=Gx∈Rd.
yi=∑j=1mGijxj some linear projection/measurement of the vectorx
Example
Medical imaging (eg, MRI). In this case, we get to choose G.
Question
How many measurements (d) are needed to recover x∈Rm?
Answer
In general, we need G to be injective (ie, d≥m).
BUT if x has some sort of structure, then compressed sensing wants to make d much smaller.
Question
What kind of structure do we need for x?
x to be sparse.
We want
Remark
Note that knowing that I can design G (called the sensing matrix) to recover sparse x is equivalent to knowing that there is a basis A in which x is sparse (thus, WLOG, we assume is sparse)
Demonstration
Assume that x is sparse is some known basis (that we choose). ie, there is some invertibleA such that x=Ax~ is sparse. (it is important that A is known as it encodes our prior assumptions about x)
Then we have GAx~=Gx.
Assuming I have designed my G to recover (sparse) x, I can then recover x~=A−1x
Suppose I have some G that recovers sparse x. Then I can also recover x~ that is sparse in basis A.
2.5.1 Null Space and Restricted Isometry Properties
Question
When does Gx uniquely determine x for x sparse? (when is G injective?)
This is an existence question: is there a reverse mapping?
How do we actually recover x (quickly)?
This is an algorithm question: what algorithm do we want?
Not on topic for this class / we will not discuss
Sparsity
We denote the sparsity of a vector x by
∣∣x∣∣0:=#{i:xi=0}=∣{i:xi=0}∣
If ∣∣x∣∣0≤k, we say that x is k-sparse
(⟸) (via contrapositive)
Suppose G does not distinguish k-sparse vectors. Then there is some x=x′ with ∣∣x∣∣0,∣∣x′∣∣0≤k such that Gx=Gx′. Note that ∣∣x−x′∣∣0≤2k, but
Gx−Gx′=G(x−x′)=0 Thus G does not have the 2k NSP.
(⟹)
Now, suppose Gdoes distinguish k sparse vectors. Suppose ∣∣x∣∣0≤2k and x=0. Then there exist x′,x′′k-sparse such that x=x′−x′′. Then
0=Gx′−Gx′′=G(x′−x′′)=Gx
ie G has the 2kNSP.
The RIP is more useful than the NSP for algorithms, especially when noise is added to y=Gx⟹y=Gx+ε
2.5.2 Random Sensing Matrices
Theorem
Suppose k≥1 and δ∈(0,1) and d≥64δ2klogm. Let G∼N(0,d1)⊗d×m. Then
P[G has (k,δ)-RIP]≥1−mk2
(We want to reduce the number of measurements required)
Proof
Suppose S⊂[m],∣S∣=k and x∈Rm. Define
x(S)∈Rk as the restriction to indices in S
G(S)∈Rd×k be the restriction to columns with indices in S
If ∣∣x∣∣0≤k and all non-zero indices are in S, ie supp(x)⊂S, then we have
Gx=G(S)x(S) and ∣∣x∣∣=∣∣x(S)∣∣
In this case, the (k,δ)RIP is equivalent to requiring that
(for all S⊂[m] with ∣S∣=k) we have
(1−δ)∣∣x(S)∣∣2≤∣∣G(S)x(S)∣∣2≤(1+δ)∣∣x(S)∣∣2
Now, let x^(S):=∣∣x(S)∣∣x(S). Then
Note that knowing that I can design G (called the sensing matrix) to recover sparse x is equivalent to {==1||knowing that I can do so and that there is a basis A in which x is sparse==} (thus, WLOG, we assume x is {1||sparse})
Demonstration
Assume that x is sparse is some known basis (that we choose). ie, there is some invertibleA such that x=Ax~ is sparse. (it is important that A is known as it encodes our prior assumptions about x)
Then we have GAx~=Gx.
Assuming I have designed my G to recover (sparse) x, I can then recover x~=A−1x