sequential adaptive elastic net approach for single …sequential adaptive elastic net approach for...

12
arXiv:1805.07575v1 [stat.ME] 19 May 2018 Sequential adaptive elastic net approach for single-snapshot source localization Muhammad Naveed Tabassum 1 and Esa Ollila 1 Aalto University, Dept. of Signal Processing and Acoustics, P.O. Box 15400, FI-00076 Aalto, Finland. a) This paper proposes efficient algorithms for accurate recovery of direction-of-arrivals (DoAs) of sources from single-snapshot measurements using compressed beamforming (CBF). In CBF, the conventional sensor array signal model is cast as an underdetermined complex- valued linear regression model and sparse signal recovery methods are used for solving the DoA finding problem. We develop a complex-valued pathwise weighted elastic net (c-PW- WEN) algorithm that finds solutions at knots of penalty parameter values over a path (or grid) of EN tuning parameter values. c-PW-WEN also computes Lasso or weighted Lasso in its path. We then propose a sequential adaptive EN (SAEN) method that is based on c-PW-WEN algorithm with adaptive weights that depend on previous solution. Extensive simulation studies illustrate that SAEN improves the probability of exact recovery of true support compared to conventional sparse signal recovery approaches such as Lasso, elastic net or orthogonal matching pursuit in several challenging multiple target scenarios. The effectiveness of SAEN is more pronounced in the presence of high mutual coherence. c 2018 Acoustical Society of America. [http://dx.doi.org(DOI number)] [XYZ] Pages: 112 I. INTRODUCTION Acoustic signal processing problems generally em- ploy a system of linear equations as the data model. For overdetermined linear systems, the least square es- timation (LSE) is often applied, but in underdetermined or ill-conditioned problems, the LSE is no longer unique but the optimization problem has infinite number of so- lutions. In these cases, additional constraints such as those promoting sparse solutions, are commonly used such as the Lasso (Least Absolute Shrinkage and Selec- tion Operator) 1 or the elastic net (EN) penalty 2 which is an extension of Lasso based on a convex combination of 1 and 2 penalties of the Lasso and ridge regression. It is now a common practice in acoustic applications 3,4 to employ grid based sparse signal recovery methods for finding source parameters, e.g., direction-of-arrival (DoA) and power. This approach, referred to as compressive beamforming (CBF), was originally proposed in 5 and has then emerged as one of the most useful approaches in problems where only few measurements are available. Since the pioneering work of 5 , the usefulness of CBF approach has been shown in a series of papers 614 . In this paper, we address the problem of estimating the unknown source parameters when only a single snapshot is available. Existing approaches in grid-based single-snapshot CBF problem often use the Lasso for sparse recovery. Lasso, however, often performs poorly in the cases when sources are closely spaced in angular domain or when a) E-mail: (muhammad.tabassum, esa.ollila)@aalto.fi there exists large variations in the source powers. The same holds true when the grid used for constructing the array steering matrix is dense, which is the case, when one aims for high-resolution DoA finding. The problem is due to the fact that Lasso has poor performance when the predictors are highly correlated (cf. 2,15 ). Another problem is that Lasso lacks group selection ability. This means that when two sources are closely spaced in the angular domain, then Lasso tends to choose only one of them in the estimation grid but ignores the other. EN is often performing better in such cases as it enforces sparse solution but has a tendency to pick and reject the correlated variables as a group unlike the Lasso. Fur- thermore, EN also enjoys the computational advantages of the Lasso 2 and adaptation using smartly chosen data- dependent weights can further enhance performance. The main aim in this paper is to improve over the conventional sparse signal recovery methods especially in the presence of high mutual coherence of basis vector or when non-zero coefficients have largely varying ampli- tudes. This former case occurs in CBF problem when a dense grid is used for constructing the steering matrix or when sources arrive to a sensor array from either neigh- bouring or oblique angles. To this end, we propose a sequential adaptive approach using the weighted elastic net (WEN) framework. To achieve this in a computa- tionally efficient way, we propose a homotopy method 16 that is a complex-valued extension of the least angles re- gression and shrinkage (LARS) 17 algorithm for weighted Lasso problem, which we refer to as c-LARS-WLasso. The developed c-LARS-WLasso method is numerically cost effective and avoids an exhaustive grid-search over candidate values of the penalty parameter. J. Acoust. Soc. Am. / 22 May 2018 Sequential adaptive elastic net 1

Upload: others

Post on 18-Jun-2020

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sequential adaptive elastic net approach for single …Sequential adaptive elastic net approach for single-snapshot source localization MuhammadNaveedTabassum 1andEsaOllila Aalto University,

arX

iv1

805

0757

5v1

[st

atM

E]

19

May

201

8Sequential adaptive elastic net approach forsingle-snapshot source localization

Muhammad Naveed Tabassum1 and Esa Ollila1

Aalto University Dept of Signal Processing and Acoustics PO Box 15400 FI-00076 Aalto

Finlanda)

This paper proposes efficient algorithms for accurate recovery of direction-of-arrivals (DoAs)of sources from single-snapshot measurements using compressed beamforming (CBF) InCBF the conventional sensor array signal model is cast as an underdetermined complex-valued linear regression model and sparse signal recovery methods are used for solving theDoA finding problem We develop a complex-valued pathwise weighted elastic net (c-PW-WEN) algorithm that finds solutions at knots of penalty parameter values over a path (orgrid) of EN tuning parameter values c-PW-WEN also computes Lasso or weighted Lassoin its path We then propose a sequential adaptive EN (SAEN) method that is based onc-PW-WEN algorithm with adaptive weights that depend on previous solution Extensivesimulation studies illustrate that SAEN improves the probability of exact recovery of truesupport compared to conventional sparse signal recovery approaches such as Lasso elasticnet or orthogonal matching pursuit in several challenging multiple target scenarios Theeffectiveness of SAEN is more pronounced in the presence of high mutual coherence

ccopy2018 Acoustical Society of America [httpdxdoiorg(DOI number)]

[XYZ] Pages 1ndash12

I INTRODUCTION

Acoustic signal processing problems generally em-ploy a system of linear equations as the data modelFor overdetermined linear systems the least square es-timation (LSE) is often applied but in underdeterminedor ill-conditioned problems the LSE is no longer uniquebut the optimization problem has infinite number of so-lutions In these cases additional constraints such asthose promoting sparse solutions are commonly usedsuch as the Lasso (Least Absolute Shrinkage and Selec-tion Operator)1 or the elastic net (EN) penalty2 whichis an extension of Lasso based on a convex combinationof ℓ1 and ℓ2 penalties of the Lasso and ridge regression

It is now a common practice in acousticapplications34 to employ grid based sparse signalrecovery methods for finding source parameters egdirection-of-arrival (DoA) and power This approachreferred to as compressive beamforming (CBF) wasoriginally proposed in5 and has then emerged as one ofthe most useful approaches in problems where only fewmeasurements are available Since the pioneering workof5 the usefulness of CBF approach has been shownin a series of papers6ndash14 In this paper we address theproblem of estimating the unknown source parameterswhen only a single snapshot is available

Existing approaches in grid-based single-snapshotCBF problem often use the Lasso for sparse recoveryLasso however often performs poorly in the cases whensources are closely spaced in angular domain or when

a)E-mail (muhammadtabassum esaollila)aaltofi

there exists large variations in the source powers Thesame holds true when the grid used for constructing thearray steering matrix is dense which is the case whenone aims for high-resolution DoA finding The problemis due to the fact that Lasso has poor performance whenthe predictors are highly correlated (cf215) Anotherproblem is that Lasso lacks group selection ability Thismeans that when two sources are closely spaced in theangular domain then Lasso tends to choose only one ofthem in the estimation grid but ignores the other ENis often performing better in such cases as it enforcessparse solution but has a tendency to pick and reject thecorrelated variables as a group unlike the Lasso Fur-thermore EN also enjoys the computational advantagesof the Lasso2 and adaptation using smartly chosen data-dependent weights can further enhance performance

The main aim in this paper is to improve over theconventional sparse signal recovery methods especiallyin the presence of high mutual coherence of basis vectoror when non-zero coefficients have largely varying ampli-tudes This former case occurs in CBF problem when adense grid is used for constructing the steering matrix orwhen sources arrive to a sensor array from either neigh-bouring or oblique angles To this end we propose asequential adaptive approach using the weighted elasticnet (WEN) framework To achieve this in a computa-tionally efficient way we propose a homotopy method16

that is a complex-valued extension of the least angles re-gression and shrinkage (LARS)17 algorithm for weightedLasso problem which we refer to as c-LARS-WLassoThe developed c-LARS-WLasso method is numericallycost effective and avoids an exhaustive grid-search overcandidate values of the penalty parameter

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 1

In this paper we assume that the number of non-zero coefficients (ie number of sources K arriving ata sensor array in CBF problem) is known and proposea complex-valued pathwise (c-PW)-WEN algorithm thatutilizes c-LARS-WLasso along with PW-LARS-EN al-gorithm proposed in18 to compute the WEN path c-PW-WEN computes the K-sparse WEN solutions overa grid of EN tuning parameter values and then selectsthe best final WEN solution We also propose a novelsequential adaptive elastic net (SAEN) approach thatapplies adaptive c-PW-WEN sequentially by decreasingthe sparsity level (order) from 3K to K in three stagesSAEN utilizes smartly chosen adaptive (ie data depen-dent) weights that are based on solutions obtained in theprevious stage

Application of the developed algorithms is illustratedin the single-snapshot CBF DoA estimation problem InCBF accurate recovery depends heavily on the user spec-ified angular estimation grid (the look directions of in-terests) which determines the array steering matrix con-sisting of array response vectors to look directions (es-timation grid points) of interests A dense angular gridimplies high mutual coherence which indicates a poorrecovery region for most sparse recovery methods19 Ef-fectiveness of SAEN compared to state-of-the art sparsesignal recovery algorithms are illustrated via extensivesimulation studies

The paper is structured as follows Section II dis-cusses the WEN optimization problem and its benefitsover the Lasso and adaptive Lasso We introduce c-LARS-WLasso method in Section III In Section IVwe develop the c-PW-WEN algorithm that finds the K-sparse WEN solutions over a grid of EN tuning parametervalues In Section V the SAEN approach is proposedSection VI layouts the DoArsquos estimation problem fromsingle-snapshot measurements using CBF The simula-tion studies using large variety of set-ups are provided inSection VII Finally Section VIII concludes the paper

Notations Lowercase boldface letters are used forvectors and uppercase for matrices The ℓ2-norm and

the ℓ1-norm are defined as a2 =radicaHa and a1 =

sumn

i=1 |ai| respectively where |a| =radicalowasta =

radic

a2R + a2Idenotes the modulus of a complex number a = aR + aI The support A of a vector a isin Cp is the index set of itsnonzero elements ie A = supp(a) = j isin 1 p aj 6= 0 The ℓ0-(pseudo)norm of a is defined as a0 =|supp(a)| which is equal to the total number of nonzeroelements in it For a vector β isin Cp (resp matrix X isinCntimesp) and an index set AK sube 1 2 p of cardinality|AK | = K we denote by βAK

(resp XAK) the K times 1

vector (resp n times K matrix) restricted to componentsof β (resp columns of X) indexed by the set AK Letaotimesb denotes the Hadamard (ie element-wise) productof a isin Cp and b isin Cp and by a⊘b we denote the element-wise division of vectors We denote by 〈ab〉 = aHb theusual Hermitian inner product of Cp Finally diag(a)denotes a ptimes p matrix with elements of a as its diagonalelements

II WEIGHTED ELASTIC NET FRAMEWORK

We consider the linear model where the n complex-valued measurements are modeled as

y = Xβ + ε (1)

where X isin Cntimesp is a known complex-valued design ormeasurement matrix β isin Cp is the unknown vector ofcomplex-valued regression coefficients and ε isin Cn is thecomplex noise vector For ease of exposition we considerthe centered linear model (ie we assume that the inter-cept is equal to zero) In this paper we deal with under-determined or ill-posed linear model where p gt n andthe primary interest is to find a sparse estimate of the un-known parameters β given y isin Cn andX isin Cntimesp In thispaper we assume that the sparsity level K = β0 iethe number of non-zero elements of β is known In DoAfinding problem using compressed beamforming this isequivalent to assuming that the number of sources arriv-ing at a sensor array is known

The WEN estimator finds the solution β isin Cp to thefollowing constrained optimization problem

minimizeβisinCp

1

2yminusXβ22 subject to Pα

(

βw)

le t (2)

where t ge 0 is the threshold parameter chosen by theuser and

(

βw)

=

psum

j=1

wj

(

α|βj |+(1 minus α)

2|βj |2

)

is the WEN constraint (or penalty) function vectorw = (w1 wp)

⊤ wi ge 0 for i = 1 p collectsthe non-negative weights and α isin [0 1] is an EN tuningparameter Both the weights w and α are chosen by theuser When the weights are data-dependent we refer thesolution as adaptive EN (AEN) Note that AEN is anextension of the adaptive Lasso20

The constrained optimization problem in (2) can alsobe written in an equivalent penalized form

β(λ α) = argminβisinCp

1

2y minusXβ22 + λPα

(

βw)

(3)

where λ ge 0 is the penalty (or regularization) parame-ter The problems in (2) and (3) are equivalent due toLagrangian duality and either can be solved Herein weuse (3)

Recall that EN tuning parameter α isin [0 1] offers ablend amongst the Lasso and the ridge regression Thebenefit of EN is its ability to select correlated variablesas a group which is illustrated in Figure 1 The ENpenalty has singularities at the vertexes like Lasso whichis a necessary property for sparse estimation It alsohas strictly convex edges which then help in selectingvariables as a group which is a useful property whenhigh correlations exists between predictors Moreoverfor w = 1 (a vector of ones) and α = 1 (3) results in a

2 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

Lasso

EN

1

2

Least squares estimate

Low-correlation

contours

Sparse selection

(a)

Lasso

EN

1

2

Least squares estimate

High-correlation

contours

Group selection

Sparse

selection

(b)

FIG 1 (Color online) The Lasso solution is often at the

vertices (corners) but the EN solution can occur on edges as

well depending on the correlations among the variables In

the uncorrelated case of (a) both the Lasso and the EN has

sparse solution When the predictors are highly correlated as

in (b) the EN has a group selection in contrast to the Lasso

Lasso solution and for w = 1 and α = 0 we obtain theridge regression21 estimator

Then WEN solution can be computed using any al-gorithm that can find the non-weighted (w = 1) EN

solution To see this let us write X = Xdiag(w)minus1 and

β = β otimes w Then WEN solution is found by applyingthe following steps

1 Solve the (non-weighted) EN solution on trans-

formed data (y X)

β(λ α) = argminβisinCp

1

2yminus Xβ22 + λPα

(

β)

where Pα(β) = Pα(β1p) is the EN penalty

2 WEN solution for the original data (yX) is

β(λ α) = β ⊘w

Yet the standard (ie non-weighted wj equiv 1 forj = 1 p) EN estimator may perform inconsistentvariable selection The EN solution depends largely on λ(and α) and tuning these parameters optimally is a diffi-cult problem The adaptive Lasso20 obtains oracle vari-able selection property by using cleverly chosen adaptiveweights for regression coefficients in the ℓ1-penalty Weextend this idea to WEN-penalty coupling it with activeset approach where WEN is applied to only nonzero (ac-tive) coefficients Our proposed adaptive EN uses datadependent weights defined as

wj =

1|βinitj | βinitj 6= 0

infin βinitj = 0 j = 1 p (4)

Above βinit isin Cp denotes a sparse initial estimator of βThe idea is that only nonzero coefficients are exploited

ie basis vectors with βinitj = 0 are omitted from themodel and thus the dimensionality of the linear model is

reduced from p to K = βinit0 Moreover larger weightmeans that the corresponding variable is penalized moreheavily The vector w = (w1 wp)

⊤ can be writtencompactly as

w = 1p ⊘∣

∣βinit

∣ (5)

where notation |β| means element-wise application ofthe absolute value operator on the vector ie |β| =(|β1| |βp|)⊤

III COMPLEX-VALUED LARS METHOD FOR

WEIGHTED LASSO

In this section we develop the c-LARS-WLasso algo-rithm which is a complex-valued extension of the LARSalgorithm17 for weighted Lasso framework This is thenused to construct a complex-valued pathwise (c-PW-)WEN algorithm These methods compute solution (3)at particular penalty parameter λ values called knotsat which a new variable enters (or leaves) the active setof nonzero coefficients Our c-PW-WEN exploits the c-LARS-WLasso as its core computational engine

Let β(λ) denote a solution to (3) for some fixed valueλ of the penalty parameter in the case that Lasso penaltyis used (α = 1) with unit weights (ie w = 1p) Alsorecall that predictors are normalized so that xj22 = 1

j = 1 p Then note that the solution β(λ) needs toverify the generalized Karush-Kuhn-Tucker conditions

That is β(λ) is a solution to (3) if and only if it verifiesthe zero sub-gradient equations given by

〈xj rrr(λ)〉 = λsj for j = 1 p (6)

where rrr(λ) = y minusXβ(λ) and sj isin signβj(λ) meaning

that sj = eθ where θ = argβj(λ) if βj(λ) 6= 0 andsome number inside the unit circle sj isin x isin C |x| le1 otherwise Taking absolute values of both sides ofequation (6) one notices that at the solution condition|〈xj rrr(λ)〉| = λ holds for the active predictors whereas|〈xj rrr(λ)〉| le λ holds for non-active predictors Thus asλ decreases and more predictors are joined to the activeset the set of active predictors become less correlatedwith the residual Moreover the absolute value of thecorrelation or equivalently the angle between any activepredictor and the residual is the same In the real-valuedcase the LARS method exploits this feature and linearityof the Lasso path to compute the knot-values ie thevalue of the penalty parameters where there is a changein the active set of predictors

Let us briefly recall the main principle of the LARSalgorithm LARS starts with a model having no vari-ables (so β = 0) and picks a predictor that has maximalcorrelation (ie having smallest angle) with the residualrrr = y Suppose the predictor x1 is chosen Then themagnitude of the coefficient of the selected predictor isincreased (toward its least-squares value) until one hasreached a step size such that another predictor (say pre-dictor x2) has the same absolute value of correlation with

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 3

the evolving residual rrr = y minus (stepsize)x1 ie the up-dated residual makes equal angles with both predictorsas shown in Fig 2 Thereafter LARS moves in the newdirection which keeps the evolving residual equally cor-related (ie equiangular) with selected predictors untilanother predictor becomes equally correlated with theresidual After that one can repeat this process until allpredictors are in the model or to specified sparsity level

x2x2

x1

r = y

q1

q2

0

x2

x1

r

q1

0

q2

step-size

FIG 2 Starting from all zeros LARS picks predictor x1 that

makes least angle (ie θ1 lt θ2) with residual rrr and moves in

its direction until θ1 = θ2 where LARS picks x2 and changes

direction Then LARS repeats this procedure and selects

next equicorrelated predictor and so on until some stopping

criterion

We first consider the weighted Lasso (WLasso) prob-lem (so α = 1) by letting X = Xdiag(w)minus1 and β =

β otimesw We write β(λ) for the solution of the optimiza-tion problem (3) in this case Let λ0 denotes the small-est value of λ such that all coefficients of the WLassosolution are zero ie β(λ0) = 0 It is easy to seethat λ0 = maxj |〈xj y〉| for j = 1 2 p15 Let A =

suppβ(λ) denote the active set at the regularizationparameter value λ lt λ0 The knots λ1 gt λ2 gt middot middot middot gt λK

are defined as smallest values of the penalty parametersafter which there is a change in the set of active predic-tors ie the order of sparsity changes The active set at

a knot λk is denoted by Ak = suppβ(λk) The activeset A1 thus contains a single index as A1 = j1 wherej1 is predictor that becomes active first ie

j1 = argmaxjisin1p

|〈xj y〉|

By definition of the knots one has that Ak =

suppβ(λk) forallλ isin (λkminus1 λk] and Ak 6= Ak+1 for allk = 1 K

The c-LARS-WLasso outlined in Algorithm 1 is astraightforward generalization of LARS-Lasso algorithmto complex-valued and weighted case It does not havethe same theoretical guarantees as its real-valued coun-terpart to solve the exact values of the knots NamelyLARS algorithm uses the property that (in the real-valued case) the Lasso regularization path is continuousand piecewise linear with respect to λ see1722ndash24 In thecomplex-valued case however the solution path betweenthe knots is not necessarily linear25 Hence the c-LARS-WLasso may not give precise values of the knots in allthe cases However simulations validate that the algo-rithm finds the knots with reasonable precision Future

work is needed to provide theoretical guarantees of thealgorithm to find the knots

Algorithm 1 c-LARS-WLasso algorithm

input y isin Cn X isin C

ntimesp w isin Rp and K

output Ak λk β(λk)Kk=0

initialize β(0) = 0ptimes1 A0 = empty ∆ = 0ptimes1 the

residual rrr0 = y Set Xlarr Xdiag(w)minus1

1 Compute λ0 = maxj |〈xj rrr0〉| and

j1 = argmaxj |〈xj rrr0〉| where j = 1 p for

k = 1 K do

2 Find the active set Ak = Akminus1 cup jk and its

least- squares direction δ to have [∆]Ak= δ

δ =1

λkminus1(XH

AkXAk

)minus1X

HAk

rrrkminus1

3 Define vector β(λ) = β(kminus1) + (λkminus1 minus λ)∆ for

0 lt λ le λkminus1 and the corresponding residual as

rrr(λ) = y minusXβ(λ)

= y minusXβ(kminus1) minus (λkminus1 minus λ)X∆

= rrrkminus1 minus (λkminus1 minus λ)XAkδ

4 The knot λk is the largest λ-value 0 lt λ le λkminus1

st

〈xℓ rrr(λ)〉 = λeθ ℓ 6isin Ak (7)

where a new predictor (at index jk+1 6isin Ak)

becomes active thus verifying

|〈xjk+1 rrr(λk)〉| = λk from (7)

5 Update the values at a knot λk

β(k) = β

(kminus1) + (λkminus1 minus λk)∆

rrrk = y minusXβ(k)

The Lasso solution is β(λk) = β(k)

6 β(λk) = β(λk)⊘wKk=0

Below we discuss how to solve the knot λk and theindex jk+1 in step 5 of the c-LARS-WLasso algorithm

Solving step 5 First we note that

〈xℓ rrr(λ)〉 = 〈xℓ rrrkminus1 minus (λkminus1 minus λ)XAkδ〉

= 〈xℓ rrrkminus1〉 minus (λkminus1 minus λ)〈xℓXAkδ〉

= cℓ minus (λkminus1 minus λ)bℓ (8)

where we have written cℓ = 〈xℓ rrrkminus1〉 and bℓ =〈xℓXAk

δ〉 First we need to find λ for each ℓ 6isin Aksuch that |〈xℓ rrr(λ)〉| = λ holds Due to (7) and (8) thismeans finding 0 lt λ le λkminus1 such that

cℓ minus (λkminus1 minus λ)bℓ = λeθ (9)

Let us reparametrize such that λ = λkminus1 minus γℓ Thenidentifying 0 lt λ le λkminus1 is equivalent to identifying the

4 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

auxiliary variable γℓ ge 0 Now (9) becomes

cℓ minus (λkminus1 minus λkminus1 + γℓ)bℓ = (λkminus1 minus γℓ) eθ

hArr cℓ minus γℓbℓ = (λkminus1 minus γℓ) eθ

hArr |cℓ minus γℓbℓ|2 = (λkminus1 minus γℓ)2

hArr |cℓ|2 minus 2γℓRe(cℓblowastℓ ) + γ2

ℓ |bℓ|2 = λ2kminus1 minus 2λkminus1γℓ + γ2

The last equation implies that γℓ can be found by solvingthe roots of the second-order polynomial equation Aγ2

ℓ +Bγℓ+C = 0 where A = |bℓ|2minus1 B = 2λkminus1minus2Re(cℓb

lowastℓ )

and C = |cℓ|2 minus λ2kminus1 The roots are γℓ = γℓ1 γℓ2 and

the final value of γℓ will be

γℓ =

min(γℓ1 γℓ2) if γℓ1 gt 0 and γℓ2 gt 0

max(γℓ1 γℓ2)+ otherwise

where (t)+ = max(0 t) for t isin R Thus finding largestλ that verifies (7) is equivalent to finding smallest non-negative γℓ Hence the variable that enters to active setAk+1 is jk+1 = argminℓ 6isinAk

γℓ and the knot is thus λk =

λkminus1 minus γjk+1 Thereafter solution β(λk) at the knot λk

is simple to find in step 6

IV COMPLEX-VALUED PATHWISE WEIGHTED ELAS-

TIC NET

Next we develop a complex-valued and weighted ver-sion of PW-LARS-EN algorithm proposed in18 and re-ferred to as c-PW-WEN algorithm Generalization tocomplex-valued case is straightforward The essentialdifference is that c-LARS-WLasso Algorithm 1 is usedinstead of the (real-valued) LARS-Lasso algorithm Thealgorithm finds the Kth knot λK and the correspondingWEN solutions at a dense grid of EN tuning parametervalues α and then picks final solution for best α-value

Let λ0(α) denotes the smallest value of λ such that all

coefficients in the WEN estimate are zero ie β(λ0 α) =0 The value of λ0(α) can be expressed in closed-form15

λ0(α) = maxj

1

α

〈xj y〉wj

j = 1 p

The c-PW-WEN algorithm computes K-sparse WEN so-lutions for a set of α values in a dense grid

[α] = αi isin [1 0) α1 = 1 lt middot middot middot lt αm lt 0 (10)

Let A(λ) = suppβ(λ α) denote the active set (ienonzero elements of WEN solution) for a given fixed reg-ularization parameter value λ equiv λ(α) lt λ0(α) and forgiven α value in the grid [α] The knots λ1(α) gt λ2(α) gtmiddot middot middot gt λK(α) are the border values of the regularizationparameter after which there is a change in the set of ac-tive predictors Since α is fixed we drop the dependencyof the penalty parameter on α and simply write λ or λK

instead of λ(α) or λK(α) The reader should howeverkeep in mind that the value of the knots are different forany given α The active set at a knot λk is then denoted

shortly by Ak equiv A(λk) = suppβ(λk α) Note thatAk 6= Ak+1 for all k = 1 K Note that it is assumedthat a non-zero coefficient does not leave the active set forany value λ gt λK that is the sparsity level is increasingfrom 0 (at λ0) to K (at λK)

First we let that Algorithm 1 can be written asc-LARS-WLasso(yXwK) then we can write it as let

λk β(λk) = c-LARS-WLasso(yXw)∣

k

for extracting the kth knot (and the corresponding solu-tion) from a sequence of the knot-solution pairs found bythe c-LARS-WLasso algorithm Next note that we canwrite the EN objective function in augmented form asfollows

1

2yminusXβ22+λPα(β) =

1

2yaminusXa(η)β22+γ

∥β∥

1(11)

where

γ = λα and η = λ(1 minus α) (12)

are new parameterizations of the tuning and shrinkageparameter pair (α λ) and

ya =

(

y

0

)

and Xa(η) =

(

Xradicη Ip

)

are the augmented forms of the response vector y andthe predictor matrix X respectively Note that (11) re-sembles the Lasso objective function with ya isin Rn+p andthat Xa(η) is an (n+ p)times p matrix

It means that we can compute the K-sparse WENsolution at Kth knot for fixed α using the c-LARS-WLasso algorithm Our c-PW-WEN method is givenin algorithm 2 It computes the WEN solutions at theknots over a dense grid (10) of α values After Step 6of the algorithm we have solution at the Kth knot for a

given αi value on the grid Having the solution β(λK αi)available we then in steps 7 to 9 compute the residualsum of squares (RSS) of the debiased WEN solution atthe Kth knot (having K nonzeros)

RSS(αi) = y minusXAKβLS(λK αi)22

where AK = supp(β(λK αi)) is the active set at the Kth

knot and βLS(λK αi) is the debiased LSE defined as

βLS(λK αi) = X+AK

y (13)

where XAKisin CntimesK consists of the K active columns of

X associated with the active set AK and X+AK

denotesits Moore-Penrose pseudo inverse

While sweeping through the grid of α values and

computing the WEN solutions β(λK αi) we chooseour best candidate solution as the WEN estimateβ(λK αı) that had the smallest RSS value ie ı =argminiRSS(αi) where i isin 1 m

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 5

Algorithm 2 c-PW-WEN algorithm

input y isin Cn X isin C

ntimesp w isin Rp [α] isin R

m

(recall α1 = 1) K and debias

output βK isin Cp and AK isin R

K

1 λk(α1) β(λk α1)Kk=0 =

c-LARS-WLasso(

yXw K)

2 for i = 2 to m do

3 for k = 1 to K do

4 ηk = λk(αiminus1) middot (1minus αi)

5

γk β(λk αi)

=

c-LARS-WLasso(

yaXa(ηk)w)∣

k

6 λk(αi) = γkαi

7 AK = suppβ(λK αi)

8 βLS(λK αi) = X+AK

y

9 RSS(αi) = y minusXAKβLS(λK αi)

22

10 ı = argmini RSS(αi)

11 βK = β(λK αı) and AK = suppβK

12 if debias then βAK= X+

AKy

V SEQUENTIALLY ADAPTIVE ELASTIC NET

Next we turn our attention on how to choose theadaptive (ie data dependent) weights in c-PW-WENIn adaptive Lasso20 one ideally uses the LSE or if p gt n

the Lasso as an initial estimator βinit to construct theweights given in (4) The problem is that both the LSEand the Lasso estimator have very poor accuracy (highvariance) when there exists high correlations betweenpredictors which is the condition we are concerned inthis paper This lowers the probability of exact recoveryof the adaptive Lasso significantly

To overcome the problem above we devise a sequen-tial adaptive elastic net (SAEN) algorithm that obtainsthe K-sparse solution in a sequential manner decreasingthe sparsity level of the solution at each iteration and us-ing the previous solution as adaptive weights for c-PW-WEN The SAEN is described in algorithm 3 SAENruns the c-PW-WEN algorithm three times At first stepit finds a standard (unit weight) c-PW-WEN solution for3K nonzero (active) coefficients which we refer to as ini-

tial EN solution βinit The obtained solution determinesthe adaptive weights via (4) (and hence the active setof 3K nonzero coefficients) which is used in the secondstep to computes the c-PW-WEN solution that has 2Knonzero coefficients This again determines the adaptiveweights via (4) (and hence the active set of 2K nonzerocoefficients) which is used in the second step to computethe c-PW-WEN solution that has the desired K nonzerocoefficients It is important to notice that since we startfrom a solution with 3K nonzeros it is quite likely thatthe true K non-zero coefficients will be included in theactive set of βinit which is computed in the first step ofthe SAEN algorithm Note that the choice of 3K is sim-

ilar to CoSaMP algorithm29 which also uses 3K as aninitial support size Using 3K also usually guaranteesthat XA3K

is well conditioned which may not be the caseif larger value than 3K is chosen

Algorithm 3 SAEN algorithm

input y isin Cn X isin C

ntimesp [α] isin Rm and K

output βK isin Cp

1

βinitA3K

= c-PW-WEN(

yX1p [α] 3K 0)

2

βA2K

=

c-PW-WEN(

yXA3K 13K ⊘

∣βinitA3K

∣ [α] 2K 0)

3

βK AK

=

c-PW-WEN(

yXA2K 12K ⊘

∣βA2K

∣ [α] K 1)

VI SINGLE-SNAPSHOT COMPRESSIVE BEAMFORM-

ING

Estimating the source location in terms of its DoAplays an important role in many applications In5 it wasobserved that CS algorithms can be applied for DoA esti-mation (eg of sound sources) using sensor arrays whenthe array output y is be expressed as sparse (underdeter-mined) linear model by discretizing the DoA parameterspace This approach is referred to as compressive beam-forming (CBF) and it has been subsequently used in aseries of papers (eg7ndash1326)

In CBF after finding the sparse regression estimatorits support can be mapped to the DoA estimates on thegrid Thus the DoA estimates in CBF are always selectedfrom the resulting finite set of discretized DoA parame-ters Hence the resolution of CBF is dependent on thedensity of the grid (spacing ∆θ or grid size p) Densergrid implies large mutual coherence of the basis vectorsxj (here equal to the steering vectors for the DoAs onthe grid) and thus a poor recovery region for most sparseregression techniques

The proposed SAEN estimator can effectively miti-gate the effect of high mutual coherence caused by dis-cretization of the DoA space with significantly better per-formance than state-of-the-art compressed sensing algo-rithms This is illustrated in Section VII via extensivesimulation studies using challenging multi-source set-upsof closely-spaced sources and large variation of sourcepowers

We assume narrowband processing and a far-fieldsource wave impinging on an array of sensors with knownconfiguration The sources are assumed to be located inthe far-field of the sensor array (ie propagation radius≫ array size) A uniform linear array (ULA) of n sensors(eg hydrophones or microphones) is used for estimat-ing the DoA θ isin [minus90 90) of the source with respectto the array axis The array response (steering or wave-front vector) of ULA for a source from DoA (in radians)

6 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

θ isin [minusπ2 π2) is given by

a(

θ)

1radicn

[

1 eπ sin θ eπ(nminus1) sin θ]⊤

where we assume half a wavelength inter-element spac-ing between sensors We consider the case that K lt nsources from distinct DoAs θ = (θ1 θK) arrive at asensor at some time instant t A single-snapshot obtainedby ULA can then be modeled as27

y(t) = A(θ)s(t) + ε(t) (14)

where θ = (θ1 θK)⊤ collects the DoAs the matrix

A(θ) = [a(θ1) middot middot middot a(θK)] A isin CntimesK is the dictionaryof replicas also known as the array steering matrix s(t) isinCK contains the source waveforms and ε(t) is complexnoise at time instant t

Consider an angular grid of size p (commonly p ≫ n)of look directions of interest

[ϑ] = ϑi isin [minusπ2 π2) ϑ1 lt middot middot middot lt ϑp

Let the ith column of the measurement matrix X in themodel (1) be the array response for look direction ϑi soxi = a(ϑi) Then if the true source DoAs are containedin the angular grid ie θi isin [ϑ] for i = 1 K thenthe snapshot y in (14) (where we drop the time index t)can be equivalently modeled by (1) as

y = Xβ + ε

where β is exactly K-sparse (β0 = K) and nonzeroselements of β maintain the source waveforms s Thusidentifying the true DoAs is equivalent to identifyingthe nonzero elements of β which we refer to as CBF-principle Hence sparse regression and CS methods canbe utilised for estimating the DoAs based on a singlesnapshot only We assume that the number of sources Kis known a priori

Besides SNR also the size p or spacing ∆θ of the gridgreatly affect the performance of CBF methods Thecross-correlation (coherence) between the true steeringvector and steering vectors on the grid depends on boththe grid spacing and obliqueness of the target DoAs wrtto the array Moreover the values of cross-correlationsin the gram matrix |XHX| also depend on the distancebetween array elements and configuration of the sen-sors array7 Let us construct a measure called max-imal basis coherence (MBC) defined as the maximumabsolute value of the cross-correlations among the truesteering vectors a(θj) and the basis a(ϑi) ϑi isin [ϑ]θjj isin 1 K

MBC = maxj

maxϑisin[ϑ]θj

∣a(θj)H a(ϑ)

∣ (15)

Note that steering vectors a(θ) θ isin [minusπ2 π2) are as-sumed to be normalized (a(θ)H a(θ) = 1) MBC valuemeasures the obliqueness of the incoming DoA to the ar-ray The higher the MBC value the more difficult it is

for any CBF method to distinguish the true DoA in thegrid Note that the value of MBC also depends on thegrid spacing ∆θ

Fig 3 shows the geometry of DoA estimation prob-lem where the target DoArsquos have varying basis coherenceand it increases with the level of obliqueness (inclination)on either side of the normal to the ULA axis We definea straight DoA when the angle of incidence of the targetDoAs is in the shaded sector in Fig 3 This the regionwhere the MBC has lower values In contrast an obliqueDoA is defined when the angle of incidence of the targetis oblique with respect to the array axis

θ

nn-12

s(t)

1d

Source

d sinθ

frac14

Straight DoAs

FIG 3 (Color online) The straight and oblique DoAs exhibit

different basis coherence

Consider the case of ULA with n = 40 elementsreceiving two sources at straight DoAs θ1 = minus6 andθ2 = 2 or at oblique DoAs θ1 = 44 and θ2 = 52The above scenarios correspond to set-up 2 and set-up 3of Section VII see also Table II Angular separation be-tween the DoArsquos is 8 in both the scenarios In Table I wecompute the correlation between the true steering vectorsand with a neighboring steering vector a(ϑ) in the grid

TABLE I Correlation between true steering vector at DoAs

θ1 and θ2 and a steering vector at angle ϑ on the grid in a two

source scenario set-ups with either straight or oblique DoAs

θ1 θ2 ϑ correlation

True straight DoAs minus6 2 0071

minus6 minus7 0814

2 3 0812

True oblique DoAs 44 52 0069

44 45 0901

52 53 0927

These values validate the fact highlighted in Fig 3Namely a target with an oblique DoA wrt the arrayhas a larger maximal correlation (coherence) with the ba-sis steering vectors This makes it difficult for the sparse

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 7

recovery method to identify the true steering vector a(θi)from the spurious steering vector a(ϑ) that simply has avery large correlation with the true one Due to this mu-tual coherence it may happen that neither a(θi) or a(ϑ)

are assigned a non-zero coefficient value in β or perhapsjust one of them in random fashion

VII SIMULATION STUDIES

We consider seven simulation set-ups First five set-ups use grid spacing ∆θ = 1 (leading to p = 180 lookdirections in the grid [ϑ]) and the last two employ sparsergrid ∆θ = 2 (leading to p = 90 look directions in thegrid) The number of sensors n in the ULA are n = 40 forset-ups 1-5 and n = 30 for set-ups 6-7 Each set-up hasK isin 2 3 4 sources at different (straight or oblique)DoAs θ = (θ1 θK)⊤ and the source waveforms aregenerated as sk = |sk| middot eArg(sk) where source powers|sk| isin (0 1] are fixed for each set-up but the source phasesare randomly generated for each Monte-Carlo trial asArg(sk) sim Unif(0 2π) for k = 1 K Table II speci-fies the values of DoA-s and power of the sources used inthe set-ups Also the MBC values (15) are reported foreach case

TABLE II Details of all the set-ups tested in this paper First

five set-ups have grid spacing ∆θ = 1 and last two ∆θ = 2

Set-ups |si| θ [] MBC

1 [09 1 1] [minus5 3 6] 0814

2 [09 1] [minus6 2] 0814

3 [09 1] [44 52] 0927

4 [08 07 1] [43 44 52] 0927

5 [09 01 1 04] [minus87minus38minus35 97] 0990

6 [08 1 09 04] minus[485 464 315 22] 0991

7 [07 1 06 07] [6 8 14 18] 0643

The error terms εi are iid and generated fromCN (0 σ2) distribution where the noise variance σ2 de-pends on the signal-to-noise ratio (SNR) level in deci-bel (dB) given by SNR(dB) = 10 log10(σ

2sσ

2) whereσ2s = 1

K

|s1|2 + |s2|2 + middot middot middot + |sK |2

denotes the averagesource power An SNR level of 20 dB is used in thispaper unless its specified otherwise and the number ofMonte-Carlo trials is L = 1000

In each set-up we evaluate the performance of allmethods in recovering exactly the true support and thesource powers Due to high mutual coherence and due thelarge differences in source powers the DoA estimation isnow a challenging task A key performance measure isthe (empirical) probability of exact recovery (PER) of allK sources defined as

PER = aveI(Alowast = AK)

where Alowast denotes the index set of true source DoAs onthe grid and set AK the found support set where |Alowast| =|AK | = K and the average is over all Monte-Carlo tri-als Above I(middot) denotes the indicator function We alsocompute the average root mean squared error (RMSE)of the debiased estimate s = argminsisinCK y minus XAK

s2of the source vector as RMSE =

radic

avesminus s2

A Compared methods

This paper compares the SAEN approach to the ex-isting well-known greedy methods such as orthogonalmatching pursuit (OMP)28 and compressive samplingmatching pursuit (CoSaMP)29 Moreover we also drawcomparisons for two special cases of the c-PW-WEN al-gorithm to Lasso estimate that has K-non-zeros (ie

β(λK 1) computed by c-PW-WEN using w = 1 andα = 1) and EN estimator when cherry-picking the bestα in the grid [α] = αi isin [1 0) α1 = 1 lt middot middot middot lt αm lt

001 (ie β(λK αbst))It is instructive to compare the SAEN to simpler

adaptive EN (AEN) approach that simply uses adaptiveweights to weight different coefficients differently in thespirit of adaptive Lasso20 This helps in understandingthe effectiveness of the cleverly chosen weights and theusefulness of the three-step procedure used by the SAENalgorithm Recall that the first step in AEN approach iscompute the weights using some initial solution Afterobtaining the (adaptive) weights the final K-sparse AENsolution is computed We devise three AEN approacheseach one using a different initial solution to compute theweights

1 AEN(LSE) uses the weights found as

w(LSE) = 1⊘ |X+y|

where X+ is Moore-Penrose pseudo inverse of X

2 AEN(n) employs weights from an initial n-sparse

EN solution β(λn α) at nth knot which is found by

c-PW-WEN algorithm with w = 1

3 AEN(3K) instead uses weights calculated from an

initial EN solution β(λ3K α) having 3K nonzerosas in step 1 of SAEN algorithm but the remainingtwo steps of SAEN algorithm are omitted

The upper bound for PER rate for SAEN is the (em-

pirical) probability that the initial solution βinit com-puted in step 1 of algorithm 3 contains the true supportAlowast ie the value

UB = ave

I(

Alowast sub supp(βinit)

(16)

where the average is over all Monte-Carlo trials Wealso compute this upper bound to illustrate the abilityof SAEN to pick the true K-sparse support from theoriginal 3K-sparse initial value For set-up 1 (cf Ta-ble II) the average recovery results for all of the abovementioned methods are provided in Table III

8 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

TABLE III The recovery results for set-up 1 Results illus-

trate the effectiveness of three step SAEN approach compared

to its competitors The SNR level was 20 dB and the upper

bound (16) for the PER rate of SAEN is given in parentheses

SAEN AEN(3K) AEN(n)

PER (0957) 0864 0689 0616

RMSE - 0449 0694 0889

AEN(LSE) EN Lasso OMP CoSaMP

PER 0 0332 0332 0477 0140

RMSE 1870 1163 1163 1060 3558

It can be noted that the proposed SAEN outperformsall other methods and weighting schemes and recovers hetrue support and powers of the sources effectively Notethat the SAENrsquos upper bound for PER rate was 957and SAEN reached the PER rate 864 The results ofthe AEN approaches validate the need for accurate initialestimate to construct the adaptive weights For exampleAEN(3K) performs better than AEN(n) but much worsethan the SAEN method

B Straight and oblique DoAs

Set-up 2 and set-up 3 correspond to the case wherethe targets are at straight and oblique DoAs respectivelyPerformance results of the sparse recovery algorithms aretabulated in Table IV As can be seen the upper boundfor PER rate of SAEN is full 100 percentage whichmeans that the true support is correctly included in 3K-sparse solution computed at Step 1 of the algorithmFor set-up 2 (straight DoAs) all methods have almostfull PER rates except CoSaMP with 678 rate Per-formance of other estimators expect of SAEN changesdrastically in set-up 3 (oblique DoAs) Here SAEN isachieving nearly perfect (sim98) PER rate which meansreduction of 2 compared to set-up 2 Other methodsperform poorly For example PER rate of Lasso dropsfrom near 98 to 40 Similar behavior is observed forEN OMP and CoSaMP

Next we discuss the results for set-up 4 which is sim-ilar to set-up 3 except that we have introduced a thirdsource that also arrives from an oblique DoA θ = 43 andthe variation of the source powers is slightly larger Ascan be noted from Table IV the PER rates of greedy al-gorithms OMP and CoSaMP have declined to outstand-ingly low 0 This is very different with the PER ratesthey had in set-up 3 which contained only two sourcesIndeed inclusion of the 3rd source from an DoA similarwith the other two sources completely ruined their accu-racy This is in deep contrast with the SAEN methodthat still achieves PER rate of 75 which is more thantwice the PER rate achieved by Lasso SAEN is againhaving the lowest RMSE values

TABLE IV Recovery results for set-ups 2 - 4 Note that for

oblique DoArsquos (set-ups 3 and 4) the SAEN method outper-

form the other methods and has a perfect recovery results

for set-up 2 (straight DoAs) SNR level is 20 dB The upper

bound (16) for the PER rate of SAEN is given in parentheses

Set-up 2 with two straight DoAs

SAEN EN Lasso OMP CoSaMP

PER (1000) 1000 0981 0981 0998 0678

RMSE 0126 0145 0145 0128 1436

Set-up 3 with two oblique DoAs

SAEN EN Lasso OMP CoSaMP

PER (1000) 0978 0399 0399 0613 0113

RMSE 0154 0916 0916 0624 2296

Set-up 4 with three oblique DoAs

SAEN EN Lasso OMP CoSaMP

PER (0776) 0749 0392 0378 0 0

RMSE 0505 0838 0827 1087 5290

In summary the recovery results for set-ups 1-4(which express different degrees of basis coherence prox-imity of target DoArsquos as well as variation of source pow-ers) clearly illustrate that the proposed SAEN performsvery well in identifying the true support and the power ofthe sources and is always outperforming the commonlyused benchmarks sparse recovery methods namely theLasso EN OMP or CoSaMP with a significant marginIt is also noteworthy that EN often achieved better PERrates than Lasso which is mainly due to its group se-lection ability As a specific example of this particularfeature Figure 4 shows the solution paths for Lasso andEN for one particular Monte-Carlo trial where EN cor-rectly chooses the true DoAs but Lasso fails to select allcorrect DoAs In this particular instance the EN tun-ing parameter was α = 09 This is reason behind thesuccess of our c-PW-WEN algorithm which is the corecomputational engine of the SAEN

C Off-grid sources

Set-ups 5 and 6 explore the case when the targetDoAs are off the grid Also note that set-up 5 uses finergrid spacing ∆θ = 1 compared to set-up 6 with ∆θ = 2Both set-ups contain four target sources that have largelyvarying source powers In the off-grid case one would likethe CBF method to localize the targets to the nearestDoA in the angular grid [ϑ] that is used to construct thearray steering matrix Therefore in the off the grid case

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 9

0 05 1 15

-log( )

0

1P

ow

er (43)deg

(52)deg

3

(a)

0 05 1 15

-log( )

(43)deg

(44)deg

(52)deg

3

(b)

0

1

Po

wer

40 50 60 70

DoA [degrees]

Sources

Lasso

EN

(c)

FIG 4 (Color online) The Lasso and EN solution paths

(upper panel) and respective DoA solutions at the knot λ3

Observe that Lasso fails to recover the true support but EN

successfully picks the true DoAs The EN tuning parameter

was α = 09 in this example

the PER rate refers to the case that CBF method selectsthe K-sparse support that corresponds to DoArsquos on thegrid that are closest in distance to the true DoAs TableV provides the recovery results As can be seen again theSAEN is performing very well outperforming the Lassoand EN Note that OMP and CoSaMP completely fail inselecting the nearest grid-points

TABLE V Performance results of CBF methods for set-ups

5 and 6 where target DoA-s are off the grid Here PER

rate refers to the case that CBF method selects the K-sparse

support that corresponds to DoArsquos on the grid that are closest

to the true DoAs The upper bound (16) for the PER rate of

SAEN is given in parentheses SNR level is 20 dB

Set-up 5 with four off-grid straight DoAs

SAEN EN Lasso OMP CoSaMP

PER (0999) 0649 0349 0328 0 0

RMSE 0899 0947 0943 1137 8909

Set-up 6 with four off-grid oblique DoAs

SAEN EN Lasso OMP CoSaMP

PER (0794) 0683 0336 0336 0 0005

RMSE 0811 0913 0911 1360 28919

0 6 12 18 24 30

SNR [dB]

0

02

04

06

08

1

PE

R

FIG 5 (Color online) PER rates of CBF methods at different

SNR levels for set-up 7

D More targets and varying SNR levels

Next we consider the set-up 7 (cf Table II) whichcontains K = 4 sources The first three of the sourcesare at straight DoAs and the fourth one at a DoA withmodest obliqueness (θ4 = 18o) We now compute thePER rates of the methods as a function of SNR From thePER rates shown in Fig 5 we again notice that SAENclearly outperforms all of the other methods Note thatthe upper bound (16) of the PER rate of the SAEN is alsoplotted Both greedy algorithms OMP and CoSaMP areperforming very poorly even at high SNR levels Lassoand EN are attaining better recovery results than thegreedy algorithms Again EN is performing better thanLasso due to additional flexibility offered by EN tuningparameter and its ability to cope with correlated steering(basis) vectors SAEN recovers the exact true support inmost of the cases due to its step-wise adaptation usingcleverly chosen weights Furthermore the improvementin PER rates offered by SAEN becomes larger as the SNRlevel increases One can also notice that SAEN is close tothe theoretical upper bound of PER rate at higher SNRregime

VIII CONCLUSIONS

We developed c-PW-WEN algorithm that computesweighted elastic net solutions at the knots of penalty pa-rameter over a grid of EN tuning parameter values c-PW-WEN also computes weighted Lasso as a special case(ie solution at α = 1) and adaptive EN (AEN) is ob-tained when adaptive (data dependent) weights are usedWe then proposed a novel SAEN approach that uses c-PW-WEN method as its core computational engine anduses three-step adaptive weighting scheme where sparsityis decreased from 3K to K in three steps Simulationsillustrated that SAEN performs better than the adap-tive EN approaches Furthermore we illustrated that the3K-sparse initial solution computed at step 1 of SAENprovide smart weights for further steps and includes thetrue K-sparse support with high accuracy The proposedSAEN algorithm is then accurately including the truesupport at each step

10 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

Using the K-sparse Lasso solution computed directlyfrom Lasso path at the kth knot fails to provide exactsupport recovery in many cases especially when we havehigh basis coherence and lower SNR Greedy algorithmsoften fail in the face of high mutual coherence (due todense grid spacing or oblique target DoArsquos) or low SNRThis is mainly due the fact that their performance heav-ily depends on their ability to accurately detecting max-imal correlation between the measurement vector y andthe basis vectors (column vectors of X) Our simulationstudy also showed that their performance (in terms ofPER rate) deteriorates when the number of targets in-creases In the off-grid case the greedy algorithms alsofailed to find the nearby grid-points

Finally the SAEN algorithm performed better thanall other methods in each set-up and the improvementwas more pronounced in the presence of high mutualcoherence This is due to ability of SAEN to in-clude the true support correctly at all three steps ofthe algorithm Our MATLAB Rcopy package that imple-ments the proposed algorithms is freely available atwwwgithubcommntabassmSAEN-LARS The packagealso contains a MATLAB Rcopy live script demo on how touse the method in CBF problem along with an examplefrom simulation set-up 4 presented in the paper

ACKNOWLEDGMENTS

The research was partially supported by theAcademy of Finland grant no 298118 which is grate-fully acknowledged

1R Tibshirani ldquoRegression shrinkage and selection via the lassordquoJournal of the Royal Statistical Society Series B (Methodologi-cal) 267ndash288 (1996)

2H Zou and T Hastie ldquoRegularization and variable selection viathe elastic netrdquo Journal of the Royal Statistical Society SeriesB (Statistical Methodology) 67(2) 301ndash320 (2005)

3T Yardibi J Li P Stoica and L N C III ldquoSparsity con-strained deconvolution approaches for acoustic source mappingrdquoThe Journal of the Acoustical Society of America 123(5) 2631ndash2642 (2008) doi 10112112896754

4A Xenaki E Fernandez-Grande and P Gerstoft ldquoBlock-sparsebeamforming for spatially extended sources in a bayesian formu-lationrdquo The Journal of the Acoustical Society of America 140(3)1828ndash1838 (2016)

5D Malioutov M Cetin and A S Willsky ldquoA sparse signalreconstruction perspective for source localization with sensor ar-raysrdquo IEEE Transactions on Signal Processing 53(8) 3010ndash3022(2005) doi 101109TSP2005850882

6G F Edelmann and C F Gaumond ldquoBeamforming using com-pressive sensingrdquo The Journal of the Acoustical Society of Amer-ica 130(4) EL232ndashEL237 (2011) doi 10112113632046

7A Xenaki P Gerstoft and K Mosegaard ldquoCompressive beam-formingrdquo The Journal of the Acoustical Society of America136(1) 260ndash271 (2014) doi 10112114883360

8P Gerstoft A Xenaki and C F Mecklenbruker ldquoMultiple andsingle snapshot compressive beamformingrdquo The Journal of theAcoustical Society of America 138(4) 2003ndash2014 (2015) doi10112114929941

9Y Choo and W Seong ldquoCompressive spherical beamformingfor localization of incipient tip vortex cavitationrdquo The Journal

of the Acoustical Society of America 140(6) 4085ndash4090 (2016)doi 10112114968576

10A Das W S Hodgkiss and P Gerstoft ldquoCoherent multi-path direction-of-arrival resolution using compressed sensingrdquoIEEE Journal of Oceanic Engineering 42(2) 494ndash505 (2017) doi101109JOE20162576198

11S Fortunati R Grasso F Gini M S Greco and K LePageldquoSingle-snapshot doa estimation by using compressed sensingrdquoEURASIP Journal on Advances in Signal Processing 2014(1)120 (2014) doi 1011861687-6180-2014-120

12E Ollila ldquoNonparametric simultaneous sparse recovery An ap-plication to source localizationrdquo in Proc European Signal Pro-cessing Conference (EUSIPCOrsquo15) Nice France (2015) pp509ndash513

13E Ollila ldquoMultichannel sparse recovery of complex-valued sig-nals using Huberrsquos criterionrdquo in Proc Compressed Sensing The-ory and its Applications to Radar Sonar and Remote Sensing(CoSeRarsquo15) Pisa Italy (2015) pp 32ndash36

14M N Tabassum and E Ollila ldquoSingle-snapshot doa estimationusing adaptive elastic net in the complex domainrdquo in 4th In-ternational Workshop on Compressed Sensing Theory and itsApplications to Radar Sonar and Remote Sensing (CoSeRa)(2016) pp 197ndash201 doi 101109CoSeRa20167745728

15T Hastie R Tibshirani and M Wainwright Statistical learningwith sparsity the lasso and generalizations (CRC Press 2015)

16M Osborne B Presnell and B Turlach ldquoA new ap-proach to variable selection in least squares problemsrdquo IMAJournal of Numerical Analysis 20(3) 389ndash403 (2000) doi101093imanum203389

17B Efron T Hastie I Johnstone and R Tibshirani ldquoLeast an-gle regression (with discussion)rdquo The Annals of statistics 32(2)407ndash499 (2004) doi 101214009053604000000067

18M N Tabassum and E Ollila ldquoPathwise least angle regressionand a significance test for the elastic netrdquo in 25th European Sig-nal Processing Conference (EUSIPCO) (2017) pp 1309ndash1313doi 1023919EUSIPCO20178081420

19S Foucart and H Rauhut A mathematical introduction to com-pressive sensing Vol 1 (Springer 2013)

20H Zou ldquoThe adaptive lasso and its oracle propertiesrdquo Journal ofthe American Statistical Association 101(476) 1418ndash1429 (2006)doi 101198016214506000000735

21A E Hoerl and R W Kennard ldquoRidge regression biased esti-mation for nonorthogonal problemsrdquo Technometrics 12(1) 55ndash67 (1970)

22S Rosset and J Zhu ldquoPiecewise linear regularized so-lution pathsrdquo Ann Statist 35(3) 1012ndash1030 (2007) doi101214009053606000001370

23R J Tibshirani and J Taylor ldquoThe solution path of thegeneralized lassordquo Ann Statist 39(3) 1335ndash1371 (2011) doi10121411-AOS878

24R J Tibshirani ldquoThe lasso problem and uniquenessrdquo ElectronJ Statist 7 1456ndash1490 (2013) doi 10121413-EJS815

25A Panahi and M Viberg ldquoFast candidate points selection in thelasso pathrdquo IEEE Signal Processing Letters 19(2) 79ndash82 (2012)

26K L Gemba W S Hodgkiss and P Gerstoft ldquoAdaptiveand compressive matched field processingrdquo The Journal ofthe Acoustical Society of America 141(1) 92ndash103 (2017) doi10112114973528

27A B Gershman C F Mecklenbrauker and J F Bohme ldquoMa-trix fitting approach to direction of arrival estimation with imper-fect spatial coherence of wavefrontsrdquo IEEE Transactions on Sig-nal Processing 45(7) 1894ndash1899 (1997) doi 10110978599968

28J A Tropp and A C Gilbert ldquoSignal recovery from randommeasurements via orthogonal matching pursuitrdquo IEEE Trans-actions on Information Theory 53(12) 4655ndash4666 (2007) doi101109TIT2007909108

29D Needell and J Tropp ldquoCosamp Iterative signal recovery fromincomplete and inaccurate samplesrdquo Applied and Computational

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 11

Harmonic Analysis 26(3) 301 ndash 321 (2009)

12 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

Page 2: Sequential adaptive elastic net approach for single …Sequential adaptive elastic net approach for single-snapshot source localization MuhammadNaveedTabassum 1andEsaOllila Aalto University,

In this paper we assume that the number of non-zero coefficients (ie number of sources K arriving ata sensor array in CBF problem) is known and proposea complex-valued pathwise (c-PW)-WEN algorithm thatutilizes c-LARS-WLasso along with PW-LARS-EN al-gorithm proposed in18 to compute the WEN path c-PW-WEN computes the K-sparse WEN solutions overa grid of EN tuning parameter values and then selectsthe best final WEN solution We also propose a novelsequential adaptive elastic net (SAEN) approach thatapplies adaptive c-PW-WEN sequentially by decreasingthe sparsity level (order) from 3K to K in three stagesSAEN utilizes smartly chosen adaptive (ie data depen-dent) weights that are based on solutions obtained in theprevious stage

Application of the developed algorithms is illustratedin the single-snapshot CBF DoA estimation problem InCBF accurate recovery depends heavily on the user spec-ified angular estimation grid (the look directions of in-terests) which determines the array steering matrix con-sisting of array response vectors to look directions (es-timation grid points) of interests A dense angular gridimplies high mutual coherence which indicates a poorrecovery region for most sparse recovery methods19 Ef-fectiveness of SAEN compared to state-of-the art sparsesignal recovery algorithms are illustrated via extensivesimulation studies

The paper is structured as follows Section II dis-cusses the WEN optimization problem and its benefitsover the Lasso and adaptive Lasso We introduce c-LARS-WLasso method in Section III In Section IVwe develop the c-PW-WEN algorithm that finds the K-sparse WEN solutions over a grid of EN tuning parametervalues In Section V the SAEN approach is proposedSection VI layouts the DoArsquos estimation problem fromsingle-snapshot measurements using CBF The simula-tion studies using large variety of set-ups are provided inSection VII Finally Section VIII concludes the paper

Notations Lowercase boldface letters are used forvectors and uppercase for matrices The ℓ2-norm and

the ℓ1-norm are defined as a2 =radicaHa and a1 =

sumn

i=1 |ai| respectively where |a| =radicalowasta =

radic

a2R + a2Idenotes the modulus of a complex number a = aR + aI The support A of a vector a isin Cp is the index set of itsnonzero elements ie A = supp(a) = j isin 1 p aj 6= 0 The ℓ0-(pseudo)norm of a is defined as a0 =|supp(a)| which is equal to the total number of nonzeroelements in it For a vector β isin Cp (resp matrix X isinCntimesp) and an index set AK sube 1 2 p of cardinality|AK | = K we denote by βAK

(resp XAK) the K times 1

vector (resp n times K matrix) restricted to componentsof β (resp columns of X) indexed by the set AK Letaotimesb denotes the Hadamard (ie element-wise) productof a isin Cp and b isin Cp and by a⊘b we denote the element-wise division of vectors We denote by 〈ab〉 = aHb theusual Hermitian inner product of Cp Finally diag(a)denotes a ptimes p matrix with elements of a as its diagonalelements

II WEIGHTED ELASTIC NET FRAMEWORK

We consider the linear model where the n complex-valued measurements are modeled as

y = Xβ + ε (1)

where X isin Cntimesp is a known complex-valued design ormeasurement matrix β isin Cp is the unknown vector ofcomplex-valued regression coefficients and ε isin Cn is thecomplex noise vector For ease of exposition we considerthe centered linear model (ie we assume that the inter-cept is equal to zero) In this paper we deal with under-determined or ill-posed linear model where p gt n andthe primary interest is to find a sparse estimate of the un-known parameters β given y isin Cn andX isin Cntimesp In thispaper we assume that the sparsity level K = β0 iethe number of non-zero elements of β is known In DoAfinding problem using compressed beamforming this isequivalent to assuming that the number of sources arriv-ing at a sensor array is known

The WEN estimator finds the solution β isin Cp to thefollowing constrained optimization problem

minimizeβisinCp

1

2yminusXβ22 subject to Pα

(

βw)

le t (2)

where t ge 0 is the threshold parameter chosen by theuser and

(

βw)

=

psum

j=1

wj

(

α|βj |+(1 minus α)

2|βj |2

)

is the WEN constraint (or penalty) function vectorw = (w1 wp)

⊤ wi ge 0 for i = 1 p collectsthe non-negative weights and α isin [0 1] is an EN tuningparameter Both the weights w and α are chosen by theuser When the weights are data-dependent we refer thesolution as adaptive EN (AEN) Note that AEN is anextension of the adaptive Lasso20

The constrained optimization problem in (2) can alsobe written in an equivalent penalized form

β(λ α) = argminβisinCp

1

2y minusXβ22 + λPα

(

βw)

(3)

where λ ge 0 is the penalty (or regularization) parame-ter The problems in (2) and (3) are equivalent due toLagrangian duality and either can be solved Herein weuse (3)

Recall that EN tuning parameter α isin [0 1] offers ablend amongst the Lasso and the ridge regression Thebenefit of EN is its ability to select correlated variablesas a group which is illustrated in Figure 1 The ENpenalty has singularities at the vertexes like Lasso whichis a necessary property for sparse estimation It alsohas strictly convex edges which then help in selectingvariables as a group which is a useful property whenhigh correlations exists between predictors Moreoverfor w = 1 (a vector of ones) and α = 1 (3) results in a

2 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

Lasso

EN

1

2

Least squares estimate

Low-correlation

contours

Sparse selection

(a)

Lasso

EN

1

2

Least squares estimate

High-correlation

contours

Group selection

Sparse

selection

(b)

FIG 1 (Color online) The Lasso solution is often at the

vertices (corners) but the EN solution can occur on edges as

well depending on the correlations among the variables In

the uncorrelated case of (a) both the Lasso and the EN has

sparse solution When the predictors are highly correlated as

in (b) the EN has a group selection in contrast to the Lasso

Lasso solution and for w = 1 and α = 0 we obtain theridge regression21 estimator

Then WEN solution can be computed using any al-gorithm that can find the non-weighted (w = 1) EN

solution To see this let us write X = Xdiag(w)minus1 and

β = β otimes w Then WEN solution is found by applyingthe following steps

1 Solve the (non-weighted) EN solution on trans-

formed data (y X)

β(λ α) = argminβisinCp

1

2yminus Xβ22 + λPα

(

β)

where Pα(β) = Pα(β1p) is the EN penalty

2 WEN solution for the original data (yX) is

β(λ α) = β ⊘w

Yet the standard (ie non-weighted wj equiv 1 forj = 1 p) EN estimator may perform inconsistentvariable selection The EN solution depends largely on λ(and α) and tuning these parameters optimally is a diffi-cult problem The adaptive Lasso20 obtains oracle vari-able selection property by using cleverly chosen adaptiveweights for regression coefficients in the ℓ1-penalty Weextend this idea to WEN-penalty coupling it with activeset approach where WEN is applied to only nonzero (ac-tive) coefficients Our proposed adaptive EN uses datadependent weights defined as

wj =

1|βinitj | βinitj 6= 0

infin βinitj = 0 j = 1 p (4)

Above βinit isin Cp denotes a sparse initial estimator of βThe idea is that only nonzero coefficients are exploited

ie basis vectors with βinitj = 0 are omitted from themodel and thus the dimensionality of the linear model is

reduced from p to K = βinit0 Moreover larger weightmeans that the corresponding variable is penalized moreheavily The vector w = (w1 wp)

⊤ can be writtencompactly as

w = 1p ⊘∣

∣βinit

∣ (5)

where notation |β| means element-wise application ofthe absolute value operator on the vector ie |β| =(|β1| |βp|)⊤

III COMPLEX-VALUED LARS METHOD FOR

WEIGHTED LASSO

In this section we develop the c-LARS-WLasso algo-rithm which is a complex-valued extension of the LARSalgorithm17 for weighted Lasso framework This is thenused to construct a complex-valued pathwise (c-PW-)WEN algorithm These methods compute solution (3)at particular penalty parameter λ values called knotsat which a new variable enters (or leaves) the active setof nonzero coefficients Our c-PW-WEN exploits the c-LARS-WLasso as its core computational engine

Let β(λ) denote a solution to (3) for some fixed valueλ of the penalty parameter in the case that Lasso penaltyis used (α = 1) with unit weights (ie w = 1p) Alsorecall that predictors are normalized so that xj22 = 1

j = 1 p Then note that the solution β(λ) needs toverify the generalized Karush-Kuhn-Tucker conditions

That is β(λ) is a solution to (3) if and only if it verifiesthe zero sub-gradient equations given by

〈xj rrr(λ)〉 = λsj for j = 1 p (6)

where rrr(λ) = y minusXβ(λ) and sj isin signβj(λ) meaning

that sj = eθ where θ = argβj(λ) if βj(λ) 6= 0 andsome number inside the unit circle sj isin x isin C |x| le1 otherwise Taking absolute values of both sides ofequation (6) one notices that at the solution condition|〈xj rrr(λ)〉| = λ holds for the active predictors whereas|〈xj rrr(λ)〉| le λ holds for non-active predictors Thus asλ decreases and more predictors are joined to the activeset the set of active predictors become less correlatedwith the residual Moreover the absolute value of thecorrelation or equivalently the angle between any activepredictor and the residual is the same In the real-valuedcase the LARS method exploits this feature and linearityof the Lasso path to compute the knot-values ie thevalue of the penalty parameters where there is a changein the active set of predictors

Let us briefly recall the main principle of the LARSalgorithm LARS starts with a model having no vari-ables (so β = 0) and picks a predictor that has maximalcorrelation (ie having smallest angle) with the residualrrr = y Suppose the predictor x1 is chosen Then themagnitude of the coefficient of the selected predictor isincreased (toward its least-squares value) until one hasreached a step size such that another predictor (say pre-dictor x2) has the same absolute value of correlation with

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 3

the evolving residual rrr = y minus (stepsize)x1 ie the up-dated residual makes equal angles with both predictorsas shown in Fig 2 Thereafter LARS moves in the newdirection which keeps the evolving residual equally cor-related (ie equiangular) with selected predictors untilanother predictor becomes equally correlated with theresidual After that one can repeat this process until allpredictors are in the model or to specified sparsity level

x2x2

x1

r = y

q1

q2

0

x2

x1

r

q1

0

q2

step-size

FIG 2 Starting from all zeros LARS picks predictor x1 that

makes least angle (ie θ1 lt θ2) with residual rrr and moves in

its direction until θ1 = θ2 where LARS picks x2 and changes

direction Then LARS repeats this procedure and selects

next equicorrelated predictor and so on until some stopping

criterion

We first consider the weighted Lasso (WLasso) prob-lem (so α = 1) by letting X = Xdiag(w)minus1 and β =

β otimesw We write β(λ) for the solution of the optimiza-tion problem (3) in this case Let λ0 denotes the small-est value of λ such that all coefficients of the WLassosolution are zero ie β(λ0) = 0 It is easy to seethat λ0 = maxj |〈xj y〉| for j = 1 2 p15 Let A =

suppβ(λ) denote the active set at the regularizationparameter value λ lt λ0 The knots λ1 gt λ2 gt middot middot middot gt λK

are defined as smallest values of the penalty parametersafter which there is a change in the set of active predic-tors ie the order of sparsity changes The active set at

a knot λk is denoted by Ak = suppβ(λk) The activeset A1 thus contains a single index as A1 = j1 wherej1 is predictor that becomes active first ie

j1 = argmaxjisin1p

|〈xj y〉|

By definition of the knots one has that Ak =

suppβ(λk) forallλ isin (λkminus1 λk] and Ak 6= Ak+1 for allk = 1 K

The c-LARS-WLasso outlined in Algorithm 1 is astraightforward generalization of LARS-Lasso algorithmto complex-valued and weighted case It does not havethe same theoretical guarantees as its real-valued coun-terpart to solve the exact values of the knots NamelyLARS algorithm uses the property that (in the real-valued case) the Lasso regularization path is continuousand piecewise linear with respect to λ see1722ndash24 In thecomplex-valued case however the solution path betweenthe knots is not necessarily linear25 Hence the c-LARS-WLasso may not give precise values of the knots in allthe cases However simulations validate that the algo-rithm finds the knots with reasonable precision Future

work is needed to provide theoretical guarantees of thealgorithm to find the knots

Algorithm 1 c-LARS-WLasso algorithm

input y isin Cn X isin C

ntimesp w isin Rp and K

output Ak λk β(λk)Kk=0

initialize β(0) = 0ptimes1 A0 = empty ∆ = 0ptimes1 the

residual rrr0 = y Set Xlarr Xdiag(w)minus1

1 Compute λ0 = maxj |〈xj rrr0〉| and

j1 = argmaxj |〈xj rrr0〉| where j = 1 p for

k = 1 K do

2 Find the active set Ak = Akminus1 cup jk and its

least- squares direction δ to have [∆]Ak= δ

δ =1

λkminus1(XH

AkXAk

)minus1X

HAk

rrrkminus1

3 Define vector β(λ) = β(kminus1) + (λkminus1 minus λ)∆ for

0 lt λ le λkminus1 and the corresponding residual as

rrr(λ) = y minusXβ(λ)

= y minusXβ(kminus1) minus (λkminus1 minus λ)X∆

= rrrkminus1 minus (λkminus1 minus λ)XAkδ

4 The knot λk is the largest λ-value 0 lt λ le λkminus1

st

〈xℓ rrr(λ)〉 = λeθ ℓ 6isin Ak (7)

where a new predictor (at index jk+1 6isin Ak)

becomes active thus verifying

|〈xjk+1 rrr(λk)〉| = λk from (7)

5 Update the values at a knot λk

β(k) = β

(kminus1) + (λkminus1 minus λk)∆

rrrk = y minusXβ(k)

The Lasso solution is β(λk) = β(k)

6 β(λk) = β(λk)⊘wKk=0

Below we discuss how to solve the knot λk and theindex jk+1 in step 5 of the c-LARS-WLasso algorithm

Solving step 5 First we note that

〈xℓ rrr(λ)〉 = 〈xℓ rrrkminus1 minus (λkminus1 minus λ)XAkδ〉

= 〈xℓ rrrkminus1〉 minus (λkminus1 minus λ)〈xℓXAkδ〉

= cℓ minus (λkminus1 minus λ)bℓ (8)

where we have written cℓ = 〈xℓ rrrkminus1〉 and bℓ =〈xℓXAk

δ〉 First we need to find λ for each ℓ 6isin Aksuch that |〈xℓ rrr(λ)〉| = λ holds Due to (7) and (8) thismeans finding 0 lt λ le λkminus1 such that

cℓ minus (λkminus1 minus λ)bℓ = λeθ (9)

Let us reparametrize such that λ = λkminus1 minus γℓ Thenidentifying 0 lt λ le λkminus1 is equivalent to identifying the

4 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

auxiliary variable γℓ ge 0 Now (9) becomes

cℓ minus (λkminus1 minus λkminus1 + γℓ)bℓ = (λkminus1 minus γℓ) eθ

hArr cℓ minus γℓbℓ = (λkminus1 minus γℓ) eθ

hArr |cℓ minus γℓbℓ|2 = (λkminus1 minus γℓ)2

hArr |cℓ|2 minus 2γℓRe(cℓblowastℓ ) + γ2

ℓ |bℓ|2 = λ2kminus1 minus 2λkminus1γℓ + γ2

The last equation implies that γℓ can be found by solvingthe roots of the second-order polynomial equation Aγ2

ℓ +Bγℓ+C = 0 where A = |bℓ|2minus1 B = 2λkminus1minus2Re(cℓb

lowastℓ )

and C = |cℓ|2 minus λ2kminus1 The roots are γℓ = γℓ1 γℓ2 and

the final value of γℓ will be

γℓ =

min(γℓ1 γℓ2) if γℓ1 gt 0 and γℓ2 gt 0

max(γℓ1 γℓ2)+ otherwise

where (t)+ = max(0 t) for t isin R Thus finding largestλ that verifies (7) is equivalent to finding smallest non-negative γℓ Hence the variable that enters to active setAk+1 is jk+1 = argminℓ 6isinAk

γℓ and the knot is thus λk =

λkminus1 minus γjk+1 Thereafter solution β(λk) at the knot λk

is simple to find in step 6

IV COMPLEX-VALUED PATHWISE WEIGHTED ELAS-

TIC NET

Next we develop a complex-valued and weighted ver-sion of PW-LARS-EN algorithm proposed in18 and re-ferred to as c-PW-WEN algorithm Generalization tocomplex-valued case is straightforward The essentialdifference is that c-LARS-WLasso Algorithm 1 is usedinstead of the (real-valued) LARS-Lasso algorithm Thealgorithm finds the Kth knot λK and the correspondingWEN solutions at a dense grid of EN tuning parametervalues α and then picks final solution for best α-value

Let λ0(α) denotes the smallest value of λ such that all

coefficients in the WEN estimate are zero ie β(λ0 α) =0 The value of λ0(α) can be expressed in closed-form15

λ0(α) = maxj

1

α

〈xj y〉wj

j = 1 p

The c-PW-WEN algorithm computes K-sparse WEN so-lutions for a set of α values in a dense grid

[α] = αi isin [1 0) α1 = 1 lt middot middot middot lt αm lt 0 (10)

Let A(λ) = suppβ(λ α) denote the active set (ienonzero elements of WEN solution) for a given fixed reg-ularization parameter value λ equiv λ(α) lt λ0(α) and forgiven α value in the grid [α] The knots λ1(α) gt λ2(α) gtmiddot middot middot gt λK(α) are the border values of the regularizationparameter after which there is a change in the set of ac-tive predictors Since α is fixed we drop the dependencyof the penalty parameter on α and simply write λ or λK

instead of λ(α) or λK(α) The reader should howeverkeep in mind that the value of the knots are different forany given α The active set at a knot λk is then denoted

shortly by Ak equiv A(λk) = suppβ(λk α) Note thatAk 6= Ak+1 for all k = 1 K Note that it is assumedthat a non-zero coefficient does not leave the active set forany value λ gt λK that is the sparsity level is increasingfrom 0 (at λ0) to K (at λK)

First we let that Algorithm 1 can be written asc-LARS-WLasso(yXwK) then we can write it as let

λk β(λk) = c-LARS-WLasso(yXw)∣

k

for extracting the kth knot (and the corresponding solu-tion) from a sequence of the knot-solution pairs found bythe c-LARS-WLasso algorithm Next note that we canwrite the EN objective function in augmented form asfollows

1

2yminusXβ22+λPα(β) =

1

2yaminusXa(η)β22+γ

∥β∥

1(11)

where

γ = λα and η = λ(1 minus α) (12)

are new parameterizations of the tuning and shrinkageparameter pair (α λ) and

ya =

(

y

0

)

and Xa(η) =

(

Xradicη Ip

)

are the augmented forms of the response vector y andthe predictor matrix X respectively Note that (11) re-sembles the Lasso objective function with ya isin Rn+p andthat Xa(η) is an (n+ p)times p matrix

It means that we can compute the K-sparse WENsolution at Kth knot for fixed α using the c-LARS-WLasso algorithm Our c-PW-WEN method is givenin algorithm 2 It computes the WEN solutions at theknots over a dense grid (10) of α values After Step 6of the algorithm we have solution at the Kth knot for a

given αi value on the grid Having the solution β(λK αi)available we then in steps 7 to 9 compute the residualsum of squares (RSS) of the debiased WEN solution atthe Kth knot (having K nonzeros)

RSS(αi) = y minusXAKβLS(λK αi)22

where AK = supp(β(λK αi)) is the active set at the Kth

knot and βLS(λK αi) is the debiased LSE defined as

βLS(λK αi) = X+AK

y (13)

where XAKisin CntimesK consists of the K active columns of

X associated with the active set AK and X+AK

denotesits Moore-Penrose pseudo inverse

While sweeping through the grid of α values and

computing the WEN solutions β(λK αi) we chooseour best candidate solution as the WEN estimateβ(λK αı) that had the smallest RSS value ie ı =argminiRSS(αi) where i isin 1 m

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 5

Algorithm 2 c-PW-WEN algorithm

input y isin Cn X isin C

ntimesp w isin Rp [α] isin R

m

(recall α1 = 1) K and debias

output βK isin Cp and AK isin R

K

1 λk(α1) β(λk α1)Kk=0 =

c-LARS-WLasso(

yXw K)

2 for i = 2 to m do

3 for k = 1 to K do

4 ηk = λk(αiminus1) middot (1minus αi)

5

γk β(λk αi)

=

c-LARS-WLasso(

yaXa(ηk)w)∣

k

6 λk(αi) = γkαi

7 AK = suppβ(λK αi)

8 βLS(λK αi) = X+AK

y

9 RSS(αi) = y minusXAKβLS(λK αi)

22

10 ı = argmini RSS(αi)

11 βK = β(λK αı) and AK = suppβK

12 if debias then βAK= X+

AKy

V SEQUENTIALLY ADAPTIVE ELASTIC NET

Next we turn our attention on how to choose theadaptive (ie data dependent) weights in c-PW-WENIn adaptive Lasso20 one ideally uses the LSE or if p gt n

the Lasso as an initial estimator βinit to construct theweights given in (4) The problem is that both the LSEand the Lasso estimator have very poor accuracy (highvariance) when there exists high correlations betweenpredictors which is the condition we are concerned inthis paper This lowers the probability of exact recoveryof the adaptive Lasso significantly

To overcome the problem above we devise a sequen-tial adaptive elastic net (SAEN) algorithm that obtainsthe K-sparse solution in a sequential manner decreasingthe sparsity level of the solution at each iteration and us-ing the previous solution as adaptive weights for c-PW-WEN The SAEN is described in algorithm 3 SAENruns the c-PW-WEN algorithm three times At first stepit finds a standard (unit weight) c-PW-WEN solution for3K nonzero (active) coefficients which we refer to as ini-

tial EN solution βinit The obtained solution determinesthe adaptive weights via (4) (and hence the active setof 3K nonzero coefficients) which is used in the secondstep to computes the c-PW-WEN solution that has 2Knonzero coefficients This again determines the adaptiveweights via (4) (and hence the active set of 2K nonzerocoefficients) which is used in the second step to computethe c-PW-WEN solution that has the desired K nonzerocoefficients It is important to notice that since we startfrom a solution with 3K nonzeros it is quite likely thatthe true K non-zero coefficients will be included in theactive set of βinit which is computed in the first step ofthe SAEN algorithm Note that the choice of 3K is sim-

ilar to CoSaMP algorithm29 which also uses 3K as aninitial support size Using 3K also usually guaranteesthat XA3K

is well conditioned which may not be the caseif larger value than 3K is chosen

Algorithm 3 SAEN algorithm

input y isin Cn X isin C

ntimesp [α] isin Rm and K

output βK isin Cp

1

βinitA3K

= c-PW-WEN(

yX1p [α] 3K 0)

2

βA2K

=

c-PW-WEN(

yXA3K 13K ⊘

∣βinitA3K

∣ [α] 2K 0)

3

βK AK

=

c-PW-WEN(

yXA2K 12K ⊘

∣βA2K

∣ [α] K 1)

VI SINGLE-SNAPSHOT COMPRESSIVE BEAMFORM-

ING

Estimating the source location in terms of its DoAplays an important role in many applications In5 it wasobserved that CS algorithms can be applied for DoA esti-mation (eg of sound sources) using sensor arrays whenthe array output y is be expressed as sparse (underdeter-mined) linear model by discretizing the DoA parameterspace This approach is referred to as compressive beam-forming (CBF) and it has been subsequently used in aseries of papers (eg7ndash1326)

In CBF after finding the sparse regression estimatorits support can be mapped to the DoA estimates on thegrid Thus the DoA estimates in CBF are always selectedfrom the resulting finite set of discretized DoA parame-ters Hence the resolution of CBF is dependent on thedensity of the grid (spacing ∆θ or grid size p) Densergrid implies large mutual coherence of the basis vectorsxj (here equal to the steering vectors for the DoAs onthe grid) and thus a poor recovery region for most sparseregression techniques

The proposed SAEN estimator can effectively miti-gate the effect of high mutual coherence caused by dis-cretization of the DoA space with significantly better per-formance than state-of-the-art compressed sensing algo-rithms This is illustrated in Section VII via extensivesimulation studies using challenging multi-source set-upsof closely-spaced sources and large variation of sourcepowers

We assume narrowband processing and a far-fieldsource wave impinging on an array of sensors with knownconfiguration The sources are assumed to be located inthe far-field of the sensor array (ie propagation radius≫ array size) A uniform linear array (ULA) of n sensors(eg hydrophones or microphones) is used for estimat-ing the DoA θ isin [minus90 90) of the source with respectto the array axis The array response (steering or wave-front vector) of ULA for a source from DoA (in radians)

6 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

θ isin [minusπ2 π2) is given by

a(

θ)

1radicn

[

1 eπ sin θ eπ(nminus1) sin θ]⊤

where we assume half a wavelength inter-element spac-ing between sensors We consider the case that K lt nsources from distinct DoAs θ = (θ1 θK) arrive at asensor at some time instant t A single-snapshot obtainedby ULA can then be modeled as27

y(t) = A(θ)s(t) + ε(t) (14)

where θ = (θ1 θK)⊤ collects the DoAs the matrix

A(θ) = [a(θ1) middot middot middot a(θK)] A isin CntimesK is the dictionaryof replicas also known as the array steering matrix s(t) isinCK contains the source waveforms and ε(t) is complexnoise at time instant t

Consider an angular grid of size p (commonly p ≫ n)of look directions of interest

[ϑ] = ϑi isin [minusπ2 π2) ϑ1 lt middot middot middot lt ϑp

Let the ith column of the measurement matrix X in themodel (1) be the array response for look direction ϑi soxi = a(ϑi) Then if the true source DoAs are containedin the angular grid ie θi isin [ϑ] for i = 1 K thenthe snapshot y in (14) (where we drop the time index t)can be equivalently modeled by (1) as

y = Xβ + ε

where β is exactly K-sparse (β0 = K) and nonzeroselements of β maintain the source waveforms s Thusidentifying the true DoAs is equivalent to identifyingthe nonzero elements of β which we refer to as CBF-principle Hence sparse regression and CS methods canbe utilised for estimating the DoAs based on a singlesnapshot only We assume that the number of sources Kis known a priori

Besides SNR also the size p or spacing ∆θ of the gridgreatly affect the performance of CBF methods Thecross-correlation (coherence) between the true steeringvector and steering vectors on the grid depends on boththe grid spacing and obliqueness of the target DoAs wrtto the array Moreover the values of cross-correlationsin the gram matrix |XHX| also depend on the distancebetween array elements and configuration of the sen-sors array7 Let us construct a measure called max-imal basis coherence (MBC) defined as the maximumabsolute value of the cross-correlations among the truesteering vectors a(θj) and the basis a(ϑi) ϑi isin [ϑ]θjj isin 1 K

MBC = maxj

maxϑisin[ϑ]θj

∣a(θj)H a(ϑ)

∣ (15)

Note that steering vectors a(θ) θ isin [minusπ2 π2) are as-sumed to be normalized (a(θ)H a(θ) = 1) MBC valuemeasures the obliqueness of the incoming DoA to the ar-ray The higher the MBC value the more difficult it is

for any CBF method to distinguish the true DoA in thegrid Note that the value of MBC also depends on thegrid spacing ∆θ

Fig 3 shows the geometry of DoA estimation prob-lem where the target DoArsquos have varying basis coherenceand it increases with the level of obliqueness (inclination)on either side of the normal to the ULA axis We definea straight DoA when the angle of incidence of the targetDoAs is in the shaded sector in Fig 3 This the regionwhere the MBC has lower values In contrast an obliqueDoA is defined when the angle of incidence of the targetis oblique with respect to the array axis

θ

nn-12

s(t)

1d

Source

d sinθ

frac14

Straight DoAs

FIG 3 (Color online) The straight and oblique DoAs exhibit

different basis coherence

Consider the case of ULA with n = 40 elementsreceiving two sources at straight DoAs θ1 = minus6 andθ2 = 2 or at oblique DoAs θ1 = 44 and θ2 = 52The above scenarios correspond to set-up 2 and set-up 3of Section VII see also Table II Angular separation be-tween the DoArsquos is 8 in both the scenarios In Table I wecompute the correlation between the true steering vectorsand with a neighboring steering vector a(ϑ) in the grid

TABLE I Correlation between true steering vector at DoAs

θ1 and θ2 and a steering vector at angle ϑ on the grid in a two

source scenario set-ups with either straight or oblique DoAs

θ1 θ2 ϑ correlation

True straight DoAs minus6 2 0071

minus6 minus7 0814

2 3 0812

True oblique DoAs 44 52 0069

44 45 0901

52 53 0927

These values validate the fact highlighted in Fig 3Namely a target with an oblique DoA wrt the arrayhas a larger maximal correlation (coherence) with the ba-sis steering vectors This makes it difficult for the sparse

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 7

recovery method to identify the true steering vector a(θi)from the spurious steering vector a(ϑ) that simply has avery large correlation with the true one Due to this mu-tual coherence it may happen that neither a(θi) or a(ϑ)

are assigned a non-zero coefficient value in β or perhapsjust one of them in random fashion

VII SIMULATION STUDIES

We consider seven simulation set-ups First five set-ups use grid spacing ∆θ = 1 (leading to p = 180 lookdirections in the grid [ϑ]) and the last two employ sparsergrid ∆θ = 2 (leading to p = 90 look directions in thegrid) The number of sensors n in the ULA are n = 40 forset-ups 1-5 and n = 30 for set-ups 6-7 Each set-up hasK isin 2 3 4 sources at different (straight or oblique)DoAs θ = (θ1 θK)⊤ and the source waveforms aregenerated as sk = |sk| middot eArg(sk) where source powers|sk| isin (0 1] are fixed for each set-up but the source phasesare randomly generated for each Monte-Carlo trial asArg(sk) sim Unif(0 2π) for k = 1 K Table II speci-fies the values of DoA-s and power of the sources used inthe set-ups Also the MBC values (15) are reported foreach case

TABLE II Details of all the set-ups tested in this paper First

five set-ups have grid spacing ∆θ = 1 and last two ∆θ = 2

Set-ups |si| θ [] MBC

1 [09 1 1] [minus5 3 6] 0814

2 [09 1] [minus6 2] 0814

3 [09 1] [44 52] 0927

4 [08 07 1] [43 44 52] 0927

5 [09 01 1 04] [minus87minus38minus35 97] 0990

6 [08 1 09 04] minus[485 464 315 22] 0991

7 [07 1 06 07] [6 8 14 18] 0643

The error terms εi are iid and generated fromCN (0 σ2) distribution where the noise variance σ2 de-pends on the signal-to-noise ratio (SNR) level in deci-bel (dB) given by SNR(dB) = 10 log10(σ

2sσ

2) whereσ2s = 1

K

|s1|2 + |s2|2 + middot middot middot + |sK |2

denotes the averagesource power An SNR level of 20 dB is used in thispaper unless its specified otherwise and the number ofMonte-Carlo trials is L = 1000

In each set-up we evaluate the performance of allmethods in recovering exactly the true support and thesource powers Due to high mutual coherence and due thelarge differences in source powers the DoA estimation isnow a challenging task A key performance measure isthe (empirical) probability of exact recovery (PER) of allK sources defined as

PER = aveI(Alowast = AK)

where Alowast denotes the index set of true source DoAs onthe grid and set AK the found support set where |Alowast| =|AK | = K and the average is over all Monte-Carlo tri-als Above I(middot) denotes the indicator function We alsocompute the average root mean squared error (RMSE)of the debiased estimate s = argminsisinCK y minus XAK

s2of the source vector as RMSE =

radic

avesminus s2

A Compared methods

This paper compares the SAEN approach to the ex-isting well-known greedy methods such as orthogonalmatching pursuit (OMP)28 and compressive samplingmatching pursuit (CoSaMP)29 Moreover we also drawcomparisons for two special cases of the c-PW-WEN al-gorithm to Lasso estimate that has K-non-zeros (ie

β(λK 1) computed by c-PW-WEN using w = 1 andα = 1) and EN estimator when cherry-picking the bestα in the grid [α] = αi isin [1 0) α1 = 1 lt middot middot middot lt αm lt

001 (ie β(λK αbst))It is instructive to compare the SAEN to simpler

adaptive EN (AEN) approach that simply uses adaptiveweights to weight different coefficients differently in thespirit of adaptive Lasso20 This helps in understandingthe effectiveness of the cleverly chosen weights and theusefulness of the three-step procedure used by the SAENalgorithm Recall that the first step in AEN approach iscompute the weights using some initial solution Afterobtaining the (adaptive) weights the final K-sparse AENsolution is computed We devise three AEN approacheseach one using a different initial solution to compute theweights

1 AEN(LSE) uses the weights found as

w(LSE) = 1⊘ |X+y|

where X+ is Moore-Penrose pseudo inverse of X

2 AEN(n) employs weights from an initial n-sparse

EN solution β(λn α) at nth knot which is found by

c-PW-WEN algorithm with w = 1

3 AEN(3K) instead uses weights calculated from an

initial EN solution β(λ3K α) having 3K nonzerosas in step 1 of SAEN algorithm but the remainingtwo steps of SAEN algorithm are omitted

The upper bound for PER rate for SAEN is the (em-

pirical) probability that the initial solution βinit com-puted in step 1 of algorithm 3 contains the true supportAlowast ie the value

UB = ave

I(

Alowast sub supp(βinit)

(16)

where the average is over all Monte-Carlo trials Wealso compute this upper bound to illustrate the abilityof SAEN to pick the true K-sparse support from theoriginal 3K-sparse initial value For set-up 1 (cf Ta-ble II) the average recovery results for all of the abovementioned methods are provided in Table III

8 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

TABLE III The recovery results for set-up 1 Results illus-

trate the effectiveness of three step SAEN approach compared

to its competitors The SNR level was 20 dB and the upper

bound (16) for the PER rate of SAEN is given in parentheses

SAEN AEN(3K) AEN(n)

PER (0957) 0864 0689 0616

RMSE - 0449 0694 0889

AEN(LSE) EN Lasso OMP CoSaMP

PER 0 0332 0332 0477 0140

RMSE 1870 1163 1163 1060 3558

It can be noted that the proposed SAEN outperformsall other methods and weighting schemes and recovers hetrue support and powers of the sources effectively Notethat the SAENrsquos upper bound for PER rate was 957and SAEN reached the PER rate 864 The results ofthe AEN approaches validate the need for accurate initialestimate to construct the adaptive weights For exampleAEN(3K) performs better than AEN(n) but much worsethan the SAEN method

B Straight and oblique DoAs

Set-up 2 and set-up 3 correspond to the case wherethe targets are at straight and oblique DoAs respectivelyPerformance results of the sparse recovery algorithms aretabulated in Table IV As can be seen the upper boundfor PER rate of SAEN is full 100 percentage whichmeans that the true support is correctly included in 3K-sparse solution computed at Step 1 of the algorithmFor set-up 2 (straight DoAs) all methods have almostfull PER rates except CoSaMP with 678 rate Per-formance of other estimators expect of SAEN changesdrastically in set-up 3 (oblique DoAs) Here SAEN isachieving nearly perfect (sim98) PER rate which meansreduction of 2 compared to set-up 2 Other methodsperform poorly For example PER rate of Lasso dropsfrom near 98 to 40 Similar behavior is observed forEN OMP and CoSaMP

Next we discuss the results for set-up 4 which is sim-ilar to set-up 3 except that we have introduced a thirdsource that also arrives from an oblique DoA θ = 43 andthe variation of the source powers is slightly larger Ascan be noted from Table IV the PER rates of greedy al-gorithms OMP and CoSaMP have declined to outstand-ingly low 0 This is very different with the PER ratesthey had in set-up 3 which contained only two sourcesIndeed inclusion of the 3rd source from an DoA similarwith the other two sources completely ruined their accu-racy This is in deep contrast with the SAEN methodthat still achieves PER rate of 75 which is more thantwice the PER rate achieved by Lasso SAEN is againhaving the lowest RMSE values

TABLE IV Recovery results for set-ups 2 - 4 Note that for

oblique DoArsquos (set-ups 3 and 4) the SAEN method outper-

form the other methods and has a perfect recovery results

for set-up 2 (straight DoAs) SNR level is 20 dB The upper

bound (16) for the PER rate of SAEN is given in parentheses

Set-up 2 with two straight DoAs

SAEN EN Lasso OMP CoSaMP

PER (1000) 1000 0981 0981 0998 0678

RMSE 0126 0145 0145 0128 1436

Set-up 3 with two oblique DoAs

SAEN EN Lasso OMP CoSaMP

PER (1000) 0978 0399 0399 0613 0113

RMSE 0154 0916 0916 0624 2296

Set-up 4 with three oblique DoAs

SAEN EN Lasso OMP CoSaMP

PER (0776) 0749 0392 0378 0 0

RMSE 0505 0838 0827 1087 5290

In summary the recovery results for set-ups 1-4(which express different degrees of basis coherence prox-imity of target DoArsquos as well as variation of source pow-ers) clearly illustrate that the proposed SAEN performsvery well in identifying the true support and the power ofthe sources and is always outperforming the commonlyused benchmarks sparse recovery methods namely theLasso EN OMP or CoSaMP with a significant marginIt is also noteworthy that EN often achieved better PERrates than Lasso which is mainly due to its group se-lection ability As a specific example of this particularfeature Figure 4 shows the solution paths for Lasso andEN for one particular Monte-Carlo trial where EN cor-rectly chooses the true DoAs but Lasso fails to select allcorrect DoAs In this particular instance the EN tun-ing parameter was α = 09 This is reason behind thesuccess of our c-PW-WEN algorithm which is the corecomputational engine of the SAEN

C Off-grid sources

Set-ups 5 and 6 explore the case when the targetDoAs are off the grid Also note that set-up 5 uses finergrid spacing ∆θ = 1 compared to set-up 6 with ∆θ = 2Both set-ups contain four target sources that have largelyvarying source powers In the off-grid case one would likethe CBF method to localize the targets to the nearestDoA in the angular grid [ϑ] that is used to construct thearray steering matrix Therefore in the off the grid case

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 9

0 05 1 15

-log( )

0

1P

ow

er (43)deg

(52)deg

3

(a)

0 05 1 15

-log( )

(43)deg

(44)deg

(52)deg

3

(b)

0

1

Po

wer

40 50 60 70

DoA [degrees]

Sources

Lasso

EN

(c)

FIG 4 (Color online) The Lasso and EN solution paths

(upper panel) and respective DoA solutions at the knot λ3

Observe that Lasso fails to recover the true support but EN

successfully picks the true DoAs The EN tuning parameter

was α = 09 in this example

the PER rate refers to the case that CBF method selectsthe K-sparse support that corresponds to DoArsquos on thegrid that are closest in distance to the true DoAs TableV provides the recovery results As can be seen again theSAEN is performing very well outperforming the Lassoand EN Note that OMP and CoSaMP completely fail inselecting the nearest grid-points

TABLE V Performance results of CBF methods for set-ups

5 and 6 where target DoA-s are off the grid Here PER

rate refers to the case that CBF method selects the K-sparse

support that corresponds to DoArsquos on the grid that are closest

to the true DoAs The upper bound (16) for the PER rate of

SAEN is given in parentheses SNR level is 20 dB

Set-up 5 with four off-grid straight DoAs

SAEN EN Lasso OMP CoSaMP

PER (0999) 0649 0349 0328 0 0

RMSE 0899 0947 0943 1137 8909

Set-up 6 with four off-grid oblique DoAs

SAEN EN Lasso OMP CoSaMP

PER (0794) 0683 0336 0336 0 0005

RMSE 0811 0913 0911 1360 28919

0 6 12 18 24 30

SNR [dB]

0

02

04

06

08

1

PE

R

FIG 5 (Color online) PER rates of CBF methods at different

SNR levels for set-up 7

D More targets and varying SNR levels

Next we consider the set-up 7 (cf Table II) whichcontains K = 4 sources The first three of the sourcesare at straight DoAs and the fourth one at a DoA withmodest obliqueness (θ4 = 18o) We now compute thePER rates of the methods as a function of SNR From thePER rates shown in Fig 5 we again notice that SAENclearly outperforms all of the other methods Note thatthe upper bound (16) of the PER rate of the SAEN is alsoplotted Both greedy algorithms OMP and CoSaMP areperforming very poorly even at high SNR levels Lassoand EN are attaining better recovery results than thegreedy algorithms Again EN is performing better thanLasso due to additional flexibility offered by EN tuningparameter and its ability to cope with correlated steering(basis) vectors SAEN recovers the exact true support inmost of the cases due to its step-wise adaptation usingcleverly chosen weights Furthermore the improvementin PER rates offered by SAEN becomes larger as the SNRlevel increases One can also notice that SAEN is close tothe theoretical upper bound of PER rate at higher SNRregime

VIII CONCLUSIONS

We developed c-PW-WEN algorithm that computesweighted elastic net solutions at the knots of penalty pa-rameter over a grid of EN tuning parameter values c-PW-WEN also computes weighted Lasso as a special case(ie solution at α = 1) and adaptive EN (AEN) is ob-tained when adaptive (data dependent) weights are usedWe then proposed a novel SAEN approach that uses c-PW-WEN method as its core computational engine anduses three-step adaptive weighting scheme where sparsityis decreased from 3K to K in three steps Simulationsillustrated that SAEN performs better than the adap-tive EN approaches Furthermore we illustrated that the3K-sparse initial solution computed at step 1 of SAENprovide smart weights for further steps and includes thetrue K-sparse support with high accuracy The proposedSAEN algorithm is then accurately including the truesupport at each step

10 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

Using the K-sparse Lasso solution computed directlyfrom Lasso path at the kth knot fails to provide exactsupport recovery in many cases especially when we havehigh basis coherence and lower SNR Greedy algorithmsoften fail in the face of high mutual coherence (due todense grid spacing or oblique target DoArsquos) or low SNRThis is mainly due the fact that their performance heav-ily depends on their ability to accurately detecting max-imal correlation between the measurement vector y andthe basis vectors (column vectors of X) Our simulationstudy also showed that their performance (in terms ofPER rate) deteriorates when the number of targets in-creases In the off-grid case the greedy algorithms alsofailed to find the nearby grid-points

Finally the SAEN algorithm performed better thanall other methods in each set-up and the improvementwas more pronounced in the presence of high mutualcoherence This is due to ability of SAEN to in-clude the true support correctly at all three steps ofthe algorithm Our MATLAB Rcopy package that imple-ments the proposed algorithms is freely available atwwwgithubcommntabassmSAEN-LARS The packagealso contains a MATLAB Rcopy live script demo on how touse the method in CBF problem along with an examplefrom simulation set-up 4 presented in the paper

ACKNOWLEDGMENTS

The research was partially supported by theAcademy of Finland grant no 298118 which is grate-fully acknowledged

1R Tibshirani ldquoRegression shrinkage and selection via the lassordquoJournal of the Royal Statistical Society Series B (Methodologi-cal) 267ndash288 (1996)

2H Zou and T Hastie ldquoRegularization and variable selection viathe elastic netrdquo Journal of the Royal Statistical Society SeriesB (Statistical Methodology) 67(2) 301ndash320 (2005)

3T Yardibi J Li P Stoica and L N C III ldquoSparsity con-strained deconvolution approaches for acoustic source mappingrdquoThe Journal of the Acoustical Society of America 123(5) 2631ndash2642 (2008) doi 10112112896754

4A Xenaki E Fernandez-Grande and P Gerstoft ldquoBlock-sparsebeamforming for spatially extended sources in a bayesian formu-lationrdquo The Journal of the Acoustical Society of America 140(3)1828ndash1838 (2016)

5D Malioutov M Cetin and A S Willsky ldquoA sparse signalreconstruction perspective for source localization with sensor ar-raysrdquo IEEE Transactions on Signal Processing 53(8) 3010ndash3022(2005) doi 101109TSP2005850882

6G F Edelmann and C F Gaumond ldquoBeamforming using com-pressive sensingrdquo The Journal of the Acoustical Society of Amer-ica 130(4) EL232ndashEL237 (2011) doi 10112113632046

7A Xenaki P Gerstoft and K Mosegaard ldquoCompressive beam-formingrdquo The Journal of the Acoustical Society of America136(1) 260ndash271 (2014) doi 10112114883360

8P Gerstoft A Xenaki and C F Mecklenbruker ldquoMultiple andsingle snapshot compressive beamformingrdquo The Journal of theAcoustical Society of America 138(4) 2003ndash2014 (2015) doi10112114929941

9Y Choo and W Seong ldquoCompressive spherical beamformingfor localization of incipient tip vortex cavitationrdquo The Journal

of the Acoustical Society of America 140(6) 4085ndash4090 (2016)doi 10112114968576

10A Das W S Hodgkiss and P Gerstoft ldquoCoherent multi-path direction-of-arrival resolution using compressed sensingrdquoIEEE Journal of Oceanic Engineering 42(2) 494ndash505 (2017) doi101109JOE20162576198

11S Fortunati R Grasso F Gini M S Greco and K LePageldquoSingle-snapshot doa estimation by using compressed sensingrdquoEURASIP Journal on Advances in Signal Processing 2014(1)120 (2014) doi 1011861687-6180-2014-120

12E Ollila ldquoNonparametric simultaneous sparse recovery An ap-plication to source localizationrdquo in Proc European Signal Pro-cessing Conference (EUSIPCOrsquo15) Nice France (2015) pp509ndash513

13E Ollila ldquoMultichannel sparse recovery of complex-valued sig-nals using Huberrsquos criterionrdquo in Proc Compressed Sensing The-ory and its Applications to Radar Sonar and Remote Sensing(CoSeRarsquo15) Pisa Italy (2015) pp 32ndash36

14M N Tabassum and E Ollila ldquoSingle-snapshot doa estimationusing adaptive elastic net in the complex domainrdquo in 4th In-ternational Workshop on Compressed Sensing Theory and itsApplications to Radar Sonar and Remote Sensing (CoSeRa)(2016) pp 197ndash201 doi 101109CoSeRa20167745728

15T Hastie R Tibshirani and M Wainwright Statistical learningwith sparsity the lasso and generalizations (CRC Press 2015)

16M Osborne B Presnell and B Turlach ldquoA new ap-proach to variable selection in least squares problemsrdquo IMAJournal of Numerical Analysis 20(3) 389ndash403 (2000) doi101093imanum203389

17B Efron T Hastie I Johnstone and R Tibshirani ldquoLeast an-gle regression (with discussion)rdquo The Annals of statistics 32(2)407ndash499 (2004) doi 101214009053604000000067

18M N Tabassum and E Ollila ldquoPathwise least angle regressionand a significance test for the elastic netrdquo in 25th European Sig-nal Processing Conference (EUSIPCO) (2017) pp 1309ndash1313doi 1023919EUSIPCO20178081420

19S Foucart and H Rauhut A mathematical introduction to com-pressive sensing Vol 1 (Springer 2013)

20H Zou ldquoThe adaptive lasso and its oracle propertiesrdquo Journal ofthe American Statistical Association 101(476) 1418ndash1429 (2006)doi 101198016214506000000735

21A E Hoerl and R W Kennard ldquoRidge regression biased esti-mation for nonorthogonal problemsrdquo Technometrics 12(1) 55ndash67 (1970)

22S Rosset and J Zhu ldquoPiecewise linear regularized so-lution pathsrdquo Ann Statist 35(3) 1012ndash1030 (2007) doi101214009053606000001370

23R J Tibshirani and J Taylor ldquoThe solution path of thegeneralized lassordquo Ann Statist 39(3) 1335ndash1371 (2011) doi10121411-AOS878

24R J Tibshirani ldquoThe lasso problem and uniquenessrdquo ElectronJ Statist 7 1456ndash1490 (2013) doi 10121413-EJS815

25A Panahi and M Viberg ldquoFast candidate points selection in thelasso pathrdquo IEEE Signal Processing Letters 19(2) 79ndash82 (2012)

26K L Gemba W S Hodgkiss and P Gerstoft ldquoAdaptiveand compressive matched field processingrdquo The Journal ofthe Acoustical Society of America 141(1) 92ndash103 (2017) doi10112114973528

27A B Gershman C F Mecklenbrauker and J F Bohme ldquoMa-trix fitting approach to direction of arrival estimation with imper-fect spatial coherence of wavefrontsrdquo IEEE Transactions on Sig-nal Processing 45(7) 1894ndash1899 (1997) doi 10110978599968

28J A Tropp and A C Gilbert ldquoSignal recovery from randommeasurements via orthogonal matching pursuitrdquo IEEE Trans-actions on Information Theory 53(12) 4655ndash4666 (2007) doi101109TIT2007909108

29D Needell and J Tropp ldquoCosamp Iterative signal recovery fromincomplete and inaccurate samplesrdquo Applied and Computational

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 11

Harmonic Analysis 26(3) 301 ndash 321 (2009)

12 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

Page 3: Sequential adaptive elastic net approach for single …Sequential adaptive elastic net approach for single-snapshot source localization MuhammadNaveedTabassum 1andEsaOllila Aalto University,

Lasso

EN

1

2

Least squares estimate

Low-correlation

contours

Sparse selection

(a)

Lasso

EN

1

2

Least squares estimate

High-correlation

contours

Group selection

Sparse

selection

(b)

FIG 1 (Color online) The Lasso solution is often at the

vertices (corners) but the EN solution can occur on edges as

well depending on the correlations among the variables In

the uncorrelated case of (a) both the Lasso and the EN has

sparse solution When the predictors are highly correlated as

in (b) the EN has a group selection in contrast to the Lasso

Lasso solution and for w = 1 and α = 0 we obtain theridge regression21 estimator

Then WEN solution can be computed using any al-gorithm that can find the non-weighted (w = 1) EN

solution To see this let us write X = Xdiag(w)minus1 and

β = β otimes w Then WEN solution is found by applyingthe following steps

1 Solve the (non-weighted) EN solution on trans-

formed data (y X)

β(λ α) = argminβisinCp

1

2yminus Xβ22 + λPα

(

β)

where Pα(β) = Pα(β1p) is the EN penalty

2 WEN solution for the original data (yX) is

β(λ α) = β ⊘w

Yet the standard (ie non-weighted wj equiv 1 forj = 1 p) EN estimator may perform inconsistentvariable selection The EN solution depends largely on λ(and α) and tuning these parameters optimally is a diffi-cult problem The adaptive Lasso20 obtains oracle vari-able selection property by using cleverly chosen adaptiveweights for regression coefficients in the ℓ1-penalty Weextend this idea to WEN-penalty coupling it with activeset approach where WEN is applied to only nonzero (ac-tive) coefficients Our proposed adaptive EN uses datadependent weights defined as

wj =

1|βinitj | βinitj 6= 0

infin βinitj = 0 j = 1 p (4)

Above βinit isin Cp denotes a sparse initial estimator of βThe idea is that only nonzero coefficients are exploited

ie basis vectors with βinitj = 0 are omitted from themodel and thus the dimensionality of the linear model is

reduced from p to K = βinit0 Moreover larger weightmeans that the corresponding variable is penalized moreheavily The vector w = (w1 wp)

⊤ can be writtencompactly as

w = 1p ⊘∣

∣βinit

∣ (5)

where notation |β| means element-wise application ofthe absolute value operator on the vector ie |β| =(|β1| |βp|)⊤

III COMPLEX-VALUED LARS METHOD FOR

WEIGHTED LASSO

In this section we develop the c-LARS-WLasso algo-rithm which is a complex-valued extension of the LARSalgorithm17 for weighted Lasso framework This is thenused to construct a complex-valued pathwise (c-PW-)WEN algorithm These methods compute solution (3)at particular penalty parameter λ values called knotsat which a new variable enters (or leaves) the active setof nonzero coefficients Our c-PW-WEN exploits the c-LARS-WLasso as its core computational engine

Let β(λ) denote a solution to (3) for some fixed valueλ of the penalty parameter in the case that Lasso penaltyis used (α = 1) with unit weights (ie w = 1p) Alsorecall that predictors are normalized so that xj22 = 1

j = 1 p Then note that the solution β(λ) needs toverify the generalized Karush-Kuhn-Tucker conditions

That is β(λ) is a solution to (3) if and only if it verifiesthe zero sub-gradient equations given by

〈xj rrr(λ)〉 = λsj for j = 1 p (6)

where rrr(λ) = y minusXβ(λ) and sj isin signβj(λ) meaning

that sj = eθ where θ = argβj(λ) if βj(λ) 6= 0 andsome number inside the unit circle sj isin x isin C |x| le1 otherwise Taking absolute values of both sides ofequation (6) one notices that at the solution condition|〈xj rrr(λ)〉| = λ holds for the active predictors whereas|〈xj rrr(λ)〉| le λ holds for non-active predictors Thus asλ decreases and more predictors are joined to the activeset the set of active predictors become less correlatedwith the residual Moreover the absolute value of thecorrelation or equivalently the angle between any activepredictor and the residual is the same In the real-valuedcase the LARS method exploits this feature and linearityof the Lasso path to compute the knot-values ie thevalue of the penalty parameters where there is a changein the active set of predictors

Let us briefly recall the main principle of the LARSalgorithm LARS starts with a model having no vari-ables (so β = 0) and picks a predictor that has maximalcorrelation (ie having smallest angle) with the residualrrr = y Suppose the predictor x1 is chosen Then themagnitude of the coefficient of the selected predictor isincreased (toward its least-squares value) until one hasreached a step size such that another predictor (say pre-dictor x2) has the same absolute value of correlation with

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 3

the evolving residual rrr = y minus (stepsize)x1 ie the up-dated residual makes equal angles with both predictorsas shown in Fig 2 Thereafter LARS moves in the newdirection which keeps the evolving residual equally cor-related (ie equiangular) with selected predictors untilanother predictor becomes equally correlated with theresidual After that one can repeat this process until allpredictors are in the model or to specified sparsity level

x2x2

x1

r = y

q1

q2

0

x2

x1

r

q1

0

q2

step-size

FIG 2 Starting from all zeros LARS picks predictor x1 that

makes least angle (ie θ1 lt θ2) with residual rrr and moves in

its direction until θ1 = θ2 where LARS picks x2 and changes

direction Then LARS repeats this procedure and selects

next equicorrelated predictor and so on until some stopping

criterion

We first consider the weighted Lasso (WLasso) prob-lem (so α = 1) by letting X = Xdiag(w)minus1 and β =

β otimesw We write β(λ) for the solution of the optimiza-tion problem (3) in this case Let λ0 denotes the small-est value of λ such that all coefficients of the WLassosolution are zero ie β(λ0) = 0 It is easy to seethat λ0 = maxj |〈xj y〉| for j = 1 2 p15 Let A =

suppβ(λ) denote the active set at the regularizationparameter value λ lt λ0 The knots λ1 gt λ2 gt middot middot middot gt λK

are defined as smallest values of the penalty parametersafter which there is a change in the set of active predic-tors ie the order of sparsity changes The active set at

a knot λk is denoted by Ak = suppβ(λk) The activeset A1 thus contains a single index as A1 = j1 wherej1 is predictor that becomes active first ie

j1 = argmaxjisin1p

|〈xj y〉|

By definition of the knots one has that Ak =

suppβ(λk) forallλ isin (λkminus1 λk] and Ak 6= Ak+1 for allk = 1 K

The c-LARS-WLasso outlined in Algorithm 1 is astraightforward generalization of LARS-Lasso algorithmto complex-valued and weighted case It does not havethe same theoretical guarantees as its real-valued coun-terpart to solve the exact values of the knots NamelyLARS algorithm uses the property that (in the real-valued case) the Lasso regularization path is continuousand piecewise linear with respect to λ see1722ndash24 In thecomplex-valued case however the solution path betweenthe knots is not necessarily linear25 Hence the c-LARS-WLasso may not give precise values of the knots in allthe cases However simulations validate that the algo-rithm finds the knots with reasonable precision Future

work is needed to provide theoretical guarantees of thealgorithm to find the knots

Algorithm 1 c-LARS-WLasso algorithm

input y isin Cn X isin C

ntimesp w isin Rp and K

output Ak λk β(λk)Kk=0

initialize β(0) = 0ptimes1 A0 = empty ∆ = 0ptimes1 the

residual rrr0 = y Set Xlarr Xdiag(w)minus1

1 Compute λ0 = maxj |〈xj rrr0〉| and

j1 = argmaxj |〈xj rrr0〉| where j = 1 p for

k = 1 K do

2 Find the active set Ak = Akminus1 cup jk and its

least- squares direction δ to have [∆]Ak= δ

δ =1

λkminus1(XH

AkXAk

)minus1X

HAk

rrrkminus1

3 Define vector β(λ) = β(kminus1) + (λkminus1 minus λ)∆ for

0 lt λ le λkminus1 and the corresponding residual as

rrr(λ) = y minusXβ(λ)

= y minusXβ(kminus1) minus (λkminus1 minus λ)X∆

= rrrkminus1 minus (λkminus1 minus λ)XAkδ

4 The knot λk is the largest λ-value 0 lt λ le λkminus1

st

〈xℓ rrr(λ)〉 = λeθ ℓ 6isin Ak (7)

where a new predictor (at index jk+1 6isin Ak)

becomes active thus verifying

|〈xjk+1 rrr(λk)〉| = λk from (7)

5 Update the values at a knot λk

β(k) = β

(kminus1) + (λkminus1 minus λk)∆

rrrk = y minusXβ(k)

The Lasso solution is β(λk) = β(k)

6 β(λk) = β(λk)⊘wKk=0

Below we discuss how to solve the knot λk and theindex jk+1 in step 5 of the c-LARS-WLasso algorithm

Solving step 5 First we note that

〈xℓ rrr(λ)〉 = 〈xℓ rrrkminus1 minus (λkminus1 minus λ)XAkδ〉

= 〈xℓ rrrkminus1〉 minus (λkminus1 minus λ)〈xℓXAkδ〉

= cℓ minus (λkminus1 minus λ)bℓ (8)

where we have written cℓ = 〈xℓ rrrkminus1〉 and bℓ =〈xℓXAk

δ〉 First we need to find λ for each ℓ 6isin Aksuch that |〈xℓ rrr(λ)〉| = λ holds Due to (7) and (8) thismeans finding 0 lt λ le λkminus1 such that

cℓ minus (λkminus1 minus λ)bℓ = λeθ (9)

Let us reparametrize such that λ = λkminus1 minus γℓ Thenidentifying 0 lt λ le λkminus1 is equivalent to identifying the

4 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

auxiliary variable γℓ ge 0 Now (9) becomes

cℓ minus (λkminus1 minus λkminus1 + γℓ)bℓ = (λkminus1 minus γℓ) eθ

hArr cℓ minus γℓbℓ = (λkminus1 minus γℓ) eθ

hArr |cℓ minus γℓbℓ|2 = (λkminus1 minus γℓ)2

hArr |cℓ|2 minus 2γℓRe(cℓblowastℓ ) + γ2

ℓ |bℓ|2 = λ2kminus1 minus 2λkminus1γℓ + γ2

The last equation implies that γℓ can be found by solvingthe roots of the second-order polynomial equation Aγ2

ℓ +Bγℓ+C = 0 where A = |bℓ|2minus1 B = 2λkminus1minus2Re(cℓb

lowastℓ )

and C = |cℓ|2 minus λ2kminus1 The roots are γℓ = γℓ1 γℓ2 and

the final value of γℓ will be

γℓ =

min(γℓ1 γℓ2) if γℓ1 gt 0 and γℓ2 gt 0

max(γℓ1 γℓ2)+ otherwise

where (t)+ = max(0 t) for t isin R Thus finding largestλ that verifies (7) is equivalent to finding smallest non-negative γℓ Hence the variable that enters to active setAk+1 is jk+1 = argminℓ 6isinAk

γℓ and the knot is thus λk =

λkminus1 minus γjk+1 Thereafter solution β(λk) at the knot λk

is simple to find in step 6

IV COMPLEX-VALUED PATHWISE WEIGHTED ELAS-

TIC NET

Next we develop a complex-valued and weighted ver-sion of PW-LARS-EN algorithm proposed in18 and re-ferred to as c-PW-WEN algorithm Generalization tocomplex-valued case is straightforward The essentialdifference is that c-LARS-WLasso Algorithm 1 is usedinstead of the (real-valued) LARS-Lasso algorithm Thealgorithm finds the Kth knot λK and the correspondingWEN solutions at a dense grid of EN tuning parametervalues α and then picks final solution for best α-value

Let λ0(α) denotes the smallest value of λ such that all

coefficients in the WEN estimate are zero ie β(λ0 α) =0 The value of λ0(α) can be expressed in closed-form15

λ0(α) = maxj

1

α

〈xj y〉wj

j = 1 p

The c-PW-WEN algorithm computes K-sparse WEN so-lutions for a set of α values in a dense grid

[α] = αi isin [1 0) α1 = 1 lt middot middot middot lt αm lt 0 (10)

Let A(λ) = suppβ(λ α) denote the active set (ienonzero elements of WEN solution) for a given fixed reg-ularization parameter value λ equiv λ(α) lt λ0(α) and forgiven α value in the grid [α] The knots λ1(α) gt λ2(α) gtmiddot middot middot gt λK(α) are the border values of the regularizationparameter after which there is a change in the set of ac-tive predictors Since α is fixed we drop the dependencyof the penalty parameter on α and simply write λ or λK

instead of λ(α) or λK(α) The reader should howeverkeep in mind that the value of the knots are different forany given α The active set at a knot λk is then denoted

shortly by Ak equiv A(λk) = suppβ(λk α) Note thatAk 6= Ak+1 for all k = 1 K Note that it is assumedthat a non-zero coefficient does not leave the active set forany value λ gt λK that is the sparsity level is increasingfrom 0 (at λ0) to K (at λK)

First we let that Algorithm 1 can be written asc-LARS-WLasso(yXwK) then we can write it as let

λk β(λk) = c-LARS-WLasso(yXw)∣

k

for extracting the kth knot (and the corresponding solu-tion) from a sequence of the knot-solution pairs found bythe c-LARS-WLasso algorithm Next note that we canwrite the EN objective function in augmented form asfollows

1

2yminusXβ22+λPα(β) =

1

2yaminusXa(η)β22+γ

∥β∥

1(11)

where

γ = λα and η = λ(1 minus α) (12)

are new parameterizations of the tuning and shrinkageparameter pair (α λ) and

ya =

(

y

0

)

and Xa(η) =

(

Xradicη Ip

)

are the augmented forms of the response vector y andthe predictor matrix X respectively Note that (11) re-sembles the Lasso objective function with ya isin Rn+p andthat Xa(η) is an (n+ p)times p matrix

It means that we can compute the K-sparse WENsolution at Kth knot for fixed α using the c-LARS-WLasso algorithm Our c-PW-WEN method is givenin algorithm 2 It computes the WEN solutions at theknots over a dense grid (10) of α values After Step 6of the algorithm we have solution at the Kth knot for a

given αi value on the grid Having the solution β(λK αi)available we then in steps 7 to 9 compute the residualsum of squares (RSS) of the debiased WEN solution atthe Kth knot (having K nonzeros)

RSS(αi) = y minusXAKβLS(λK αi)22

where AK = supp(β(λK αi)) is the active set at the Kth

knot and βLS(λK αi) is the debiased LSE defined as

βLS(λK αi) = X+AK

y (13)

where XAKisin CntimesK consists of the K active columns of

X associated with the active set AK and X+AK

denotesits Moore-Penrose pseudo inverse

While sweeping through the grid of α values and

computing the WEN solutions β(λK αi) we chooseour best candidate solution as the WEN estimateβ(λK αı) that had the smallest RSS value ie ı =argminiRSS(αi) where i isin 1 m

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 5

Algorithm 2 c-PW-WEN algorithm

input y isin Cn X isin C

ntimesp w isin Rp [α] isin R

m

(recall α1 = 1) K and debias

output βK isin Cp and AK isin R

K

1 λk(α1) β(λk α1)Kk=0 =

c-LARS-WLasso(

yXw K)

2 for i = 2 to m do

3 for k = 1 to K do

4 ηk = λk(αiminus1) middot (1minus αi)

5

γk β(λk αi)

=

c-LARS-WLasso(

yaXa(ηk)w)∣

k

6 λk(αi) = γkαi

7 AK = suppβ(λK αi)

8 βLS(λK αi) = X+AK

y

9 RSS(αi) = y minusXAKβLS(λK αi)

22

10 ı = argmini RSS(αi)

11 βK = β(λK αı) and AK = suppβK

12 if debias then βAK= X+

AKy

V SEQUENTIALLY ADAPTIVE ELASTIC NET

Next we turn our attention on how to choose theadaptive (ie data dependent) weights in c-PW-WENIn adaptive Lasso20 one ideally uses the LSE or if p gt n

the Lasso as an initial estimator βinit to construct theweights given in (4) The problem is that both the LSEand the Lasso estimator have very poor accuracy (highvariance) when there exists high correlations betweenpredictors which is the condition we are concerned inthis paper This lowers the probability of exact recoveryof the adaptive Lasso significantly

To overcome the problem above we devise a sequen-tial adaptive elastic net (SAEN) algorithm that obtainsthe K-sparse solution in a sequential manner decreasingthe sparsity level of the solution at each iteration and us-ing the previous solution as adaptive weights for c-PW-WEN The SAEN is described in algorithm 3 SAENruns the c-PW-WEN algorithm three times At first stepit finds a standard (unit weight) c-PW-WEN solution for3K nonzero (active) coefficients which we refer to as ini-

tial EN solution βinit The obtained solution determinesthe adaptive weights via (4) (and hence the active setof 3K nonzero coefficients) which is used in the secondstep to computes the c-PW-WEN solution that has 2Knonzero coefficients This again determines the adaptiveweights via (4) (and hence the active set of 2K nonzerocoefficients) which is used in the second step to computethe c-PW-WEN solution that has the desired K nonzerocoefficients It is important to notice that since we startfrom a solution with 3K nonzeros it is quite likely thatthe true K non-zero coefficients will be included in theactive set of βinit which is computed in the first step ofthe SAEN algorithm Note that the choice of 3K is sim-

ilar to CoSaMP algorithm29 which also uses 3K as aninitial support size Using 3K also usually guaranteesthat XA3K

is well conditioned which may not be the caseif larger value than 3K is chosen

Algorithm 3 SAEN algorithm

input y isin Cn X isin C

ntimesp [α] isin Rm and K

output βK isin Cp

1

βinitA3K

= c-PW-WEN(

yX1p [α] 3K 0)

2

βA2K

=

c-PW-WEN(

yXA3K 13K ⊘

∣βinitA3K

∣ [α] 2K 0)

3

βK AK

=

c-PW-WEN(

yXA2K 12K ⊘

∣βA2K

∣ [α] K 1)

VI SINGLE-SNAPSHOT COMPRESSIVE BEAMFORM-

ING

Estimating the source location in terms of its DoAplays an important role in many applications In5 it wasobserved that CS algorithms can be applied for DoA esti-mation (eg of sound sources) using sensor arrays whenthe array output y is be expressed as sparse (underdeter-mined) linear model by discretizing the DoA parameterspace This approach is referred to as compressive beam-forming (CBF) and it has been subsequently used in aseries of papers (eg7ndash1326)

In CBF after finding the sparse regression estimatorits support can be mapped to the DoA estimates on thegrid Thus the DoA estimates in CBF are always selectedfrom the resulting finite set of discretized DoA parame-ters Hence the resolution of CBF is dependent on thedensity of the grid (spacing ∆θ or grid size p) Densergrid implies large mutual coherence of the basis vectorsxj (here equal to the steering vectors for the DoAs onthe grid) and thus a poor recovery region for most sparseregression techniques

The proposed SAEN estimator can effectively miti-gate the effect of high mutual coherence caused by dis-cretization of the DoA space with significantly better per-formance than state-of-the-art compressed sensing algo-rithms This is illustrated in Section VII via extensivesimulation studies using challenging multi-source set-upsof closely-spaced sources and large variation of sourcepowers

We assume narrowband processing and a far-fieldsource wave impinging on an array of sensors with knownconfiguration The sources are assumed to be located inthe far-field of the sensor array (ie propagation radius≫ array size) A uniform linear array (ULA) of n sensors(eg hydrophones or microphones) is used for estimat-ing the DoA θ isin [minus90 90) of the source with respectto the array axis The array response (steering or wave-front vector) of ULA for a source from DoA (in radians)

6 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

θ isin [minusπ2 π2) is given by

a(

θ)

1radicn

[

1 eπ sin θ eπ(nminus1) sin θ]⊤

where we assume half a wavelength inter-element spac-ing between sensors We consider the case that K lt nsources from distinct DoAs θ = (θ1 θK) arrive at asensor at some time instant t A single-snapshot obtainedby ULA can then be modeled as27

y(t) = A(θ)s(t) + ε(t) (14)

where θ = (θ1 θK)⊤ collects the DoAs the matrix

A(θ) = [a(θ1) middot middot middot a(θK)] A isin CntimesK is the dictionaryof replicas also known as the array steering matrix s(t) isinCK contains the source waveforms and ε(t) is complexnoise at time instant t

Consider an angular grid of size p (commonly p ≫ n)of look directions of interest

[ϑ] = ϑi isin [minusπ2 π2) ϑ1 lt middot middot middot lt ϑp

Let the ith column of the measurement matrix X in themodel (1) be the array response for look direction ϑi soxi = a(ϑi) Then if the true source DoAs are containedin the angular grid ie θi isin [ϑ] for i = 1 K thenthe snapshot y in (14) (where we drop the time index t)can be equivalently modeled by (1) as

y = Xβ + ε

where β is exactly K-sparse (β0 = K) and nonzeroselements of β maintain the source waveforms s Thusidentifying the true DoAs is equivalent to identifyingthe nonzero elements of β which we refer to as CBF-principle Hence sparse regression and CS methods canbe utilised for estimating the DoAs based on a singlesnapshot only We assume that the number of sources Kis known a priori

Besides SNR also the size p or spacing ∆θ of the gridgreatly affect the performance of CBF methods Thecross-correlation (coherence) between the true steeringvector and steering vectors on the grid depends on boththe grid spacing and obliqueness of the target DoAs wrtto the array Moreover the values of cross-correlationsin the gram matrix |XHX| also depend on the distancebetween array elements and configuration of the sen-sors array7 Let us construct a measure called max-imal basis coherence (MBC) defined as the maximumabsolute value of the cross-correlations among the truesteering vectors a(θj) and the basis a(ϑi) ϑi isin [ϑ]θjj isin 1 K

MBC = maxj

maxϑisin[ϑ]θj

∣a(θj)H a(ϑ)

∣ (15)

Note that steering vectors a(θ) θ isin [minusπ2 π2) are as-sumed to be normalized (a(θ)H a(θ) = 1) MBC valuemeasures the obliqueness of the incoming DoA to the ar-ray The higher the MBC value the more difficult it is

for any CBF method to distinguish the true DoA in thegrid Note that the value of MBC also depends on thegrid spacing ∆θ

Fig 3 shows the geometry of DoA estimation prob-lem where the target DoArsquos have varying basis coherenceand it increases with the level of obliqueness (inclination)on either side of the normal to the ULA axis We definea straight DoA when the angle of incidence of the targetDoAs is in the shaded sector in Fig 3 This the regionwhere the MBC has lower values In contrast an obliqueDoA is defined when the angle of incidence of the targetis oblique with respect to the array axis

θ

nn-12

s(t)

1d

Source

d sinθ

frac14

Straight DoAs

FIG 3 (Color online) The straight and oblique DoAs exhibit

different basis coherence

Consider the case of ULA with n = 40 elementsreceiving two sources at straight DoAs θ1 = minus6 andθ2 = 2 or at oblique DoAs θ1 = 44 and θ2 = 52The above scenarios correspond to set-up 2 and set-up 3of Section VII see also Table II Angular separation be-tween the DoArsquos is 8 in both the scenarios In Table I wecompute the correlation between the true steering vectorsand with a neighboring steering vector a(ϑ) in the grid

TABLE I Correlation between true steering vector at DoAs

θ1 and θ2 and a steering vector at angle ϑ on the grid in a two

source scenario set-ups with either straight or oblique DoAs

θ1 θ2 ϑ correlation

True straight DoAs minus6 2 0071

minus6 minus7 0814

2 3 0812

True oblique DoAs 44 52 0069

44 45 0901

52 53 0927

These values validate the fact highlighted in Fig 3Namely a target with an oblique DoA wrt the arrayhas a larger maximal correlation (coherence) with the ba-sis steering vectors This makes it difficult for the sparse

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 7

recovery method to identify the true steering vector a(θi)from the spurious steering vector a(ϑ) that simply has avery large correlation with the true one Due to this mu-tual coherence it may happen that neither a(θi) or a(ϑ)

are assigned a non-zero coefficient value in β or perhapsjust one of them in random fashion

VII SIMULATION STUDIES

We consider seven simulation set-ups First five set-ups use grid spacing ∆θ = 1 (leading to p = 180 lookdirections in the grid [ϑ]) and the last two employ sparsergrid ∆θ = 2 (leading to p = 90 look directions in thegrid) The number of sensors n in the ULA are n = 40 forset-ups 1-5 and n = 30 for set-ups 6-7 Each set-up hasK isin 2 3 4 sources at different (straight or oblique)DoAs θ = (θ1 θK)⊤ and the source waveforms aregenerated as sk = |sk| middot eArg(sk) where source powers|sk| isin (0 1] are fixed for each set-up but the source phasesare randomly generated for each Monte-Carlo trial asArg(sk) sim Unif(0 2π) for k = 1 K Table II speci-fies the values of DoA-s and power of the sources used inthe set-ups Also the MBC values (15) are reported foreach case

TABLE II Details of all the set-ups tested in this paper First

five set-ups have grid spacing ∆θ = 1 and last two ∆θ = 2

Set-ups |si| θ [] MBC

1 [09 1 1] [minus5 3 6] 0814

2 [09 1] [minus6 2] 0814

3 [09 1] [44 52] 0927

4 [08 07 1] [43 44 52] 0927

5 [09 01 1 04] [minus87minus38minus35 97] 0990

6 [08 1 09 04] minus[485 464 315 22] 0991

7 [07 1 06 07] [6 8 14 18] 0643

The error terms εi are iid and generated fromCN (0 σ2) distribution where the noise variance σ2 de-pends on the signal-to-noise ratio (SNR) level in deci-bel (dB) given by SNR(dB) = 10 log10(σ

2sσ

2) whereσ2s = 1

K

|s1|2 + |s2|2 + middot middot middot + |sK |2

denotes the averagesource power An SNR level of 20 dB is used in thispaper unless its specified otherwise and the number ofMonte-Carlo trials is L = 1000

In each set-up we evaluate the performance of allmethods in recovering exactly the true support and thesource powers Due to high mutual coherence and due thelarge differences in source powers the DoA estimation isnow a challenging task A key performance measure isthe (empirical) probability of exact recovery (PER) of allK sources defined as

PER = aveI(Alowast = AK)

where Alowast denotes the index set of true source DoAs onthe grid and set AK the found support set where |Alowast| =|AK | = K and the average is over all Monte-Carlo tri-als Above I(middot) denotes the indicator function We alsocompute the average root mean squared error (RMSE)of the debiased estimate s = argminsisinCK y minus XAK

s2of the source vector as RMSE =

radic

avesminus s2

A Compared methods

This paper compares the SAEN approach to the ex-isting well-known greedy methods such as orthogonalmatching pursuit (OMP)28 and compressive samplingmatching pursuit (CoSaMP)29 Moreover we also drawcomparisons for two special cases of the c-PW-WEN al-gorithm to Lasso estimate that has K-non-zeros (ie

β(λK 1) computed by c-PW-WEN using w = 1 andα = 1) and EN estimator when cherry-picking the bestα in the grid [α] = αi isin [1 0) α1 = 1 lt middot middot middot lt αm lt

001 (ie β(λK αbst))It is instructive to compare the SAEN to simpler

adaptive EN (AEN) approach that simply uses adaptiveweights to weight different coefficients differently in thespirit of adaptive Lasso20 This helps in understandingthe effectiveness of the cleverly chosen weights and theusefulness of the three-step procedure used by the SAENalgorithm Recall that the first step in AEN approach iscompute the weights using some initial solution Afterobtaining the (adaptive) weights the final K-sparse AENsolution is computed We devise three AEN approacheseach one using a different initial solution to compute theweights

1 AEN(LSE) uses the weights found as

w(LSE) = 1⊘ |X+y|

where X+ is Moore-Penrose pseudo inverse of X

2 AEN(n) employs weights from an initial n-sparse

EN solution β(λn α) at nth knot which is found by

c-PW-WEN algorithm with w = 1

3 AEN(3K) instead uses weights calculated from an

initial EN solution β(λ3K α) having 3K nonzerosas in step 1 of SAEN algorithm but the remainingtwo steps of SAEN algorithm are omitted

The upper bound for PER rate for SAEN is the (em-

pirical) probability that the initial solution βinit com-puted in step 1 of algorithm 3 contains the true supportAlowast ie the value

UB = ave

I(

Alowast sub supp(βinit)

(16)

where the average is over all Monte-Carlo trials Wealso compute this upper bound to illustrate the abilityof SAEN to pick the true K-sparse support from theoriginal 3K-sparse initial value For set-up 1 (cf Ta-ble II) the average recovery results for all of the abovementioned methods are provided in Table III

8 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

TABLE III The recovery results for set-up 1 Results illus-

trate the effectiveness of three step SAEN approach compared

to its competitors The SNR level was 20 dB and the upper

bound (16) for the PER rate of SAEN is given in parentheses

SAEN AEN(3K) AEN(n)

PER (0957) 0864 0689 0616

RMSE - 0449 0694 0889

AEN(LSE) EN Lasso OMP CoSaMP

PER 0 0332 0332 0477 0140

RMSE 1870 1163 1163 1060 3558

It can be noted that the proposed SAEN outperformsall other methods and weighting schemes and recovers hetrue support and powers of the sources effectively Notethat the SAENrsquos upper bound for PER rate was 957and SAEN reached the PER rate 864 The results ofthe AEN approaches validate the need for accurate initialestimate to construct the adaptive weights For exampleAEN(3K) performs better than AEN(n) but much worsethan the SAEN method

B Straight and oblique DoAs

Set-up 2 and set-up 3 correspond to the case wherethe targets are at straight and oblique DoAs respectivelyPerformance results of the sparse recovery algorithms aretabulated in Table IV As can be seen the upper boundfor PER rate of SAEN is full 100 percentage whichmeans that the true support is correctly included in 3K-sparse solution computed at Step 1 of the algorithmFor set-up 2 (straight DoAs) all methods have almostfull PER rates except CoSaMP with 678 rate Per-formance of other estimators expect of SAEN changesdrastically in set-up 3 (oblique DoAs) Here SAEN isachieving nearly perfect (sim98) PER rate which meansreduction of 2 compared to set-up 2 Other methodsperform poorly For example PER rate of Lasso dropsfrom near 98 to 40 Similar behavior is observed forEN OMP and CoSaMP

Next we discuss the results for set-up 4 which is sim-ilar to set-up 3 except that we have introduced a thirdsource that also arrives from an oblique DoA θ = 43 andthe variation of the source powers is slightly larger Ascan be noted from Table IV the PER rates of greedy al-gorithms OMP and CoSaMP have declined to outstand-ingly low 0 This is very different with the PER ratesthey had in set-up 3 which contained only two sourcesIndeed inclusion of the 3rd source from an DoA similarwith the other two sources completely ruined their accu-racy This is in deep contrast with the SAEN methodthat still achieves PER rate of 75 which is more thantwice the PER rate achieved by Lasso SAEN is againhaving the lowest RMSE values

TABLE IV Recovery results for set-ups 2 - 4 Note that for

oblique DoArsquos (set-ups 3 and 4) the SAEN method outper-

form the other methods and has a perfect recovery results

for set-up 2 (straight DoAs) SNR level is 20 dB The upper

bound (16) for the PER rate of SAEN is given in parentheses

Set-up 2 with two straight DoAs

SAEN EN Lasso OMP CoSaMP

PER (1000) 1000 0981 0981 0998 0678

RMSE 0126 0145 0145 0128 1436

Set-up 3 with two oblique DoAs

SAEN EN Lasso OMP CoSaMP

PER (1000) 0978 0399 0399 0613 0113

RMSE 0154 0916 0916 0624 2296

Set-up 4 with three oblique DoAs

SAEN EN Lasso OMP CoSaMP

PER (0776) 0749 0392 0378 0 0

RMSE 0505 0838 0827 1087 5290

In summary the recovery results for set-ups 1-4(which express different degrees of basis coherence prox-imity of target DoArsquos as well as variation of source pow-ers) clearly illustrate that the proposed SAEN performsvery well in identifying the true support and the power ofthe sources and is always outperforming the commonlyused benchmarks sparse recovery methods namely theLasso EN OMP or CoSaMP with a significant marginIt is also noteworthy that EN often achieved better PERrates than Lasso which is mainly due to its group se-lection ability As a specific example of this particularfeature Figure 4 shows the solution paths for Lasso andEN for one particular Monte-Carlo trial where EN cor-rectly chooses the true DoAs but Lasso fails to select allcorrect DoAs In this particular instance the EN tun-ing parameter was α = 09 This is reason behind thesuccess of our c-PW-WEN algorithm which is the corecomputational engine of the SAEN

C Off-grid sources

Set-ups 5 and 6 explore the case when the targetDoAs are off the grid Also note that set-up 5 uses finergrid spacing ∆θ = 1 compared to set-up 6 with ∆θ = 2Both set-ups contain four target sources that have largelyvarying source powers In the off-grid case one would likethe CBF method to localize the targets to the nearestDoA in the angular grid [ϑ] that is used to construct thearray steering matrix Therefore in the off the grid case

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 9

0 05 1 15

-log( )

0

1P

ow

er (43)deg

(52)deg

3

(a)

0 05 1 15

-log( )

(43)deg

(44)deg

(52)deg

3

(b)

0

1

Po

wer

40 50 60 70

DoA [degrees]

Sources

Lasso

EN

(c)

FIG 4 (Color online) The Lasso and EN solution paths

(upper panel) and respective DoA solutions at the knot λ3

Observe that Lasso fails to recover the true support but EN

successfully picks the true DoAs The EN tuning parameter

was α = 09 in this example

the PER rate refers to the case that CBF method selectsthe K-sparse support that corresponds to DoArsquos on thegrid that are closest in distance to the true DoAs TableV provides the recovery results As can be seen again theSAEN is performing very well outperforming the Lassoand EN Note that OMP and CoSaMP completely fail inselecting the nearest grid-points

TABLE V Performance results of CBF methods for set-ups

5 and 6 where target DoA-s are off the grid Here PER

rate refers to the case that CBF method selects the K-sparse

support that corresponds to DoArsquos on the grid that are closest

to the true DoAs The upper bound (16) for the PER rate of

SAEN is given in parentheses SNR level is 20 dB

Set-up 5 with four off-grid straight DoAs

SAEN EN Lasso OMP CoSaMP

PER (0999) 0649 0349 0328 0 0

RMSE 0899 0947 0943 1137 8909

Set-up 6 with four off-grid oblique DoAs

SAEN EN Lasso OMP CoSaMP

PER (0794) 0683 0336 0336 0 0005

RMSE 0811 0913 0911 1360 28919

0 6 12 18 24 30

SNR [dB]

0

02

04

06

08

1

PE

R

FIG 5 (Color online) PER rates of CBF methods at different

SNR levels for set-up 7

D More targets and varying SNR levels

Next we consider the set-up 7 (cf Table II) whichcontains K = 4 sources The first three of the sourcesare at straight DoAs and the fourth one at a DoA withmodest obliqueness (θ4 = 18o) We now compute thePER rates of the methods as a function of SNR From thePER rates shown in Fig 5 we again notice that SAENclearly outperforms all of the other methods Note thatthe upper bound (16) of the PER rate of the SAEN is alsoplotted Both greedy algorithms OMP and CoSaMP areperforming very poorly even at high SNR levels Lassoand EN are attaining better recovery results than thegreedy algorithms Again EN is performing better thanLasso due to additional flexibility offered by EN tuningparameter and its ability to cope with correlated steering(basis) vectors SAEN recovers the exact true support inmost of the cases due to its step-wise adaptation usingcleverly chosen weights Furthermore the improvementin PER rates offered by SAEN becomes larger as the SNRlevel increases One can also notice that SAEN is close tothe theoretical upper bound of PER rate at higher SNRregime

VIII CONCLUSIONS

We developed c-PW-WEN algorithm that computesweighted elastic net solutions at the knots of penalty pa-rameter over a grid of EN tuning parameter values c-PW-WEN also computes weighted Lasso as a special case(ie solution at α = 1) and adaptive EN (AEN) is ob-tained when adaptive (data dependent) weights are usedWe then proposed a novel SAEN approach that uses c-PW-WEN method as its core computational engine anduses three-step adaptive weighting scheme where sparsityis decreased from 3K to K in three steps Simulationsillustrated that SAEN performs better than the adap-tive EN approaches Furthermore we illustrated that the3K-sparse initial solution computed at step 1 of SAENprovide smart weights for further steps and includes thetrue K-sparse support with high accuracy The proposedSAEN algorithm is then accurately including the truesupport at each step

10 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

Using the K-sparse Lasso solution computed directlyfrom Lasso path at the kth knot fails to provide exactsupport recovery in many cases especially when we havehigh basis coherence and lower SNR Greedy algorithmsoften fail in the face of high mutual coherence (due todense grid spacing or oblique target DoArsquos) or low SNRThis is mainly due the fact that their performance heav-ily depends on their ability to accurately detecting max-imal correlation between the measurement vector y andthe basis vectors (column vectors of X) Our simulationstudy also showed that their performance (in terms ofPER rate) deteriorates when the number of targets in-creases In the off-grid case the greedy algorithms alsofailed to find the nearby grid-points

Finally the SAEN algorithm performed better thanall other methods in each set-up and the improvementwas more pronounced in the presence of high mutualcoherence This is due to ability of SAEN to in-clude the true support correctly at all three steps ofthe algorithm Our MATLAB Rcopy package that imple-ments the proposed algorithms is freely available atwwwgithubcommntabassmSAEN-LARS The packagealso contains a MATLAB Rcopy live script demo on how touse the method in CBF problem along with an examplefrom simulation set-up 4 presented in the paper

ACKNOWLEDGMENTS

The research was partially supported by theAcademy of Finland grant no 298118 which is grate-fully acknowledged

1R Tibshirani ldquoRegression shrinkage and selection via the lassordquoJournal of the Royal Statistical Society Series B (Methodologi-cal) 267ndash288 (1996)

2H Zou and T Hastie ldquoRegularization and variable selection viathe elastic netrdquo Journal of the Royal Statistical Society SeriesB (Statistical Methodology) 67(2) 301ndash320 (2005)

3T Yardibi J Li P Stoica and L N C III ldquoSparsity con-strained deconvolution approaches for acoustic source mappingrdquoThe Journal of the Acoustical Society of America 123(5) 2631ndash2642 (2008) doi 10112112896754

4A Xenaki E Fernandez-Grande and P Gerstoft ldquoBlock-sparsebeamforming for spatially extended sources in a bayesian formu-lationrdquo The Journal of the Acoustical Society of America 140(3)1828ndash1838 (2016)

5D Malioutov M Cetin and A S Willsky ldquoA sparse signalreconstruction perspective for source localization with sensor ar-raysrdquo IEEE Transactions on Signal Processing 53(8) 3010ndash3022(2005) doi 101109TSP2005850882

6G F Edelmann and C F Gaumond ldquoBeamforming using com-pressive sensingrdquo The Journal of the Acoustical Society of Amer-ica 130(4) EL232ndashEL237 (2011) doi 10112113632046

7A Xenaki P Gerstoft and K Mosegaard ldquoCompressive beam-formingrdquo The Journal of the Acoustical Society of America136(1) 260ndash271 (2014) doi 10112114883360

8P Gerstoft A Xenaki and C F Mecklenbruker ldquoMultiple andsingle snapshot compressive beamformingrdquo The Journal of theAcoustical Society of America 138(4) 2003ndash2014 (2015) doi10112114929941

9Y Choo and W Seong ldquoCompressive spherical beamformingfor localization of incipient tip vortex cavitationrdquo The Journal

of the Acoustical Society of America 140(6) 4085ndash4090 (2016)doi 10112114968576

10A Das W S Hodgkiss and P Gerstoft ldquoCoherent multi-path direction-of-arrival resolution using compressed sensingrdquoIEEE Journal of Oceanic Engineering 42(2) 494ndash505 (2017) doi101109JOE20162576198

11S Fortunati R Grasso F Gini M S Greco and K LePageldquoSingle-snapshot doa estimation by using compressed sensingrdquoEURASIP Journal on Advances in Signal Processing 2014(1)120 (2014) doi 1011861687-6180-2014-120

12E Ollila ldquoNonparametric simultaneous sparse recovery An ap-plication to source localizationrdquo in Proc European Signal Pro-cessing Conference (EUSIPCOrsquo15) Nice France (2015) pp509ndash513

13E Ollila ldquoMultichannel sparse recovery of complex-valued sig-nals using Huberrsquos criterionrdquo in Proc Compressed Sensing The-ory and its Applications to Radar Sonar and Remote Sensing(CoSeRarsquo15) Pisa Italy (2015) pp 32ndash36

14M N Tabassum and E Ollila ldquoSingle-snapshot doa estimationusing adaptive elastic net in the complex domainrdquo in 4th In-ternational Workshop on Compressed Sensing Theory and itsApplications to Radar Sonar and Remote Sensing (CoSeRa)(2016) pp 197ndash201 doi 101109CoSeRa20167745728

15T Hastie R Tibshirani and M Wainwright Statistical learningwith sparsity the lasso and generalizations (CRC Press 2015)

16M Osborne B Presnell and B Turlach ldquoA new ap-proach to variable selection in least squares problemsrdquo IMAJournal of Numerical Analysis 20(3) 389ndash403 (2000) doi101093imanum203389

17B Efron T Hastie I Johnstone and R Tibshirani ldquoLeast an-gle regression (with discussion)rdquo The Annals of statistics 32(2)407ndash499 (2004) doi 101214009053604000000067

18M N Tabassum and E Ollila ldquoPathwise least angle regressionand a significance test for the elastic netrdquo in 25th European Sig-nal Processing Conference (EUSIPCO) (2017) pp 1309ndash1313doi 1023919EUSIPCO20178081420

19S Foucart and H Rauhut A mathematical introduction to com-pressive sensing Vol 1 (Springer 2013)

20H Zou ldquoThe adaptive lasso and its oracle propertiesrdquo Journal ofthe American Statistical Association 101(476) 1418ndash1429 (2006)doi 101198016214506000000735

21A E Hoerl and R W Kennard ldquoRidge regression biased esti-mation for nonorthogonal problemsrdquo Technometrics 12(1) 55ndash67 (1970)

22S Rosset and J Zhu ldquoPiecewise linear regularized so-lution pathsrdquo Ann Statist 35(3) 1012ndash1030 (2007) doi101214009053606000001370

23R J Tibshirani and J Taylor ldquoThe solution path of thegeneralized lassordquo Ann Statist 39(3) 1335ndash1371 (2011) doi10121411-AOS878

24R J Tibshirani ldquoThe lasso problem and uniquenessrdquo ElectronJ Statist 7 1456ndash1490 (2013) doi 10121413-EJS815

25A Panahi and M Viberg ldquoFast candidate points selection in thelasso pathrdquo IEEE Signal Processing Letters 19(2) 79ndash82 (2012)

26K L Gemba W S Hodgkiss and P Gerstoft ldquoAdaptiveand compressive matched field processingrdquo The Journal ofthe Acoustical Society of America 141(1) 92ndash103 (2017) doi10112114973528

27A B Gershman C F Mecklenbrauker and J F Bohme ldquoMa-trix fitting approach to direction of arrival estimation with imper-fect spatial coherence of wavefrontsrdquo IEEE Transactions on Sig-nal Processing 45(7) 1894ndash1899 (1997) doi 10110978599968

28J A Tropp and A C Gilbert ldquoSignal recovery from randommeasurements via orthogonal matching pursuitrdquo IEEE Trans-actions on Information Theory 53(12) 4655ndash4666 (2007) doi101109TIT2007909108

29D Needell and J Tropp ldquoCosamp Iterative signal recovery fromincomplete and inaccurate samplesrdquo Applied and Computational

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 11

Harmonic Analysis 26(3) 301 ndash 321 (2009)

12 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

Page 4: Sequential adaptive elastic net approach for single …Sequential adaptive elastic net approach for single-snapshot source localization MuhammadNaveedTabassum 1andEsaOllila Aalto University,

the evolving residual rrr = y minus (stepsize)x1 ie the up-dated residual makes equal angles with both predictorsas shown in Fig 2 Thereafter LARS moves in the newdirection which keeps the evolving residual equally cor-related (ie equiangular) with selected predictors untilanother predictor becomes equally correlated with theresidual After that one can repeat this process until allpredictors are in the model or to specified sparsity level

x2x2

x1

r = y

q1

q2

0

x2

x1

r

q1

0

q2

step-size

FIG 2 Starting from all zeros LARS picks predictor x1 that

makes least angle (ie θ1 lt θ2) with residual rrr and moves in

its direction until θ1 = θ2 where LARS picks x2 and changes

direction Then LARS repeats this procedure and selects

next equicorrelated predictor and so on until some stopping

criterion

We first consider the weighted Lasso (WLasso) prob-lem (so α = 1) by letting X = Xdiag(w)minus1 and β =

β otimesw We write β(λ) for the solution of the optimiza-tion problem (3) in this case Let λ0 denotes the small-est value of λ such that all coefficients of the WLassosolution are zero ie β(λ0) = 0 It is easy to seethat λ0 = maxj |〈xj y〉| for j = 1 2 p15 Let A =

suppβ(λ) denote the active set at the regularizationparameter value λ lt λ0 The knots λ1 gt λ2 gt middot middot middot gt λK

are defined as smallest values of the penalty parametersafter which there is a change in the set of active predic-tors ie the order of sparsity changes The active set at

a knot λk is denoted by Ak = suppβ(λk) The activeset A1 thus contains a single index as A1 = j1 wherej1 is predictor that becomes active first ie

j1 = argmaxjisin1p

|〈xj y〉|

By definition of the knots one has that Ak =

suppβ(λk) forallλ isin (λkminus1 λk] and Ak 6= Ak+1 for allk = 1 K

The c-LARS-WLasso outlined in Algorithm 1 is astraightforward generalization of LARS-Lasso algorithmto complex-valued and weighted case It does not havethe same theoretical guarantees as its real-valued coun-terpart to solve the exact values of the knots NamelyLARS algorithm uses the property that (in the real-valued case) the Lasso regularization path is continuousand piecewise linear with respect to λ see1722ndash24 In thecomplex-valued case however the solution path betweenthe knots is not necessarily linear25 Hence the c-LARS-WLasso may not give precise values of the knots in allthe cases However simulations validate that the algo-rithm finds the knots with reasonable precision Future

work is needed to provide theoretical guarantees of thealgorithm to find the knots

Algorithm 1 c-LARS-WLasso algorithm

input y isin Cn X isin C

ntimesp w isin Rp and K

output Ak λk β(λk)Kk=0

initialize β(0) = 0ptimes1 A0 = empty ∆ = 0ptimes1 the

residual rrr0 = y Set Xlarr Xdiag(w)minus1

1 Compute λ0 = maxj |〈xj rrr0〉| and

j1 = argmaxj |〈xj rrr0〉| where j = 1 p for

k = 1 K do

2 Find the active set Ak = Akminus1 cup jk and its

least- squares direction δ to have [∆]Ak= δ

δ =1

λkminus1(XH

AkXAk

)minus1X

HAk

rrrkminus1

3 Define vector β(λ) = β(kminus1) + (λkminus1 minus λ)∆ for

0 lt λ le λkminus1 and the corresponding residual as

rrr(λ) = y minusXβ(λ)

= y minusXβ(kminus1) minus (λkminus1 minus λ)X∆

= rrrkminus1 minus (λkminus1 minus λ)XAkδ

4 The knot λk is the largest λ-value 0 lt λ le λkminus1

st

〈xℓ rrr(λ)〉 = λeθ ℓ 6isin Ak (7)

where a new predictor (at index jk+1 6isin Ak)

becomes active thus verifying

|〈xjk+1 rrr(λk)〉| = λk from (7)

5 Update the values at a knot λk

β(k) = β

(kminus1) + (λkminus1 minus λk)∆

rrrk = y minusXβ(k)

The Lasso solution is β(λk) = β(k)

6 β(λk) = β(λk)⊘wKk=0

Below we discuss how to solve the knot λk and theindex jk+1 in step 5 of the c-LARS-WLasso algorithm

Solving step 5 First we note that

〈xℓ rrr(λ)〉 = 〈xℓ rrrkminus1 minus (λkminus1 minus λ)XAkδ〉

= 〈xℓ rrrkminus1〉 minus (λkminus1 minus λ)〈xℓXAkδ〉

= cℓ minus (λkminus1 minus λ)bℓ (8)

where we have written cℓ = 〈xℓ rrrkminus1〉 and bℓ =〈xℓXAk

δ〉 First we need to find λ for each ℓ 6isin Aksuch that |〈xℓ rrr(λ)〉| = λ holds Due to (7) and (8) thismeans finding 0 lt λ le λkminus1 such that

cℓ minus (λkminus1 minus λ)bℓ = λeθ (9)

Let us reparametrize such that λ = λkminus1 minus γℓ Thenidentifying 0 lt λ le λkminus1 is equivalent to identifying the

4 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

auxiliary variable γℓ ge 0 Now (9) becomes

cℓ minus (λkminus1 minus λkminus1 + γℓ)bℓ = (λkminus1 minus γℓ) eθ

hArr cℓ minus γℓbℓ = (λkminus1 minus γℓ) eθ

hArr |cℓ minus γℓbℓ|2 = (λkminus1 minus γℓ)2

hArr |cℓ|2 minus 2γℓRe(cℓblowastℓ ) + γ2

ℓ |bℓ|2 = λ2kminus1 minus 2λkminus1γℓ + γ2

The last equation implies that γℓ can be found by solvingthe roots of the second-order polynomial equation Aγ2

ℓ +Bγℓ+C = 0 where A = |bℓ|2minus1 B = 2λkminus1minus2Re(cℓb

lowastℓ )

and C = |cℓ|2 minus λ2kminus1 The roots are γℓ = γℓ1 γℓ2 and

the final value of γℓ will be

γℓ =

min(γℓ1 γℓ2) if γℓ1 gt 0 and γℓ2 gt 0

max(γℓ1 γℓ2)+ otherwise

where (t)+ = max(0 t) for t isin R Thus finding largestλ that verifies (7) is equivalent to finding smallest non-negative γℓ Hence the variable that enters to active setAk+1 is jk+1 = argminℓ 6isinAk

γℓ and the knot is thus λk =

λkminus1 minus γjk+1 Thereafter solution β(λk) at the knot λk

is simple to find in step 6

IV COMPLEX-VALUED PATHWISE WEIGHTED ELAS-

TIC NET

Next we develop a complex-valued and weighted ver-sion of PW-LARS-EN algorithm proposed in18 and re-ferred to as c-PW-WEN algorithm Generalization tocomplex-valued case is straightforward The essentialdifference is that c-LARS-WLasso Algorithm 1 is usedinstead of the (real-valued) LARS-Lasso algorithm Thealgorithm finds the Kth knot λK and the correspondingWEN solutions at a dense grid of EN tuning parametervalues α and then picks final solution for best α-value

Let λ0(α) denotes the smallest value of λ such that all

coefficients in the WEN estimate are zero ie β(λ0 α) =0 The value of λ0(α) can be expressed in closed-form15

λ0(α) = maxj

1

α

〈xj y〉wj

j = 1 p

The c-PW-WEN algorithm computes K-sparse WEN so-lutions for a set of α values in a dense grid

[α] = αi isin [1 0) α1 = 1 lt middot middot middot lt αm lt 0 (10)

Let A(λ) = suppβ(λ α) denote the active set (ienonzero elements of WEN solution) for a given fixed reg-ularization parameter value λ equiv λ(α) lt λ0(α) and forgiven α value in the grid [α] The knots λ1(α) gt λ2(α) gtmiddot middot middot gt λK(α) are the border values of the regularizationparameter after which there is a change in the set of ac-tive predictors Since α is fixed we drop the dependencyof the penalty parameter on α and simply write λ or λK

instead of λ(α) or λK(α) The reader should howeverkeep in mind that the value of the knots are different forany given α The active set at a knot λk is then denoted

shortly by Ak equiv A(λk) = suppβ(λk α) Note thatAk 6= Ak+1 for all k = 1 K Note that it is assumedthat a non-zero coefficient does not leave the active set forany value λ gt λK that is the sparsity level is increasingfrom 0 (at λ0) to K (at λK)

First we let that Algorithm 1 can be written asc-LARS-WLasso(yXwK) then we can write it as let

λk β(λk) = c-LARS-WLasso(yXw)∣

k

for extracting the kth knot (and the corresponding solu-tion) from a sequence of the knot-solution pairs found bythe c-LARS-WLasso algorithm Next note that we canwrite the EN objective function in augmented form asfollows

1

2yminusXβ22+λPα(β) =

1

2yaminusXa(η)β22+γ

∥β∥

1(11)

where

γ = λα and η = λ(1 minus α) (12)

are new parameterizations of the tuning and shrinkageparameter pair (α λ) and

ya =

(

y

0

)

and Xa(η) =

(

Xradicη Ip

)

are the augmented forms of the response vector y andthe predictor matrix X respectively Note that (11) re-sembles the Lasso objective function with ya isin Rn+p andthat Xa(η) is an (n+ p)times p matrix

It means that we can compute the K-sparse WENsolution at Kth knot for fixed α using the c-LARS-WLasso algorithm Our c-PW-WEN method is givenin algorithm 2 It computes the WEN solutions at theknots over a dense grid (10) of α values After Step 6of the algorithm we have solution at the Kth knot for a

given αi value on the grid Having the solution β(λK αi)available we then in steps 7 to 9 compute the residualsum of squares (RSS) of the debiased WEN solution atthe Kth knot (having K nonzeros)

RSS(αi) = y minusXAKβLS(λK αi)22

where AK = supp(β(λK αi)) is the active set at the Kth

knot and βLS(λK αi) is the debiased LSE defined as

βLS(λK αi) = X+AK

y (13)

where XAKisin CntimesK consists of the K active columns of

X associated with the active set AK and X+AK

denotesits Moore-Penrose pseudo inverse

While sweeping through the grid of α values and

computing the WEN solutions β(λK αi) we chooseour best candidate solution as the WEN estimateβ(λK αı) that had the smallest RSS value ie ı =argminiRSS(αi) where i isin 1 m

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 5

Algorithm 2 c-PW-WEN algorithm

input y isin Cn X isin C

ntimesp w isin Rp [α] isin R

m

(recall α1 = 1) K and debias

output βK isin Cp and AK isin R

K

1 λk(α1) β(λk α1)Kk=0 =

c-LARS-WLasso(

yXw K)

2 for i = 2 to m do

3 for k = 1 to K do

4 ηk = λk(αiminus1) middot (1minus αi)

5

γk β(λk αi)

=

c-LARS-WLasso(

yaXa(ηk)w)∣

k

6 λk(αi) = γkαi

7 AK = suppβ(λK αi)

8 βLS(λK αi) = X+AK

y

9 RSS(αi) = y minusXAKβLS(λK αi)

22

10 ı = argmini RSS(αi)

11 βK = β(λK αı) and AK = suppβK

12 if debias then βAK= X+

AKy

V SEQUENTIALLY ADAPTIVE ELASTIC NET

Next we turn our attention on how to choose theadaptive (ie data dependent) weights in c-PW-WENIn adaptive Lasso20 one ideally uses the LSE or if p gt n

the Lasso as an initial estimator βinit to construct theweights given in (4) The problem is that both the LSEand the Lasso estimator have very poor accuracy (highvariance) when there exists high correlations betweenpredictors which is the condition we are concerned inthis paper This lowers the probability of exact recoveryof the adaptive Lasso significantly

To overcome the problem above we devise a sequen-tial adaptive elastic net (SAEN) algorithm that obtainsthe K-sparse solution in a sequential manner decreasingthe sparsity level of the solution at each iteration and us-ing the previous solution as adaptive weights for c-PW-WEN The SAEN is described in algorithm 3 SAENruns the c-PW-WEN algorithm three times At first stepit finds a standard (unit weight) c-PW-WEN solution for3K nonzero (active) coefficients which we refer to as ini-

tial EN solution βinit The obtained solution determinesthe adaptive weights via (4) (and hence the active setof 3K nonzero coefficients) which is used in the secondstep to computes the c-PW-WEN solution that has 2Knonzero coefficients This again determines the adaptiveweights via (4) (and hence the active set of 2K nonzerocoefficients) which is used in the second step to computethe c-PW-WEN solution that has the desired K nonzerocoefficients It is important to notice that since we startfrom a solution with 3K nonzeros it is quite likely thatthe true K non-zero coefficients will be included in theactive set of βinit which is computed in the first step ofthe SAEN algorithm Note that the choice of 3K is sim-

ilar to CoSaMP algorithm29 which also uses 3K as aninitial support size Using 3K also usually guaranteesthat XA3K

is well conditioned which may not be the caseif larger value than 3K is chosen

Algorithm 3 SAEN algorithm

input y isin Cn X isin C

ntimesp [α] isin Rm and K

output βK isin Cp

1

βinitA3K

= c-PW-WEN(

yX1p [α] 3K 0)

2

βA2K

=

c-PW-WEN(

yXA3K 13K ⊘

∣βinitA3K

∣ [α] 2K 0)

3

βK AK

=

c-PW-WEN(

yXA2K 12K ⊘

∣βA2K

∣ [α] K 1)

VI SINGLE-SNAPSHOT COMPRESSIVE BEAMFORM-

ING

Estimating the source location in terms of its DoAplays an important role in many applications In5 it wasobserved that CS algorithms can be applied for DoA esti-mation (eg of sound sources) using sensor arrays whenthe array output y is be expressed as sparse (underdeter-mined) linear model by discretizing the DoA parameterspace This approach is referred to as compressive beam-forming (CBF) and it has been subsequently used in aseries of papers (eg7ndash1326)

In CBF after finding the sparse regression estimatorits support can be mapped to the DoA estimates on thegrid Thus the DoA estimates in CBF are always selectedfrom the resulting finite set of discretized DoA parame-ters Hence the resolution of CBF is dependent on thedensity of the grid (spacing ∆θ or grid size p) Densergrid implies large mutual coherence of the basis vectorsxj (here equal to the steering vectors for the DoAs onthe grid) and thus a poor recovery region for most sparseregression techniques

The proposed SAEN estimator can effectively miti-gate the effect of high mutual coherence caused by dis-cretization of the DoA space with significantly better per-formance than state-of-the-art compressed sensing algo-rithms This is illustrated in Section VII via extensivesimulation studies using challenging multi-source set-upsof closely-spaced sources and large variation of sourcepowers

We assume narrowband processing and a far-fieldsource wave impinging on an array of sensors with knownconfiguration The sources are assumed to be located inthe far-field of the sensor array (ie propagation radius≫ array size) A uniform linear array (ULA) of n sensors(eg hydrophones or microphones) is used for estimat-ing the DoA θ isin [minus90 90) of the source with respectto the array axis The array response (steering or wave-front vector) of ULA for a source from DoA (in radians)

6 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

θ isin [minusπ2 π2) is given by

a(

θ)

1radicn

[

1 eπ sin θ eπ(nminus1) sin θ]⊤

where we assume half a wavelength inter-element spac-ing between sensors We consider the case that K lt nsources from distinct DoAs θ = (θ1 θK) arrive at asensor at some time instant t A single-snapshot obtainedby ULA can then be modeled as27

y(t) = A(θ)s(t) + ε(t) (14)

where θ = (θ1 θK)⊤ collects the DoAs the matrix

A(θ) = [a(θ1) middot middot middot a(θK)] A isin CntimesK is the dictionaryof replicas also known as the array steering matrix s(t) isinCK contains the source waveforms and ε(t) is complexnoise at time instant t

Consider an angular grid of size p (commonly p ≫ n)of look directions of interest

[ϑ] = ϑi isin [minusπ2 π2) ϑ1 lt middot middot middot lt ϑp

Let the ith column of the measurement matrix X in themodel (1) be the array response for look direction ϑi soxi = a(ϑi) Then if the true source DoAs are containedin the angular grid ie θi isin [ϑ] for i = 1 K thenthe snapshot y in (14) (where we drop the time index t)can be equivalently modeled by (1) as

y = Xβ + ε

where β is exactly K-sparse (β0 = K) and nonzeroselements of β maintain the source waveforms s Thusidentifying the true DoAs is equivalent to identifyingthe nonzero elements of β which we refer to as CBF-principle Hence sparse regression and CS methods canbe utilised for estimating the DoAs based on a singlesnapshot only We assume that the number of sources Kis known a priori

Besides SNR also the size p or spacing ∆θ of the gridgreatly affect the performance of CBF methods Thecross-correlation (coherence) between the true steeringvector and steering vectors on the grid depends on boththe grid spacing and obliqueness of the target DoAs wrtto the array Moreover the values of cross-correlationsin the gram matrix |XHX| also depend on the distancebetween array elements and configuration of the sen-sors array7 Let us construct a measure called max-imal basis coherence (MBC) defined as the maximumabsolute value of the cross-correlations among the truesteering vectors a(θj) and the basis a(ϑi) ϑi isin [ϑ]θjj isin 1 K

MBC = maxj

maxϑisin[ϑ]θj

∣a(θj)H a(ϑ)

∣ (15)

Note that steering vectors a(θ) θ isin [minusπ2 π2) are as-sumed to be normalized (a(θ)H a(θ) = 1) MBC valuemeasures the obliqueness of the incoming DoA to the ar-ray The higher the MBC value the more difficult it is

for any CBF method to distinguish the true DoA in thegrid Note that the value of MBC also depends on thegrid spacing ∆θ

Fig 3 shows the geometry of DoA estimation prob-lem where the target DoArsquos have varying basis coherenceand it increases with the level of obliqueness (inclination)on either side of the normal to the ULA axis We definea straight DoA when the angle of incidence of the targetDoAs is in the shaded sector in Fig 3 This the regionwhere the MBC has lower values In contrast an obliqueDoA is defined when the angle of incidence of the targetis oblique with respect to the array axis

θ

nn-12

s(t)

1d

Source

d sinθ

frac14

Straight DoAs

FIG 3 (Color online) The straight and oblique DoAs exhibit

different basis coherence

Consider the case of ULA with n = 40 elementsreceiving two sources at straight DoAs θ1 = minus6 andθ2 = 2 or at oblique DoAs θ1 = 44 and θ2 = 52The above scenarios correspond to set-up 2 and set-up 3of Section VII see also Table II Angular separation be-tween the DoArsquos is 8 in both the scenarios In Table I wecompute the correlation between the true steering vectorsand with a neighboring steering vector a(ϑ) in the grid

TABLE I Correlation between true steering vector at DoAs

θ1 and θ2 and a steering vector at angle ϑ on the grid in a two

source scenario set-ups with either straight or oblique DoAs

θ1 θ2 ϑ correlation

True straight DoAs minus6 2 0071

minus6 minus7 0814

2 3 0812

True oblique DoAs 44 52 0069

44 45 0901

52 53 0927

These values validate the fact highlighted in Fig 3Namely a target with an oblique DoA wrt the arrayhas a larger maximal correlation (coherence) with the ba-sis steering vectors This makes it difficult for the sparse

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 7

recovery method to identify the true steering vector a(θi)from the spurious steering vector a(ϑ) that simply has avery large correlation with the true one Due to this mu-tual coherence it may happen that neither a(θi) or a(ϑ)

are assigned a non-zero coefficient value in β or perhapsjust one of them in random fashion

VII SIMULATION STUDIES

We consider seven simulation set-ups First five set-ups use grid spacing ∆θ = 1 (leading to p = 180 lookdirections in the grid [ϑ]) and the last two employ sparsergrid ∆θ = 2 (leading to p = 90 look directions in thegrid) The number of sensors n in the ULA are n = 40 forset-ups 1-5 and n = 30 for set-ups 6-7 Each set-up hasK isin 2 3 4 sources at different (straight or oblique)DoAs θ = (θ1 θK)⊤ and the source waveforms aregenerated as sk = |sk| middot eArg(sk) where source powers|sk| isin (0 1] are fixed for each set-up but the source phasesare randomly generated for each Monte-Carlo trial asArg(sk) sim Unif(0 2π) for k = 1 K Table II speci-fies the values of DoA-s and power of the sources used inthe set-ups Also the MBC values (15) are reported foreach case

TABLE II Details of all the set-ups tested in this paper First

five set-ups have grid spacing ∆θ = 1 and last two ∆θ = 2

Set-ups |si| θ [] MBC

1 [09 1 1] [minus5 3 6] 0814

2 [09 1] [minus6 2] 0814

3 [09 1] [44 52] 0927

4 [08 07 1] [43 44 52] 0927

5 [09 01 1 04] [minus87minus38minus35 97] 0990

6 [08 1 09 04] minus[485 464 315 22] 0991

7 [07 1 06 07] [6 8 14 18] 0643

The error terms εi are iid and generated fromCN (0 σ2) distribution where the noise variance σ2 de-pends on the signal-to-noise ratio (SNR) level in deci-bel (dB) given by SNR(dB) = 10 log10(σ

2sσ

2) whereσ2s = 1

K

|s1|2 + |s2|2 + middot middot middot + |sK |2

denotes the averagesource power An SNR level of 20 dB is used in thispaper unless its specified otherwise and the number ofMonte-Carlo trials is L = 1000

In each set-up we evaluate the performance of allmethods in recovering exactly the true support and thesource powers Due to high mutual coherence and due thelarge differences in source powers the DoA estimation isnow a challenging task A key performance measure isthe (empirical) probability of exact recovery (PER) of allK sources defined as

PER = aveI(Alowast = AK)

where Alowast denotes the index set of true source DoAs onthe grid and set AK the found support set where |Alowast| =|AK | = K and the average is over all Monte-Carlo tri-als Above I(middot) denotes the indicator function We alsocompute the average root mean squared error (RMSE)of the debiased estimate s = argminsisinCK y minus XAK

s2of the source vector as RMSE =

radic

avesminus s2

A Compared methods

This paper compares the SAEN approach to the ex-isting well-known greedy methods such as orthogonalmatching pursuit (OMP)28 and compressive samplingmatching pursuit (CoSaMP)29 Moreover we also drawcomparisons for two special cases of the c-PW-WEN al-gorithm to Lasso estimate that has K-non-zeros (ie

β(λK 1) computed by c-PW-WEN using w = 1 andα = 1) and EN estimator when cherry-picking the bestα in the grid [α] = αi isin [1 0) α1 = 1 lt middot middot middot lt αm lt

001 (ie β(λK αbst))It is instructive to compare the SAEN to simpler

adaptive EN (AEN) approach that simply uses adaptiveweights to weight different coefficients differently in thespirit of adaptive Lasso20 This helps in understandingthe effectiveness of the cleverly chosen weights and theusefulness of the three-step procedure used by the SAENalgorithm Recall that the first step in AEN approach iscompute the weights using some initial solution Afterobtaining the (adaptive) weights the final K-sparse AENsolution is computed We devise three AEN approacheseach one using a different initial solution to compute theweights

1 AEN(LSE) uses the weights found as

w(LSE) = 1⊘ |X+y|

where X+ is Moore-Penrose pseudo inverse of X

2 AEN(n) employs weights from an initial n-sparse

EN solution β(λn α) at nth knot which is found by

c-PW-WEN algorithm with w = 1

3 AEN(3K) instead uses weights calculated from an

initial EN solution β(λ3K α) having 3K nonzerosas in step 1 of SAEN algorithm but the remainingtwo steps of SAEN algorithm are omitted

The upper bound for PER rate for SAEN is the (em-

pirical) probability that the initial solution βinit com-puted in step 1 of algorithm 3 contains the true supportAlowast ie the value

UB = ave

I(

Alowast sub supp(βinit)

(16)

where the average is over all Monte-Carlo trials Wealso compute this upper bound to illustrate the abilityof SAEN to pick the true K-sparse support from theoriginal 3K-sparse initial value For set-up 1 (cf Ta-ble II) the average recovery results for all of the abovementioned methods are provided in Table III

8 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

TABLE III The recovery results for set-up 1 Results illus-

trate the effectiveness of three step SAEN approach compared

to its competitors The SNR level was 20 dB and the upper

bound (16) for the PER rate of SAEN is given in parentheses

SAEN AEN(3K) AEN(n)

PER (0957) 0864 0689 0616

RMSE - 0449 0694 0889

AEN(LSE) EN Lasso OMP CoSaMP

PER 0 0332 0332 0477 0140

RMSE 1870 1163 1163 1060 3558

It can be noted that the proposed SAEN outperformsall other methods and weighting schemes and recovers hetrue support and powers of the sources effectively Notethat the SAENrsquos upper bound for PER rate was 957and SAEN reached the PER rate 864 The results ofthe AEN approaches validate the need for accurate initialestimate to construct the adaptive weights For exampleAEN(3K) performs better than AEN(n) but much worsethan the SAEN method

B Straight and oblique DoAs

Set-up 2 and set-up 3 correspond to the case wherethe targets are at straight and oblique DoAs respectivelyPerformance results of the sparse recovery algorithms aretabulated in Table IV As can be seen the upper boundfor PER rate of SAEN is full 100 percentage whichmeans that the true support is correctly included in 3K-sparse solution computed at Step 1 of the algorithmFor set-up 2 (straight DoAs) all methods have almostfull PER rates except CoSaMP with 678 rate Per-formance of other estimators expect of SAEN changesdrastically in set-up 3 (oblique DoAs) Here SAEN isachieving nearly perfect (sim98) PER rate which meansreduction of 2 compared to set-up 2 Other methodsperform poorly For example PER rate of Lasso dropsfrom near 98 to 40 Similar behavior is observed forEN OMP and CoSaMP

Next we discuss the results for set-up 4 which is sim-ilar to set-up 3 except that we have introduced a thirdsource that also arrives from an oblique DoA θ = 43 andthe variation of the source powers is slightly larger Ascan be noted from Table IV the PER rates of greedy al-gorithms OMP and CoSaMP have declined to outstand-ingly low 0 This is very different with the PER ratesthey had in set-up 3 which contained only two sourcesIndeed inclusion of the 3rd source from an DoA similarwith the other two sources completely ruined their accu-racy This is in deep contrast with the SAEN methodthat still achieves PER rate of 75 which is more thantwice the PER rate achieved by Lasso SAEN is againhaving the lowest RMSE values

TABLE IV Recovery results for set-ups 2 - 4 Note that for

oblique DoArsquos (set-ups 3 and 4) the SAEN method outper-

form the other methods and has a perfect recovery results

for set-up 2 (straight DoAs) SNR level is 20 dB The upper

bound (16) for the PER rate of SAEN is given in parentheses

Set-up 2 with two straight DoAs

SAEN EN Lasso OMP CoSaMP

PER (1000) 1000 0981 0981 0998 0678

RMSE 0126 0145 0145 0128 1436

Set-up 3 with two oblique DoAs

SAEN EN Lasso OMP CoSaMP

PER (1000) 0978 0399 0399 0613 0113

RMSE 0154 0916 0916 0624 2296

Set-up 4 with three oblique DoAs

SAEN EN Lasso OMP CoSaMP

PER (0776) 0749 0392 0378 0 0

RMSE 0505 0838 0827 1087 5290

In summary the recovery results for set-ups 1-4(which express different degrees of basis coherence prox-imity of target DoArsquos as well as variation of source pow-ers) clearly illustrate that the proposed SAEN performsvery well in identifying the true support and the power ofthe sources and is always outperforming the commonlyused benchmarks sparse recovery methods namely theLasso EN OMP or CoSaMP with a significant marginIt is also noteworthy that EN often achieved better PERrates than Lasso which is mainly due to its group se-lection ability As a specific example of this particularfeature Figure 4 shows the solution paths for Lasso andEN for one particular Monte-Carlo trial where EN cor-rectly chooses the true DoAs but Lasso fails to select allcorrect DoAs In this particular instance the EN tun-ing parameter was α = 09 This is reason behind thesuccess of our c-PW-WEN algorithm which is the corecomputational engine of the SAEN

C Off-grid sources

Set-ups 5 and 6 explore the case when the targetDoAs are off the grid Also note that set-up 5 uses finergrid spacing ∆θ = 1 compared to set-up 6 with ∆θ = 2Both set-ups contain four target sources that have largelyvarying source powers In the off-grid case one would likethe CBF method to localize the targets to the nearestDoA in the angular grid [ϑ] that is used to construct thearray steering matrix Therefore in the off the grid case

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 9

0 05 1 15

-log( )

0

1P

ow

er (43)deg

(52)deg

3

(a)

0 05 1 15

-log( )

(43)deg

(44)deg

(52)deg

3

(b)

0

1

Po

wer

40 50 60 70

DoA [degrees]

Sources

Lasso

EN

(c)

FIG 4 (Color online) The Lasso and EN solution paths

(upper panel) and respective DoA solutions at the knot λ3

Observe that Lasso fails to recover the true support but EN

successfully picks the true DoAs The EN tuning parameter

was α = 09 in this example

the PER rate refers to the case that CBF method selectsthe K-sparse support that corresponds to DoArsquos on thegrid that are closest in distance to the true DoAs TableV provides the recovery results As can be seen again theSAEN is performing very well outperforming the Lassoand EN Note that OMP and CoSaMP completely fail inselecting the nearest grid-points

TABLE V Performance results of CBF methods for set-ups

5 and 6 where target DoA-s are off the grid Here PER

rate refers to the case that CBF method selects the K-sparse

support that corresponds to DoArsquos on the grid that are closest

to the true DoAs The upper bound (16) for the PER rate of

SAEN is given in parentheses SNR level is 20 dB

Set-up 5 with four off-grid straight DoAs

SAEN EN Lasso OMP CoSaMP

PER (0999) 0649 0349 0328 0 0

RMSE 0899 0947 0943 1137 8909

Set-up 6 with four off-grid oblique DoAs

SAEN EN Lasso OMP CoSaMP

PER (0794) 0683 0336 0336 0 0005

RMSE 0811 0913 0911 1360 28919

0 6 12 18 24 30

SNR [dB]

0

02

04

06

08

1

PE

R

FIG 5 (Color online) PER rates of CBF methods at different

SNR levels for set-up 7

D More targets and varying SNR levels

Next we consider the set-up 7 (cf Table II) whichcontains K = 4 sources The first three of the sourcesare at straight DoAs and the fourth one at a DoA withmodest obliqueness (θ4 = 18o) We now compute thePER rates of the methods as a function of SNR From thePER rates shown in Fig 5 we again notice that SAENclearly outperforms all of the other methods Note thatthe upper bound (16) of the PER rate of the SAEN is alsoplotted Both greedy algorithms OMP and CoSaMP areperforming very poorly even at high SNR levels Lassoand EN are attaining better recovery results than thegreedy algorithms Again EN is performing better thanLasso due to additional flexibility offered by EN tuningparameter and its ability to cope with correlated steering(basis) vectors SAEN recovers the exact true support inmost of the cases due to its step-wise adaptation usingcleverly chosen weights Furthermore the improvementin PER rates offered by SAEN becomes larger as the SNRlevel increases One can also notice that SAEN is close tothe theoretical upper bound of PER rate at higher SNRregime

VIII CONCLUSIONS

We developed c-PW-WEN algorithm that computesweighted elastic net solutions at the knots of penalty pa-rameter over a grid of EN tuning parameter values c-PW-WEN also computes weighted Lasso as a special case(ie solution at α = 1) and adaptive EN (AEN) is ob-tained when adaptive (data dependent) weights are usedWe then proposed a novel SAEN approach that uses c-PW-WEN method as its core computational engine anduses three-step adaptive weighting scheme where sparsityis decreased from 3K to K in three steps Simulationsillustrated that SAEN performs better than the adap-tive EN approaches Furthermore we illustrated that the3K-sparse initial solution computed at step 1 of SAENprovide smart weights for further steps and includes thetrue K-sparse support with high accuracy The proposedSAEN algorithm is then accurately including the truesupport at each step

10 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

Using the K-sparse Lasso solution computed directlyfrom Lasso path at the kth knot fails to provide exactsupport recovery in many cases especially when we havehigh basis coherence and lower SNR Greedy algorithmsoften fail in the face of high mutual coherence (due todense grid spacing or oblique target DoArsquos) or low SNRThis is mainly due the fact that their performance heav-ily depends on their ability to accurately detecting max-imal correlation between the measurement vector y andthe basis vectors (column vectors of X) Our simulationstudy also showed that their performance (in terms ofPER rate) deteriorates when the number of targets in-creases In the off-grid case the greedy algorithms alsofailed to find the nearby grid-points

Finally the SAEN algorithm performed better thanall other methods in each set-up and the improvementwas more pronounced in the presence of high mutualcoherence This is due to ability of SAEN to in-clude the true support correctly at all three steps ofthe algorithm Our MATLAB Rcopy package that imple-ments the proposed algorithms is freely available atwwwgithubcommntabassmSAEN-LARS The packagealso contains a MATLAB Rcopy live script demo on how touse the method in CBF problem along with an examplefrom simulation set-up 4 presented in the paper

ACKNOWLEDGMENTS

The research was partially supported by theAcademy of Finland grant no 298118 which is grate-fully acknowledged

1R Tibshirani ldquoRegression shrinkage and selection via the lassordquoJournal of the Royal Statistical Society Series B (Methodologi-cal) 267ndash288 (1996)

2H Zou and T Hastie ldquoRegularization and variable selection viathe elastic netrdquo Journal of the Royal Statistical Society SeriesB (Statistical Methodology) 67(2) 301ndash320 (2005)

3T Yardibi J Li P Stoica and L N C III ldquoSparsity con-strained deconvolution approaches for acoustic source mappingrdquoThe Journal of the Acoustical Society of America 123(5) 2631ndash2642 (2008) doi 10112112896754

4A Xenaki E Fernandez-Grande and P Gerstoft ldquoBlock-sparsebeamforming for spatially extended sources in a bayesian formu-lationrdquo The Journal of the Acoustical Society of America 140(3)1828ndash1838 (2016)

5D Malioutov M Cetin and A S Willsky ldquoA sparse signalreconstruction perspective for source localization with sensor ar-raysrdquo IEEE Transactions on Signal Processing 53(8) 3010ndash3022(2005) doi 101109TSP2005850882

6G F Edelmann and C F Gaumond ldquoBeamforming using com-pressive sensingrdquo The Journal of the Acoustical Society of Amer-ica 130(4) EL232ndashEL237 (2011) doi 10112113632046

7A Xenaki P Gerstoft and K Mosegaard ldquoCompressive beam-formingrdquo The Journal of the Acoustical Society of America136(1) 260ndash271 (2014) doi 10112114883360

8P Gerstoft A Xenaki and C F Mecklenbruker ldquoMultiple andsingle snapshot compressive beamformingrdquo The Journal of theAcoustical Society of America 138(4) 2003ndash2014 (2015) doi10112114929941

9Y Choo and W Seong ldquoCompressive spherical beamformingfor localization of incipient tip vortex cavitationrdquo The Journal

of the Acoustical Society of America 140(6) 4085ndash4090 (2016)doi 10112114968576

10A Das W S Hodgkiss and P Gerstoft ldquoCoherent multi-path direction-of-arrival resolution using compressed sensingrdquoIEEE Journal of Oceanic Engineering 42(2) 494ndash505 (2017) doi101109JOE20162576198

11S Fortunati R Grasso F Gini M S Greco and K LePageldquoSingle-snapshot doa estimation by using compressed sensingrdquoEURASIP Journal on Advances in Signal Processing 2014(1)120 (2014) doi 1011861687-6180-2014-120

12E Ollila ldquoNonparametric simultaneous sparse recovery An ap-plication to source localizationrdquo in Proc European Signal Pro-cessing Conference (EUSIPCOrsquo15) Nice France (2015) pp509ndash513

13E Ollila ldquoMultichannel sparse recovery of complex-valued sig-nals using Huberrsquos criterionrdquo in Proc Compressed Sensing The-ory and its Applications to Radar Sonar and Remote Sensing(CoSeRarsquo15) Pisa Italy (2015) pp 32ndash36

14M N Tabassum and E Ollila ldquoSingle-snapshot doa estimationusing adaptive elastic net in the complex domainrdquo in 4th In-ternational Workshop on Compressed Sensing Theory and itsApplications to Radar Sonar and Remote Sensing (CoSeRa)(2016) pp 197ndash201 doi 101109CoSeRa20167745728

15T Hastie R Tibshirani and M Wainwright Statistical learningwith sparsity the lasso and generalizations (CRC Press 2015)

16M Osborne B Presnell and B Turlach ldquoA new ap-proach to variable selection in least squares problemsrdquo IMAJournal of Numerical Analysis 20(3) 389ndash403 (2000) doi101093imanum203389

17B Efron T Hastie I Johnstone and R Tibshirani ldquoLeast an-gle regression (with discussion)rdquo The Annals of statistics 32(2)407ndash499 (2004) doi 101214009053604000000067

18M N Tabassum and E Ollila ldquoPathwise least angle regressionand a significance test for the elastic netrdquo in 25th European Sig-nal Processing Conference (EUSIPCO) (2017) pp 1309ndash1313doi 1023919EUSIPCO20178081420

19S Foucart and H Rauhut A mathematical introduction to com-pressive sensing Vol 1 (Springer 2013)

20H Zou ldquoThe adaptive lasso and its oracle propertiesrdquo Journal ofthe American Statistical Association 101(476) 1418ndash1429 (2006)doi 101198016214506000000735

21A E Hoerl and R W Kennard ldquoRidge regression biased esti-mation for nonorthogonal problemsrdquo Technometrics 12(1) 55ndash67 (1970)

22S Rosset and J Zhu ldquoPiecewise linear regularized so-lution pathsrdquo Ann Statist 35(3) 1012ndash1030 (2007) doi101214009053606000001370

23R J Tibshirani and J Taylor ldquoThe solution path of thegeneralized lassordquo Ann Statist 39(3) 1335ndash1371 (2011) doi10121411-AOS878

24R J Tibshirani ldquoThe lasso problem and uniquenessrdquo ElectronJ Statist 7 1456ndash1490 (2013) doi 10121413-EJS815

25A Panahi and M Viberg ldquoFast candidate points selection in thelasso pathrdquo IEEE Signal Processing Letters 19(2) 79ndash82 (2012)

26K L Gemba W S Hodgkiss and P Gerstoft ldquoAdaptiveand compressive matched field processingrdquo The Journal ofthe Acoustical Society of America 141(1) 92ndash103 (2017) doi10112114973528

27A B Gershman C F Mecklenbrauker and J F Bohme ldquoMa-trix fitting approach to direction of arrival estimation with imper-fect spatial coherence of wavefrontsrdquo IEEE Transactions on Sig-nal Processing 45(7) 1894ndash1899 (1997) doi 10110978599968

28J A Tropp and A C Gilbert ldquoSignal recovery from randommeasurements via orthogonal matching pursuitrdquo IEEE Trans-actions on Information Theory 53(12) 4655ndash4666 (2007) doi101109TIT2007909108

29D Needell and J Tropp ldquoCosamp Iterative signal recovery fromincomplete and inaccurate samplesrdquo Applied and Computational

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 11

Harmonic Analysis 26(3) 301 ndash 321 (2009)

12 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

Page 5: Sequential adaptive elastic net approach for single …Sequential adaptive elastic net approach for single-snapshot source localization MuhammadNaveedTabassum 1andEsaOllila Aalto University,

auxiliary variable γℓ ge 0 Now (9) becomes

cℓ minus (λkminus1 minus λkminus1 + γℓ)bℓ = (λkminus1 minus γℓ) eθ

hArr cℓ minus γℓbℓ = (λkminus1 minus γℓ) eθ

hArr |cℓ minus γℓbℓ|2 = (λkminus1 minus γℓ)2

hArr |cℓ|2 minus 2γℓRe(cℓblowastℓ ) + γ2

ℓ |bℓ|2 = λ2kminus1 minus 2λkminus1γℓ + γ2

The last equation implies that γℓ can be found by solvingthe roots of the second-order polynomial equation Aγ2

ℓ +Bγℓ+C = 0 where A = |bℓ|2minus1 B = 2λkminus1minus2Re(cℓb

lowastℓ )

and C = |cℓ|2 minus λ2kminus1 The roots are γℓ = γℓ1 γℓ2 and

the final value of γℓ will be

γℓ =

min(γℓ1 γℓ2) if γℓ1 gt 0 and γℓ2 gt 0

max(γℓ1 γℓ2)+ otherwise

where (t)+ = max(0 t) for t isin R Thus finding largestλ that verifies (7) is equivalent to finding smallest non-negative γℓ Hence the variable that enters to active setAk+1 is jk+1 = argminℓ 6isinAk

γℓ and the knot is thus λk =

λkminus1 minus γjk+1 Thereafter solution β(λk) at the knot λk

is simple to find in step 6

IV COMPLEX-VALUED PATHWISE WEIGHTED ELAS-

TIC NET

Next we develop a complex-valued and weighted ver-sion of PW-LARS-EN algorithm proposed in18 and re-ferred to as c-PW-WEN algorithm Generalization tocomplex-valued case is straightforward The essentialdifference is that c-LARS-WLasso Algorithm 1 is usedinstead of the (real-valued) LARS-Lasso algorithm Thealgorithm finds the Kth knot λK and the correspondingWEN solutions at a dense grid of EN tuning parametervalues α and then picks final solution for best α-value

Let λ0(α) denotes the smallest value of λ such that all

coefficients in the WEN estimate are zero ie β(λ0 α) =0 The value of λ0(α) can be expressed in closed-form15

λ0(α) = maxj

1

α

〈xj y〉wj

j = 1 p

The c-PW-WEN algorithm computes K-sparse WEN so-lutions for a set of α values in a dense grid

[α] = αi isin [1 0) α1 = 1 lt middot middot middot lt αm lt 0 (10)

Let A(λ) = suppβ(λ α) denote the active set (ienonzero elements of WEN solution) for a given fixed reg-ularization parameter value λ equiv λ(α) lt λ0(α) and forgiven α value in the grid [α] The knots λ1(α) gt λ2(α) gtmiddot middot middot gt λK(α) are the border values of the regularizationparameter after which there is a change in the set of ac-tive predictors Since α is fixed we drop the dependencyof the penalty parameter on α and simply write λ or λK

instead of λ(α) or λK(α) The reader should howeverkeep in mind that the value of the knots are different forany given α The active set at a knot λk is then denoted

shortly by Ak equiv A(λk) = suppβ(λk α) Note thatAk 6= Ak+1 for all k = 1 K Note that it is assumedthat a non-zero coefficient does not leave the active set forany value λ gt λK that is the sparsity level is increasingfrom 0 (at λ0) to K (at λK)

First we let that Algorithm 1 can be written asc-LARS-WLasso(yXwK) then we can write it as let

λk β(λk) = c-LARS-WLasso(yXw)∣

k

for extracting the kth knot (and the corresponding solu-tion) from a sequence of the knot-solution pairs found bythe c-LARS-WLasso algorithm Next note that we canwrite the EN objective function in augmented form asfollows

1

2yminusXβ22+λPα(β) =

1

2yaminusXa(η)β22+γ

∥β∥

1(11)

where

γ = λα and η = λ(1 minus α) (12)

are new parameterizations of the tuning and shrinkageparameter pair (α λ) and

ya =

(

y

0

)

and Xa(η) =

(

Xradicη Ip

)

are the augmented forms of the response vector y andthe predictor matrix X respectively Note that (11) re-sembles the Lasso objective function with ya isin Rn+p andthat Xa(η) is an (n+ p)times p matrix

It means that we can compute the K-sparse WENsolution at Kth knot for fixed α using the c-LARS-WLasso algorithm Our c-PW-WEN method is givenin algorithm 2 It computes the WEN solutions at theknots over a dense grid (10) of α values After Step 6of the algorithm we have solution at the Kth knot for a

given αi value on the grid Having the solution β(λK αi)available we then in steps 7 to 9 compute the residualsum of squares (RSS) of the debiased WEN solution atthe Kth knot (having K nonzeros)

RSS(αi) = y minusXAKβLS(λK αi)22

where AK = supp(β(λK αi)) is the active set at the Kth

knot and βLS(λK αi) is the debiased LSE defined as

βLS(λK αi) = X+AK

y (13)

where XAKisin CntimesK consists of the K active columns of

X associated with the active set AK and X+AK

denotesits Moore-Penrose pseudo inverse

While sweeping through the grid of α values and

computing the WEN solutions β(λK αi) we chooseour best candidate solution as the WEN estimateβ(λK αı) that had the smallest RSS value ie ı =argminiRSS(αi) where i isin 1 m

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 5

Algorithm 2 c-PW-WEN algorithm

input y isin Cn X isin C

ntimesp w isin Rp [α] isin R

m

(recall α1 = 1) K and debias

output βK isin Cp and AK isin R

K

1 λk(α1) β(λk α1)Kk=0 =

c-LARS-WLasso(

yXw K)

2 for i = 2 to m do

3 for k = 1 to K do

4 ηk = λk(αiminus1) middot (1minus αi)

5

γk β(λk αi)

=

c-LARS-WLasso(

yaXa(ηk)w)∣

k

6 λk(αi) = γkαi

7 AK = suppβ(λK αi)

8 βLS(λK αi) = X+AK

y

9 RSS(αi) = y minusXAKβLS(λK αi)

22

10 ı = argmini RSS(αi)

11 βK = β(λK αı) and AK = suppβK

12 if debias then βAK= X+

AKy

V SEQUENTIALLY ADAPTIVE ELASTIC NET

Next we turn our attention on how to choose theadaptive (ie data dependent) weights in c-PW-WENIn adaptive Lasso20 one ideally uses the LSE or if p gt n

the Lasso as an initial estimator βinit to construct theweights given in (4) The problem is that both the LSEand the Lasso estimator have very poor accuracy (highvariance) when there exists high correlations betweenpredictors which is the condition we are concerned inthis paper This lowers the probability of exact recoveryof the adaptive Lasso significantly

To overcome the problem above we devise a sequen-tial adaptive elastic net (SAEN) algorithm that obtainsthe K-sparse solution in a sequential manner decreasingthe sparsity level of the solution at each iteration and us-ing the previous solution as adaptive weights for c-PW-WEN The SAEN is described in algorithm 3 SAENruns the c-PW-WEN algorithm three times At first stepit finds a standard (unit weight) c-PW-WEN solution for3K nonzero (active) coefficients which we refer to as ini-

tial EN solution βinit The obtained solution determinesthe adaptive weights via (4) (and hence the active setof 3K nonzero coefficients) which is used in the secondstep to computes the c-PW-WEN solution that has 2Knonzero coefficients This again determines the adaptiveweights via (4) (and hence the active set of 2K nonzerocoefficients) which is used in the second step to computethe c-PW-WEN solution that has the desired K nonzerocoefficients It is important to notice that since we startfrom a solution with 3K nonzeros it is quite likely thatthe true K non-zero coefficients will be included in theactive set of βinit which is computed in the first step ofthe SAEN algorithm Note that the choice of 3K is sim-

ilar to CoSaMP algorithm29 which also uses 3K as aninitial support size Using 3K also usually guaranteesthat XA3K

is well conditioned which may not be the caseif larger value than 3K is chosen

Algorithm 3 SAEN algorithm

input y isin Cn X isin C

ntimesp [α] isin Rm and K

output βK isin Cp

1

βinitA3K

= c-PW-WEN(

yX1p [α] 3K 0)

2

βA2K

=

c-PW-WEN(

yXA3K 13K ⊘

∣βinitA3K

∣ [α] 2K 0)

3

βK AK

=

c-PW-WEN(

yXA2K 12K ⊘

∣βA2K

∣ [α] K 1)

VI SINGLE-SNAPSHOT COMPRESSIVE BEAMFORM-

ING

Estimating the source location in terms of its DoAplays an important role in many applications In5 it wasobserved that CS algorithms can be applied for DoA esti-mation (eg of sound sources) using sensor arrays whenthe array output y is be expressed as sparse (underdeter-mined) linear model by discretizing the DoA parameterspace This approach is referred to as compressive beam-forming (CBF) and it has been subsequently used in aseries of papers (eg7ndash1326)

In CBF after finding the sparse regression estimatorits support can be mapped to the DoA estimates on thegrid Thus the DoA estimates in CBF are always selectedfrom the resulting finite set of discretized DoA parame-ters Hence the resolution of CBF is dependent on thedensity of the grid (spacing ∆θ or grid size p) Densergrid implies large mutual coherence of the basis vectorsxj (here equal to the steering vectors for the DoAs onthe grid) and thus a poor recovery region for most sparseregression techniques

The proposed SAEN estimator can effectively miti-gate the effect of high mutual coherence caused by dis-cretization of the DoA space with significantly better per-formance than state-of-the-art compressed sensing algo-rithms This is illustrated in Section VII via extensivesimulation studies using challenging multi-source set-upsof closely-spaced sources and large variation of sourcepowers

We assume narrowband processing and a far-fieldsource wave impinging on an array of sensors with knownconfiguration The sources are assumed to be located inthe far-field of the sensor array (ie propagation radius≫ array size) A uniform linear array (ULA) of n sensors(eg hydrophones or microphones) is used for estimat-ing the DoA θ isin [minus90 90) of the source with respectto the array axis The array response (steering or wave-front vector) of ULA for a source from DoA (in radians)

6 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

θ isin [minusπ2 π2) is given by

a(

θ)

1radicn

[

1 eπ sin θ eπ(nminus1) sin θ]⊤

where we assume half a wavelength inter-element spac-ing between sensors We consider the case that K lt nsources from distinct DoAs θ = (θ1 θK) arrive at asensor at some time instant t A single-snapshot obtainedby ULA can then be modeled as27

y(t) = A(θ)s(t) + ε(t) (14)

where θ = (θ1 θK)⊤ collects the DoAs the matrix

A(θ) = [a(θ1) middot middot middot a(θK)] A isin CntimesK is the dictionaryof replicas also known as the array steering matrix s(t) isinCK contains the source waveforms and ε(t) is complexnoise at time instant t

Consider an angular grid of size p (commonly p ≫ n)of look directions of interest

[ϑ] = ϑi isin [minusπ2 π2) ϑ1 lt middot middot middot lt ϑp

Let the ith column of the measurement matrix X in themodel (1) be the array response for look direction ϑi soxi = a(ϑi) Then if the true source DoAs are containedin the angular grid ie θi isin [ϑ] for i = 1 K thenthe snapshot y in (14) (where we drop the time index t)can be equivalently modeled by (1) as

y = Xβ + ε

where β is exactly K-sparse (β0 = K) and nonzeroselements of β maintain the source waveforms s Thusidentifying the true DoAs is equivalent to identifyingthe nonzero elements of β which we refer to as CBF-principle Hence sparse regression and CS methods canbe utilised for estimating the DoAs based on a singlesnapshot only We assume that the number of sources Kis known a priori

Besides SNR also the size p or spacing ∆θ of the gridgreatly affect the performance of CBF methods Thecross-correlation (coherence) between the true steeringvector and steering vectors on the grid depends on boththe grid spacing and obliqueness of the target DoAs wrtto the array Moreover the values of cross-correlationsin the gram matrix |XHX| also depend on the distancebetween array elements and configuration of the sen-sors array7 Let us construct a measure called max-imal basis coherence (MBC) defined as the maximumabsolute value of the cross-correlations among the truesteering vectors a(θj) and the basis a(ϑi) ϑi isin [ϑ]θjj isin 1 K

MBC = maxj

maxϑisin[ϑ]θj

∣a(θj)H a(ϑ)

∣ (15)

Note that steering vectors a(θ) θ isin [minusπ2 π2) are as-sumed to be normalized (a(θ)H a(θ) = 1) MBC valuemeasures the obliqueness of the incoming DoA to the ar-ray The higher the MBC value the more difficult it is

for any CBF method to distinguish the true DoA in thegrid Note that the value of MBC also depends on thegrid spacing ∆θ

Fig 3 shows the geometry of DoA estimation prob-lem where the target DoArsquos have varying basis coherenceand it increases with the level of obliqueness (inclination)on either side of the normal to the ULA axis We definea straight DoA when the angle of incidence of the targetDoAs is in the shaded sector in Fig 3 This the regionwhere the MBC has lower values In contrast an obliqueDoA is defined when the angle of incidence of the targetis oblique with respect to the array axis

θ

nn-12

s(t)

1d

Source

d sinθ

frac14

Straight DoAs

FIG 3 (Color online) The straight and oblique DoAs exhibit

different basis coherence

Consider the case of ULA with n = 40 elementsreceiving two sources at straight DoAs θ1 = minus6 andθ2 = 2 or at oblique DoAs θ1 = 44 and θ2 = 52The above scenarios correspond to set-up 2 and set-up 3of Section VII see also Table II Angular separation be-tween the DoArsquos is 8 in both the scenarios In Table I wecompute the correlation between the true steering vectorsand with a neighboring steering vector a(ϑ) in the grid

TABLE I Correlation between true steering vector at DoAs

θ1 and θ2 and a steering vector at angle ϑ on the grid in a two

source scenario set-ups with either straight or oblique DoAs

θ1 θ2 ϑ correlation

True straight DoAs minus6 2 0071

minus6 minus7 0814

2 3 0812

True oblique DoAs 44 52 0069

44 45 0901

52 53 0927

These values validate the fact highlighted in Fig 3Namely a target with an oblique DoA wrt the arrayhas a larger maximal correlation (coherence) with the ba-sis steering vectors This makes it difficult for the sparse

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 7

recovery method to identify the true steering vector a(θi)from the spurious steering vector a(ϑ) that simply has avery large correlation with the true one Due to this mu-tual coherence it may happen that neither a(θi) or a(ϑ)

are assigned a non-zero coefficient value in β or perhapsjust one of them in random fashion

VII SIMULATION STUDIES

We consider seven simulation set-ups First five set-ups use grid spacing ∆θ = 1 (leading to p = 180 lookdirections in the grid [ϑ]) and the last two employ sparsergrid ∆θ = 2 (leading to p = 90 look directions in thegrid) The number of sensors n in the ULA are n = 40 forset-ups 1-5 and n = 30 for set-ups 6-7 Each set-up hasK isin 2 3 4 sources at different (straight or oblique)DoAs θ = (θ1 θK)⊤ and the source waveforms aregenerated as sk = |sk| middot eArg(sk) where source powers|sk| isin (0 1] are fixed for each set-up but the source phasesare randomly generated for each Monte-Carlo trial asArg(sk) sim Unif(0 2π) for k = 1 K Table II speci-fies the values of DoA-s and power of the sources used inthe set-ups Also the MBC values (15) are reported foreach case

TABLE II Details of all the set-ups tested in this paper First

five set-ups have grid spacing ∆θ = 1 and last two ∆θ = 2

Set-ups |si| θ [] MBC

1 [09 1 1] [minus5 3 6] 0814

2 [09 1] [minus6 2] 0814

3 [09 1] [44 52] 0927

4 [08 07 1] [43 44 52] 0927

5 [09 01 1 04] [minus87minus38minus35 97] 0990

6 [08 1 09 04] minus[485 464 315 22] 0991

7 [07 1 06 07] [6 8 14 18] 0643

The error terms εi are iid and generated fromCN (0 σ2) distribution where the noise variance σ2 de-pends on the signal-to-noise ratio (SNR) level in deci-bel (dB) given by SNR(dB) = 10 log10(σ

2sσ

2) whereσ2s = 1

K

|s1|2 + |s2|2 + middot middot middot + |sK |2

denotes the averagesource power An SNR level of 20 dB is used in thispaper unless its specified otherwise and the number ofMonte-Carlo trials is L = 1000

In each set-up we evaluate the performance of allmethods in recovering exactly the true support and thesource powers Due to high mutual coherence and due thelarge differences in source powers the DoA estimation isnow a challenging task A key performance measure isthe (empirical) probability of exact recovery (PER) of allK sources defined as

PER = aveI(Alowast = AK)

where Alowast denotes the index set of true source DoAs onthe grid and set AK the found support set where |Alowast| =|AK | = K and the average is over all Monte-Carlo tri-als Above I(middot) denotes the indicator function We alsocompute the average root mean squared error (RMSE)of the debiased estimate s = argminsisinCK y minus XAK

s2of the source vector as RMSE =

radic

avesminus s2

A Compared methods

This paper compares the SAEN approach to the ex-isting well-known greedy methods such as orthogonalmatching pursuit (OMP)28 and compressive samplingmatching pursuit (CoSaMP)29 Moreover we also drawcomparisons for two special cases of the c-PW-WEN al-gorithm to Lasso estimate that has K-non-zeros (ie

β(λK 1) computed by c-PW-WEN using w = 1 andα = 1) and EN estimator when cherry-picking the bestα in the grid [α] = αi isin [1 0) α1 = 1 lt middot middot middot lt αm lt

001 (ie β(λK αbst))It is instructive to compare the SAEN to simpler

adaptive EN (AEN) approach that simply uses adaptiveweights to weight different coefficients differently in thespirit of adaptive Lasso20 This helps in understandingthe effectiveness of the cleverly chosen weights and theusefulness of the three-step procedure used by the SAENalgorithm Recall that the first step in AEN approach iscompute the weights using some initial solution Afterobtaining the (adaptive) weights the final K-sparse AENsolution is computed We devise three AEN approacheseach one using a different initial solution to compute theweights

1 AEN(LSE) uses the weights found as

w(LSE) = 1⊘ |X+y|

where X+ is Moore-Penrose pseudo inverse of X

2 AEN(n) employs weights from an initial n-sparse

EN solution β(λn α) at nth knot which is found by

c-PW-WEN algorithm with w = 1

3 AEN(3K) instead uses weights calculated from an

initial EN solution β(λ3K α) having 3K nonzerosas in step 1 of SAEN algorithm but the remainingtwo steps of SAEN algorithm are omitted

The upper bound for PER rate for SAEN is the (em-

pirical) probability that the initial solution βinit com-puted in step 1 of algorithm 3 contains the true supportAlowast ie the value

UB = ave

I(

Alowast sub supp(βinit)

(16)

where the average is over all Monte-Carlo trials Wealso compute this upper bound to illustrate the abilityof SAEN to pick the true K-sparse support from theoriginal 3K-sparse initial value For set-up 1 (cf Ta-ble II) the average recovery results for all of the abovementioned methods are provided in Table III

8 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

TABLE III The recovery results for set-up 1 Results illus-

trate the effectiveness of three step SAEN approach compared

to its competitors The SNR level was 20 dB and the upper

bound (16) for the PER rate of SAEN is given in parentheses

SAEN AEN(3K) AEN(n)

PER (0957) 0864 0689 0616

RMSE - 0449 0694 0889

AEN(LSE) EN Lasso OMP CoSaMP

PER 0 0332 0332 0477 0140

RMSE 1870 1163 1163 1060 3558

It can be noted that the proposed SAEN outperformsall other methods and weighting schemes and recovers hetrue support and powers of the sources effectively Notethat the SAENrsquos upper bound for PER rate was 957and SAEN reached the PER rate 864 The results ofthe AEN approaches validate the need for accurate initialestimate to construct the adaptive weights For exampleAEN(3K) performs better than AEN(n) but much worsethan the SAEN method

B Straight and oblique DoAs

Set-up 2 and set-up 3 correspond to the case wherethe targets are at straight and oblique DoAs respectivelyPerformance results of the sparse recovery algorithms aretabulated in Table IV As can be seen the upper boundfor PER rate of SAEN is full 100 percentage whichmeans that the true support is correctly included in 3K-sparse solution computed at Step 1 of the algorithmFor set-up 2 (straight DoAs) all methods have almostfull PER rates except CoSaMP with 678 rate Per-formance of other estimators expect of SAEN changesdrastically in set-up 3 (oblique DoAs) Here SAEN isachieving nearly perfect (sim98) PER rate which meansreduction of 2 compared to set-up 2 Other methodsperform poorly For example PER rate of Lasso dropsfrom near 98 to 40 Similar behavior is observed forEN OMP and CoSaMP

Next we discuss the results for set-up 4 which is sim-ilar to set-up 3 except that we have introduced a thirdsource that also arrives from an oblique DoA θ = 43 andthe variation of the source powers is slightly larger Ascan be noted from Table IV the PER rates of greedy al-gorithms OMP and CoSaMP have declined to outstand-ingly low 0 This is very different with the PER ratesthey had in set-up 3 which contained only two sourcesIndeed inclusion of the 3rd source from an DoA similarwith the other two sources completely ruined their accu-racy This is in deep contrast with the SAEN methodthat still achieves PER rate of 75 which is more thantwice the PER rate achieved by Lasso SAEN is againhaving the lowest RMSE values

TABLE IV Recovery results for set-ups 2 - 4 Note that for

oblique DoArsquos (set-ups 3 and 4) the SAEN method outper-

form the other methods and has a perfect recovery results

for set-up 2 (straight DoAs) SNR level is 20 dB The upper

bound (16) for the PER rate of SAEN is given in parentheses

Set-up 2 with two straight DoAs

SAEN EN Lasso OMP CoSaMP

PER (1000) 1000 0981 0981 0998 0678

RMSE 0126 0145 0145 0128 1436

Set-up 3 with two oblique DoAs

SAEN EN Lasso OMP CoSaMP

PER (1000) 0978 0399 0399 0613 0113

RMSE 0154 0916 0916 0624 2296

Set-up 4 with three oblique DoAs

SAEN EN Lasso OMP CoSaMP

PER (0776) 0749 0392 0378 0 0

RMSE 0505 0838 0827 1087 5290

In summary the recovery results for set-ups 1-4(which express different degrees of basis coherence prox-imity of target DoArsquos as well as variation of source pow-ers) clearly illustrate that the proposed SAEN performsvery well in identifying the true support and the power ofthe sources and is always outperforming the commonlyused benchmarks sparse recovery methods namely theLasso EN OMP or CoSaMP with a significant marginIt is also noteworthy that EN often achieved better PERrates than Lasso which is mainly due to its group se-lection ability As a specific example of this particularfeature Figure 4 shows the solution paths for Lasso andEN for one particular Monte-Carlo trial where EN cor-rectly chooses the true DoAs but Lasso fails to select allcorrect DoAs In this particular instance the EN tun-ing parameter was α = 09 This is reason behind thesuccess of our c-PW-WEN algorithm which is the corecomputational engine of the SAEN

C Off-grid sources

Set-ups 5 and 6 explore the case when the targetDoAs are off the grid Also note that set-up 5 uses finergrid spacing ∆θ = 1 compared to set-up 6 with ∆θ = 2Both set-ups contain four target sources that have largelyvarying source powers In the off-grid case one would likethe CBF method to localize the targets to the nearestDoA in the angular grid [ϑ] that is used to construct thearray steering matrix Therefore in the off the grid case

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 9

0 05 1 15

-log( )

0

1P

ow

er (43)deg

(52)deg

3

(a)

0 05 1 15

-log( )

(43)deg

(44)deg

(52)deg

3

(b)

0

1

Po

wer

40 50 60 70

DoA [degrees]

Sources

Lasso

EN

(c)

FIG 4 (Color online) The Lasso and EN solution paths

(upper panel) and respective DoA solutions at the knot λ3

Observe that Lasso fails to recover the true support but EN

successfully picks the true DoAs The EN tuning parameter

was α = 09 in this example

the PER rate refers to the case that CBF method selectsthe K-sparse support that corresponds to DoArsquos on thegrid that are closest in distance to the true DoAs TableV provides the recovery results As can be seen again theSAEN is performing very well outperforming the Lassoand EN Note that OMP and CoSaMP completely fail inselecting the nearest grid-points

TABLE V Performance results of CBF methods for set-ups

5 and 6 where target DoA-s are off the grid Here PER

rate refers to the case that CBF method selects the K-sparse

support that corresponds to DoArsquos on the grid that are closest

to the true DoAs The upper bound (16) for the PER rate of

SAEN is given in parentheses SNR level is 20 dB

Set-up 5 with four off-grid straight DoAs

SAEN EN Lasso OMP CoSaMP

PER (0999) 0649 0349 0328 0 0

RMSE 0899 0947 0943 1137 8909

Set-up 6 with four off-grid oblique DoAs

SAEN EN Lasso OMP CoSaMP

PER (0794) 0683 0336 0336 0 0005

RMSE 0811 0913 0911 1360 28919

0 6 12 18 24 30

SNR [dB]

0

02

04

06

08

1

PE

R

FIG 5 (Color online) PER rates of CBF methods at different

SNR levels for set-up 7

D More targets and varying SNR levels

Next we consider the set-up 7 (cf Table II) whichcontains K = 4 sources The first three of the sourcesare at straight DoAs and the fourth one at a DoA withmodest obliqueness (θ4 = 18o) We now compute thePER rates of the methods as a function of SNR From thePER rates shown in Fig 5 we again notice that SAENclearly outperforms all of the other methods Note thatthe upper bound (16) of the PER rate of the SAEN is alsoplotted Both greedy algorithms OMP and CoSaMP areperforming very poorly even at high SNR levels Lassoand EN are attaining better recovery results than thegreedy algorithms Again EN is performing better thanLasso due to additional flexibility offered by EN tuningparameter and its ability to cope with correlated steering(basis) vectors SAEN recovers the exact true support inmost of the cases due to its step-wise adaptation usingcleverly chosen weights Furthermore the improvementin PER rates offered by SAEN becomes larger as the SNRlevel increases One can also notice that SAEN is close tothe theoretical upper bound of PER rate at higher SNRregime

VIII CONCLUSIONS

We developed c-PW-WEN algorithm that computesweighted elastic net solutions at the knots of penalty pa-rameter over a grid of EN tuning parameter values c-PW-WEN also computes weighted Lasso as a special case(ie solution at α = 1) and adaptive EN (AEN) is ob-tained when adaptive (data dependent) weights are usedWe then proposed a novel SAEN approach that uses c-PW-WEN method as its core computational engine anduses three-step adaptive weighting scheme where sparsityis decreased from 3K to K in three steps Simulationsillustrated that SAEN performs better than the adap-tive EN approaches Furthermore we illustrated that the3K-sparse initial solution computed at step 1 of SAENprovide smart weights for further steps and includes thetrue K-sparse support with high accuracy The proposedSAEN algorithm is then accurately including the truesupport at each step

10 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

Using the K-sparse Lasso solution computed directlyfrom Lasso path at the kth knot fails to provide exactsupport recovery in many cases especially when we havehigh basis coherence and lower SNR Greedy algorithmsoften fail in the face of high mutual coherence (due todense grid spacing or oblique target DoArsquos) or low SNRThis is mainly due the fact that their performance heav-ily depends on their ability to accurately detecting max-imal correlation between the measurement vector y andthe basis vectors (column vectors of X) Our simulationstudy also showed that their performance (in terms ofPER rate) deteriorates when the number of targets in-creases In the off-grid case the greedy algorithms alsofailed to find the nearby grid-points

Finally the SAEN algorithm performed better thanall other methods in each set-up and the improvementwas more pronounced in the presence of high mutualcoherence This is due to ability of SAEN to in-clude the true support correctly at all three steps ofthe algorithm Our MATLAB Rcopy package that imple-ments the proposed algorithms is freely available atwwwgithubcommntabassmSAEN-LARS The packagealso contains a MATLAB Rcopy live script demo on how touse the method in CBF problem along with an examplefrom simulation set-up 4 presented in the paper

ACKNOWLEDGMENTS

The research was partially supported by theAcademy of Finland grant no 298118 which is grate-fully acknowledged

1R Tibshirani ldquoRegression shrinkage and selection via the lassordquoJournal of the Royal Statistical Society Series B (Methodologi-cal) 267ndash288 (1996)

2H Zou and T Hastie ldquoRegularization and variable selection viathe elastic netrdquo Journal of the Royal Statistical Society SeriesB (Statistical Methodology) 67(2) 301ndash320 (2005)

3T Yardibi J Li P Stoica and L N C III ldquoSparsity con-strained deconvolution approaches for acoustic source mappingrdquoThe Journal of the Acoustical Society of America 123(5) 2631ndash2642 (2008) doi 10112112896754

4A Xenaki E Fernandez-Grande and P Gerstoft ldquoBlock-sparsebeamforming for spatially extended sources in a bayesian formu-lationrdquo The Journal of the Acoustical Society of America 140(3)1828ndash1838 (2016)

5D Malioutov M Cetin and A S Willsky ldquoA sparse signalreconstruction perspective for source localization with sensor ar-raysrdquo IEEE Transactions on Signal Processing 53(8) 3010ndash3022(2005) doi 101109TSP2005850882

6G F Edelmann and C F Gaumond ldquoBeamforming using com-pressive sensingrdquo The Journal of the Acoustical Society of Amer-ica 130(4) EL232ndashEL237 (2011) doi 10112113632046

7A Xenaki P Gerstoft and K Mosegaard ldquoCompressive beam-formingrdquo The Journal of the Acoustical Society of America136(1) 260ndash271 (2014) doi 10112114883360

8P Gerstoft A Xenaki and C F Mecklenbruker ldquoMultiple andsingle snapshot compressive beamformingrdquo The Journal of theAcoustical Society of America 138(4) 2003ndash2014 (2015) doi10112114929941

9Y Choo and W Seong ldquoCompressive spherical beamformingfor localization of incipient tip vortex cavitationrdquo The Journal

of the Acoustical Society of America 140(6) 4085ndash4090 (2016)doi 10112114968576

10A Das W S Hodgkiss and P Gerstoft ldquoCoherent multi-path direction-of-arrival resolution using compressed sensingrdquoIEEE Journal of Oceanic Engineering 42(2) 494ndash505 (2017) doi101109JOE20162576198

11S Fortunati R Grasso F Gini M S Greco and K LePageldquoSingle-snapshot doa estimation by using compressed sensingrdquoEURASIP Journal on Advances in Signal Processing 2014(1)120 (2014) doi 1011861687-6180-2014-120

12E Ollila ldquoNonparametric simultaneous sparse recovery An ap-plication to source localizationrdquo in Proc European Signal Pro-cessing Conference (EUSIPCOrsquo15) Nice France (2015) pp509ndash513

13E Ollila ldquoMultichannel sparse recovery of complex-valued sig-nals using Huberrsquos criterionrdquo in Proc Compressed Sensing The-ory and its Applications to Radar Sonar and Remote Sensing(CoSeRarsquo15) Pisa Italy (2015) pp 32ndash36

14M N Tabassum and E Ollila ldquoSingle-snapshot doa estimationusing adaptive elastic net in the complex domainrdquo in 4th In-ternational Workshop on Compressed Sensing Theory and itsApplications to Radar Sonar and Remote Sensing (CoSeRa)(2016) pp 197ndash201 doi 101109CoSeRa20167745728

15T Hastie R Tibshirani and M Wainwright Statistical learningwith sparsity the lasso and generalizations (CRC Press 2015)

16M Osborne B Presnell and B Turlach ldquoA new ap-proach to variable selection in least squares problemsrdquo IMAJournal of Numerical Analysis 20(3) 389ndash403 (2000) doi101093imanum203389

17B Efron T Hastie I Johnstone and R Tibshirani ldquoLeast an-gle regression (with discussion)rdquo The Annals of statistics 32(2)407ndash499 (2004) doi 101214009053604000000067

18M N Tabassum and E Ollila ldquoPathwise least angle regressionand a significance test for the elastic netrdquo in 25th European Sig-nal Processing Conference (EUSIPCO) (2017) pp 1309ndash1313doi 1023919EUSIPCO20178081420

19S Foucart and H Rauhut A mathematical introduction to com-pressive sensing Vol 1 (Springer 2013)

20H Zou ldquoThe adaptive lasso and its oracle propertiesrdquo Journal ofthe American Statistical Association 101(476) 1418ndash1429 (2006)doi 101198016214506000000735

21A E Hoerl and R W Kennard ldquoRidge regression biased esti-mation for nonorthogonal problemsrdquo Technometrics 12(1) 55ndash67 (1970)

22S Rosset and J Zhu ldquoPiecewise linear regularized so-lution pathsrdquo Ann Statist 35(3) 1012ndash1030 (2007) doi101214009053606000001370

23R J Tibshirani and J Taylor ldquoThe solution path of thegeneralized lassordquo Ann Statist 39(3) 1335ndash1371 (2011) doi10121411-AOS878

24R J Tibshirani ldquoThe lasso problem and uniquenessrdquo ElectronJ Statist 7 1456ndash1490 (2013) doi 10121413-EJS815

25A Panahi and M Viberg ldquoFast candidate points selection in thelasso pathrdquo IEEE Signal Processing Letters 19(2) 79ndash82 (2012)

26K L Gemba W S Hodgkiss and P Gerstoft ldquoAdaptiveand compressive matched field processingrdquo The Journal ofthe Acoustical Society of America 141(1) 92ndash103 (2017) doi10112114973528

27A B Gershman C F Mecklenbrauker and J F Bohme ldquoMa-trix fitting approach to direction of arrival estimation with imper-fect spatial coherence of wavefrontsrdquo IEEE Transactions on Sig-nal Processing 45(7) 1894ndash1899 (1997) doi 10110978599968

28J A Tropp and A C Gilbert ldquoSignal recovery from randommeasurements via orthogonal matching pursuitrdquo IEEE Trans-actions on Information Theory 53(12) 4655ndash4666 (2007) doi101109TIT2007909108

29D Needell and J Tropp ldquoCosamp Iterative signal recovery fromincomplete and inaccurate samplesrdquo Applied and Computational

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 11

Harmonic Analysis 26(3) 301 ndash 321 (2009)

12 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

Page 6: Sequential adaptive elastic net approach for single …Sequential adaptive elastic net approach for single-snapshot source localization MuhammadNaveedTabassum 1andEsaOllila Aalto University,

Algorithm 2 c-PW-WEN algorithm

input y isin Cn X isin C

ntimesp w isin Rp [α] isin R

m

(recall α1 = 1) K and debias

output βK isin Cp and AK isin R

K

1 λk(α1) β(λk α1)Kk=0 =

c-LARS-WLasso(

yXw K)

2 for i = 2 to m do

3 for k = 1 to K do

4 ηk = λk(αiminus1) middot (1minus αi)

5

γk β(λk αi)

=

c-LARS-WLasso(

yaXa(ηk)w)∣

k

6 λk(αi) = γkαi

7 AK = suppβ(λK αi)

8 βLS(λK αi) = X+AK

y

9 RSS(αi) = y minusXAKβLS(λK αi)

22

10 ı = argmini RSS(αi)

11 βK = β(λK αı) and AK = suppβK

12 if debias then βAK= X+

AKy

V SEQUENTIALLY ADAPTIVE ELASTIC NET

Next we turn our attention on how to choose theadaptive (ie data dependent) weights in c-PW-WENIn adaptive Lasso20 one ideally uses the LSE or if p gt n

the Lasso as an initial estimator βinit to construct theweights given in (4) The problem is that both the LSEand the Lasso estimator have very poor accuracy (highvariance) when there exists high correlations betweenpredictors which is the condition we are concerned inthis paper This lowers the probability of exact recoveryof the adaptive Lasso significantly

To overcome the problem above we devise a sequen-tial adaptive elastic net (SAEN) algorithm that obtainsthe K-sparse solution in a sequential manner decreasingthe sparsity level of the solution at each iteration and us-ing the previous solution as adaptive weights for c-PW-WEN The SAEN is described in algorithm 3 SAENruns the c-PW-WEN algorithm three times At first stepit finds a standard (unit weight) c-PW-WEN solution for3K nonzero (active) coefficients which we refer to as ini-

tial EN solution βinit The obtained solution determinesthe adaptive weights via (4) (and hence the active setof 3K nonzero coefficients) which is used in the secondstep to computes the c-PW-WEN solution that has 2Knonzero coefficients This again determines the adaptiveweights via (4) (and hence the active set of 2K nonzerocoefficients) which is used in the second step to computethe c-PW-WEN solution that has the desired K nonzerocoefficients It is important to notice that since we startfrom a solution with 3K nonzeros it is quite likely thatthe true K non-zero coefficients will be included in theactive set of βinit which is computed in the first step ofthe SAEN algorithm Note that the choice of 3K is sim-

ilar to CoSaMP algorithm29 which also uses 3K as aninitial support size Using 3K also usually guaranteesthat XA3K

is well conditioned which may not be the caseif larger value than 3K is chosen

Algorithm 3 SAEN algorithm

input y isin Cn X isin C

ntimesp [α] isin Rm and K

output βK isin Cp

1

βinitA3K

= c-PW-WEN(

yX1p [α] 3K 0)

2

βA2K

=

c-PW-WEN(

yXA3K 13K ⊘

∣βinitA3K

∣ [α] 2K 0)

3

βK AK

=

c-PW-WEN(

yXA2K 12K ⊘

∣βA2K

∣ [α] K 1)

VI SINGLE-SNAPSHOT COMPRESSIVE BEAMFORM-

ING

Estimating the source location in terms of its DoAplays an important role in many applications In5 it wasobserved that CS algorithms can be applied for DoA esti-mation (eg of sound sources) using sensor arrays whenthe array output y is be expressed as sparse (underdeter-mined) linear model by discretizing the DoA parameterspace This approach is referred to as compressive beam-forming (CBF) and it has been subsequently used in aseries of papers (eg7ndash1326)

In CBF after finding the sparse regression estimatorits support can be mapped to the DoA estimates on thegrid Thus the DoA estimates in CBF are always selectedfrom the resulting finite set of discretized DoA parame-ters Hence the resolution of CBF is dependent on thedensity of the grid (spacing ∆θ or grid size p) Densergrid implies large mutual coherence of the basis vectorsxj (here equal to the steering vectors for the DoAs onthe grid) and thus a poor recovery region for most sparseregression techniques

The proposed SAEN estimator can effectively miti-gate the effect of high mutual coherence caused by dis-cretization of the DoA space with significantly better per-formance than state-of-the-art compressed sensing algo-rithms This is illustrated in Section VII via extensivesimulation studies using challenging multi-source set-upsof closely-spaced sources and large variation of sourcepowers

We assume narrowband processing and a far-fieldsource wave impinging on an array of sensors with knownconfiguration The sources are assumed to be located inthe far-field of the sensor array (ie propagation radius≫ array size) A uniform linear array (ULA) of n sensors(eg hydrophones or microphones) is used for estimat-ing the DoA θ isin [minus90 90) of the source with respectto the array axis The array response (steering or wave-front vector) of ULA for a source from DoA (in radians)

6 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

θ isin [minusπ2 π2) is given by

a(

θ)

1radicn

[

1 eπ sin θ eπ(nminus1) sin θ]⊤

where we assume half a wavelength inter-element spac-ing between sensors We consider the case that K lt nsources from distinct DoAs θ = (θ1 θK) arrive at asensor at some time instant t A single-snapshot obtainedby ULA can then be modeled as27

y(t) = A(θ)s(t) + ε(t) (14)

where θ = (θ1 θK)⊤ collects the DoAs the matrix

A(θ) = [a(θ1) middot middot middot a(θK)] A isin CntimesK is the dictionaryof replicas also known as the array steering matrix s(t) isinCK contains the source waveforms and ε(t) is complexnoise at time instant t

Consider an angular grid of size p (commonly p ≫ n)of look directions of interest

[ϑ] = ϑi isin [minusπ2 π2) ϑ1 lt middot middot middot lt ϑp

Let the ith column of the measurement matrix X in themodel (1) be the array response for look direction ϑi soxi = a(ϑi) Then if the true source DoAs are containedin the angular grid ie θi isin [ϑ] for i = 1 K thenthe snapshot y in (14) (where we drop the time index t)can be equivalently modeled by (1) as

y = Xβ + ε

where β is exactly K-sparse (β0 = K) and nonzeroselements of β maintain the source waveforms s Thusidentifying the true DoAs is equivalent to identifyingthe nonzero elements of β which we refer to as CBF-principle Hence sparse regression and CS methods canbe utilised for estimating the DoAs based on a singlesnapshot only We assume that the number of sources Kis known a priori

Besides SNR also the size p or spacing ∆θ of the gridgreatly affect the performance of CBF methods Thecross-correlation (coherence) between the true steeringvector and steering vectors on the grid depends on boththe grid spacing and obliqueness of the target DoAs wrtto the array Moreover the values of cross-correlationsin the gram matrix |XHX| also depend on the distancebetween array elements and configuration of the sen-sors array7 Let us construct a measure called max-imal basis coherence (MBC) defined as the maximumabsolute value of the cross-correlations among the truesteering vectors a(θj) and the basis a(ϑi) ϑi isin [ϑ]θjj isin 1 K

MBC = maxj

maxϑisin[ϑ]θj

∣a(θj)H a(ϑ)

∣ (15)

Note that steering vectors a(θ) θ isin [minusπ2 π2) are as-sumed to be normalized (a(θ)H a(θ) = 1) MBC valuemeasures the obliqueness of the incoming DoA to the ar-ray The higher the MBC value the more difficult it is

for any CBF method to distinguish the true DoA in thegrid Note that the value of MBC also depends on thegrid spacing ∆θ

Fig 3 shows the geometry of DoA estimation prob-lem where the target DoArsquos have varying basis coherenceand it increases with the level of obliqueness (inclination)on either side of the normal to the ULA axis We definea straight DoA when the angle of incidence of the targetDoAs is in the shaded sector in Fig 3 This the regionwhere the MBC has lower values In contrast an obliqueDoA is defined when the angle of incidence of the targetis oblique with respect to the array axis

θ

nn-12

s(t)

1d

Source

d sinθ

frac14

Straight DoAs

FIG 3 (Color online) The straight and oblique DoAs exhibit

different basis coherence

Consider the case of ULA with n = 40 elementsreceiving two sources at straight DoAs θ1 = minus6 andθ2 = 2 or at oblique DoAs θ1 = 44 and θ2 = 52The above scenarios correspond to set-up 2 and set-up 3of Section VII see also Table II Angular separation be-tween the DoArsquos is 8 in both the scenarios In Table I wecompute the correlation between the true steering vectorsand with a neighboring steering vector a(ϑ) in the grid

TABLE I Correlation between true steering vector at DoAs

θ1 and θ2 and a steering vector at angle ϑ on the grid in a two

source scenario set-ups with either straight or oblique DoAs

θ1 θ2 ϑ correlation

True straight DoAs minus6 2 0071

minus6 minus7 0814

2 3 0812

True oblique DoAs 44 52 0069

44 45 0901

52 53 0927

These values validate the fact highlighted in Fig 3Namely a target with an oblique DoA wrt the arrayhas a larger maximal correlation (coherence) with the ba-sis steering vectors This makes it difficult for the sparse

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 7

recovery method to identify the true steering vector a(θi)from the spurious steering vector a(ϑ) that simply has avery large correlation with the true one Due to this mu-tual coherence it may happen that neither a(θi) or a(ϑ)

are assigned a non-zero coefficient value in β or perhapsjust one of them in random fashion

VII SIMULATION STUDIES

We consider seven simulation set-ups First five set-ups use grid spacing ∆θ = 1 (leading to p = 180 lookdirections in the grid [ϑ]) and the last two employ sparsergrid ∆θ = 2 (leading to p = 90 look directions in thegrid) The number of sensors n in the ULA are n = 40 forset-ups 1-5 and n = 30 for set-ups 6-7 Each set-up hasK isin 2 3 4 sources at different (straight or oblique)DoAs θ = (θ1 θK)⊤ and the source waveforms aregenerated as sk = |sk| middot eArg(sk) where source powers|sk| isin (0 1] are fixed for each set-up but the source phasesare randomly generated for each Monte-Carlo trial asArg(sk) sim Unif(0 2π) for k = 1 K Table II speci-fies the values of DoA-s and power of the sources used inthe set-ups Also the MBC values (15) are reported foreach case

TABLE II Details of all the set-ups tested in this paper First

five set-ups have grid spacing ∆θ = 1 and last two ∆θ = 2

Set-ups |si| θ [] MBC

1 [09 1 1] [minus5 3 6] 0814

2 [09 1] [minus6 2] 0814

3 [09 1] [44 52] 0927

4 [08 07 1] [43 44 52] 0927

5 [09 01 1 04] [minus87minus38minus35 97] 0990

6 [08 1 09 04] minus[485 464 315 22] 0991

7 [07 1 06 07] [6 8 14 18] 0643

The error terms εi are iid and generated fromCN (0 σ2) distribution where the noise variance σ2 de-pends on the signal-to-noise ratio (SNR) level in deci-bel (dB) given by SNR(dB) = 10 log10(σ

2sσ

2) whereσ2s = 1

K

|s1|2 + |s2|2 + middot middot middot + |sK |2

denotes the averagesource power An SNR level of 20 dB is used in thispaper unless its specified otherwise and the number ofMonte-Carlo trials is L = 1000

In each set-up we evaluate the performance of allmethods in recovering exactly the true support and thesource powers Due to high mutual coherence and due thelarge differences in source powers the DoA estimation isnow a challenging task A key performance measure isthe (empirical) probability of exact recovery (PER) of allK sources defined as

PER = aveI(Alowast = AK)

where Alowast denotes the index set of true source DoAs onthe grid and set AK the found support set where |Alowast| =|AK | = K and the average is over all Monte-Carlo tri-als Above I(middot) denotes the indicator function We alsocompute the average root mean squared error (RMSE)of the debiased estimate s = argminsisinCK y minus XAK

s2of the source vector as RMSE =

radic

avesminus s2

A Compared methods

This paper compares the SAEN approach to the ex-isting well-known greedy methods such as orthogonalmatching pursuit (OMP)28 and compressive samplingmatching pursuit (CoSaMP)29 Moreover we also drawcomparisons for two special cases of the c-PW-WEN al-gorithm to Lasso estimate that has K-non-zeros (ie

β(λK 1) computed by c-PW-WEN using w = 1 andα = 1) and EN estimator when cherry-picking the bestα in the grid [α] = αi isin [1 0) α1 = 1 lt middot middot middot lt αm lt

001 (ie β(λK αbst))It is instructive to compare the SAEN to simpler

adaptive EN (AEN) approach that simply uses adaptiveweights to weight different coefficients differently in thespirit of adaptive Lasso20 This helps in understandingthe effectiveness of the cleverly chosen weights and theusefulness of the three-step procedure used by the SAENalgorithm Recall that the first step in AEN approach iscompute the weights using some initial solution Afterobtaining the (adaptive) weights the final K-sparse AENsolution is computed We devise three AEN approacheseach one using a different initial solution to compute theweights

1 AEN(LSE) uses the weights found as

w(LSE) = 1⊘ |X+y|

where X+ is Moore-Penrose pseudo inverse of X

2 AEN(n) employs weights from an initial n-sparse

EN solution β(λn α) at nth knot which is found by

c-PW-WEN algorithm with w = 1

3 AEN(3K) instead uses weights calculated from an

initial EN solution β(λ3K α) having 3K nonzerosas in step 1 of SAEN algorithm but the remainingtwo steps of SAEN algorithm are omitted

The upper bound for PER rate for SAEN is the (em-

pirical) probability that the initial solution βinit com-puted in step 1 of algorithm 3 contains the true supportAlowast ie the value

UB = ave

I(

Alowast sub supp(βinit)

(16)

where the average is over all Monte-Carlo trials Wealso compute this upper bound to illustrate the abilityof SAEN to pick the true K-sparse support from theoriginal 3K-sparse initial value For set-up 1 (cf Ta-ble II) the average recovery results for all of the abovementioned methods are provided in Table III

8 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

TABLE III The recovery results for set-up 1 Results illus-

trate the effectiveness of three step SAEN approach compared

to its competitors The SNR level was 20 dB and the upper

bound (16) for the PER rate of SAEN is given in parentheses

SAEN AEN(3K) AEN(n)

PER (0957) 0864 0689 0616

RMSE - 0449 0694 0889

AEN(LSE) EN Lasso OMP CoSaMP

PER 0 0332 0332 0477 0140

RMSE 1870 1163 1163 1060 3558

It can be noted that the proposed SAEN outperformsall other methods and weighting schemes and recovers hetrue support and powers of the sources effectively Notethat the SAENrsquos upper bound for PER rate was 957and SAEN reached the PER rate 864 The results ofthe AEN approaches validate the need for accurate initialestimate to construct the adaptive weights For exampleAEN(3K) performs better than AEN(n) but much worsethan the SAEN method

B Straight and oblique DoAs

Set-up 2 and set-up 3 correspond to the case wherethe targets are at straight and oblique DoAs respectivelyPerformance results of the sparse recovery algorithms aretabulated in Table IV As can be seen the upper boundfor PER rate of SAEN is full 100 percentage whichmeans that the true support is correctly included in 3K-sparse solution computed at Step 1 of the algorithmFor set-up 2 (straight DoAs) all methods have almostfull PER rates except CoSaMP with 678 rate Per-formance of other estimators expect of SAEN changesdrastically in set-up 3 (oblique DoAs) Here SAEN isachieving nearly perfect (sim98) PER rate which meansreduction of 2 compared to set-up 2 Other methodsperform poorly For example PER rate of Lasso dropsfrom near 98 to 40 Similar behavior is observed forEN OMP and CoSaMP

Next we discuss the results for set-up 4 which is sim-ilar to set-up 3 except that we have introduced a thirdsource that also arrives from an oblique DoA θ = 43 andthe variation of the source powers is slightly larger Ascan be noted from Table IV the PER rates of greedy al-gorithms OMP and CoSaMP have declined to outstand-ingly low 0 This is very different with the PER ratesthey had in set-up 3 which contained only two sourcesIndeed inclusion of the 3rd source from an DoA similarwith the other two sources completely ruined their accu-racy This is in deep contrast with the SAEN methodthat still achieves PER rate of 75 which is more thantwice the PER rate achieved by Lasso SAEN is againhaving the lowest RMSE values

TABLE IV Recovery results for set-ups 2 - 4 Note that for

oblique DoArsquos (set-ups 3 and 4) the SAEN method outper-

form the other methods and has a perfect recovery results

for set-up 2 (straight DoAs) SNR level is 20 dB The upper

bound (16) for the PER rate of SAEN is given in parentheses

Set-up 2 with two straight DoAs

SAEN EN Lasso OMP CoSaMP

PER (1000) 1000 0981 0981 0998 0678

RMSE 0126 0145 0145 0128 1436

Set-up 3 with two oblique DoAs

SAEN EN Lasso OMP CoSaMP

PER (1000) 0978 0399 0399 0613 0113

RMSE 0154 0916 0916 0624 2296

Set-up 4 with three oblique DoAs

SAEN EN Lasso OMP CoSaMP

PER (0776) 0749 0392 0378 0 0

RMSE 0505 0838 0827 1087 5290

In summary the recovery results for set-ups 1-4(which express different degrees of basis coherence prox-imity of target DoArsquos as well as variation of source pow-ers) clearly illustrate that the proposed SAEN performsvery well in identifying the true support and the power ofthe sources and is always outperforming the commonlyused benchmarks sparse recovery methods namely theLasso EN OMP or CoSaMP with a significant marginIt is also noteworthy that EN often achieved better PERrates than Lasso which is mainly due to its group se-lection ability As a specific example of this particularfeature Figure 4 shows the solution paths for Lasso andEN for one particular Monte-Carlo trial where EN cor-rectly chooses the true DoAs but Lasso fails to select allcorrect DoAs In this particular instance the EN tun-ing parameter was α = 09 This is reason behind thesuccess of our c-PW-WEN algorithm which is the corecomputational engine of the SAEN

C Off-grid sources

Set-ups 5 and 6 explore the case when the targetDoAs are off the grid Also note that set-up 5 uses finergrid spacing ∆θ = 1 compared to set-up 6 with ∆θ = 2Both set-ups contain four target sources that have largelyvarying source powers In the off-grid case one would likethe CBF method to localize the targets to the nearestDoA in the angular grid [ϑ] that is used to construct thearray steering matrix Therefore in the off the grid case

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 9

0 05 1 15

-log( )

0

1P

ow

er (43)deg

(52)deg

3

(a)

0 05 1 15

-log( )

(43)deg

(44)deg

(52)deg

3

(b)

0

1

Po

wer

40 50 60 70

DoA [degrees]

Sources

Lasso

EN

(c)

FIG 4 (Color online) The Lasso and EN solution paths

(upper panel) and respective DoA solutions at the knot λ3

Observe that Lasso fails to recover the true support but EN

successfully picks the true DoAs The EN tuning parameter

was α = 09 in this example

the PER rate refers to the case that CBF method selectsthe K-sparse support that corresponds to DoArsquos on thegrid that are closest in distance to the true DoAs TableV provides the recovery results As can be seen again theSAEN is performing very well outperforming the Lassoand EN Note that OMP and CoSaMP completely fail inselecting the nearest grid-points

TABLE V Performance results of CBF methods for set-ups

5 and 6 where target DoA-s are off the grid Here PER

rate refers to the case that CBF method selects the K-sparse

support that corresponds to DoArsquos on the grid that are closest

to the true DoAs The upper bound (16) for the PER rate of

SAEN is given in parentheses SNR level is 20 dB

Set-up 5 with four off-grid straight DoAs

SAEN EN Lasso OMP CoSaMP

PER (0999) 0649 0349 0328 0 0

RMSE 0899 0947 0943 1137 8909

Set-up 6 with four off-grid oblique DoAs

SAEN EN Lasso OMP CoSaMP

PER (0794) 0683 0336 0336 0 0005

RMSE 0811 0913 0911 1360 28919

0 6 12 18 24 30

SNR [dB]

0

02

04

06

08

1

PE

R

FIG 5 (Color online) PER rates of CBF methods at different

SNR levels for set-up 7

D More targets and varying SNR levels

Next we consider the set-up 7 (cf Table II) whichcontains K = 4 sources The first three of the sourcesare at straight DoAs and the fourth one at a DoA withmodest obliqueness (θ4 = 18o) We now compute thePER rates of the methods as a function of SNR From thePER rates shown in Fig 5 we again notice that SAENclearly outperforms all of the other methods Note thatthe upper bound (16) of the PER rate of the SAEN is alsoplotted Both greedy algorithms OMP and CoSaMP areperforming very poorly even at high SNR levels Lassoand EN are attaining better recovery results than thegreedy algorithms Again EN is performing better thanLasso due to additional flexibility offered by EN tuningparameter and its ability to cope with correlated steering(basis) vectors SAEN recovers the exact true support inmost of the cases due to its step-wise adaptation usingcleverly chosen weights Furthermore the improvementin PER rates offered by SAEN becomes larger as the SNRlevel increases One can also notice that SAEN is close tothe theoretical upper bound of PER rate at higher SNRregime

VIII CONCLUSIONS

We developed c-PW-WEN algorithm that computesweighted elastic net solutions at the knots of penalty pa-rameter over a grid of EN tuning parameter values c-PW-WEN also computes weighted Lasso as a special case(ie solution at α = 1) and adaptive EN (AEN) is ob-tained when adaptive (data dependent) weights are usedWe then proposed a novel SAEN approach that uses c-PW-WEN method as its core computational engine anduses three-step adaptive weighting scheme where sparsityis decreased from 3K to K in three steps Simulationsillustrated that SAEN performs better than the adap-tive EN approaches Furthermore we illustrated that the3K-sparse initial solution computed at step 1 of SAENprovide smart weights for further steps and includes thetrue K-sparse support with high accuracy The proposedSAEN algorithm is then accurately including the truesupport at each step

10 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

Using the K-sparse Lasso solution computed directlyfrom Lasso path at the kth knot fails to provide exactsupport recovery in many cases especially when we havehigh basis coherence and lower SNR Greedy algorithmsoften fail in the face of high mutual coherence (due todense grid spacing or oblique target DoArsquos) or low SNRThis is mainly due the fact that their performance heav-ily depends on their ability to accurately detecting max-imal correlation between the measurement vector y andthe basis vectors (column vectors of X) Our simulationstudy also showed that their performance (in terms ofPER rate) deteriorates when the number of targets in-creases In the off-grid case the greedy algorithms alsofailed to find the nearby grid-points

Finally the SAEN algorithm performed better thanall other methods in each set-up and the improvementwas more pronounced in the presence of high mutualcoherence This is due to ability of SAEN to in-clude the true support correctly at all three steps ofthe algorithm Our MATLAB Rcopy package that imple-ments the proposed algorithms is freely available atwwwgithubcommntabassmSAEN-LARS The packagealso contains a MATLAB Rcopy live script demo on how touse the method in CBF problem along with an examplefrom simulation set-up 4 presented in the paper

ACKNOWLEDGMENTS

The research was partially supported by theAcademy of Finland grant no 298118 which is grate-fully acknowledged

1R Tibshirani ldquoRegression shrinkage and selection via the lassordquoJournal of the Royal Statistical Society Series B (Methodologi-cal) 267ndash288 (1996)

2H Zou and T Hastie ldquoRegularization and variable selection viathe elastic netrdquo Journal of the Royal Statistical Society SeriesB (Statistical Methodology) 67(2) 301ndash320 (2005)

3T Yardibi J Li P Stoica and L N C III ldquoSparsity con-strained deconvolution approaches for acoustic source mappingrdquoThe Journal of the Acoustical Society of America 123(5) 2631ndash2642 (2008) doi 10112112896754

4A Xenaki E Fernandez-Grande and P Gerstoft ldquoBlock-sparsebeamforming for spatially extended sources in a bayesian formu-lationrdquo The Journal of the Acoustical Society of America 140(3)1828ndash1838 (2016)

5D Malioutov M Cetin and A S Willsky ldquoA sparse signalreconstruction perspective for source localization with sensor ar-raysrdquo IEEE Transactions on Signal Processing 53(8) 3010ndash3022(2005) doi 101109TSP2005850882

6G F Edelmann and C F Gaumond ldquoBeamforming using com-pressive sensingrdquo The Journal of the Acoustical Society of Amer-ica 130(4) EL232ndashEL237 (2011) doi 10112113632046

7A Xenaki P Gerstoft and K Mosegaard ldquoCompressive beam-formingrdquo The Journal of the Acoustical Society of America136(1) 260ndash271 (2014) doi 10112114883360

8P Gerstoft A Xenaki and C F Mecklenbruker ldquoMultiple andsingle snapshot compressive beamformingrdquo The Journal of theAcoustical Society of America 138(4) 2003ndash2014 (2015) doi10112114929941

9Y Choo and W Seong ldquoCompressive spherical beamformingfor localization of incipient tip vortex cavitationrdquo The Journal

of the Acoustical Society of America 140(6) 4085ndash4090 (2016)doi 10112114968576

10A Das W S Hodgkiss and P Gerstoft ldquoCoherent multi-path direction-of-arrival resolution using compressed sensingrdquoIEEE Journal of Oceanic Engineering 42(2) 494ndash505 (2017) doi101109JOE20162576198

11S Fortunati R Grasso F Gini M S Greco and K LePageldquoSingle-snapshot doa estimation by using compressed sensingrdquoEURASIP Journal on Advances in Signal Processing 2014(1)120 (2014) doi 1011861687-6180-2014-120

12E Ollila ldquoNonparametric simultaneous sparse recovery An ap-plication to source localizationrdquo in Proc European Signal Pro-cessing Conference (EUSIPCOrsquo15) Nice France (2015) pp509ndash513

13E Ollila ldquoMultichannel sparse recovery of complex-valued sig-nals using Huberrsquos criterionrdquo in Proc Compressed Sensing The-ory and its Applications to Radar Sonar and Remote Sensing(CoSeRarsquo15) Pisa Italy (2015) pp 32ndash36

14M N Tabassum and E Ollila ldquoSingle-snapshot doa estimationusing adaptive elastic net in the complex domainrdquo in 4th In-ternational Workshop on Compressed Sensing Theory and itsApplications to Radar Sonar and Remote Sensing (CoSeRa)(2016) pp 197ndash201 doi 101109CoSeRa20167745728

15T Hastie R Tibshirani and M Wainwright Statistical learningwith sparsity the lasso and generalizations (CRC Press 2015)

16M Osborne B Presnell and B Turlach ldquoA new ap-proach to variable selection in least squares problemsrdquo IMAJournal of Numerical Analysis 20(3) 389ndash403 (2000) doi101093imanum203389

17B Efron T Hastie I Johnstone and R Tibshirani ldquoLeast an-gle regression (with discussion)rdquo The Annals of statistics 32(2)407ndash499 (2004) doi 101214009053604000000067

18M N Tabassum and E Ollila ldquoPathwise least angle regressionand a significance test for the elastic netrdquo in 25th European Sig-nal Processing Conference (EUSIPCO) (2017) pp 1309ndash1313doi 1023919EUSIPCO20178081420

19S Foucart and H Rauhut A mathematical introduction to com-pressive sensing Vol 1 (Springer 2013)

20H Zou ldquoThe adaptive lasso and its oracle propertiesrdquo Journal ofthe American Statistical Association 101(476) 1418ndash1429 (2006)doi 101198016214506000000735

21A E Hoerl and R W Kennard ldquoRidge regression biased esti-mation for nonorthogonal problemsrdquo Technometrics 12(1) 55ndash67 (1970)

22S Rosset and J Zhu ldquoPiecewise linear regularized so-lution pathsrdquo Ann Statist 35(3) 1012ndash1030 (2007) doi101214009053606000001370

23R J Tibshirani and J Taylor ldquoThe solution path of thegeneralized lassordquo Ann Statist 39(3) 1335ndash1371 (2011) doi10121411-AOS878

24R J Tibshirani ldquoThe lasso problem and uniquenessrdquo ElectronJ Statist 7 1456ndash1490 (2013) doi 10121413-EJS815

25A Panahi and M Viberg ldquoFast candidate points selection in thelasso pathrdquo IEEE Signal Processing Letters 19(2) 79ndash82 (2012)

26K L Gemba W S Hodgkiss and P Gerstoft ldquoAdaptiveand compressive matched field processingrdquo The Journal ofthe Acoustical Society of America 141(1) 92ndash103 (2017) doi10112114973528

27A B Gershman C F Mecklenbrauker and J F Bohme ldquoMa-trix fitting approach to direction of arrival estimation with imper-fect spatial coherence of wavefrontsrdquo IEEE Transactions on Sig-nal Processing 45(7) 1894ndash1899 (1997) doi 10110978599968

28J A Tropp and A C Gilbert ldquoSignal recovery from randommeasurements via orthogonal matching pursuitrdquo IEEE Trans-actions on Information Theory 53(12) 4655ndash4666 (2007) doi101109TIT2007909108

29D Needell and J Tropp ldquoCosamp Iterative signal recovery fromincomplete and inaccurate samplesrdquo Applied and Computational

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 11

Harmonic Analysis 26(3) 301 ndash 321 (2009)

12 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

Page 7: Sequential adaptive elastic net approach for single …Sequential adaptive elastic net approach for single-snapshot source localization MuhammadNaveedTabassum 1andEsaOllila Aalto University,

θ isin [minusπ2 π2) is given by

a(

θ)

1radicn

[

1 eπ sin θ eπ(nminus1) sin θ]⊤

where we assume half a wavelength inter-element spac-ing between sensors We consider the case that K lt nsources from distinct DoAs θ = (θ1 θK) arrive at asensor at some time instant t A single-snapshot obtainedby ULA can then be modeled as27

y(t) = A(θ)s(t) + ε(t) (14)

where θ = (θ1 θK)⊤ collects the DoAs the matrix

A(θ) = [a(θ1) middot middot middot a(θK)] A isin CntimesK is the dictionaryof replicas also known as the array steering matrix s(t) isinCK contains the source waveforms and ε(t) is complexnoise at time instant t

Consider an angular grid of size p (commonly p ≫ n)of look directions of interest

[ϑ] = ϑi isin [minusπ2 π2) ϑ1 lt middot middot middot lt ϑp

Let the ith column of the measurement matrix X in themodel (1) be the array response for look direction ϑi soxi = a(ϑi) Then if the true source DoAs are containedin the angular grid ie θi isin [ϑ] for i = 1 K thenthe snapshot y in (14) (where we drop the time index t)can be equivalently modeled by (1) as

y = Xβ + ε

where β is exactly K-sparse (β0 = K) and nonzeroselements of β maintain the source waveforms s Thusidentifying the true DoAs is equivalent to identifyingthe nonzero elements of β which we refer to as CBF-principle Hence sparse regression and CS methods canbe utilised for estimating the DoAs based on a singlesnapshot only We assume that the number of sources Kis known a priori

Besides SNR also the size p or spacing ∆θ of the gridgreatly affect the performance of CBF methods Thecross-correlation (coherence) between the true steeringvector and steering vectors on the grid depends on boththe grid spacing and obliqueness of the target DoAs wrtto the array Moreover the values of cross-correlationsin the gram matrix |XHX| also depend on the distancebetween array elements and configuration of the sen-sors array7 Let us construct a measure called max-imal basis coherence (MBC) defined as the maximumabsolute value of the cross-correlations among the truesteering vectors a(θj) and the basis a(ϑi) ϑi isin [ϑ]θjj isin 1 K

MBC = maxj

maxϑisin[ϑ]θj

∣a(θj)H a(ϑ)

∣ (15)

Note that steering vectors a(θ) θ isin [minusπ2 π2) are as-sumed to be normalized (a(θ)H a(θ) = 1) MBC valuemeasures the obliqueness of the incoming DoA to the ar-ray The higher the MBC value the more difficult it is

for any CBF method to distinguish the true DoA in thegrid Note that the value of MBC also depends on thegrid spacing ∆θ

Fig 3 shows the geometry of DoA estimation prob-lem where the target DoArsquos have varying basis coherenceand it increases with the level of obliqueness (inclination)on either side of the normal to the ULA axis We definea straight DoA when the angle of incidence of the targetDoAs is in the shaded sector in Fig 3 This the regionwhere the MBC has lower values In contrast an obliqueDoA is defined when the angle of incidence of the targetis oblique with respect to the array axis

θ

nn-12

s(t)

1d

Source

d sinθ

frac14

Straight DoAs

FIG 3 (Color online) The straight and oblique DoAs exhibit

different basis coherence

Consider the case of ULA with n = 40 elementsreceiving two sources at straight DoAs θ1 = minus6 andθ2 = 2 or at oblique DoAs θ1 = 44 and θ2 = 52The above scenarios correspond to set-up 2 and set-up 3of Section VII see also Table II Angular separation be-tween the DoArsquos is 8 in both the scenarios In Table I wecompute the correlation between the true steering vectorsand with a neighboring steering vector a(ϑ) in the grid

TABLE I Correlation between true steering vector at DoAs

θ1 and θ2 and a steering vector at angle ϑ on the grid in a two

source scenario set-ups with either straight or oblique DoAs

θ1 θ2 ϑ correlation

True straight DoAs minus6 2 0071

minus6 minus7 0814

2 3 0812

True oblique DoAs 44 52 0069

44 45 0901

52 53 0927

These values validate the fact highlighted in Fig 3Namely a target with an oblique DoA wrt the arrayhas a larger maximal correlation (coherence) with the ba-sis steering vectors This makes it difficult for the sparse

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 7

recovery method to identify the true steering vector a(θi)from the spurious steering vector a(ϑ) that simply has avery large correlation with the true one Due to this mu-tual coherence it may happen that neither a(θi) or a(ϑ)

are assigned a non-zero coefficient value in β or perhapsjust one of them in random fashion

VII SIMULATION STUDIES

We consider seven simulation set-ups First five set-ups use grid spacing ∆θ = 1 (leading to p = 180 lookdirections in the grid [ϑ]) and the last two employ sparsergrid ∆θ = 2 (leading to p = 90 look directions in thegrid) The number of sensors n in the ULA are n = 40 forset-ups 1-5 and n = 30 for set-ups 6-7 Each set-up hasK isin 2 3 4 sources at different (straight or oblique)DoAs θ = (θ1 θK)⊤ and the source waveforms aregenerated as sk = |sk| middot eArg(sk) where source powers|sk| isin (0 1] are fixed for each set-up but the source phasesare randomly generated for each Monte-Carlo trial asArg(sk) sim Unif(0 2π) for k = 1 K Table II speci-fies the values of DoA-s and power of the sources used inthe set-ups Also the MBC values (15) are reported foreach case

TABLE II Details of all the set-ups tested in this paper First

five set-ups have grid spacing ∆θ = 1 and last two ∆θ = 2

Set-ups |si| θ [] MBC

1 [09 1 1] [minus5 3 6] 0814

2 [09 1] [minus6 2] 0814

3 [09 1] [44 52] 0927

4 [08 07 1] [43 44 52] 0927

5 [09 01 1 04] [minus87minus38minus35 97] 0990

6 [08 1 09 04] minus[485 464 315 22] 0991

7 [07 1 06 07] [6 8 14 18] 0643

The error terms εi are iid and generated fromCN (0 σ2) distribution where the noise variance σ2 de-pends on the signal-to-noise ratio (SNR) level in deci-bel (dB) given by SNR(dB) = 10 log10(σ

2sσ

2) whereσ2s = 1

K

|s1|2 + |s2|2 + middot middot middot + |sK |2

denotes the averagesource power An SNR level of 20 dB is used in thispaper unless its specified otherwise and the number ofMonte-Carlo trials is L = 1000

In each set-up we evaluate the performance of allmethods in recovering exactly the true support and thesource powers Due to high mutual coherence and due thelarge differences in source powers the DoA estimation isnow a challenging task A key performance measure isthe (empirical) probability of exact recovery (PER) of allK sources defined as

PER = aveI(Alowast = AK)

where Alowast denotes the index set of true source DoAs onthe grid and set AK the found support set where |Alowast| =|AK | = K and the average is over all Monte-Carlo tri-als Above I(middot) denotes the indicator function We alsocompute the average root mean squared error (RMSE)of the debiased estimate s = argminsisinCK y minus XAK

s2of the source vector as RMSE =

radic

avesminus s2

A Compared methods

This paper compares the SAEN approach to the ex-isting well-known greedy methods such as orthogonalmatching pursuit (OMP)28 and compressive samplingmatching pursuit (CoSaMP)29 Moreover we also drawcomparisons for two special cases of the c-PW-WEN al-gorithm to Lasso estimate that has K-non-zeros (ie

β(λK 1) computed by c-PW-WEN using w = 1 andα = 1) and EN estimator when cherry-picking the bestα in the grid [α] = αi isin [1 0) α1 = 1 lt middot middot middot lt αm lt

001 (ie β(λK αbst))It is instructive to compare the SAEN to simpler

adaptive EN (AEN) approach that simply uses adaptiveweights to weight different coefficients differently in thespirit of adaptive Lasso20 This helps in understandingthe effectiveness of the cleverly chosen weights and theusefulness of the three-step procedure used by the SAENalgorithm Recall that the first step in AEN approach iscompute the weights using some initial solution Afterobtaining the (adaptive) weights the final K-sparse AENsolution is computed We devise three AEN approacheseach one using a different initial solution to compute theweights

1 AEN(LSE) uses the weights found as

w(LSE) = 1⊘ |X+y|

where X+ is Moore-Penrose pseudo inverse of X

2 AEN(n) employs weights from an initial n-sparse

EN solution β(λn α) at nth knot which is found by

c-PW-WEN algorithm with w = 1

3 AEN(3K) instead uses weights calculated from an

initial EN solution β(λ3K α) having 3K nonzerosas in step 1 of SAEN algorithm but the remainingtwo steps of SAEN algorithm are omitted

The upper bound for PER rate for SAEN is the (em-

pirical) probability that the initial solution βinit com-puted in step 1 of algorithm 3 contains the true supportAlowast ie the value

UB = ave

I(

Alowast sub supp(βinit)

(16)

where the average is over all Monte-Carlo trials Wealso compute this upper bound to illustrate the abilityof SAEN to pick the true K-sparse support from theoriginal 3K-sparse initial value For set-up 1 (cf Ta-ble II) the average recovery results for all of the abovementioned methods are provided in Table III

8 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

TABLE III The recovery results for set-up 1 Results illus-

trate the effectiveness of three step SAEN approach compared

to its competitors The SNR level was 20 dB and the upper

bound (16) for the PER rate of SAEN is given in parentheses

SAEN AEN(3K) AEN(n)

PER (0957) 0864 0689 0616

RMSE - 0449 0694 0889

AEN(LSE) EN Lasso OMP CoSaMP

PER 0 0332 0332 0477 0140

RMSE 1870 1163 1163 1060 3558

It can be noted that the proposed SAEN outperformsall other methods and weighting schemes and recovers hetrue support and powers of the sources effectively Notethat the SAENrsquos upper bound for PER rate was 957and SAEN reached the PER rate 864 The results ofthe AEN approaches validate the need for accurate initialestimate to construct the adaptive weights For exampleAEN(3K) performs better than AEN(n) but much worsethan the SAEN method

B Straight and oblique DoAs

Set-up 2 and set-up 3 correspond to the case wherethe targets are at straight and oblique DoAs respectivelyPerformance results of the sparse recovery algorithms aretabulated in Table IV As can be seen the upper boundfor PER rate of SAEN is full 100 percentage whichmeans that the true support is correctly included in 3K-sparse solution computed at Step 1 of the algorithmFor set-up 2 (straight DoAs) all methods have almostfull PER rates except CoSaMP with 678 rate Per-formance of other estimators expect of SAEN changesdrastically in set-up 3 (oblique DoAs) Here SAEN isachieving nearly perfect (sim98) PER rate which meansreduction of 2 compared to set-up 2 Other methodsperform poorly For example PER rate of Lasso dropsfrom near 98 to 40 Similar behavior is observed forEN OMP and CoSaMP

Next we discuss the results for set-up 4 which is sim-ilar to set-up 3 except that we have introduced a thirdsource that also arrives from an oblique DoA θ = 43 andthe variation of the source powers is slightly larger Ascan be noted from Table IV the PER rates of greedy al-gorithms OMP and CoSaMP have declined to outstand-ingly low 0 This is very different with the PER ratesthey had in set-up 3 which contained only two sourcesIndeed inclusion of the 3rd source from an DoA similarwith the other two sources completely ruined their accu-racy This is in deep contrast with the SAEN methodthat still achieves PER rate of 75 which is more thantwice the PER rate achieved by Lasso SAEN is againhaving the lowest RMSE values

TABLE IV Recovery results for set-ups 2 - 4 Note that for

oblique DoArsquos (set-ups 3 and 4) the SAEN method outper-

form the other methods and has a perfect recovery results

for set-up 2 (straight DoAs) SNR level is 20 dB The upper

bound (16) for the PER rate of SAEN is given in parentheses

Set-up 2 with two straight DoAs

SAEN EN Lasso OMP CoSaMP

PER (1000) 1000 0981 0981 0998 0678

RMSE 0126 0145 0145 0128 1436

Set-up 3 with two oblique DoAs

SAEN EN Lasso OMP CoSaMP

PER (1000) 0978 0399 0399 0613 0113

RMSE 0154 0916 0916 0624 2296

Set-up 4 with three oblique DoAs

SAEN EN Lasso OMP CoSaMP

PER (0776) 0749 0392 0378 0 0

RMSE 0505 0838 0827 1087 5290

In summary the recovery results for set-ups 1-4(which express different degrees of basis coherence prox-imity of target DoArsquos as well as variation of source pow-ers) clearly illustrate that the proposed SAEN performsvery well in identifying the true support and the power ofthe sources and is always outperforming the commonlyused benchmarks sparse recovery methods namely theLasso EN OMP or CoSaMP with a significant marginIt is also noteworthy that EN often achieved better PERrates than Lasso which is mainly due to its group se-lection ability As a specific example of this particularfeature Figure 4 shows the solution paths for Lasso andEN for one particular Monte-Carlo trial where EN cor-rectly chooses the true DoAs but Lasso fails to select allcorrect DoAs In this particular instance the EN tun-ing parameter was α = 09 This is reason behind thesuccess of our c-PW-WEN algorithm which is the corecomputational engine of the SAEN

C Off-grid sources

Set-ups 5 and 6 explore the case when the targetDoAs are off the grid Also note that set-up 5 uses finergrid spacing ∆θ = 1 compared to set-up 6 with ∆θ = 2Both set-ups contain four target sources that have largelyvarying source powers In the off-grid case one would likethe CBF method to localize the targets to the nearestDoA in the angular grid [ϑ] that is used to construct thearray steering matrix Therefore in the off the grid case

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 9

0 05 1 15

-log( )

0

1P

ow

er (43)deg

(52)deg

3

(a)

0 05 1 15

-log( )

(43)deg

(44)deg

(52)deg

3

(b)

0

1

Po

wer

40 50 60 70

DoA [degrees]

Sources

Lasso

EN

(c)

FIG 4 (Color online) The Lasso and EN solution paths

(upper panel) and respective DoA solutions at the knot λ3

Observe that Lasso fails to recover the true support but EN

successfully picks the true DoAs The EN tuning parameter

was α = 09 in this example

the PER rate refers to the case that CBF method selectsthe K-sparse support that corresponds to DoArsquos on thegrid that are closest in distance to the true DoAs TableV provides the recovery results As can be seen again theSAEN is performing very well outperforming the Lassoand EN Note that OMP and CoSaMP completely fail inselecting the nearest grid-points

TABLE V Performance results of CBF methods for set-ups

5 and 6 where target DoA-s are off the grid Here PER

rate refers to the case that CBF method selects the K-sparse

support that corresponds to DoArsquos on the grid that are closest

to the true DoAs The upper bound (16) for the PER rate of

SAEN is given in parentheses SNR level is 20 dB

Set-up 5 with four off-grid straight DoAs

SAEN EN Lasso OMP CoSaMP

PER (0999) 0649 0349 0328 0 0

RMSE 0899 0947 0943 1137 8909

Set-up 6 with four off-grid oblique DoAs

SAEN EN Lasso OMP CoSaMP

PER (0794) 0683 0336 0336 0 0005

RMSE 0811 0913 0911 1360 28919

0 6 12 18 24 30

SNR [dB]

0

02

04

06

08

1

PE

R

FIG 5 (Color online) PER rates of CBF methods at different

SNR levels for set-up 7

D More targets and varying SNR levels

Next we consider the set-up 7 (cf Table II) whichcontains K = 4 sources The first three of the sourcesare at straight DoAs and the fourth one at a DoA withmodest obliqueness (θ4 = 18o) We now compute thePER rates of the methods as a function of SNR From thePER rates shown in Fig 5 we again notice that SAENclearly outperforms all of the other methods Note thatthe upper bound (16) of the PER rate of the SAEN is alsoplotted Both greedy algorithms OMP and CoSaMP areperforming very poorly even at high SNR levels Lassoand EN are attaining better recovery results than thegreedy algorithms Again EN is performing better thanLasso due to additional flexibility offered by EN tuningparameter and its ability to cope with correlated steering(basis) vectors SAEN recovers the exact true support inmost of the cases due to its step-wise adaptation usingcleverly chosen weights Furthermore the improvementin PER rates offered by SAEN becomes larger as the SNRlevel increases One can also notice that SAEN is close tothe theoretical upper bound of PER rate at higher SNRregime

VIII CONCLUSIONS

We developed c-PW-WEN algorithm that computesweighted elastic net solutions at the knots of penalty pa-rameter over a grid of EN tuning parameter values c-PW-WEN also computes weighted Lasso as a special case(ie solution at α = 1) and adaptive EN (AEN) is ob-tained when adaptive (data dependent) weights are usedWe then proposed a novel SAEN approach that uses c-PW-WEN method as its core computational engine anduses three-step adaptive weighting scheme where sparsityis decreased from 3K to K in three steps Simulationsillustrated that SAEN performs better than the adap-tive EN approaches Furthermore we illustrated that the3K-sparse initial solution computed at step 1 of SAENprovide smart weights for further steps and includes thetrue K-sparse support with high accuracy The proposedSAEN algorithm is then accurately including the truesupport at each step

10 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

Using the K-sparse Lasso solution computed directlyfrom Lasso path at the kth knot fails to provide exactsupport recovery in many cases especially when we havehigh basis coherence and lower SNR Greedy algorithmsoften fail in the face of high mutual coherence (due todense grid spacing or oblique target DoArsquos) or low SNRThis is mainly due the fact that their performance heav-ily depends on their ability to accurately detecting max-imal correlation between the measurement vector y andthe basis vectors (column vectors of X) Our simulationstudy also showed that their performance (in terms ofPER rate) deteriorates when the number of targets in-creases In the off-grid case the greedy algorithms alsofailed to find the nearby grid-points

Finally the SAEN algorithm performed better thanall other methods in each set-up and the improvementwas more pronounced in the presence of high mutualcoherence This is due to ability of SAEN to in-clude the true support correctly at all three steps ofthe algorithm Our MATLAB Rcopy package that imple-ments the proposed algorithms is freely available atwwwgithubcommntabassmSAEN-LARS The packagealso contains a MATLAB Rcopy live script demo on how touse the method in CBF problem along with an examplefrom simulation set-up 4 presented in the paper

ACKNOWLEDGMENTS

The research was partially supported by theAcademy of Finland grant no 298118 which is grate-fully acknowledged

1R Tibshirani ldquoRegression shrinkage and selection via the lassordquoJournal of the Royal Statistical Society Series B (Methodologi-cal) 267ndash288 (1996)

2H Zou and T Hastie ldquoRegularization and variable selection viathe elastic netrdquo Journal of the Royal Statistical Society SeriesB (Statistical Methodology) 67(2) 301ndash320 (2005)

3T Yardibi J Li P Stoica and L N C III ldquoSparsity con-strained deconvolution approaches for acoustic source mappingrdquoThe Journal of the Acoustical Society of America 123(5) 2631ndash2642 (2008) doi 10112112896754

4A Xenaki E Fernandez-Grande and P Gerstoft ldquoBlock-sparsebeamforming for spatially extended sources in a bayesian formu-lationrdquo The Journal of the Acoustical Society of America 140(3)1828ndash1838 (2016)

5D Malioutov M Cetin and A S Willsky ldquoA sparse signalreconstruction perspective for source localization with sensor ar-raysrdquo IEEE Transactions on Signal Processing 53(8) 3010ndash3022(2005) doi 101109TSP2005850882

6G F Edelmann and C F Gaumond ldquoBeamforming using com-pressive sensingrdquo The Journal of the Acoustical Society of Amer-ica 130(4) EL232ndashEL237 (2011) doi 10112113632046

7A Xenaki P Gerstoft and K Mosegaard ldquoCompressive beam-formingrdquo The Journal of the Acoustical Society of America136(1) 260ndash271 (2014) doi 10112114883360

8P Gerstoft A Xenaki and C F Mecklenbruker ldquoMultiple andsingle snapshot compressive beamformingrdquo The Journal of theAcoustical Society of America 138(4) 2003ndash2014 (2015) doi10112114929941

9Y Choo and W Seong ldquoCompressive spherical beamformingfor localization of incipient tip vortex cavitationrdquo The Journal

of the Acoustical Society of America 140(6) 4085ndash4090 (2016)doi 10112114968576

10A Das W S Hodgkiss and P Gerstoft ldquoCoherent multi-path direction-of-arrival resolution using compressed sensingrdquoIEEE Journal of Oceanic Engineering 42(2) 494ndash505 (2017) doi101109JOE20162576198

11S Fortunati R Grasso F Gini M S Greco and K LePageldquoSingle-snapshot doa estimation by using compressed sensingrdquoEURASIP Journal on Advances in Signal Processing 2014(1)120 (2014) doi 1011861687-6180-2014-120

12E Ollila ldquoNonparametric simultaneous sparse recovery An ap-plication to source localizationrdquo in Proc European Signal Pro-cessing Conference (EUSIPCOrsquo15) Nice France (2015) pp509ndash513

13E Ollila ldquoMultichannel sparse recovery of complex-valued sig-nals using Huberrsquos criterionrdquo in Proc Compressed Sensing The-ory and its Applications to Radar Sonar and Remote Sensing(CoSeRarsquo15) Pisa Italy (2015) pp 32ndash36

14M N Tabassum and E Ollila ldquoSingle-snapshot doa estimationusing adaptive elastic net in the complex domainrdquo in 4th In-ternational Workshop on Compressed Sensing Theory and itsApplications to Radar Sonar and Remote Sensing (CoSeRa)(2016) pp 197ndash201 doi 101109CoSeRa20167745728

15T Hastie R Tibshirani and M Wainwright Statistical learningwith sparsity the lasso and generalizations (CRC Press 2015)

16M Osborne B Presnell and B Turlach ldquoA new ap-proach to variable selection in least squares problemsrdquo IMAJournal of Numerical Analysis 20(3) 389ndash403 (2000) doi101093imanum203389

17B Efron T Hastie I Johnstone and R Tibshirani ldquoLeast an-gle regression (with discussion)rdquo The Annals of statistics 32(2)407ndash499 (2004) doi 101214009053604000000067

18M N Tabassum and E Ollila ldquoPathwise least angle regressionand a significance test for the elastic netrdquo in 25th European Sig-nal Processing Conference (EUSIPCO) (2017) pp 1309ndash1313doi 1023919EUSIPCO20178081420

19S Foucart and H Rauhut A mathematical introduction to com-pressive sensing Vol 1 (Springer 2013)

20H Zou ldquoThe adaptive lasso and its oracle propertiesrdquo Journal ofthe American Statistical Association 101(476) 1418ndash1429 (2006)doi 101198016214506000000735

21A E Hoerl and R W Kennard ldquoRidge regression biased esti-mation for nonorthogonal problemsrdquo Technometrics 12(1) 55ndash67 (1970)

22S Rosset and J Zhu ldquoPiecewise linear regularized so-lution pathsrdquo Ann Statist 35(3) 1012ndash1030 (2007) doi101214009053606000001370

23R J Tibshirani and J Taylor ldquoThe solution path of thegeneralized lassordquo Ann Statist 39(3) 1335ndash1371 (2011) doi10121411-AOS878

24R J Tibshirani ldquoThe lasso problem and uniquenessrdquo ElectronJ Statist 7 1456ndash1490 (2013) doi 10121413-EJS815

25A Panahi and M Viberg ldquoFast candidate points selection in thelasso pathrdquo IEEE Signal Processing Letters 19(2) 79ndash82 (2012)

26K L Gemba W S Hodgkiss and P Gerstoft ldquoAdaptiveand compressive matched field processingrdquo The Journal ofthe Acoustical Society of America 141(1) 92ndash103 (2017) doi10112114973528

27A B Gershman C F Mecklenbrauker and J F Bohme ldquoMa-trix fitting approach to direction of arrival estimation with imper-fect spatial coherence of wavefrontsrdquo IEEE Transactions on Sig-nal Processing 45(7) 1894ndash1899 (1997) doi 10110978599968

28J A Tropp and A C Gilbert ldquoSignal recovery from randommeasurements via orthogonal matching pursuitrdquo IEEE Trans-actions on Information Theory 53(12) 4655ndash4666 (2007) doi101109TIT2007909108

29D Needell and J Tropp ldquoCosamp Iterative signal recovery fromincomplete and inaccurate samplesrdquo Applied and Computational

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 11

Harmonic Analysis 26(3) 301 ndash 321 (2009)

12 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

Page 8: Sequential adaptive elastic net approach for single …Sequential adaptive elastic net approach for single-snapshot source localization MuhammadNaveedTabassum 1andEsaOllila Aalto University,

recovery method to identify the true steering vector a(θi)from the spurious steering vector a(ϑ) that simply has avery large correlation with the true one Due to this mu-tual coherence it may happen that neither a(θi) or a(ϑ)

are assigned a non-zero coefficient value in β or perhapsjust one of them in random fashion

VII SIMULATION STUDIES

We consider seven simulation set-ups First five set-ups use grid spacing ∆θ = 1 (leading to p = 180 lookdirections in the grid [ϑ]) and the last two employ sparsergrid ∆θ = 2 (leading to p = 90 look directions in thegrid) The number of sensors n in the ULA are n = 40 forset-ups 1-5 and n = 30 for set-ups 6-7 Each set-up hasK isin 2 3 4 sources at different (straight or oblique)DoAs θ = (θ1 θK)⊤ and the source waveforms aregenerated as sk = |sk| middot eArg(sk) where source powers|sk| isin (0 1] are fixed for each set-up but the source phasesare randomly generated for each Monte-Carlo trial asArg(sk) sim Unif(0 2π) for k = 1 K Table II speci-fies the values of DoA-s and power of the sources used inthe set-ups Also the MBC values (15) are reported foreach case

TABLE II Details of all the set-ups tested in this paper First

five set-ups have grid spacing ∆θ = 1 and last two ∆θ = 2

Set-ups |si| θ [] MBC

1 [09 1 1] [minus5 3 6] 0814

2 [09 1] [minus6 2] 0814

3 [09 1] [44 52] 0927

4 [08 07 1] [43 44 52] 0927

5 [09 01 1 04] [minus87minus38minus35 97] 0990

6 [08 1 09 04] minus[485 464 315 22] 0991

7 [07 1 06 07] [6 8 14 18] 0643

The error terms εi are iid and generated fromCN (0 σ2) distribution where the noise variance σ2 de-pends on the signal-to-noise ratio (SNR) level in deci-bel (dB) given by SNR(dB) = 10 log10(σ

2sσ

2) whereσ2s = 1

K

|s1|2 + |s2|2 + middot middot middot + |sK |2

denotes the averagesource power An SNR level of 20 dB is used in thispaper unless its specified otherwise and the number ofMonte-Carlo trials is L = 1000

In each set-up we evaluate the performance of allmethods in recovering exactly the true support and thesource powers Due to high mutual coherence and due thelarge differences in source powers the DoA estimation isnow a challenging task A key performance measure isthe (empirical) probability of exact recovery (PER) of allK sources defined as

PER = aveI(Alowast = AK)

where Alowast denotes the index set of true source DoAs onthe grid and set AK the found support set where |Alowast| =|AK | = K and the average is over all Monte-Carlo tri-als Above I(middot) denotes the indicator function We alsocompute the average root mean squared error (RMSE)of the debiased estimate s = argminsisinCK y minus XAK

s2of the source vector as RMSE =

radic

avesminus s2

A Compared methods

This paper compares the SAEN approach to the ex-isting well-known greedy methods such as orthogonalmatching pursuit (OMP)28 and compressive samplingmatching pursuit (CoSaMP)29 Moreover we also drawcomparisons for two special cases of the c-PW-WEN al-gorithm to Lasso estimate that has K-non-zeros (ie

β(λK 1) computed by c-PW-WEN using w = 1 andα = 1) and EN estimator when cherry-picking the bestα in the grid [α] = αi isin [1 0) α1 = 1 lt middot middot middot lt αm lt

001 (ie β(λK αbst))It is instructive to compare the SAEN to simpler

adaptive EN (AEN) approach that simply uses adaptiveweights to weight different coefficients differently in thespirit of adaptive Lasso20 This helps in understandingthe effectiveness of the cleverly chosen weights and theusefulness of the three-step procedure used by the SAENalgorithm Recall that the first step in AEN approach iscompute the weights using some initial solution Afterobtaining the (adaptive) weights the final K-sparse AENsolution is computed We devise three AEN approacheseach one using a different initial solution to compute theweights

1 AEN(LSE) uses the weights found as

w(LSE) = 1⊘ |X+y|

where X+ is Moore-Penrose pseudo inverse of X

2 AEN(n) employs weights from an initial n-sparse

EN solution β(λn α) at nth knot which is found by

c-PW-WEN algorithm with w = 1

3 AEN(3K) instead uses weights calculated from an

initial EN solution β(λ3K α) having 3K nonzerosas in step 1 of SAEN algorithm but the remainingtwo steps of SAEN algorithm are omitted

The upper bound for PER rate for SAEN is the (em-

pirical) probability that the initial solution βinit com-puted in step 1 of algorithm 3 contains the true supportAlowast ie the value

UB = ave

I(

Alowast sub supp(βinit)

(16)

where the average is over all Monte-Carlo trials Wealso compute this upper bound to illustrate the abilityof SAEN to pick the true K-sparse support from theoriginal 3K-sparse initial value For set-up 1 (cf Ta-ble II) the average recovery results for all of the abovementioned methods are provided in Table III

8 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

TABLE III The recovery results for set-up 1 Results illus-

trate the effectiveness of three step SAEN approach compared

to its competitors The SNR level was 20 dB and the upper

bound (16) for the PER rate of SAEN is given in parentheses

SAEN AEN(3K) AEN(n)

PER (0957) 0864 0689 0616

RMSE - 0449 0694 0889

AEN(LSE) EN Lasso OMP CoSaMP

PER 0 0332 0332 0477 0140

RMSE 1870 1163 1163 1060 3558

It can be noted that the proposed SAEN outperformsall other methods and weighting schemes and recovers hetrue support and powers of the sources effectively Notethat the SAENrsquos upper bound for PER rate was 957and SAEN reached the PER rate 864 The results ofthe AEN approaches validate the need for accurate initialestimate to construct the adaptive weights For exampleAEN(3K) performs better than AEN(n) but much worsethan the SAEN method

B Straight and oblique DoAs

Set-up 2 and set-up 3 correspond to the case wherethe targets are at straight and oblique DoAs respectivelyPerformance results of the sparse recovery algorithms aretabulated in Table IV As can be seen the upper boundfor PER rate of SAEN is full 100 percentage whichmeans that the true support is correctly included in 3K-sparse solution computed at Step 1 of the algorithmFor set-up 2 (straight DoAs) all methods have almostfull PER rates except CoSaMP with 678 rate Per-formance of other estimators expect of SAEN changesdrastically in set-up 3 (oblique DoAs) Here SAEN isachieving nearly perfect (sim98) PER rate which meansreduction of 2 compared to set-up 2 Other methodsperform poorly For example PER rate of Lasso dropsfrom near 98 to 40 Similar behavior is observed forEN OMP and CoSaMP

Next we discuss the results for set-up 4 which is sim-ilar to set-up 3 except that we have introduced a thirdsource that also arrives from an oblique DoA θ = 43 andthe variation of the source powers is slightly larger Ascan be noted from Table IV the PER rates of greedy al-gorithms OMP and CoSaMP have declined to outstand-ingly low 0 This is very different with the PER ratesthey had in set-up 3 which contained only two sourcesIndeed inclusion of the 3rd source from an DoA similarwith the other two sources completely ruined their accu-racy This is in deep contrast with the SAEN methodthat still achieves PER rate of 75 which is more thantwice the PER rate achieved by Lasso SAEN is againhaving the lowest RMSE values

TABLE IV Recovery results for set-ups 2 - 4 Note that for

oblique DoArsquos (set-ups 3 and 4) the SAEN method outper-

form the other methods and has a perfect recovery results

for set-up 2 (straight DoAs) SNR level is 20 dB The upper

bound (16) for the PER rate of SAEN is given in parentheses

Set-up 2 with two straight DoAs

SAEN EN Lasso OMP CoSaMP

PER (1000) 1000 0981 0981 0998 0678

RMSE 0126 0145 0145 0128 1436

Set-up 3 with two oblique DoAs

SAEN EN Lasso OMP CoSaMP

PER (1000) 0978 0399 0399 0613 0113

RMSE 0154 0916 0916 0624 2296

Set-up 4 with three oblique DoAs

SAEN EN Lasso OMP CoSaMP

PER (0776) 0749 0392 0378 0 0

RMSE 0505 0838 0827 1087 5290

In summary the recovery results for set-ups 1-4(which express different degrees of basis coherence prox-imity of target DoArsquos as well as variation of source pow-ers) clearly illustrate that the proposed SAEN performsvery well in identifying the true support and the power ofthe sources and is always outperforming the commonlyused benchmarks sparse recovery methods namely theLasso EN OMP or CoSaMP with a significant marginIt is also noteworthy that EN often achieved better PERrates than Lasso which is mainly due to its group se-lection ability As a specific example of this particularfeature Figure 4 shows the solution paths for Lasso andEN for one particular Monte-Carlo trial where EN cor-rectly chooses the true DoAs but Lasso fails to select allcorrect DoAs In this particular instance the EN tun-ing parameter was α = 09 This is reason behind thesuccess of our c-PW-WEN algorithm which is the corecomputational engine of the SAEN

C Off-grid sources

Set-ups 5 and 6 explore the case when the targetDoAs are off the grid Also note that set-up 5 uses finergrid spacing ∆θ = 1 compared to set-up 6 with ∆θ = 2Both set-ups contain four target sources that have largelyvarying source powers In the off-grid case one would likethe CBF method to localize the targets to the nearestDoA in the angular grid [ϑ] that is used to construct thearray steering matrix Therefore in the off the grid case

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 9

0 05 1 15

-log( )

0

1P

ow

er (43)deg

(52)deg

3

(a)

0 05 1 15

-log( )

(43)deg

(44)deg

(52)deg

3

(b)

0

1

Po

wer

40 50 60 70

DoA [degrees]

Sources

Lasso

EN

(c)

FIG 4 (Color online) The Lasso and EN solution paths

(upper panel) and respective DoA solutions at the knot λ3

Observe that Lasso fails to recover the true support but EN

successfully picks the true DoAs The EN tuning parameter

was α = 09 in this example

the PER rate refers to the case that CBF method selectsthe K-sparse support that corresponds to DoArsquos on thegrid that are closest in distance to the true DoAs TableV provides the recovery results As can be seen again theSAEN is performing very well outperforming the Lassoand EN Note that OMP and CoSaMP completely fail inselecting the nearest grid-points

TABLE V Performance results of CBF methods for set-ups

5 and 6 where target DoA-s are off the grid Here PER

rate refers to the case that CBF method selects the K-sparse

support that corresponds to DoArsquos on the grid that are closest

to the true DoAs The upper bound (16) for the PER rate of

SAEN is given in parentheses SNR level is 20 dB

Set-up 5 with four off-grid straight DoAs

SAEN EN Lasso OMP CoSaMP

PER (0999) 0649 0349 0328 0 0

RMSE 0899 0947 0943 1137 8909

Set-up 6 with four off-grid oblique DoAs

SAEN EN Lasso OMP CoSaMP

PER (0794) 0683 0336 0336 0 0005

RMSE 0811 0913 0911 1360 28919

0 6 12 18 24 30

SNR [dB]

0

02

04

06

08

1

PE

R

FIG 5 (Color online) PER rates of CBF methods at different

SNR levels for set-up 7

D More targets and varying SNR levels

Next we consider the set-up 7 (cf Table II) whichcontains K = 4 sources The first three of the sourcesare at straight DoAs and the fourth one at a DoA withmodest obliqueness (θ4 = 18o) We now compute thePER rates of the methods as a function of SNR From thePER rates shown in Fig 5 we again notice that SAENclearly outperforms all of the other methods Note thatthe upper bound (16) of the PER rate of the SAEN is alsoplotted Both greedy algorithms OMP and CoSaMP areperforming very poorly even at high SNR levels Lassoand EN are attaining better recovery results than thegreedy algorithms Again EN is performing better thanLasso due to additional flexibility offered by EN tuningparameter and its ability to cope with correlated steering(basis) vectors SAEN recovers the exact true support inmost of the cases due to its step-wise adaptation usingcleverly chosen weights Furthermore the improvementin PER rates offered by SAEN becomes larger as the SNRlevel increases One can also notice that SAEN is close tothe theoretical upper bound of PER rate at higher SNRregime

VIII CONCLUSIONS

We developed c-PW-WEN algorithm that computesweighted elastic net solutions at the knots of penalty pa-rameter over a grid of EN tuning parameter values c-PW-WEN also computes weighted Lasso as a special case(ie solution at α = 1) and adaptive EN (AEN) is ob-tained when adaptive (data dependent) weights are usedWe then proposed a novel SAEN approach that uses c-PW-WEN method as its core computational engine anduses three-step adaptive weighting scheme where sparsityis decreased from 3K to K in three steps Simulationsillustrated that SAEN performs better than the adap-tive EN approaches Furthermore we illustrated that the3K-sparse initial solution computed at step 1 of SAENprovide smart weights for further steps and includes thetrue K-sparse support with high accuracy The proposedSAEN algorithm is then accurately including the truesupport at each step

10 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

Using the K-sparse Lasso solution computed directlyfrom Lasso path at the kth knot fails to provide exactsupport recovery in many cases especially when we havehigh basis coherence and lower SNR Greedy algorithmsoften fail in the face of high mutual coherence (due todense grid spacing or oblique target DoArsquos) or low SNRThis is mainly due the fact that their performance heav-ily depends on their ability to accurately detecting max-imal correlation between the measurement vector y andthe basis vectors (column vectors of X) Our simulationstudy also showed that their performance (in terms ofPER rate) deteriorates when the number of targets in-creases In the off-grid case the greedy algorithms alsofailed to find the nearby grid-points

Finally the SAEN algorithm performed better thanall other methods in each set-up and the improvementwas more pronounced in the presence of high mutualcoherence This is due to ability of SAEN to in-clude the true support correctly at all three steps ofthe algorithm Our MATLAB Rcopy package that imple-ments the proposed algorithms is freely available atwwwgithubcommntabassmSAEN-LARS The packagealso contains a MATLAB Rcopy live script demo on how touse the method in CBF problem along with an examplefrom simulation set-up 4 presented in the paper

ACKNOWLEDGMENTS

The research was partially supported by theAcademy of Finland grant no 298118 which is grate-fully acknowledged

1R Tibshirani ldquoRegression shrinkage and selection via the lassordquoJournal of the Royal Statistical Society Series B (Methodologi-cal) 267ndash288 (1996)

2H Zou and T Hastie ldquoRegularization and variable selection viathe elastic netrdquo Journal of the Royal Statistical Society SeriesB (Statistical Methodology) 67(2) 301ndash320 (2005)

3T Yardibi J Li P Stoica and L N C III ldquoSparsity con-strained deconvolution approaches for acoustic source mappingrdquoThe Journal of the Acoustical Society of America 123(5) 2631ndash2642 (2008) doi 10112112896754

4A Xenaki E Fernandez-Grande and P Gerstoft ldquoBlock-sparsebeamforming for spatially extended sources in a bayesian formu-lationrdquo The Journal of the Acoustical Society of America 140(3)1828ndash1838 (2016)

5D Malioutov M Cetin and A S Willsky ldquoA sparse signalreconstruction perspective for source localization with sensor ar-raysrdquo IEEE Transactions on Signal Processing 53(8) 3010ndash3022(2005) doi 101109TSP2005850882

6G F Edelmann and C F Gaumond ldquoBeamforming using com-pressive sensingrdquo The Journal of the Acoustical Society of Amer-ica 130(4) EL232ndashEL237 (2011) doi 10112113632046

7A Xenaki P Gerstoft and K Mosegaard ldquoCompressive beam-formingrdquo The Journal of the Acoustical Society of America136(1) 260ndash271 (2014) doi 10112114883360

8P Gerstoft A Xenaki and C F Mecklenbruker ldquoMultiple andsingle snapshot compressive beamformingrdquo The Journal of theAcoustical Society of America 138(4) 2003ndash2014 (2015) doi10112114929941

9Y Choo and W Seong ldquoCompressive spherical beamformingfor localization of incipient tip vortex cavitationrdquo The Journal

of the Acoustical Society of America 140(6) 4085ndash4090 (2016)doi 10112114968576

10A Das W S Hodgkiss and P Gerstoft ldquoCoherent multi-path direction-of-arrival resolution using compressed sensingrdquoIEEE Journal of Oceanic Engineering 42(2) 494ndash505 (2017) doi101109JOE20162576198

11S Fortunati R Grasso F Gini M S Greco and K LePageldquoSingle-snapshot doa estimation by using compressed sensingrdquoEURASIP Journal on Advances in Signal Processing 2014(1)120 (2014) doi 1011861687-6180-2014-120

12E Ollila ldquoNonparametric simultaneous sparse recovery An ap-plication to source localizationrdquo in Proc European Signal Pro-cessing Conference (EUSIPCOrsquo15) Nice France (2015) pp509ndash513

13E Ollila ldquoMultichannel sparse recovery of complex-valued sig-nals using Huberrsquos criterionrdquo in Proc Compressed Sensing The-ory and its Applications to Radar Sonar and Remote Sensing(CoSeRarsquo15) Pisa Italy (2015) pp 32ndash36

14M N Tabassum and E Ollila ldquoSingle-snapshot doa estimationusing adaptive elastic net in the complex domainrdquo in 4th In-ternational Workshop on Compressed Sensing Theory and itsApplications to Radar Sonar and Remote Sensing (CoSeRa)(2016) pp 197ndash201 doi 101109CoSeRa20167745728

15T Hastie R Tibshirani and M Wainwright Statistical learningwith sparsity the lasso and generalizations (CRC Press 2015)

16M Osborne B Presnell and B Turlach ldquoA new ap-proach to variable selection in least squares problemsrdquo IMAJournal of Numerical Analysis 20(3) 389ndash403 (2000) doi101093imanum203389

17B Efron T Hastie I Johnstone and R Tibshirani ldquoLeast an-gle regression (with discussion)rdquo The Annals of statistics 32(2)407ndash499 (2004) doi 101214009053604000000067

18M N Tabassum and E Ollila ldquoPathwise least angle regressionand a significance test for the elastic netrdquo in 25th European Sig-nal Processing Conference (EUSIPCO) (2017) pp 1309ndash1313doi 1023919EUSIPCO20178081420

19S Foucart and H Rauhut A mathematical introduction to com-pressive sensing Vol 1 (Springer 2013)

20H Zou ldquoThe adaptive lasso and its oracle propertiesrdquo Journal ofthe American Statistical Association 101(476) 1418ndash1429 (2006)doi 101198016214506000000735

21A E Hoerl and R W Kennard ldquoRidge regression biased esti-mation for nonorthogonal problemsrdquo Technometrics 12(1) 55ndash67 (1970)

22S Rosset and J Zhu ldquoPiecewise linear regularized so-lution pathsrdquo Ann Statist 35(3) 1012ndash1030 (2007) doi101214009053606000001370

23R J Tibshirani and J Taylor ldquoThe solution path of thegeneralized lassordquo Ann Statist 39(3) 1335ndash1371 (2011) doi10121411-AOS878

24R J Tibshirani ldquoThe lasso problem and uniquenessrdquo ElectronJ Statist 7 1456ndash1490 (2013) doi 10121413-EJS815

25A Panahi and M Viberg ldquoFast candidate points selection in thelasso pathrdquo IEEE Signal Processing Letters 19(2) 79ndash82 (2012)

26K L Gemba W S Hodgkiss and P Gerstoft ldquoAdaptiveand compressive matched field processingrdquo The Journal ofthe Acoustical Society of America 141(1) 92ndash103 (2017) doi10112114973528

27A B Gershman C F Mecklenbrauker and J F Bohme ldquoMa-trix fitting approach to direction of arrival estimation with imper-fect spatial coherence of wavefrontsrdquo IEEE Transactions on Sig-nal Processing 45(7) 1894ndash1899 (1997) doi 10110978599968

28J A Tropp and A C Gilbert ldquoSignal recovery from randommeasurements via orthogonal matching pursuitrdquo IEEE Trans-actions on Information Theory 53(12) 4655ndash4666 (2007) doi101109TIT2007909108

29D Needell and J Tropp ldquoCosamp Iterative signal recovery fromincomplete and inaccurate samplesrdquo Applied and Computational

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 11

Harmonic Analysis 26(3) 301 ndash 321 (2009)

12 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

Page 9: Sequential adaptive elastic net approach for single …Sequential adaptive elastic net approach for single-snapshot source localization MuhammadNaveedTabassum 1andEsaOllila Aalto University,

TABLE III The recovery results for set-up 1 Results illus-

trate the effectiveness of three step SAEN approach compared

to its competitors The SNR level was 20 dB and the upper

bound (16) for the PER rate of SAEN is given in parentheses

SAEN AEN(3K) AEN(n)

PER (0957) 0864 0689 0616

RMSE - 0449 0694 0889

AEN(LSE) EN Lasso OMP CoSaMP

PER 0 0332 0332 0477 0140

RMSE 1870 1163 1163 1060 3558

It can be noted that the proposed SAEN outperformsall other methods and weighting schemes and recovers hetrue support and powers of the sources effectively Notethat the SAENrsquos upper bound for PER rate was 957and SAEN reached the PER rate 864 The results ofthe AEN approaches validate the need for accurate initialestimate to construct the adaptive weights For exampleAEN(3K) performs better than AEN(n) but much worsethan the SAEN method

B Straight and oblique DoAs

Set-up 2 and set-up 3 correspond to the case wherethe targets are at straight and oblique DoAs respectivelyPerformance results of the sparse recovery algorithms aretabulated in Table IV As can be seen the upper boundfor PER rate of SAEN is full 100 percentage whichmeans that the true support is correctly included in 3K-sparse solution computed at Step 1 of the algorithmFor set-up 2 (straight DoAs) all methods have almostfull PER rates except CoSaMP with 678 rate Per-formance of other estimators expect of SAEN changesdrastically in set-up 3 (oblique DoAs) Here SAEN isachieving nearly perfect (sim98) PER rate which meansreduction of 2 compared to set-up 2 Other methodsperform poorly For example PER rate of Lasso dropsfrom near 98 to 40 Similar behavior is observed forEN OMP and CoSaMP

Next we discuss the results for set-up 4 which is sim-ilar to set-up 3 except that we have introduced a thirdsource that also arrives from an oblique DoA θ = 43 andthe variation of the source powers is slightly larger Ascan be noted from Table IV the PER rates of greedy al-gorithms OMP and CoSaMP have declined to outstand-ingly low 0 This is very different with the PER ratesthey had in set-up 3 which contained only two sourcesIndeed inclusion of the 3rd source from an DoA similarwith the other two sources completely ruined their accu-racy This is in deep contrast with the SAEN methodthat still achieves PER rate of 75 which is more thantwice the PER rate achieved by Lasso SAEN is againhaving the lowest RMSE values

TABLE IV Recovery results for set-ups 2 - 4 Note that for

oblique DoArsquos (set-ups 3 and 4) the SAEN method outper-

form the other methods and has a perfect recovery results

for set-up 2 (straight DoAs) SNR level is 20 dB The upper

bound (16) for the PER rate of SAEN is given in parentheses

Set-up 2 with two straight DoAs

SAEN EN Lasso OMP CoSaMP

PER (1000) 1000 0981 0981 0998 0678

RMSE 0126 0145 0145 0128 1436

Set-up 3 with two oblique DoAs

SAEN EN Lasso OMP CoSaMP

PER (1000) 0978 0399 0399 0613 0113

RMSE 0154 0916 0916 0624 2296

Set-up 4 with three oblique DoAs

SAEN EN Lasso OMP CoSaMP

PER (0776) 0749 0392 0378 0 0

RMSE 0505 0838 0827 1087 5290

In summary the recovery results for set-ups 1-4(which express different degrees of basis coherence prox-imity of target DoArsquos as well as variation of source pow-ers) clearly illustrate that the proposed SAEN performsvery well in identifying the true support and the power ofthe sources and is always outperforming the commonlyused benchmarks sparse recovery methods namely theLasso EN OMP or CoSaMP with a significant marginIt is also noteworthy that EN often achieved better PERrates than Lasso which is mainly due to its group se-lection ability As a specific example of this particularfeature Figure 4 shows the solution paths for Lasso andEN for one particular Monte-Carlo trial where EN cor-rectly chooses the true DoAs but Lasso fails to select allcorrect DoAs In this particular instance the EN tun-ing parameter was α = 09 This is reason behind thesuccess of our c-PW-WEN algorithm which is the corecomputational engine of the SAEN

C Off-grid sources

Set-ups 5 and 6 explore the case when the targetDoAs are off the grid Also note that set-up 5 uses finergrid spacing ∆θ = 1 compared to set-up 6 with ∆θ = 2Both set-ups contain four target sources that have largelyvarying source powers In the off-grid case one would likethe CBF method to localize the targets to the nearestDoA in the angular grid [ϑ] that is used to construct thearray steering matrix Therefore in the off the grid case

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 9

0 05 1 15

-log( )

0

1P

ow

er (43)deg

(52)deg

3

(a)

0 05 1 15

-log( )

(43)deg

(44)deg

(52)deg

3

(b)

0

1

Po

wer

40 50 60 70

DoA [degrees]

Sources

Lasso

EN

(c)

FIG 4 (Color online) The Lasso and EN solution paths

(upper panel) and respective DoA solutions at the knot λ3

Observe that Lasso fails to recover the true support but EN

successfully picks the true DoAs The EN tuning parameter

was α = 09 in this example

the PER rate refers to the case that CBF method selectsthe K-sparse support that corresponds to DoArsquos on thegrid that are closest in distance to the true DoAs TableV provides the recovery results As can be seen again theSAEN is performing very well outperforming the Lassoand EN Note that OMP and CoSaMP completely fail inselecting the nearest grid-points

TABLE V Performance results of CBF methods for set-ups

5 and 6 where target DoA-s are off the grid Here PER

rate refers to the case that CBF method selects the K-sparse

support that corresponds to DoArsquos on the grid that are closest

to the true DoAs The upper bound (16) for the PER rate of

SAEN is given in parentheses SNR level is 20 dB

Set-up 5 with four off-grid straight DoAs

SAEN EN Lasso OMP CoSaMP

PER (0999) 0649 0349 0328 0 0

RMSE 0899 0947 0943 1137 8909

Set-up 6 with four off-grid oblique DoAs

SAEN EN Lasso OMP CoSaMP

PER (0794) 0683 0336 0336 0 0005

RMSE 0811 0913 0911 1360 28919

0 6 12 18 24 30

SNR [dB]

0

02

04

06

08

1

PE

R

FIG 5 (Color online) PER rates of CBF methods at different

SNR levels for set-up 7

D More targets and varying SNR levels

Next we consider the set-up 7 (cf Table II) whichcontains K = 4 sources The first three of the sourcesare at straight DoAs and the fourth one at a DoA withmodest obliqueness (θ4 = 18o) We now compute thePER rates of the methods as a function of SNR From thePER rates shown in Fig 5 we again notice that SAENclearly outperforms all of the other methods Note thatthe upper bound (16) of the PER rate of the SAEN is alsoplotted Both greedy algorithms OMP and CoSaMP areperforming very poorly even at high SNR levels Lassoand EN are attaining better recovery results than thegreedy algorithms Again EN is performing better thanLasso due to additional flexibility offered by EN tuningparameter and its ability to cope with correlated steering(basis) vectors SAEN recovers the exact true support inmost of the cases due to its step-wise adaptation usingcleverly chosen weights Furthermore the improvementin PER rates offered by SAEN becomes larger as the SNRlevel increases One can also notice that SAEN is close tothe theoretical upper bound of PER rate at higher SNRregime

VIII CONCLUSIONS

We developed c-PW-WEN algorithm that computesweighted elastic net solutions at the knots of penalty pa-rameter over a grid of EN tuning parameter values c-PW-WEN also computes weighted Lasso as a special case(ie solution at α = 1) and adaptive EN (AEN) is ob-tained when adaptive (data dependent) weights are usedWe then proposed a novel SAEN approach that uses c-PW-WEN method as its core computational engine anduses three-step adaptive weighting scheme where sparsityis decreased from 3K to K in three steps Simulationsillustrated that SAEN performs better than the adap-tive EN approaches Furthermore we illustrated that the3K-sparse initial solution computed at step 1 of SAENprovide smart weights for further steps and includes thetrue K-sparse support with high accuracy The proposedSAEN algorithm is then accurately including the truesupport at each step

10 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

Using the K-sparse Lasso solution computed directlyfrom Lasso path at the kth knot fails to provide exactsupport recovery in many cases especially when we havehigh basis coherence and lower SNR Greedy algorithmsoften fail in the face of high mutual coherence (due todense grid spacing or oblique target DoArsquos) or low SNRThis is mainly due the fact that their performance heav-ily depends on their ability to accurately detecting max-imal correlation between the measurement vector y andthe basis vectors (column vectors of X) Our simulationstudy also showed that their performance (in terms ofPER rate) deteriorates when the number of targets in-creases In the off-grid case the greedy algorithms alsofailed to find the nearby grid-points

Finally the SAEN algorithm performed better thanall other methods in each set-up and the improvementwas more pronounced in the presence of high mutualcoherence This is due to ability of SAEN to in-clude the true support correctly at all three steps ofthe algorithm Our MATLAB Rcopy package that imple-ments the proposed algorithms is freely available atwwwgithubcommntabassmSAEN-LARS The packagealso contains a MATLAB Rcopy live script demo on how touse the method in CBF problem along with an examplefrom simulation set-up 4 presented in the paper

ACKNOWLEDGMENTS

The research was partially supported by theAcademy of Finland grant no 298118 which is grate-fully acknowledged

1R Tibshirani ldquoRegression shrinkage and selection via the lassordquoJournal of the Royal Statistical Society Series B (Methodologi-cal) 267ndash288 (1996)

2H Zou and T Hastie ldquoRegularization and variable selection viathe elastic netrdquo Journal of the Royal Statistical Society SeriesB (Statistical Methodology) 67(2) 301ndash320 (2005)

3T Yardibi J Li P Stoica and L N C III ldquoSparsity con-strained deconvolution approaches for acoustic source mappingrdquoThe Journal of the Acoustical Society of America 123(5) 2631ndash2642 (2008) doi 10112112896754

4A Xenaki E Fernandez-Grande and P Gerstoft ldquoBlock-sparsebeamforming for spatially extended sources in a bayesian formu-lationrdquo The Journal of the Acoustical Society of America 140(3)1828ndash1838 (2016)

5D Malioutov M Cetin and A S Willsky ldquoA sparse signalreconstruction perspective for source localization with sensor ar-raysrdquo IEEE Transactions on Signal Processing 53(8) 3010ndash3022(2005) doi 101109TSP2005850882

6G F Edelmann and C F Gaumond ldquoBeamforming using com-pressive sensingrdquo The Journal of the Acoustical Society of Amer-ica 130(4) EL232ndashEL237 (2011) doi 10112113632046

7A Xenaki P Gerstoft and K Mosegaard ldquoCompressive beam-formingrdquo The Journal of the Acoustical Society of America136(1) 260ndash271 (2014) doi 10112114883360

8P Gerstoft A Xenaki and C F Mecklenbruker ldquoMultiple andsingle snapshot compressive beamformingrdquo The Journal of theAcoustical Society of America 138(4) 2003ndash2014 (2015) doi10112114929941

9Y Choo and W Seong ldquoCompressive spherical beamformingfor localization of incipient tip vortex cavitationrdquo The Journal

of the Acoustical Society of America 140(6) 4085ndash4090 (2016)doi 10112114968576

10A Das W S Hodgkiss and P Gerstoft ldquoCoherent multi-path direction-of-arrival resolution using compressed sensingrdquoIEEE Journal of Oceanic Engineering 42(2) 494ndash505 (2017) doi101109JOE20162576198

11S Fortunati R Grasso F Gini M S Greco and K LePageldquoSingle-snapshot doa estimation by using compressed sensingrdquoEURASIP Journal on Advances in Signal Processing 2014(1)120 (2014) doi 1011861687-6180-2014-120

12E Ollila ldquoNonparametric simultaneous sparse recovery An ap-plication to source localizationrdquo in Proc European Signal Pro-cessing Conference (EUSIPCOrsquo15) Nice France (2015) pp509ndash513

13E Ollila ldquoMultichannel sparse recovery of complex-valued sig-nals using Huberrsquos criterionrdquo in Proc Compressed Sensing The-ory and its Applications to Radar Sonar and Remote Sensing(CoSeRarsquo15) Pisa Italy (2015) pp 32ndash36

14M N Tabassum and E Ollila ldquoSingle-snapshot doa estimationusing adaptive elastic net in the complex domainrdquo in 4th In-ternational Workshop on Compressed Sensing Theory and itsApplications to Radar Sonar and Remote Sensing (CoSeRa)(2016) pp 197ndash201 doi 101109CoSeRa20167745728

15T Hastie R Tibshirani and M Wainwright Statistical learningwith sparsity the lasso and generalizations (CRC Press 2015)

16M Osborne B Presnell and B Turlach ldquoA new ap-proach to variable selection in least squares problemsrdquo IMAJournal of Numerical Analysis 20(3) 389ndash403 (2000) doi101093imanum203389

17B Efron T Hastie I Johnstone and R Tibshirani ldquoLeast an-gle regression (with discussion)rdquo The Annals of statistics 32(2)407ndash499 (2004) doi 101214009053604000000067

18M N Tabassum and E Ollila ldquoPathwise least angle regressionand a significance test for the elastic netrdquo in 25th European Sig-nal Processing Conference (EUSIPCO) (2017) pp 1309ndash1313doi 1023919EUSIPCO20178081420

19S Foucart and H Rauhut A mathematical introduction to com-pressive sensing Vol 1 (Springer 2013)

20H Zou ldquoThe adaptive lasso and its oracle propertiesrdquo Journal ofthe American Statistical Association 101(476) 1418ndash1429 (2006)doi 101198016214506000000735

21A E Hoerl and R W Kennard ldquoRidge regression biased esti-mation for nonorthogonal problemsrdquo Technometrics 12(1) 55ndash67 (1970)

22S Rosset and J Zhu ldquoPiecewise linear regularized so-lution pathsrdquo Ann Statist 35(3) 1012ndash1030 (2007) doi101214009053606000001370

23R J Tibshirani and J Taylor ldquoThe solution path of thegeneralized lassordquo Ann Statist 39(3) 1335ndash1371 (2011) doi10121411-AOS878

24R J Tibshirani ldquoThe lasso problem and uniquenessrdquo ElectronJ Statist 7 1456ndash1490 (2013) doi 10121413-EJS815

25A Panahi and M Viberg ldquoFast candidate points selection in thelasso pathrdquo IEEE Signal Processing Letters 19(2) 79ndash82 (2012)

26K L Gemba W S Hodgkiss and P Gerstoft ldquoAdaptiveand compressive matched field processingrdquo The Journal ofthe Acoustical Society of America 141(1) 92ndash103 (2017) doi10112114973528

27A B Gershman C F Mecklenbrauker and J F Bohme ldquoMa-trix fitting approach to direction of arrival estimation with imper-fect spatial coherence of wavefrontsrdquo IEEE Transactions on Sig-nal Processing 45(7) 1894ndash1899 (1997) doi 10110978599968

28J A Tropp and A C Gilbert ldquoSignal recovery from randommeasurements via orthogonal matching pursuitrdquo IEEE Trans-actions on Information Theory 53(12) 4655ndash4666 (2007) doi101109TIT2007909108

29D Needell and J Tropp ldquoCosamp Iterative signal recovery fromincomplete and inaccurate samplesrdquo Applied and Computational

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 11

Harmonic Analysis 26(3) 301 ndash 321 (2009)

12 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

Page 10: Sequential adaptive elastic net approach for single …Sequential adaptive elastic net approach for single-snapshot source localization MuhammadNaveedTabassum 1andEsaOllila Aalto University,

0 05 1 15

-log( )

0

1P

ow

er (43)deg

(52)deg

3

(a)

0 05 1 15

-log( )

(43)deg

(44)deg

(52)deg

3

(b)

0

1

Po

wer

40 50 60 70

DoA [degrees]

Sources

Lasso

EN

(c)

FIG 4 (Color online) The Lasso and EN solution paths

(upper panel) and respective DoA solutions at the knot λ3

Observe that Lasso fails to recover the true support but EN

successfully picks the true DoAs The EN tuning parameter

was α = 09 in this example

the PER rate refers to the case that CBF method selectsthe K-sparse support that corresponds to DoArsquos on thegrid that are closest in distance to the true DoAs TableV provides the recovery results As can be seen again theSAEN is performing very well outperforming the Lassoand EN Note that OMP and CoSaMP completely fail inselecting the nearest grid-points

TABLE V Performance results of CBF methods for set-ups

5 and 6 where target DoA-s are off the grid Here PER

rate refers to the case that CBF method selects the K-sparse

support that corresponds to DoArsquos on the grid that are closest

to the true DoAs The upper bound (16) for the PER rate of

SAEN is given in parentheses SNR level is 20 dB

Set-up 5 with four off-grid straight DoAs

SAEN EN Lasso OMP CoSaMP

PER (0999) 0649 0349 0328 0 0

RMSE 0899 0947 0943 1137 8909

Set-up 6 with four off-grid oblique DoAs

SAEN EN Lasso OMP CoSaMP

PER (0794) 0683 0336 0336 0 0005

RMSE 0811 0913 0911 1360 28919

0 6 12 18 24 30

SNR [dB]

0

02

04

06

08

1

PE

R

FIG 5 (Color online) PER rates of CBF methods at different

SNR levels for set-up 7

D More targets and varying SNR levels

Next we consider the set-up 7 (cf Table II) whichcontains K = 4 sources The first three of the sourcesare at straight DoAs and the fourth one at a DoA withmodest obliqueness (θ4 = 18o) We now compute thePER rates of the methods as a function of SNR From thePER rates shown in Fig 5 we again notice that SAENclearly outperforms all of the other methods Note thatthe upper bound (16) of the PER rate of the SAEN is alsoplotted Both greedy algorithms OMP and CoSaMP areperforming very poorly even at high SNR levels Lassoand EN are attaining better recovery results than thegreedy algorithms Again EN is performing better thanLasso due to additional flexibility offered by EN tuningparameter and its ability to cope with correlated steering(basis) vectors SAEN recovers the exact true support inmost of the cases due to its step-wise adaptation usingcleverly chosen weights Furthermore the improvementin PER rates offered by SAEN becomes larger as the SNRlevel increases One can also notice that SAEN is close tothe theoretical upper bound of PER rate at higher SNRregime

VIII CONCLUSIONS

We developed c-PW-WEN algorithm that computesweighted elastic net solutions at the knots of penalty pa-rameter over a grid of EN tuning parameter values c-PW-WEN also computes weighted Lasso as a special case(ie solution at α = 1) and adaptive EN (AEN) is ob-tained when adaptive (data dependent) weights are usedWe then proposed a novel SAEN approach that uses c-PW-WEN method as its core computational engine anduses three-step adaptive weighting scheme where sparsityis decreased from 3K to K in three steps Simulationsillustrated that SAEN performs better than the adap-tive EN approaches Furthermore we illustrated that the3K-sparse initial solution computed at step 1 of SAENprovide smart weights for further steps and includes thetrue K-sparse support with high accuracy The proposedSAEN algorithm is then accurately including the truesupport at each step

10 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

Using the K-sparse Lasso solution computed directlyfrom Lasso path at the kth knot fails to provide exactsupport recovery in many cases especially when we havehigh basis coherence and lower SNR Greedy algorithmsoften fail in the face of high mutual coherence (due todense grid spacing or oblique target DoArsquos) or low SNRThis is mainly due the fact that their performance heav-ily depends on their ability to accurately detecting max-imal correlation between the measurement vector y andthe basis vectors (column vectors of X) Our simulationstudy also showed that their performance (in terms ofPER rate) deteriorates when the number of targets in-creases In the off-grid case the greedy algorithms alsofailed to find the nearby grid-points

Finally the SAEN algorithm performed better thanall other methods in each set-up and the improvementwas more pronounced in the presence of high mutualcoherence This is due to ability of SAEN to in-clude the true support correctly at all three steps ofthe algorithm Our MATLAB Rcopy package that imple-ments the proposed algorithms is freely available atwwwgithubcommntabassmSAEN-LARS The packagealso contains a MATLAB Rcopy live script demo on how touse the method in CBF problem along with an examplefrom simulation set-up 4 presented in the paper

ACKNOWLEDGMENTS

The research was partially supported by theAcademy of Finland grant no 298118 which is grate-fully acknowledged

1R Tibshirani ldquoRegression shrinkage and selection via the lassordquoJournal of the Royal Statistical Society Series B (Methodologi-cal) 267ndash288 (1996)

2H Zou and T Hastie ldquoRegularization and variable selection viathe elastic netrdquo Journal of the Royal Statistical Society SeriesB (Statistical Methodology) 67(2) 301ndash320 (2005)

3T Yardibi J Li P Stoica and L N C III ldquoSparsity con-strained deconvolution approaches for acoustic source mappingrdquoThe Journal of the Acoustical Society of America 123(5) 2631ndash2642 (2008) doi 10112112896754

4A Xenaki E Fernandez-Grande and P Gerstoft ldquoBlock-sparsebeamforming for spatially extended sources in a bayesian formu-lationrdquo The Journal of the Acoustical Society of America 140(3)1828ndash1838 (2016)

5D Malioutov M Cetin and A S Willsky ldquoA sparse signalreconstruction perspective for source localization with sensor ar-raysrdquo IEEE Transactions on Signal Processing 53(8) 3010ndash3022(2005) doi 101109TSP2005850882

6G F Edelmann and C F Gaumond ldquoBeamforming using com-pressive sensingrdquo The Journal of the Acoustical Society of Amer-ica 130(4) EL232ndashEL237 (2011) doi 10112113632046

7A Xenaki P Gerstoft and K Mosegaard ldquoCompressive beam-formingrdquo The Journal of the Acoustical Society of America136(1) 260ndash271 (2014) doi 10112114883360

8P Gerstoft A Xenaki and C F Mecklenbruker ldquoMultiple andsingle snapshot compressive beamformingrdquo The Journal of theAcoustical Society of America 138(4) 2003ndash2014 (2015) doi10112114929941

9Y Choo and W Seong ldquoCompressive spherical beamformingfor localization of incipient tip vortex cavitationrdquo The Journal

of the Acoustical Society of America 140(6) 4085ndash4090 (2016)doi 10112114968576

10A Das W S Hodgkiss and P Gerstoft ldquoCoherent multi-path direction-of-arrival resolution using compressed sensingrdquoIEEE Journal of Oceanic Engineering 42(2) 494ndash505 (2017) doi101109JOE20162576198

11S Fortunati R Grasso F Gini M S Greco and K LePageldquoSingle-snapshot doa estimation by using compressed sensingrdquoEURASIP Journal on Advances in Signal Processing 2014(1)120 (2014) doi 1011861687-6180-2014-120

12E Ollila ldquoNonparametric simultaneous sparse recovery An ap-plication to source localizationrdquo in Proc European Signal Pro-cessing Conference (EUSIPCOrsquo15) Nice France (2015) pp509ndash513

13E Ollila ldquoMultichannel sparse recovery of complex-valued sig-nals using Huberrsquos criterionrdquo in Proc Compressed Sensing The-ory and its Applications to Radar Sonar and Remote Sensing(CoSeRarsquo15) Pisa Italy (2015) pp 32ndash36

14M N Tabassum and E Ollila ldquoSingle-snapshot doa estimationusing adaptive elastic net in the complex domainrdquo in 4th In-ternational Workshop on Compressed Sensing Theory and itsApplications to Radar Sonar and Remote Sensing (CoSeRa)(2016) pp 197ndash201 doi 101109CoSeRa20167745728

15T Hastie R Tibshirani and M Wainwright Statistical learningwith sparsity the lasso and generalizations (CRC Press 2015)

16M Osborne B Presnell and B Turlach ldquoA new ap-proach to variable selection in least squares problemsrdquo IMAJournal of Numerical Analysis 20(3) 389ndash403 (2000) doi101093imanum203389

17B Efron T Hastie I Johnstone and R Tibshirani ldquoLeast an-gle regression (with discussion)rdquo The Annals of statistics 32(2)407ndash499 (2004) doi 101214009053604000000067

18M N Tabassum and E Ollila ldquoPathwise least angle regressionand a significance test for the elastic netrdquo in 25th European Sig-nal Processing Conference (EUSIPCO) (2017) pp 1309ndash1313doi 1023919EUSIPCO20178081420

19S Foucart and H Rauhut A mathematical introduction to com-pressive sensing Vol 1 (Springer 2013)

20H Zou ldquoThe adaptive lasso and its oracle propertiesrdquo Journal ofthe American Statistical Association 101(476) 1418ndash1429 (2006)doi 101198016214506000000735

21A E Hoerl and R W Kennard ldquoRidge regression biased esti-mation for nonorthogonal problemsrdquo Technometrics 12(1) 55ndash67 (1970)

22S Rosset and J Zhu ldquoPiecewise linear regularized so-lution pathsrdquo Ann Statist 35(3) 1012ndash1030 (2007) doi101214009053606000001370

23R J Tibshirani and J Taylor ldquoThe solution path of thegeneralized lassordquo Ann Statist 39(3) 1335ndash1371 (2011) doi10121411-AOS878

24R J Tibshirani ldquoThe lasso problem and uniquenessrdquo ElectronJ Statist 7 1456ndash1490 (2013) doi 10121413-EJS815

25A Panahi and M Viberg ldquoFast candidate points selection in thelasso pathrdquo IEEE Signal Processing Letters 19(2) 79ndash82 (2012)

26K L Gemba W S Hodgkiss and P Gerstoft ldquoAdaptiveand compressive matched field processingrdquo The Journal ofthe Acoustical Society of America 141(1) 92ndash103 (2017) doi10112114973528

27A B Gershman C F Mecklenbrauker and J F Bohme ldquoMa-trix fitting approach to direction of arrival estimation with imper-fect spatial coherence of wavefrontsrdquo IEEE Transactions on Sig-nal Processing 45(7) 1894ndash1899 (1997) doi 10110978599968

28J A Tropp and A C Gilbert ldquoSignal recovery from randommeasurements via orthogonal matching pursuitrdquo IEEE Trans-actions on Information Theory 53(12) 4655ndash4666 (2007) doi101109TIT2007909108

29D Needell and J Tropp ldquoCosamp Iterative signal recovery fromincomplete and inaccurate samplesrdquo Applied and Computational

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 11

Harmonic Analysis 26(3) 301 ndash 321 (2009)

12 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

Page 11: Sequential adaptive elastic net approach for single …Sequential adaptive elastic net approach for single-snapshot source localization MuhammadNaveedTabassum 1andEsaOllila Aalto University,

Using the K-sparse Lasso solution computed directlyfrom Lasso path at the kth knot fails to provide exactsupport recovery in many cases especially when we havehigh basis coherence and lower SNR Greedy algorithmsoften fail in the face of high mutual coherence (due todense grid spacing or oblique target DoArsquos) or low SNRThis is mainly due the fact that their performance heav-ily depends on their ability to accurately detecting max-imal correlation between the measurement vector y andthe basis vectors (column vectors of X) Our simulationstudy also showed that their performance (in terms ofPER rate) deteriorates when the number of targets in-creases In the off-grid case the greedy algorithms alsofailed to find the nearby grid-points

Finally the SAEN algorithm performed better thanall other methods in each set-up and the improvementwas more pronounced in the presence of high mutualcoherence This is due to ability of SAEN to in-clude the true support correctly at all three steps ofthe algorithm Our MATLAB Rcopy package that imple-ments the proposed algorithms is freely available atwwwgithubcommntabassmSAEN-LARS The packagealso contains a MATLAB Rcopy live script demo on how touse the method in CBF problem along with an examplefrom simulation set-up 4 presented in the paper

ACKNOWLEDGMENTS

The research was partially supported by theAcademy of Finland grant no 298118 which is grate-fully acknowledged

1R Tibshirani ldquoRegression shrinkage and selection via the lassordquoJournal of the Royal Statistical Society Series B (Methodologi-cal) 267ndash288 (1996)

2H Zou and T Hastie ldquoRegularization and variable selection viathe elastic netrdquo Journal of the Royal Statistical Society SeriesB (Statistical Methodology) 67(2) 301ndash320 (2005)

3T Yardibi J Li P Stoica and L N C III ldquoSparsity con-strained deconvolution approaches for acoustic source mappingrdquoThe Journal of the Acoustical Society of America 123(5) 2631ndash2642 (2008) doi 10112112896754

4A Xenaki E Fernandez-Grande and P Gerstoft ldquoBlock-sparsebeamforming for spatially extended sources in a bayesian formu-lationrdquo The Journal of the Acoustical Society of America 140(3)1828ndash1838 (2016)

5D Malioutov M Cetin and A S Willsky ldquoA sparse signalreconstruction perspective for source localization with sensor ar-raysrdquo IEEE Transactions on Signal Processing 53(8) 3010ndash3022(2005) doi 101109TSP2005850882

6G F Edelmann and C F Gaumond ldquoBeamforming using com-pressive sensingrdquo The Journal of the Acoustical Society of Amer-ica 130(4) EL232ndashEL237 (2011) doi 10112113632046

7A Xenaki P Gerstoft and K Mosegaard ldquoCompressive beam-formingrdquo The Journal of the Acoustical Society of America136(1) 260ndash271 (2014) doi 10112114883360

8P Gerstoft A Xenaki and C F Mecklenbruker ldquoMultiple andsingle snapshot compressive beamformingrdquo The Journal of theAcoustical Society of America 138(4) 2003ndash2014 (2015) doi10112114929941

9Y Choo and W Seong ldquoCompressive spherical beamformingfor localization of incipient tip vortex cavitationrdquo The Journal

of the Acoustical Society of America 140(6) 4085ndash4090 (2016)doi 10112114968576

10A Das W S Hodgkiss and P Gerstoft ldquoCoherent multi-path direction-of-arrival resolution using compressed sensingrdquoIEEE Journal of Oceanic Engineering 42(2) 494ndash505 (2017) doi101109JOE20162576198

11S Fortunati R Grasso F Gini M S Greco and K LePageldquoSingle-snapshot doa estimation by using compressed sensingrdquoEURASIP Journal on Advances in Signal Processing 2014(1)120 (2014) doi 1011861687-6180-2014-120

12E Ollila ldquoNonparametric simultaneous sparse recovery An ap-plication to source localizationrdquo in Proc European Signal Pro-cessing Conference (EUSIPCOrsquo15) Nice France (2015) pp509ndash513

13E Ollila ldquoMultichannel sparse recovery of complex-valued sig-nals using Huberrsquos criterionrdquo in Proc Compressed Sensing The-ory and its Applications to Radar Sonar and Remote Sensing(CoSeRarsquo15) Pisa Italy (2015) pp 32ndash36

14M N Tabassum and E Ollila ldquoSingle-snapshot doa estimationusing adaptive elastic net in the complex domainrdquo in 4th In-ternational Workshop on Compressed Sensing Theory and itsApplications to Radar Sonar and Remote Sensing (CoSeRa)(2016) pp 197ndash201 doi 101109CoSeRa20167745728

15T Hastie R Tibshirani and M Wainwright Statistical learningwith sparsity the lasso and generalizations (CRC Press 2015)

16M Osborne B Presnell and B Turlach ldquoA new ap-proach to variable selection in least squares problemsrdquo IMAJournal of Numerical Analysis 20(3) 389ndash403 (2000) doi101093imanum203389

17B Efron T Hastie I Johnstone and R Tibshirani ldquoLeast an-gle regression (with discussion)rdquo The Annals of statistics 32(2)407ndash499 (2004) doi 101214009053604000000067

18M N Tabassum and E Ollila ldquoPathwise least angle regressionand a significance test for the elastic netrdquo in 25th European Sig-nal Processing Conference (EUSIPCO) (2017) pp 1309ndash1313doi 1023919EUSIPCO20178081420

19S Foucart and H Rauhut A mathematical introduction to com-pressive sensing Vol 1 (Springer 2013)

20H Zou ldquoThe adaptive lasso and its oracle propertiesrdquo Journal ofthe American Statistical Association 101(476) 1418ndash1429 (2006)doi 101198016214506000000735

21A E Hoerl and R W Kennard ldquoRidge regression biased esti-mation for nonorthogonal problemsrdquo Technometrics 12(1) 55ndash67 (1970)

22S Rosset and J Zhu ldquoPiecewise linear regularized so-lution pathsrdquo Ann Statist 35(3) 1012ndash1030 (2007) doi101214009053606000001370

23R J Tibshirani and J Taylor ldquoThe solution path of thegeneralized lassordquo Ann Statist 39(3) 1335ndash1371 (2011) doi10121411-AOS878

24R J Tibshirani ldquoThe lasso problem and uniquenessrdquo ElectronJ Statist 7 1456ndash1490 (2013) doi 10121413-EJS815

25A Panahi and M Viberg ldquoFast candidate points selection in thelasso pathrdquo IEEE Signal Processing Letters 19(2) 79ndash82 (2012)

26K L Gemba W S Hodgkiss and P Gerstoft ldquoAdaptiveand compressive matched field processingrdquo The Journal ofthe Acoustical Society of America 141(1) 92ndash103 (2017) doi10112114973528

27A B Gershman C F Mecklenbrauker and J F Bohme ldquoMa-trix fitting approach to direction of arrival estimation with imper-fect spatial coherence of wavefrontsrdquo IEEE Transactions on Sig-nal Processing 45(7) 1894ndash1899 (1997) doi 10110978599968

28J A Tropp and A C Gilbert ldquoSignal recovery from randommeasurements via orthogonal matching pursuitrdquo IEEE Trans-actions on Information Theory 53(12) 4655ndash4666 (2007) doi101109TIT2007909108

29D Needell and J Tropp ldquoCosamp Iterative signal recovery fromincomplete and inaccurate samplesrdquo Applied and Computational

J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net 11

Harmonic Analysis 26(3) 301 ndash 321 (2009)

12 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net

Page 12: Sequential adaptive elastic net approach for single …Sequential adaptive elastic net approach for single-snapshot source localization MuhammadNaveedTabassum 1andEsaOllila Aalto University,

Harmonic Analysis 26(3) 301 ndash 321 (2009)

12 J Acoust Soc Am 22 May 2018 Sequential adaptive elastic net