Phát hiện dấu hiệu phi Gaussian vũ trụ: So sánh hiệu suất của các kiểm tra thống kê khác nhau

EURASIP Journal on Applied Signal Processing 2005:15, 2470–2485

2005 Hindawi Publishing Corporation

Cosmological Non-Gaussian Signature Detection:

Comparing Performance of Different Statistical Tests

J. Jin

Department of Statistics, Purdue University, 150 N. University Street, West Lafayette, IN 47907, USA

Email: jinj@stat.purdue.edu

J.-L. Starck

DAPNIA/SEDI-SAP, Service d’Astrophysique, CEA-Saclay, 91191 Gif-sur-Yvette Cedex, France

Email: jstarck@cea.fr

D. L. Donoho

Department of Statistics, Stanford University, Sequoia Hall, Stanford, CA 94305, USA

Email: donoho@stat.stanford.edu

N. Aghanim

IAS-CNRS, Universit´

e Paris Sud, Bˆ

atiment 121, 91405 Orsay Cedex, France

Email: aghanim@ias.fr

Division of Theoretical Astronomy, National Astronomical Observatory of Japan, Osawa 2-21-1,

Mitaka, Tokyo 181-8588, Japan

O. Forni

IAS-CNRS, Universit´

e Paris Sud, Bˆ

atiment 121, 91405 Orsay Cedex, France

Email: olivier.forni@ias.fr

Received 30 June 2004

Currently, it appears that the best method for non-Gaussianity detection in the cosmic microwave background (CMB) consists

in calculating the kurtosis of the wavelet coefficients. We know that wavelet-kurtosis outperforms other methods such as the bis-

pectrum, the genus, ridgelet-kurtosis, and curvelet-kurtosis on an empirical basis, but relatively few studies have compared other

transform-based statistics, such as extreme values, or more recent tools such as higher criticism (HC), or proposed “best possible”

choices for such statistics. In this paper, we consider two models for transform-domain coefficients: (a) a power-law model, which

seems suited to the wavelet coefficients of simulated cosmic strings, and (b) a sparse mixture model, which seems suitable for the

curvelet coefficients of filamentary structure. For model (a), if power-law behavior holds with finite 8th moment, excess kurtosis

is an asymptotically optimal detector, but if the 8th moment is not finite, a test based on extreme values is asymptotically optimal.

For model (b), if the transform coefficients are very sparse, a recent test, higher criticism, is an optimal detector, but if they are

dense, kurtosis is an optimal detector. Empirical wavelet coefficients of simulated cosmic strings have power-law character, infinite

8th moment, while curvelet coefficients of the simulated cosmic strings are not very sparse. In all cases, excess kurtosis seems to

be an effective test in moderate-resolution imagery.

Keywords and phrases: cosmology, cosmological microwave background, non-Gaussianity detection, multiscale method, wavelet,

curvelet.

1. INTRODUCTION

The cosmic microwave background (CMB), discovered in

1965 by Penzias and Wilson [1], is a relic of radiation emit-

ted some 13 billion years ago, when the universe was about

370 000 years old. This radiation exhibits characteristic of an

almost perfect blackbody at a temperature of 2.726 kelvins

as measured by the FIRAS experiment on board COBE satel-

lite [2]. The DMR experiment, again on board COBE, de-

tected and measured angular small fluctuations of this tem-

perature, at the level of a few tens of microkelvins, and at

angular scale of about 10 degrees [3]. These so-called tem-

perature anisotropies were predicted as the imprints of the

initial density perturbations which gave rise to present large

Comparing Different Statistics in Non-Gaussian 2471

Figure 1: Courtesy of the WMAP team (reference to the website).

All sky map of the CMB anisotropies measured by the WMAP satel-

lite.

scale structures as galaxies and clusters of galaxies. This rela-

tion between the present-day universe and its initial condi-

tions has made the CMB radiation one of the preferred tools

of cosmologists to understand the history of the universe,

the formation and evolution of the cosmic structures, and

physical processes responsible for them and for their cluster-

ing.

As a consequence, the last several years have been a par-

ticularly exciting period for observational cosmology focus-

ing on the CMB. With CMB balloon-borne and ground-

based experiments such as TOCO [4], BOOMERanG [5],

MAXIMA [6], DASI [7], and Archeops [8],afirmdetection

of the so-called “first peak” in the CMB anisotropy angular

power spectrum at the degree scale was obtained. This detec-

tion has been very recently confirmed by the WMAP satel-

lite [9,10] (see Figure 1), which detected also the second and

third peaks. WMAP satellite mapped the CMB temperature

fluctuations with a resolution better than 15 arcminutes and

a very good accuracy marking the starting point of a new

era of precision cosmology that enables us to use the CMB

anisotropy measurements to constrain the cosmological pa-

rameters and the underlying theoretical models.

In the framework of adiabatic cold dark matter models,

the position, amplitude, and width of the first peak indeed

provide strong evidence for the inflationary predictions of a

flat universe and a scale-invariant primordial spectrum for

the density perturbations. Furthermore, the presence of sec-

ond and third peaks confirms the theoretical prediction of

acoustic oscillations in the primeval plasma and sheds new

light on various cosmological and inflationary parameters,

in particular, the baryonic content of the universe. The accu-

rate measurements of both the temperature anisotropies and

polarised emission of the CMB will enable us in the very near

future to break some of the degeneracies that are still affect-

ing parameter estimation. It will also allow us to probe more

directly the inflationary paradigm favored by the present ob-

servations.

Testing the inflationary paradigm can also be achieved

through detailed study of the statistical nature of the CMB

anisotropy distribution. In the simplest inflation models,

the distribution of CMB temperature fluctuations should be

Gaussian, and this Gaussian field is completely determined

by its power spectrum. However, many models such as mul-

tifield inflation (e.g., [11] and the references therein), su-

per strings, or topological defects predict non-Gaussian con-

tributions to the initial fluctuations [12,13,14]. The sta-

tistical properties of the CMB should discriminate mod-

els of the early universe. Nevertheless, secondary effects like

the inverse Compton scattering, the Doppler effect, lensing,

and others add their own contributions to the total non-

Gaussianity.

All these sources of non-Gaussian signatures might have

different origins and thus different statistical and morpho-

logical characteristics. It is therefore not surprising that a

large number of studies have recently been devoted to the

subject of the detection of non-Gaussian signatures. Many

approaches have been investigated: Minkowski functionals

and the morphological statistics [15,16], the bispectrum

(3-point estimator in the Fourier domain) [17,18,19],

the trispectrum (4-point estimator in the Fourier domain)

[20], wavelet transforms [21,22,23,24,25,26,27], and

the curvelet transform [27]. Different wavelet methods have

been studied, such as the isotropic `

a trous algorithm [28]

and the biorthogonal wavelet transform [29]. (The biorthog-

onal wavelet transform was found to be the most sensitive

to non-Gaussianity [27].) In [27,30], it was shown that the

wavelet transform was a very powerful tool to detect the non-

Gaussian signatures. Indeed, the excess kurtosis (4th mo-

ment) of the wavelet coefficients outperformed all the other

methods (when the signal is characterised by a nonzero 4th

moment).

Nevertheless, a major issue of the non-Gaussian studies

in CMB remains our ability to disentangle all the sources of

non-Gaussianity from one another. Recent progress has been

made on the discrimination between different possible ori-

gins of non-Gaussianity. Namely, it was possible to separate

the non-Gaussian signatures associated with topological de-

fects (cosmic strings (CS)) from those due to Doppler effect

of moving clusters of galaxies (both dominated by a Gaussian

CMB field) by combining the excess kurtosis derived from

both the wavelet and the curvelet transforms [27].

This success argues for us to construct a “toolkit” of well-

understood and sensitive methods for probing different as-

pects of the non-Gaussian signatures.

In that spirit, the goal of the present study is to consider

the advantages and limitations of detectors which apply kur-

tosis to transform coefficients of image data. We will study

plausible models for transform coefficients of image data and

compare the performance of tests based on kurtosis of trans-

form coefficients to other types of statistical diagnostics.

At the center of our analysis are the following two facts:

(A) the wavelet/curvelet coefficients of CMB are Gaussian

(we implicitly assume the most simple inflationary

scenario);

(B) the wavelet/curvelet coefficients of topological defect

and Doppler effect simulations are non-Gaussian.

We develop tests for non-Gaussianity for two models of

statistical behavior of transform coefficients. The first, better

suited for wavelet analysis, models transform coefficients of

cosmic strings as following a power law. The second, theoret-

ically better suited for curvelet coefficients, assumes that the

2472 EURASIP Journal on Applied Signal Processing

salient features of interest are actually filamentary (it can be

residual strips due to a nonperfect calibration), which gives

the curvelet coefficients a sparse structure.

We review some basic ideas from detection theory, such

as likelihood ratio detectors, and explain why we prefer non-

parametric detectors, valid across a broad range of assump-

tions.

In the power-law setting, we consider two kinds of non-

parametric detectors. The first, based on kurtosis, is asymp-

totically optimal in the class of weakly dependent symmetric

non-Gaussian contamination with finite 8th moments. The

second, the Max, is shown to be asymptotically optimal in the

class of weakly dependent symmetric non-Gaussian contam-

ination with infinite 8th moment. While the evidence seems

to be that wavelet coefficients of CS have about 6 existing

moments, indicating a decisive advantage for extreme-value

statistics, the performance of kurtosis-based tests and Max-

based tests on moderate sample sizes (e.g., 64 K transform

coefficients) does not follow the asymptotic theory; excess

kurtosis works better at these sample sizes.

In the sparse-coefficients setting, we consider kurtosis,

the Max, and a recent statistic called higher criticism (HC)

[31]. Theoretical analysis suggests that curvelet coefficients

of filamentary features should be sparse, with about n1/4sub-

stantial nonzero coefficients out of ncoefficients in a sub-

band; this level of sparsity would argue in favor of Max/HC.

However, empirically, the curvelet coefficients of actual CS

simulations are not very sparse. It turns out that kurtosis out-

performs Max/HC in simulation.

Summarizing, the work reported here seems to show

that for all transforms considered, the excess kurtosis out-

performs alternative methods despite their strong theoreti-

cal motivation. A reanalysis of the theory supporting those

methods shows that the case for kurtosis can also be justi-

fied theoretically based on observed statistical properties of

the transform coefficients not used in the original theoretic

analysis.

2. DETECTING FAINT NON-GAUSSIAN SIGNALS

SUPERPOSED ON A GAUSSIAN SIGNAL

The superposition of a non-Gaussian signal with a Gaussian

signal can be modeled as Y=N+G,whereYis the ob-

served image, Nis the non-Gaussian component, and Gis

the Gaussian component. We are interested in using trans-

form coefficients to test whether N≡0ornot.

2.1. Hypothesis testing and likelihood ratio test

Transfor m coefficients of various kinds (Fourier, wavelet,

etc.) have been used for detecting non-Gaussian behavior in

numerous studies. Let X1,X2,...,Xnbe the transform coeffi-

cients of Y; we model these as

Xi=1−λ·zi+λ·wi,0<λ<1, (1)

where λ>0isaparameter,ziiid

∼N(0, 1) are the trans-

form coefficients of the Gaussian component G,wiiid

∼Ware

the transform coefficients of the non-Gaussian component

N,andWis some unknown symmetrical distribution. Here

without loss of generality, we assume the standard deviation

for both ziand wiis 1.

Phrased in statistical terms, the problem of detecting the

existence of a non-Gaussian component is equivalent to dis-

criminating between the hypotheses

H0:Xi=zi,(2)

H1:Xi=1−λ·zi+λ·wi,0<λ<1, (3)

and N≡0isequivalenttoλ≡0. We call H0the null hypoth-

esis and H1the alternative hypothesis.

When both Wand λare known, then the optimal test for

problem (2)-(3) is simply the Neyman-Pearson likelihood

ratio test (LRT) [32,page74].Thesizeofλ=λnfor which

reliable discrimination between H0and H1is possible can be

derived using asymptotics. If we assume that the tail proba-

bility of Wdecays algebraically,

lim

x→∞ xαP{|W|>x}=Cα,Cαis a constant (4)

(we say Whas a power-law tail), and we calibrate λto de-

cay with n, so that increasing amounts of data are offset by

increasingly hard challenges:

λ=λn=n−r,(5)

then there is a threshold effect for the detection problem (2)-

(3). In fact, define

ρ∗

1(α)=









α,α≤8,

4,α>8,

(6)

then, as n→∞, LRT is able to reliably detect for large nwhen

r<ρ

∗

1(α), and is unable to detect when r>ρ

∗

1(α); this is

proved in [33]. Since LRT is optimal, it is not possible for

any statistic to reliably detect when r>ρ

∗

1(α). We call the

curve r=ρ∗

1(α) in the α-rplane the detection boundary;see

Figure 2.

In fact, when r<1/4, asymptotically LRT is able to reli-

ably detect whenever Whas a finite 8th moment, even with-

out the assumption that Whas a power-law tail. Of course,

the case that Whas an infinite 8th moment is more compli-

cated, but if Whas a power-law tail, then LRT is also able to

reliably detect if r<2/α.

Despite its optimality, LRT is not a practical procedure.

To apply LRT, one needs to specify the value of λand the

distribution of W, which seems unlikely to be available. We

need nonparametric detectors, which can be implemented

without any knowledge of λor W, and depend on Xi’s

only. In the section below, we are going to introduce two

nonparametric detectors: excess kurtosis and Max; later in

Section 4.3, we will introduce a third nonparametric detec-

tor: higher criticism (HC).

Comparing Different Statistics in Non-Gaussian 2473

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

2 4 6 8 10 12 14 16 18

Undetectable

Detectable

for Max/HC

not for kurtosis

Detectable for kurtosis

not for Max/HC

Detectable for both kurtosis and Max/HC

Figure 2: Detectable regions in the α-rplane. With (α,r)inthe

white region on the top or in the undetectable region, all methods

completely fail for detection. With (α,r) in the white region on the

bottom, both excess kurtosis and Max/HC are able to detect reliably.

In the shaded region to the left, Max/HC is able to detect reliably,

but excess kurtosis completely fails, and in the shaded region to the

right, excess kurtosis is able to detect reliably, but Max/HC com-

pletely fails.

2.2. Excess kurtosis and Max

We pause to review the concept of p-value briefly. For a

statistic Tn, the p-value is the probability of seeing equally

extreme results under the null hypothesis:

p=PH0Tn≥tnX1,X2,...,Xn;(7)

here PH0refers to probability under H0,and

tn(X1,X2,...,Xn) is the observed value of statistic Tn.

Notice that the smaller the p-value, the stronger the evidence

against the null hypothesis. A natural decision rule based

on p-values rejects the null when p<αfor some selected

level α, and a convenient choice is α=5%. When the null

hypothesis is indeed true, the p-values for any statistic

are distributed as uniform U(0, 1). This implies that the

p-values provide a common scale for comparing different

statistics.

We now introduce two statistics for comparison.

Excess kurtosis (κn)

Excess kurtosis is a widely used statistic, based on the 4th mo-

ment. For any (symmetrical) random variable X, the kurtosis

κ(X)=EX4

EX22−3.(8)

The kurtosis measures a kind of departure of Xfrom Gaus-

sianity, as κ(z)=0.

Empirically, given nrealizations of X, the excess kurtosis

statistic is defined as

κnX1,X2,...,Xn=n

24 (1/n)iX4

(1/n)iX2

i2−3.(9)

When the null is true, the excess kurtosis statistic is asymp-

totically normal:

κnX1,X2,...,Xn−→ wN(0, 1), n−→ ∞ , (10)

thus for large n, the p-value of the excess kurtosis is approxi-

mately

p=¯

Φ−1κnX1,X2,...,Xn, (11)

where ¯

Φ(·) is the survival function (upper-tail probability)

of N(0, 1).

It is proved in [33] that the excess kurtosis is asymptoti-

cally optimal for the hypothesis testing of (2)-(3)if

EW8<∞.(12)

However, when E[W8]=∞, even though kurtosis is well

defined (E[W4]<∞), there are situations in which LRT is

able to reliably detect but excess kurtosis completely fails. In

fact, by assuming (4)-(5)withanα<8, if (α,r) falls into

the shaded region to the left of Figure 2, then LRT is able to

reliably detect, however, excess kurtosis completely fails. This

shows that in such cases, excess kurtosis is not optimal; see

[33].

Max (Mn)

The largest (absolute) observation is a classical and fre-

quently used nonparametric statistic:

Mn=max 

X1

,

X2

,...,

Xn

, (13)

under the null hypothesis,

Mn≈2logn, (14)

and, moreover, by normalizing Mnwith constants cnand dn,

the resulting statistic converges to the Gumbel distribution

Ev, whose cdf is e−e−x:

Mn−cn

dn−→ wEv, (15)

where approximately

dn=√6Sn

π,cn=¯

X−0.5772dn; (16)

here ¯

Xand Snare the sample mean and sample standard de-

viation of {Xi}n

i=1, respectively. Thus a good approximation

of the p-value for Mnis

p=exp −exp −Mn−cn

dn.(17)

We have tried the above experiment for n=2442,andfound

that taking cn=4.2627, dn=0.2125 gives a good approxi-

mation.

2474 EURASIP Journal on Applied Signal Processing

Assuming (4)-(5)andα<8, or λ=n−r, and that Whas

a power-law tail with α<8, it is proved in [33] that Max

is optimal for hypothesis testing (2)-(3). Recall if we further

assume 1/4<r<2/α, then asymptotically excess kurtosis

completely fails; however, Max is able to reliably detect and is

competitive to LRT.

On the other hand, recall that excess kurtosis is optimal

for the case α>8. In comparison, in this case, Max is not op-

timal. In fact, if we further assume 2/α < r < 1/4, then excess

kurtosis is able to reliably detect, but Max will completely fail.

In Figure 2, we compared the detectable regions of the

excess kurtosis and Max in the α-rplane.

To conclude this section, we mention an alternative way

to approximate the p-values for any statistic Tn. This alterna-

tive way is important in case that an asymptotic (theoretic)

approximation is poor for moderate large n, an example is

the statistic HC∗

nwe will introduce in Section 4.3; this alter-

native way is helpful even when the asymptotic approxima-

tion is accurate. Now the idea is that, under the null hypoth-

esis, we simulate a large number (N=104or more) of Tn:

T(1)

n,T(2)

n,...,T(N)

n, we then tabulate them. For the observed

value tn(X1,X2,...,Xn), the p-value will then be well approx-

imated by

N·#k:T(k)

n≥tnX1,X2,...,Xn, (18)

and the larger the N, the better the approximation.

2.3. Heuristic approach

We have exhibited a phase-change phenomenon, where the

asymptotically optimal test changes depending on power-law

index α. In this section, we develop a heuristic analysis of

detectability and phase change.

The detection property of Max follows from compar-

ing the ranges of data. Recall that Xi=1−λn·zi+

λn·wi, the range of {zi}n

i=1is roughly (−2logn,2logn),

and the range of {λn·wi}n

i=1is λn·(−n1/α,n1/α)=

(−n1/α−r/2,n1/α−r/2); so, heuristically,

Mn≈max 2logn,n1/α−r/2; (19)

for large n, notice that

n1/α−r/2≫2logn,ifr<2

α,

n1/α−r/2≪2logn,ifr>2

α,

(20)

thus if and only if r<2/α,Mnfor the alternative will differ

significantly from Mnfor the null, and so the criterion for

detectability by Max is r<2/α.

Now we study detection by excess kurtosis. Heuristically,

κn≈1

√24 ·κ1−λn·zi+λn·wi

=1

√24 ·√n·λ2

n·κ(W)=On1/2−2r,

(21)

thus if and only if r<1/4, κnfor the alternative will differ

significantly from κnunder the null, and so the criterion for

detectability by excess kurtosis is r<1/4.

This analysis shows the reason for the phase change. In

Figure 2, when the parameter (α,r) is in the shaded region to

the left, for sufficiently large n,n1/α−r/2≫2lognand the

strongest evidence against the null is in the tails of the data

set, which Mnis indeed using. However, when (α,r)moves

from the shaded region to the left to the shaded region to

the right, n1/α−r/2≪2logn, the tails no longer contain any

important evidence against the null, instead, the central part

of the data set contains the evidence. By symmetry, the 1st

and the 3rd moments vanish, and the 2nd moment is 1 by

the normalization; so the excess kurtosis is in fact the most

promising candidate of detectors based on moments.

The heuristic analysis is the essence for theoretic proof

as well as empirical experiment. Later in Section 3.4,wewill

have more discussions for comparing the excess kurtosis with

Max down this vein.

3. WAVELET COEFFICIENTS OF COSMIC STRINGS

3.1. Simulated astrophysical signals

The temperature anisotropies of the CMB contain the con-

tributions of both the primary cosmological signal, directly

related to the initial density perturbations, and the secondary

anisotropies. The latter are generated after matter-radiation

decoupling [34]. They arise from the interaction of the CMB

photons with the neutral or ionised matter along their path

[35,36,37].

In the present study, we assume that the primary CMB

anisotropies are dominated by the fluctuations generated

in the simple single field inflationary cold-dark-matter

model with a nonzero cosmological constant. The CMB

anisotropies have therefore a Gaussian distribution. We al-

low for a contribution to the primary signal from topological

defects, namely, cosmic strings (CS), as suggested in [38,39].

See Figures 3and 4.

We use for our simulations the cosmological parame-

ters obtained from the WMAP satellite [10]andanormal-

ization parameter σ8=0.9. Finally, we obtain the so-called

“simulated observed map,” D, that contains the two pre-

vious astrophysical components. It is obtained from Dλ=

√1−λCMB + √λCS, where CMB and CS are, respectively,

the CMB and the cosmic string simulated maps. λ=0.18

is an upper limit constant derived by [38]. All the simulated

maps have 500 ×500 pixels with a resolution of 1.5 arcmin-

utes per pixel.

3.2. Evidence for E[W8]=∞

For the wavelet coefficients on the finest scale of the cosmic

string map in Figure 3b, by throwing away all the coefficients

related to pixels on the edge of the map, we have n=2442

coefficients; we then normalize these coefficients so that the

empirical mean and standard deviation are 0 and 1, respec-

tively; we denote the resulting dataset by {wi}n

i=1.

Báo cáo hóa học: " Cosmological Non-Gaussian Signature Detection: Comparing Performance of Different Statistical Tests"

Tuyển tập báo cáo các nghiên cứu khoa học quốc tế ngành hóa học dành cho các bạn yêu hóa học tham khảo đề tài: Cosmological Non-Gaussian Signature Detection: Comparing Performance of Different Statistical Tests

Chủ đề:

Tài liệu liên quan

Tài liêu mới

AI tóm tắt

Giới thiệu tài liệu

Đối tượng sử dụng

Từ khoá chính

Nội dung tóm tắt

Hỗ trợ

Phương thức thanh toán

Theo dõi chúng tôi