1. Introduction
The forthcoming generation of experiments targeting the large-scale cosmic structure will provide us with data of exquisite quality, from which it will be possible to extract cosmological information to test the our current cosmological model (ΛCDM), for instance investigating the nature of dark energy and dark matter. The two main probes envisaged for such experiments are weak gravitational lensing and galaxy clustering. In this paper, we shall focus on the latter.
Forthcoming galaxy surveys, such as the Euclid satellite (Amendola et al., Reference Amendola, Appleby, Bacon, Baker, Baldi, Bartolo, Blanchard, Bonvin, Borgani, Branchini, Burrage, Camera, Carbone, Casarini, Cropper, de Rham, Di Porto, Ealet, Ferreira and Zlosnik2013, Reference Amendola, Appleby, Bacon, Baker, Baldi, Bartolo, Blanchard, Bonvin, Borgani, Branchini, Burrage, Camera, Carbone, Casarini, Cropper, de Rham, Di Porto, Ealet, Ferreira and Zlosnik2018; Laureijs et al., Reference Laureijs, Amiaux, Arduini, Auguères, Brinchmann, Cole, Cropper, Dabin, Duvet, Ealet, Garilli, Gondoin, Guzzo, Hoar, Hoekstra, Holmes, Kitching, Maciaszek, Mellier and Zucca2011), the Legacy Survey of Space and Time (LSST Science Collaboration et al., Reference Abell, Allison, Anderson, Andrew, Angel, Armus, Arnett, Asztalos, Axelrod, Bailey, Ballantyne, Bankert, Barkhouse, Barr, Barrientos, Barth, Bartlett, Becker, Becla and Zhan2009), and the Square Kilometre Array (Bacon et al., Reference Bacon, Battye, Bull, Camera, Ferreira, Harrison, Parkinson, Pourtsidou, Santos, Wolz, Abdalla, Akrami, Alonso, Andrianomena, Ballardini, Bernal, Bertacca, Bengaly, Bonaldi and Zuntz2020), will be characterised by a high computational time cost in their analysis, motivating the search for new optimised methods. For this reason, this work aims at developing an improved analysis technique, taking inspiration from Camera et al. (Reference Camera, Fonseca, Maartens and Santos2018). In particular, we adopt the philosophy there presented and implement it in a likelihood-based approach, simulating a synthetic data set that we then fit against the theoretical model predictions.
This paper is outlined as follows. In Section 2, we introduce the survey assumptions considered throughout our analysis, present the harmonic-space angular power spectrum for galaxy clustering, describe in detail the optimised method, and show the likelihood and the scale cuts applied for the analysis. In Section 3, we discuss the results obtained with the standard and the optimised method. Finally, conclusions are presented in Section 4.
2. Methods
2.1. Survey assumptions
We adopt the same survey specifications of Camera et al. (Reference Camera, Fonseca, Maartens and Santos2018, see their Section 2.2 and references therein, for details), who first proposed the method and tested it via a Fisher matrix analysis. Specifically, we consider a spectroscopic galaxy survey targeting Hα emitters in the redshift range between 0.6 and 2, with an accuracy that can be modelled with a redshift-dependent Gaussian uncertainty on the distribution on the measured redshift with width $ {\sigma}_z=0.001\left(1+z\right) $ . The linear galaxy bias is modelled as $ b(z)=\sqrt{\left(1+z\right)} $ .
2.2. The harmonic-space galaxy power spectrum
The harmonic-space (also, angular) power spectrum represents the natural tool to probe fluctuations in the observed galaxy distribution as measured from our point of view as observers. For large multipole values, $ \mathrm{\ell}\quad \gg 1 $ , it is possible to employ the Limber approximation (Kaiser, Reference Kaiser1992; Limber, Reference Limber1953), to reduce the computational effort thanks to its collapsing a three-dimensional integral into a one-dimensional one. Under this assumption, the theoretical power spectrum of galaxy number counts for the redshift bin pair i − j and on linear scales reads
where $ \chi (z) $ is the comoving distance to redshift z,
is the window function in the ith redshift bin, with ni(χ) its normalised galaxy distribution, b(χ) is the linear galaxy bias, and D(χ) the linear growth factor. Finally, P lin(k) is the linear matter power spectrum at z = 0, which is here provided by the Boltzmann solver CAMB (Lewis et al., Reference Lewis, Challinor and Lasenby2000).
It is worth noting that the observed clustering of galaxies contains other terms on top of what we have described above, which is due to perturbations in the underlying matter density distribution (Bonvin & Durrer, Reference Bonvin and Durrer2011; Challinor & Lewis, Reference Challinor and Lewis2011). The most notable of such terms are redshift-space distortions (RSD) and lensing magnification.Footnote 1 However, these terms are suppressed on the scales of interest in this analysis and for the bin sizes we adopt, meaning we can safely neglect to include them.
2.3. The traditional approach
On the one hand, data from spectroscopic galaxy surveys has customarily been analysed in terms of the Fourier-space power spectrum and its decomposition into Legendre multipoles. Whilst this approach has worked perfectly, for the redshift and sky coverages of data hitherto collected, it is arguable that some of its underlying assumptions will no longer be met with the next generation of cosmological experiments (see e.g. Blake et al., Reference Blake, Carter and Koda2018; Ruggeri et al., Reference Ruggeri, Percival, Gil-Marín, Beutler, Mueller, Zhu, Padmanabhan, Zhao, Zarrouk, Sánchez, Bautista, Brinkmann, Brownstein, Baumgarten, Chuang, Dawson, Seo, Tojeiro and Zhao2018). Moreover, the fluctuations in the observed galaxy number counts contain terms that cannot be decomposed in Fourier modes, like the nonlocal contribution from gravitational lensing, which will be all the more important for deeper surveys (Camera et al., Reference Camera, Raccanelli, Bull, Bertacca, Chen, Ferreira, Kunz, Maartens, Mao, Santos, Shapiro, Viel and Xu2015; Cardona et al., Reference Cardona, Durrer, Kunz and Montanari2016).
On the other hand, the standard tomographic approach for the computation of the galaxy angular power spectrum $ {C}_{\mathrm{\ell}}^{ij} $ is based on all correlations among bin pairs i – j across the whole redshift range. Now, the benchmark survey described in Section 2.1 will easily be able to slice the observed galaxy distribution in bins of width ~ 0.01, which, for the redshift range considered, results into about 104 between auto- and cross-bin correlations. Such a number has to be further multiplied by the number of bins in multipole space the data will be binned into. This is clearly computationally unfeasible, in the prospect of a likelihood-based analysis scanning—at the very least—the six-dimensional parameter space of the ‘vanilla’ ΛCDM model.
2.4. The new method
This conundrum motivates the research of new strategies to analyse forthcoming surveys data sets. Among the different proposals, we follow Camera et al. (Reference Camera, Fonseca, Maartens and Santos2018), who proposed to combine relevant aspects of the two standard techniques described above. Fourier-space analyses usually employ a thick redshift binning, e.g. with width $ \Delta z\approx 0.1 $ ; all Fourier modes inside the bin are then considered, but the correlations among adjacent z-bins are not taken into account. However, applying this approach face-value to the harmonic-space $ {C}_{\mathrm{\ell}}^{ij} $ means losing information by squashing all the galaxies within the relatively large ∆z bin onto a single redshift slice.
Hence, the idea is to combine the two approaches in a ‘hybrid’ method. This method is characterised by two binning tiers: the galaxy distribution is binned by adopting a set of top-hat thick bins; each of these is further binned into top-hat thin bins, convolved with a Gaussian in order to take into account for the small although non-negligible errors in the spectroscopic redshift estimation. This division is made by using equi-spaced bins. Each thick bin is considered as an independent survey, hence cross-correlation between them is not computed, while it is for the thin bins inside the thick ones. The resulting tomographic matrix $ {C}_{\mathrm{\ell}}^{ij} $ is thus block diagonal by construction.
In this paper, we include two hybrid binning configurations in the same redshift range $ z\in \left[0.6,2.0\right] $ , both smoothed by a Gaussian with σz = 0.001:
-
1. 7 equi-spaced thick bins of redshift width ∆z = 0.2, each having 5 equi-spaced thin bins of width δz = 0.04. This case is represented by black and coloured bins, respectively, in the left panel of Figure 1;
-
2. 10 equi-spaced thick bins of redshift width ∆z = 0.14, each having 7 equi-spaced thin bins of width δz = 0.02, as shown in the right panel of Figure 1.
2.5. Set-up of statistical analysis
To construct the likelihood and forecast constraints on the cosmological parameters of interest, we employ the publicly available suite CosmoSIS (Zuntz et al., Reference Zuntz, Paterno, Jennings, Rudd, Manzotti, Dodelson, Bridle, Sehrish and Kowalkowski2015), which we modify to reproduce the hybrid method described above. For our synthetic data, we choose as a reference a flat ΛCDM model with the cosmological parameter set θ = {Ωm, h, Ωb, n s, ln (1010A s)}, with fiducial values θ fid = {0.31, 0.6774, 0.05, 0.9667, 3.06} using the angular power spectra as given in Equation 1. For details on the samplers and analysis employed to explore the parameter space, see Section 3.
For the data, we assume a Gaussian likelihood, and we focus on minimising the chisquared. In other words, we do not include the likelihood normalisation in the parameter estimation. This assumption does not hinder our result, as the data covariance is assumed independent of the cosmological parameters.
Concerning the covariance of the galaxy clustering signal given in Equation 1, we adopt the Gaussian approximation, namely
where $ \Delta \mathrm{\ell} $ is the multipole binning width, f sky the sky fraction covered by the survey, δ K the Kronecker symbol and $ {\overline{n}}_i $ is the surface galaxy density in bin i.
The angular power spectra are computed with the Limber approximation and in the linear regime, we therefore focus on multipole range, $ \mathrm{\ell}\in \left[100,800\right] $ as a reasonable interval. It is possible that for a few bin-pair configurations either the lower or upper multipole limit exceeds the range of validity of the Limber approximation or the nonlinear scale. However, we do not aim to make forecasts for a specific experiment but rather to compare the performance of the standard and hybrid methods in a realistic setting, and thus this choice does not affect our conclusions. In both binning scenario we consider $ {n}_{\mathrm{\ell}}=5 $ log-spaced multipole values in the aforementioned range.
3. Results
Here, we present and compare the results obtained with the standard and the hybrid methods. As already mentioned in Subsection 2.4, we applied two hybrid binning configurations in the redshift range $ z\in \left[0.6,2.0\right] $ . We can summarise our findings as follows. All cosmological parameter reconstructed mean values and inferred 68% confidence level intervals are summarised in Table 1.
-
1. For the standard approach we use 20 equi-populated bins in the redshift range $ z\in \left[0.6,2.0\right] $ , and n = 5 multipole values in the considered l range. We use the multinest sampler (Feroz et al., Reference Feroz, Hobson and Bridges2009) to forecast constraints.
-
2. Regarding the first configuration of the hybrid binning we use equi-spaced thick bins with width ∆z = 0.2, in the same redshift range, while for the thin bins we use a width δz = 0.04. This means that we have seven thick bins, considered as seven independent surveys, each of them containing five thin bins. Again, we use n = 5 multipole values while for the sampling method we chose the emcee sampler (Foreman-Mackey et al., Reference Foreman-Mackey, Hogg, Lang and Goodman2013), better suited for the tomographic matrix configuration of the hybrid method.
-
3. In the second hybrid binning configuration the thick bin width is ∆z = 0.14 and the thin bins δz = 0.02, working now with finer binning of 10 thick bins each containing 7 thin bins. The sampler employed is, again, emcee.
For a more thorough comparison of the two methods, in Table 2 we also show the computation times running a fixed cosmology on a specific parameter value set for the standard and the hybrid method. For sake of comparison of running time test, we consider a third hybrid binning with 14 thick bins each containing 10 thin bins while keeping the same smearing with the previous cases. It can be clearly seen from Table 2 that the larger the number of the bins, the more time we save by using the hybrid method with respect to the standard one.
Another major advantage of the hybrid method over the standard approach is that it yields tighter constraints on the parameter of interest. This is due to the fact that the finer binning of the thin bins allows us to recover partly the three-dimensional information encoded in the correlation of galaxies within the thick bin. To appreciate better the aforementioned enhancement in constraining power, in Figure 2 we show the ratio between the 68% marginal error intervals on each parameter from the hybrid method and the same obtained with the standard approach, for the two binning configurations of Subsection 2.4 (green and red candlesticks, for configuration 1 and 2 respectively). Note that the blue candlesticks are the ratio of the 68% marginal error intervals of the standard approach with themselves, simply to guide the reader’s eye. This clearly shows us how the finer the binning, the tighter the constraints, both because we can track better the cosmic growth and the redshift evolution of the source distribution (thick binning), and because we can recover radial information (thin binning). Actually, the fact that even the seven thick bins of the hybrid binning configuration no. 1 perform better than the 20 bins of the standard method is a proof that radial information within the bin is crucial for accurate cosmological parameter estimation.
To have a deeper understanding of the impact of the radial information retrieved by the hybrid approach, it is useful to look not only at the constraints on a single parameter, but rather at the cross-talks among different parameters, which tell us about intrinsic parameter degeneracies. Figure 3 shows the 68% and 95% joint marginal error contours on the two-dimensional parameter planes of the parameter set θ for the three cases under investigation, i.e. the standard approach (blue contours), and the two binning configurations of the hybrid method (green and red contours for configuration 1 and 2 respectively). Looking at these plots it is evident that the new method is capable of constraining cosmological parameters better than the standard one, giving relative errors which are of the same order of magnitude but smaller. It is worth noting that the parameter Ωm is particularly better constrained with the hybrid procedure, having a relative error half of the one given by the standard method.
4. Conclusions
In this work we make forecasts to compare the constraining power and reliability of a new hybrid tomographic method with the standard tomographic approach generally applied in the studies of spectroscopic galaxy clustering. We perform this comparison in a likelihood-based Bayesian approach going beyond the Fisher matrix analysis. We confirm that the standard and hybrid methods give comparable results, but the latter appears to be more constraining. On top of that, it saves computational time, as shown in Table 2. However, several approximations are made: we do not consider the RSD or correction due to lensing magnification (which nonetheless should be subdominant for fine redshift slicing). Also, we do not account for nuisance parameters, which are considered in the original paper. Finally, we work in the Limber approximation and calculate the angular power spectra at an exiguous number of multipole values to speed up the analysis computation. This, in principle, is not an issue, but in future works the hybrid method should be tested with finer binning, both in angular and in redshift space. Consequently, these approximations do not allow for a face-value comparison with the original paper results. In a forthcoming work we plan to reproduce the same analysis using finer binning, and introducing nuisance parameters as well as the contributions from RSD and magnification on the galaxy density field.
Author contributions
SC conceived the methodology. KT and GF created the algorithm. GF performed the analysis. SC and KT supervised the analysis. GF, KT, and SC wrote the article.
Funding information
The authors wish to thanks Roy Maartens and José Fonseca for useful feedback on an early draft of this paper, as well as the referees who helped us improving the presentation of our results. The authors acknowledge support from the ‘Departments of Excellence 2018–2022’ Grant (L. 232/2016) awarded by the Italian Ministry of Education, University and Research (miur). SC also acknowledges support by miur through Rita Levi Montalcini project ‘prometheus – Probing and Relating Observables with Multi-wavelength Experiments To Help Enlightening the Universe’s Structure’ in the early stages of this project.
Conflicts of interest
GF, KT, and SC declare none.
Data availability
Data sharing not applicable - no new data generated.
Comments
Comments to the Author: There are two avenues that the authors presumably wish to take, the first being a clear comparison with SC18 when using the more robust Bayesian analyses. This is made difficult from the get-go because RSD is omitted. The second possible avenue is to provide a second test of the standard and hybrid approaches in the context of a Bayesian analysis. In this case, how do the authors justify not using a non-linear power spectrum in equation.1 such as halofit? If nuisance (e.g. bias) parameters are not considered as hinted at in Section.4, then it seems there will be no consistency issues by using the pure dark matter halofit formula in equation 1.
In summary, it is not clear what additional information they are they providing over the Fisher analysis of SC18 by using linear theory for model and data as well as a Gaussian covariance. I feel the authors should either revise the analysis or make it very clear what the goal is, and argue clearly against extensions, such as (given their methodology) to using non-linear spectra. I also have some minor comments:
Further, the authors should describe explicitly how the synthetic data is created (presumably using linear theory) and they should comment explicitly that bias is neglected (if that is the case).
The label font of figure 1 could do with increasing.
Could figures 3 and 4 be combined?
Bacon et al 2018 needs updating and Mon.Not.Roy.Astron.Soc to be used consistently.