Paper The following article is Open access

Quick recipes for gravitational-wave selection effects

and

Published 14 May 2024 © 2024 The Author(s). Published by IOP Publishing Ltd
, , Citation Davide Gerosa and Malvina Bellotti 2024 Class. Quantum Grav. 41 125002 DOI 10.1088/1361-6382/ad4509

0264-9381/41/12/125002

Abstract

Accurate modeling of selection effects is a key ingredient to the success of gravitational-wave astronomy. The detection probability plays a crucial role in both statistical population studies, where it enters the hierarchical Bayesian likelihood, and astrophysical modeling, where it is used to convert predictions from population-synthesis codes into observable distributions. We review the most commonly used approximations, extend them, and present some recipes for a straightforward implementation. These include a closed-form expression capturing both multiple detectors and noise realizations written in terms of the so-called Marcum Q-function and a ready-to-use mapping between signal-to-noise ratio (SNR) thresholds and false-alarm rates from state-of-the-art detection pipelines. The bias introduced by approximating the matched filter SNR with the optimal SNR is not symmetric: sources that are nominally below threshold are more likely to be detected than sources above threshold are to be missed. Using both analytical considerations and software injections in detection pipelines, we confirm that including noise realizations when estimating the selection function introduces an average variation of a few %. This effect is most relevant for large catalogs and specific subpopulations of sources at the edge of detectability (e.g. high redshifts).

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 license. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

Gravitation-wave (GW) surveys are affected by selection biases. GW selection effects are relatively clean to model compared to those of conventional (i.e. electromagnetic) astronomy because, unlike photons, GWs travel unaffected across the Universe from emission to detection. That said, our detectors are not equally sensitive to compact binaries with different parameters (masses, spins, distance, inclination, etc), which implies their observational coverage is not uniform. Selection-effect modeling is crucial in GW population studies [1, 2] and it is not an exaggeration to say that accurately estimating the probability of detection—commonly referred to as $p_{\mathrm{det}}$—is a key ingredient to the success of GW astronomy as a whole. Indeed, it was shown [3] that modeling errors in the selection function will be (and perhaps already are) the leading limiting factor in our astrophysical inference, or at least that which scales more severely with the number of observed events.

The modern and more accurate approach to estimating selection biases is that of performing software injections using the same pipelines that are used for detection [4]. While accurate, this procedure is computationally expensive, requiring reweighting schemes [5], calibration factors [6], and an effective number of samples that is at least a factor of a few greater than the number of events in the catalog [7]. In practice, many astrophysical predictions in the field relies on (semi-)analytical approximations to $p_{\mathrm{det}}$. Seminal work in this direction was presented by Finn and Chernoff [8], who approximated the detection statistics using the optimal signal-to-noise ratio (SNR) and factored out the dependence on the binary extrinsic parameters. Their approach is still widely adopted, with SNR thresholds $\rho_{\mathrm{t}}$ ranging from ∼8 to ∼12 [9]. More recent advances include those by Essick [10], who presented a semi-analytical model of GW detectability capturing the finite size of template banks, with all the associated complexities. Additional attempts leveraging both analytical [11] and machine-learning [12, 13] techniques have also been explored. One can also calibrate SNR thresholds a posteriori, i.e. using the events that are considered detected [14].

In this paper, we review and extend some of the most commonly used approximations to GW selection effects, notably including noise realizations (section 2). For a single detector, we further develop the marginalization over the extrinsic parameters first presented in [8]. For multiple detectors, we show that thresholding the matched filter SNR results in an expression for $p_{\mathrm{det}}$ that can be written down in closed form using special functions. In short, it is sufficient to substitute a sharp step function with a so-called Marcum Q-function [1517]. We apply our findings to both controlled distributions and LIGO/Virgo injections (section 3). Thresholding the matched filter SNR instead of the optimal SNR results in values of $p_{\mathrm{det}}$ that are systematically higher, and thus merger rates that are systematically lower. While this effect is of a few %, its impact grows dramatically with the size of the GW catalog [3] and disproportionally affects those specific regions of the parameters space where sources are close to the detection horizon [1820]. Our expressions are straightforward to implement and can be used to quickly post-process large samples of simulated sources such as the outputs of astrophysical population-synthesis codes (section 4). To facilitate the exploitation of our findings, we also describe a straightforward implementation of the Marcum Q-function for the Python programming language (appendix).

2. SNR thresholds

We organize our calculation in three steps of increasing complexity. In what follows, the symbol θ collectively denotes the intrinsic parameters of a compact binary (e.g. masses, spins) as well as the distance to the sources, while ξ indicates the remaining extrinsic parameters (inclination, sky location, and polarization angle).

2.1. Single detector & optimal SNR

Let us first consider the case of a single detector. The optimal SNR is given by

Equation (1)

where f indicates frequency, h(f) is the GW strain, and S(f) is the one-sided power spectral density of the detector. The word 'optimal' has sometimes been used in the literature (including by some of the authors [12]) to indicate the SNR of an optimally oriented source. In this paper, we refer to the more common meaning of optimality with respect to noise realizations.

A common approximation [8] is to take $\rho_{\mathrm{opt}} $ as a detection statistics, i.e. assume that a source is detectable if the optimal SNR is greater than a given threshold $\rho_{\mathrm{t}}$. That is, we write

Equation (2)

where $\mathcal{I}$ is an indicator function equal to one if the condition inside the brackets is true and zero otherwise 4 . For a given distribution of extrinsic parameters $p(\xi)$, one can compute (with an abuse of notation)

Equation (3)

For a single detector, the optimal SNR factorizes as follows [8, 12]

Equation (4)

where $0\unicode{x2A7D} \omega\unicode{x2A7D} 1$ is a projection parameter and $\rho_{\mathrm{max}}(\theta)$ is the optimal SNR of an optimally oriented source (i.e. a binary with face-on inclination located overhead the detector). In particular, one has $\xi = \{\iota,\vartheta,\phi,\psi\}$ and

Equation (5)

where ι is the binary inclination, ϑ and φ are a polar and an azimuthal angle for the sky location, and ψ is the polarization angle. Note that the factorization of equation (4) breaks down for precessing sources because the inclination of the orbital plane depends on the emitted frequency. That said, this effect is likely to be mild because current GW observations cover ${\lesssim} 1$ precession cycle.

With this factorization, the integral in equation (3) becomes [8]

Equation (6)

Assuming that the distribution $p(\xi)$ of the extrinsic parameters is known, this implies that selection effects can be estimated by evaluating the complementary cumulative distribution function of ω [8]

Equation (7)

The simplicity of this approximation is appealing: estimating selection effects reduces to evaluating a single waveform h(f) for an optimally oriented source with intrinsic parameters θ, calculating $\rho_{\mathrm{max}}$ from equation (1), and evaluating the universal function $p({\gt}\omega)$ at $\omega = \rho_{\mathrm{t}}/ \rho_{\mathrm{max}}$.

Figure 1 shows the outcome of this procedure assuming that sources are distributed isotropically in the sky (i.e. we take uniform distributions in $\cos\iota$, $\cos\vartheta$, φ, and ψ). To facilitate comparison with the rest of the paper, we show the averaged detectability $p_{\mathrm{det}}(\theta)$ as a function of $\rho_{\mathrm{max}}/ \rho_{\mathrm{t}}$ instead of its inverse, even though the latter enters equation (6) directly. For instance, the black curve in the top panel of figure 1 indicates that at least ${\sim} 80\%$ of the binaries with a given set of intrinsic parameters θ will be detectable if at least one of these sources has an SNR that is ${\sim} 5$ times above the detection threshold.

Figure 1.

Figure 1. Detection probability $p_{\mathrm{det}}(\theta)$ averaged over the extrinsic parameters as a function of the optimal, single-detector SNR $\rho_{\mathrm{max}}(\theta)$ of an optimally oriented source with intrinsic parameters θ. The black-dashed line is obtained by thresholding the optimal SNR. Colored solid lines are obtained by thresholding the matched filter SNR with various thresholds $\rho_{\mathrm{t}}$. The top panel shows the detection probability itself, while the bottom panel shows the difference between $p_{\mathrm{det}}$ obtained using the matched filter SNR as a detection statistics and $p_{\mathrm{det}}$ obtained using the optimal SNR instead. The vertical dotted lines indicate $\rho_{\mathrm{max}}(\theta) = \rho_{\mathrm{t}}$.

Standard image High-resolution image

2.2. Single detector & matched filter SNR

Because noise realizations, any specific GW source will not be observed with SNR $\rho_{\mathrm{opt}}$ but rather with some other value $\rho_{\mathrm{obs}}$. In the standard matched-filtering approach to GW detection, $\rho_{\mathrm{obs}}$ is given by the filtered signal (made of both GWs and noise) normalized by its own root-mean-square; see [21, 22]. One can thus include the effect of noise realizations in the estimate of $p_{\mathrm{det}}$ by thresholding the observed SNR $\rho_{\mathrm{obs}}$ instead of $\rho_{\mathrm{opt}}$ and computing the expectation value over noise realizations n, i.e.

Equation (8)

Assuming the noise in the detector is Gaussian and stationary, the observed SNR is distributed normally around the optimal SNR with unit variance (see [21, 22] but also [10] for caveats). Because of this property, computing $\rho_{\mathrm{opt}}$ reduces to evaluating in equation (1) and adding a variance of one. One has

Equation (9)

and thus [23]

Equation (10)

where $\mathrm{erf}$ is the error function. Marginalizing over the extrinsic parameters as above yields

Equation (11)

Our results are shown in figure 1 assuming isotropic sources. Note that, unlike equation (6), thresholding the observed SNR does not result in a universal function of $\rho_{\mathrm{max}}(\theta)/\rho_{\mathrm{t}}$ but rather a one-parameter family of functions, where the additional parameter is ρt itself.

In practice, introducing the SNR variance due to noise realizations causes variations in $p_{\mathrm{det}}$ that are of $\mathcal{O}(1\%)$. The effect decreases as $\rho_{\mathrm{t}}$ increases: if the threshold is large, it is less likely that introducing a variance of one might turn a detectable event into a non-detectable event, or vice versa. Broadly speaking, we find that the GW detectability computed by thresholding the matched filter SNR is larger (smaller) than that obtained by thresholding the optimal SNR when signals are weak (loud). From figure 1, the transition between these two behaviors takes place at $\rho_{\mathrm{max}}\simeq 4 \rho_{\mathrm{t}}$. The largest deviations are found at SNRs $\rho_{\mathrm{max}}\simeq 2 \rho_{\mathrm{t}}$.

The luminosity distance dL of astrophysical objects in the nearby Universe is distributed geometrically, $p(d_L)\propto d_L^{-2}$, which implies that $\rho_{\mathrm{max}}\propto 1/d_L$ is distributed as $p(\rho_{\mathrm{max}}) \propto {\rho_{\mathrm{max}}^{-4}}$ [24] (but note this will not be true for third-generation detectors [25]). This implies that the left part of the curve in figure 1 where $\rho_{\mathrm{t}}\simeq \rho_{\mathrm{max}}$ has a disproportionally larger impact on the overall population of detected sources. Therefore, thresholding the optimal SNR instead of the observed SNR has the net effect of underestimating $p_{\mathrm{det}}$ (i.e. the residuals in figure 1 are positive) and thus overestimating the astrophysical merger rate.

One caveat here is that we did not truncate the Gaussian distribution in equation (9) to impose $\rho_{\mathrm{obs}}\unicode{x2A7E} 0$. This is appropriate as long as the threshold value is much greater than the SNR variance, i.e. $\rho_{\mathrm{t}}\gg 1$.

2.3. Multiple detectors & matched filter SNR

Let us now consider a network of multiple detectors. The network SNR is the root sum square of the individual SNRs, i.e.

Equation (12)

where N is the number of interferometers. Each of the $\rho_{\mathrm{obs,i}}$'s is distributed normally around optimal values $ \rho_{\mathrm{opt, i}}$ with unit variance. The square root of the sum of Gaussian variates with different means and unit variances is distributed according to the non-central χ distribution (which is also known as the generalized Rayleigh distribution) [26]; see also [10]. The probability density function of $\rho_{\mathrm{obs}}$ is

Equation (13)

where $I_\nu(z)$ is the modified Bessel function of the first kind and order ν [27]. The parameter $\rho_{\mathrm{opt}}$ in equation (13) is root sum square of the individual optimal SNRs calculated as in equation (1), i.e.

Equation (14)

Note how the detection probability in equation (13) only depends on the combined quantity $\rho_{\mathrm{opt}} $ and not on how this SNR is distributed among the various instruments in the network.

We can now threshold the matched filter SNR $\rho_{\mathrm{obs}}$ and compute the expectation value over noise realizations as in equation (8). We find this can also be written down using special functions. In particular, one has

Equation (15)

where

Equation (16)

is the generalized Marcum Q-function of order ν [1517], which is used in the field of digital communications (but see e.g. [28, 29] for some previous appearances in GW astronomy). In words, the generalized Marcum Q-function is the complementary cumulative distribution function of the non-central χ distribution. A two-line Python code to evaluate equation (16) is described in appendix and made available at [30].

In figure 2, we evaluate equation (15) for the case of N = 3 interferometers. As expected, thresholding the observed SNR smoothens the sharp step one would instead obtain when using the optimal SNR as detection statistics, allowing for some finite probability for events below (above) threshold to be detected (missed). It is interesting to note that this effect is not symmetric: binaries with optimal network SNR that is below threshold are more likely to be detected than binaries with optimal network SNR above threshold are to be missed. That is, thresholding the optimal SNR $\rho_{\mathrm{opt}}$ instead of the matched filter SNR $\rho_{\mathrm{obs}}$ underestimates $p_{\mathrm{det}}$, hence overestimates the intrinsic merger rate. This effect is enhanced by the expected astrophysical SNR probability $p(\rho_{\mathrm{opt}})\propto \rho_{\mathrm{opt}}^{-4}$ [24], which implies binaries are more likely to be found below than above threshold.

Figure 2.

Figure 2. Detectability $p_{\mathrm{det}}(\theta,\xi)$ of GW signals with given intrinsic and extrinsic parameters for a network of N = 3 detectors as a function of the optimal SNR $\rho_{\mathrm{opt}}(\theta,\xi)$. The black-dashed line is obtained by thresholding the optimal SNR itself, resulting in a step function centered on the threshold $\rho_{\mathrm{t}}$. Colored solid curves are obtained by thresholding the observed network SNR according to equation (15) assuming various thresholds $\rho_{\mathrm{t}}$.

Standard image High-resolution image

We further quantify this detection/non-detection asymmetry as follows, borrowing terminology from that of a classification problem [31] where the actual outcome is given by $\rho_{\mathrm{obs}}\gt\rho_{\mathrm{t}}$ and the predicted outcome is given by the test $\rho_{\mathrm{opt}}\gt\rho_{\mathrm{t}}$. For a set of sources with SNRs distributed according to $p(\rho_{\mathrm{opt}})$, there are four mutually exclusive cases:

  • $\rho_{\mathrm{t}}\lt\rho_{\mathrm{obs}}, \rho_{\mathrm{opt}}$ or 'true positives'. These events are flagged as detectable irrespectively of the thresholding strategy. The fraction of sources in this class is
    Equation (17)
  • $\rho_{\mathrm{obs}}, \rho_{\mathrm{opt}}\lt\rho_{\mathrm{t}}$ or 'true negatives'. These events are flagged as non detectable irrespectively of the thresholding strategy. The fraction of sources in this class is
    Equation (18)
  • $\rho_{\mathrm{opt}}\lt \rho_{\mathrm{t}}\lt\rho_{\mathrm{obs}}$ or 'false negatives'. These events are detectable but would be classified as non detectable if one neglects the SNR variance. The fraction of sources in this class is
    Equation (19)
  • $\rho_{\mathrm{obs}}\lt \rho_{\mathrm{t}}\lt\rho_{\mathrm{opt}}$ or 'false positives'. These events are non detectable but would be classified as detectable if one neglects the SNR variance. The fraction of sources in this class is
    Equation (20)

Figure 3 shows the fractions FN and FP as a function of the threshold $\rho_{\mathrm{t}}$ and the number of detectors N. We consider a population with $\rho_{\mathrm{opt}}\in [1,100]$ distributed according to $p(\rho_{\mathrm{opt}})\propto \rho_{\mathrm{opt}}^{-4}$. Both fractions go to zero as $\rho_{\mathrm{t}}$ increase, corresponding to the curves of figure 2 approaching a step function: imposing a high detectability threshold makes it less likely for sources with a given optimal SNR to be scattered above or below threshold by a specific noise realization. The rate of false negatives is about an order of magnitude larger than that of false positives, confirming our point above. This difference increases with the number of detectors in the network, which is a consequence of equation (16).

Figure 3.

Figure 3. Fraction of false negative (blue, FN) and false positive (red, FP) detections one would get by thresholding the optimal SNR $\rho_{\mathrm{opt}} $ instead of the matched filter SNR $\rho_{\mathrm{obs}}$. Both fractions decrease with the SNR threshold $\rho_{\mathrm{t}}$ and their difference increases with the number of detectors N. Dashed, solid, and dotted curves refers to networks of N = 1, 3, and 5 interferometers, respectively.

Standard image High-resolution image

If needed, one can marginalize equation (15) over the extrinsic parameters as in equation (3). Note however that the factorization of equation (4) is not useful when considering multiple detectors because sources cannot be optimally oriented for all interferometers at the same time.

3. Applications to LIGO/Virgo

In population studies, the quantity that enters the hierarchical likelihood [1, 2] is the expectation value (abusing notation once more)

Equation (21)

where λ indicates the population parameters and $p_{\mathrm{pop}}(\theta, \xi | \lambda)$ is the chosen population model. In case one is only trying to measure the population properties of the intrinsic parameters [4], then $p_{\mathrm{pop}}(\theta, \xi | \lambda) = p_{\mathrm{pop}}(\theta | \lambda) p(\xi)$ and $p_{\mathrm{det}}(\lambda) = \int p_{\mathrm{det}}(\theta) p_{\mathrm{pop}}(\theta | \lambda) \mathrm{d} \theta$. In this section, we first compute population-averaged detectabilities on a controlled experiment and then use software injections in real LIGO noise.

3.1. Toy population

We estimate the impact of our findings on a toy population of GW events observable by the LIGO/Virgo network. We take a simple population $p_{\mathrm{pop}}(\theta, \xi | \lambda)$ where black-hole binaries have source-frame primary masses $m_1\in[5 M_\odot, 50 M_\odot]$ distributed according to $p(m_1)\propto m_1^{-2.3}$, source-frame secondary masses $m_2\in[5 M_\odot,m_1]$ distributed uniformly, redshifts $z\in[0,1]$ distributed uniformly in comoving volume and source-frame time $p(z)\propto (\mathrm{d} V_c / \mathrm{d} z)/ (1+z)$, spins magnitudes $\chi_{1,2}\in[0,1]$ distributed uniformly, and spin directions distributed isotropically. We assume a flat ΛCDM cosmological model with parameters from [32]. For the extrinsic parameters, we take uniform distributions in $\cos\iota$, $\cos\vartheta$, φ, and ψ as above. We consider a three-instrument network made of LIGO Livingston, LIGO Hanford, and Virgo at their design sensitivities [9] and compute optimal SNRs $\rho_{\mathrm{opt, i}}$ using the the IMRPhenomX approximant [33]. For this example, our threshold is set to $\rho_{\mathrm{t}} = 12$.

Figure 4 shows the resulting distributions of optimal network SNRs $\rho_{\mathrm{opt}}(\theta,\xi)$ and probabilities of detection $p_{\mathrm{det}}(\theta,\xi)$. A Monte Carlo estimate of the integral in equation (21) computed over the entire population returns $p_{\mathrm{det}}(\lambda)\sim 0.027$ (we used 106 samples, so the error on this number is ∼10−3). If one only selects events with $|\rho_{\mathrm{opt}}(\theta,\xi) - \rho_{\mathrm{t}}|\lt1$ (i.e. those close to threshold), we find $p_{\mathrm{det}}(\lambda)\sim 0.496$. This should be compared with the analogous value ∼0.435 one would instead obtain with a simple cut on the optimal SNR. Marginalizing over noise realizations when estimating selection effects results in a higher $p_{\mathrm{det}}$ and mostly affects subpopulations of sources that are close to detection threshold.

Figure 4.

Figure 4. Distribution of optimal SNR $\rho_{\mathrm{opt}}(\theta,\xi)$ (top) and detectabilities $p_{\mathrm{det}}(\theta,\xi)$ (bottom) for a toy population of black-hole binaries observable by the LIGO/Virgo network at design sensitivity, assuming a threshold $\rho_{\mathrm{t}} = 12$. In the top panel, the green histogram shows the optimal SNRs. The vertical solid (dotted) black lines show the median (90% inter-quantile range) of the probability of detection from equation (15), indicating the region where the transition between non-detectability to the left and detectability to the right takes place. In the bottom panel, the green histogram shows the detectabilities obtained by thresholding the matched filter SNR $\rho_{\mathrm{obs}}$ and the two dashed black lines show the fractions of binaries that would be marked as detectable ($p_{\mathrm{det}} = 1$) and non-detectable ($p_{\mathrm{det}} = 0$) if one were to instead threshold the optimal SNR $\rho_{\mathrm{opt}}$.

Standard image High-resolution image

3.2. Pipeline injections

It is important to remember that the SNR (either optimal or observed) provides approximate information on the GW detectability. The quantity returned by GW detection pipelines is the false-alarm rate (FAR), which is indeed the statistics in used state-of-the-art population analyses [4] to both compile the list of events and estimate selection effects (other selection strategies are sometime used, see e.g. $p_{\mathrm{astro}}$ in [34]). Using the population average of equation (21), we now present a calibration of the SNR threshold that enters our $p_{\mathrm{det}}$ approximation to reproduce the response of current detection pipelines as a function of their FARs.

We use software injections performed in real noise from the third LIGO/Virgo/KAGRA observing run [35]. They report FAR values for five different detection pipelines and LIGO-only (i.e. N = 2) optimal SNRs for a fiducial population of sources (some of the injections also include Virgo; we neglect this difference and have verified it has a negligible impact on our results). Note their adopted population is different from that of our toy case above (see [35] for details). We use the injection dataset labeled as 'mixture,' which is their recommended default. We consider all the injections provided, including cases where the FAR could not be quantified (and is reported as $\infty$ in the dataset). Note the injections were performed uniformly in time, thus taking into account the duty cycle of the detectors.

We compute the population-average detection probability by thresholding their FARs and match it with the population-average detection probability obtained with our $p_{\mathrm{det}}$ approach. Figure 5 shows the resulting mapping between the two quantities. We repeat our calculation for each of the five detection pipelines provided in the available dataset [35] as well as by selecting the minimum FAR for each injected source. The latter is in line with the criterion used in [4] for selecting compact binaries of astrophysical origin.

Figure 5.

Figure 5. Mapping between SNR threshold $\rho_{\mathrm{t}}$ and FAR threshold obtained by matching the population-averaged detectability $p_{\mathrm{det}}(\lambda)$. We use software injections into five different GW detection pipelines (colored curves) as well as the minimum FAR across all pipelines (black curves). Solid (dashed) curves are computed using the observed (optimal) SNR when thresholding for detectability. Curves are restricted to the regime of validity of our analysis (see text). To guide the eye, the grey dotted lines indicate ${\mathrm{FAR}}_{\mathrm{t}} = 1{\mathrm{yr}}^{-1}$ and $\rho_{\mathrm{t}} = 11$.

Standard image High-resolution image

This calculation is not reliable for SNRs that are ${\lesssim} 6$ because those injections were deemed 'hopeless' to save computational time [35]. For those values of $\rho_{\mathrm{t}}$, there is a (potentially large) number of missing sources with optimal SNRs below threshold that could have been scattered above threshold by the SNR variance. We thus restrict our analysis to $\rho_{\mathrm{t}}\unicode{x2A7E} 8$, which is a few standard deviations away from the 'hopeless' cut. Additionally, some pipelines have minimum and maximum FAR values they attempt to quantify. We estimated these limits from the provided datasets and truncated the curves in figure 5 accordingly. These limitations do not significantly impact the minimum FAR calculation across pipelines, at least in the region of interest shown in figure 5.

Figure 5 shows that selection effects computed using a minimum FAR threshold of ${\sim} 1$ yr−1 are reproduced by an SNR-based cut with threshold $\rho_{\mathrm{t}}\simeq 11$. We find once more that the bias induced by thresholding $\rho_{\mathrm{opt}}$ instead of $\rho_{\mathrm{obs}}$ is of a few % and underestimates $p_{\mathrm{det}}$ across the entire range: that is, the $\rho_{\mathrm{t}}$ threshold for a given FAR obtained when marginalizing over noise realization is larger compared to the case where noise is neglected. The difference between the two treatments is smaller than the typical difference between the various detection pipelines (and indeed current LIGO/Virgo selection criteria combine information from multiple pipelines). The population used in [35] is deliberately broad, while our results above indicate that the subpopulation of sources that are close to threshold will be affected more significantly by the SNR variance. Confirming this expectation with dedicated pipeline injections is left to future work.

As shown in figure 5, we empirically find that the mapping between $\rho_{\mathrm{t}}$ and the minimum FAR threshold across the available pipelines is well described by a power law:

Equation (22)

A simple least-square regression returns $p_1 = -1.80$ and $p_0 = 20.1$ when thresholding using $\rho_{\mathrm{obs}}$ (i.e. this is a fit to the solid black curve in figure 5), and $p_1 = -1.72$ and $p_0 = 18.7$ when thresholding using $\rho_{\mathrm{opt}}$ (i.e. this is a fit to the dashed black curve in figure 5). These fits can be used to quickly filter synthetic distributions according to the desired purity of the resulting GW simulated catalog, though with the important caveat that this relationship was calibrated on a specific population of sources. We stress that figure 5 and equation (22) present a calibration between thresholds, not a mapping of SNR to FAR for a given trigger and pipeline.

4. Summary

We summarized and extended the most common treatment of GW selection effects, namely that of thresholding the SNR. We focused in particular on the marginalization over extrinsic parameters and noise realizations, considering both single and multiple detectors.

When modeling the filter imposed to observations by the detectors, the simplest strategy is to consider a source 'detectable' if its optimal SNR $\rho_{\mathrm{opt}}$ is sufficiently large [8]. This approach does not take into account the SNR variance induced by noise realizations. As presented here, incorporating such an effect is straightforward and results in a closed-form expression of $p_{\mathrm{det}}$. All one needs to do is substitute the sharp step

Equation (23)

with the smooth transition

Equation (24)

Including noise realizations when estimating $p_{\mathrm{det}}$ takes care of the conceptual point recently raised in [20]. There the authors argue that a consistent detectability estimate should never make use of the true signal parameters, which are not accessible, but only of the data, and these are inevitably affected by noise.

Our expression is appealing for its simplicity but still approximate. Notable simplifications include (i) assuming that noise is stationary and Gaussian, which is never perfectly the case, (ii) thresholding events using the SNR and not the FAR, and (iii) neglecting errors due to the finite size of our template banks [10]. Further improvements include considering the phase-maximized SNR, which can also be written down analytically using special functions [36].

We find that a SNR threshold of ∼11 reproduces the selection criterion used in current analyses (FAR $\lt$ 1 yr−1 for binary black holes [4]) and that the inclusion of noise realizations increases the average detection probability $p_{\mathrm{det}}$ by a few %. While nominally modest, this effect becomes increasingly important as the number of detections grows because systematic errors related to selection effects scale faster than linearly with the number of events in the catalog [3]. Furthermore, the projected bias is not uniform across the parameter space and disproportionally affects sources at the edge of detectability. For instance, this will be relevant to analyses targeting compact binaries at high redshift [1820] and attempting to discriminate their origin as either astrophysical or cosmological [37, 38].

Accurate modeling of selection effects is prominent in both (i) GW population studies, where selection effects enter the hierarchical likelihood, and (ii) the development of astrophysical predictions, where outputs of population-synthesis codes are post-processed to obtain detectable distributions. We hope the 'collection of recipes' we presented here will provide a useful companion to researchers working in either of these two contexts, facilitating the treatment of selection biases in GW astronomy.

Acknowledgments

In memory of Chris Belczynski. I first got interested in selection effects while post-processing your simulations. I am sure you are on a beautiful mountain; farewell. (D G)

We thank Matt Mould, Costantino Pacilio, Michele Mancarella, Viola De Renzis, Francesco Iacovelli, Nick Loutrel, Chris Moore, Riccardo Buscicchio, Reed Essick and the participants of the 'Gravitational-wave populations: what's next?' conference (Milan, 2023) for discussions. We thank the referees for their inputs on section 3.2. D G and M B are supported by ERC Starting Grant No. 945155–GWmining, Cariplo Foundation Grant No. 2021-0555, MUR PRIN Grant No. 2022-Z9X4XS, and the ICSC National Research Centre funded by NextGenerationEU. D G is supported by MSCA Fellowships No. 101064542–StochRewind and No. 101149270–ProtoBH. Computational work was performed at CINECA with allocations through INFN and Bicocca.

Data availability statement

No new data were created or analysed in this study.

Appendix: Marcum Q-functions with Python

The numerical implementation of the Marcum Q-function for the Python programming language is somewhat hidden in the popular module scipy [39]. In particular, object scipy.stats.ncx2 provides tools to characterize the non-central χ2 distribution (which is different than the non-central χ distribution used in section 2.3). The probability density function of a random variate x distributed according to a non-central χ2 distribution with k degrees of freedom and non-centrality parameter λ is

Equation (A.1)

Integrating this, one obtains the complementary cumulative distribution function (or survival function)

Equation (A.2)

The expression $Q_\nu(a,b)$ from equation (16) can therefore be evaluated as the survival function of a non-central χ2 distribution with $x = b^2$, $k = 2\nu$ and $\lambda = a^2$. In Python, this is

def marcumq(nu,a,b):

   return scipy.stats.ncx2.sf(b**2, 2*nu, a**2)

We have implemented this function in a standalone package named marcumq [30]. This can be installed with

pip install marcumq and used with e.g.

import marcumq

marcumq.marcumq(nu,a,b)

We have tested our implementation against those provided in the symbolic manipulation tools sympy and mathematica.

Footnotes

  • Note that $p_{\mathrm{det}}(\theta,\xi)\in[0,1]$ is a probability and not a probability density, i.e. it does not integrate to unity over θ and ξ. One should more carefully write $p({\mathrm{det}}| \theta,\xi)$.

Please wait… references are loading.