A publishing partnership

The following article is Open access

Testing the LSST Difference Image Analysis Pipeline Using Synthetic Source Injection Analysis

, , , , , and

Published 2024 May 13 © 2024. The Author(s). Published by the American Astronomical Society.
, , Citation S. Liu et al 2024 ApJ 967 10 DOI 10.3847/1538-4357/ad3635

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

0004-637X/967/1/10

Abstract

We evaluate the performance of the Legacy Survey of Space and Time Science Pipelines Difference Image Analysis (DIA) on simulated images. By adding synthetic sources to galaxies on images, we trace the recovery of injected synthetic sources to evaluate the pipeline on images from the Dark Energy Science Collaboration Data Challenge 2. The pipeline performs well, with efficiency and flux accuracy consistent with the signal-to-noise ratio of the input images. We explore different spatial degrees of freedom for the Alard–Lupton polynomial-Gaussian image subtraction kernel and analyze for trade-offs in efficiency versus artifact rate. Increasing the kernel spatial degrees of freedom reduces the artifact rate without loss of efficiency. The flux measurements with different kernel spatial degrees of freedom are consistent. We also here provide a set of DIA flags that substantially filter out artifacts from the DIA source table. We explore the morphology and possible origins of the observed remaining subtraction artifacts and suggest that given the complexity of these artifact origins, a convolution kernel with a set of flexible bases with spatial variation may be needed to yield further improvements.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

The Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST; Ivezić et al. 2019) will survey the southern sky from 2025 to 2035. This wide-field ground-based system uses an 8.4 m (6.5 m effective) telescope with a 3.2 gigapixel camera and a 9.6 deg2 field of view. The LSST survey will repeatedly observe ∼18,000 deg2 of the sky in the ugrizy filters during its 10 yr of operations.

The LSST data will be useful for a wide variety of science (LSST Science Collaboration et al. 2009), with guiding areas of probing dark energy and dark matter, studying the transient optical sky, producing an inventory of the solar system, and mapping the Milky Way. Achieving these scientific goals requires the discovery of transient sources and accurate photometric measurements. The LSST is expected to discover about 106 transient sources per night (Ridgway et al. 2014; Graham et al. 2019; Ivezić et al. 2019), which requires a detection pipeline with high efficiency and a low rate of nonastrophysical artifacts.

The Rubin Observatory LSST Dark Energy Science Collaboration (DESC) developed the dia_pipe 6 package to run the LSST Science Pipelines 7 Difference Image Analysis (DIA) algorithms. The dia_pipe combines different parts of the LSST Science Pipelines difference imaging code into a single pipeline, which enables the user to perform image difference, source association, and forced photometry. The DESC dia_pipe has been integrated with the LSST Data Release Production pipeline.

The LSST DESC Data Challenge 2 (DC2; LSST Dark Energy Science Collaboration (LSST DESC) et al. 2021) was a modeling and simulation effort that generated simulated LSST images. From a simulated Universe with galaxies, stars, and Type Ia supernovae (SNe Ia), DC2 used GalSim (Rowe et al. 2015) to render astronomical objects into a set of images for 5 yr at 300 deg2 of the sky. These images were processed through the LSST Science Pipelines into data products modeled on those to be delivered by the Rubin Observatory. Sánchez et al. (2022) compared the DIA performance metrics to the Dark Energy Survey DIA performance (Kessler et al. 2015) using a 15 deg2 subset of DC2 data, based on the SNe Ia that had been included in the DC2 simulation, and found similar performance to the DESC survey. Here, we perform a more in-depth evaluation of DIA using additional synthetic source injection on a subset of the DC2 data set. Our goal is to connect and explore the DIA algorithmic concepts and code as implemented in the LSST Science Pipelines.

In this work, we inject a controlled set of synthetic sources onto DC2 images to (1) evaluate the pipeline at different source magnitude levels and host magnitude levels, (2) compare the pipeline performance with different kernel spatial orders, and (3) determine a set of flag-based cuts to removing false-positive detections ("artifacts").

The essential strategy of the synthetic source injection technique is to add synthetic sources onto astronomical images based on the measured point-spread function (PSF) and monitor the detection status and flux measurements of these synthetic sources.

Traditionally, the use of DIA algorithms in surveys has suffered from significant subtraction artifacts, leading to the need for human-based or machine-learning-based classification (Kaiser et al. 2002; Frieman et al. 2008; Sako et al. 2008; Bernstein et al. 2012; Kessler et al. 2015). We explore the possible origins of these artifacts using difference images produced by the DIA pipeline. We study both the statistical properties of artifact measurements and the morphology of artifacts. A few possible causes are discussed, and we give suggestions for future improvements.

The outline of this work is as follows. Section 2 introduces image difference algorithms, Section 3 introduces the analysis framework of the synthetic source injection technique, Section 4 presents the analysis results of the DIA pipeline, Section 5 discusses the morphology and potential origin of subtraction artifacts, Section 6 discusses the current status of the DIA pipeline and possible future improvements, and we provide conclusions and outline future directions in Section 7.

2. Difference Image Algorithms

To begin DIA, we prepare two images that cover the same region of the sky with a time separation longer than a typical light-curve duration. One image, which usually was taken in good seeing conditions and has a high signal-to-noise ratio (S/N) for the magnitude range of interest, is designated as the template image. The other image is called the science image. Typically, the template image was taken months to years before the science image and may be a coaddition of many images to reduce the photon noise. In this analysis, we use coadded template images. The science image is typically from the current night, where we search for new transient sources or variations in the brightness or position of existing sources.

The next step is to match both images to the same seeing condition and the same astrometric reference so that we can subtract one image from the other to obtain a difference image that ideally represents the astrophysical differences between them. Source detection and photometric measurements are then performed on the difference image to produce a detection catalog. The LSST Science Pipelines version used in this work provides both the Alard–Lupton (AL; Alard & Lupton 1998) and the Zackay–Ofek–Gal–Yam (ZOGY; Zackay et al. 2016) algorithms to perform image differencing. The AL algorithm assumes that the PSFs of the science image (S) and template image (T) can be matched using a kernel κ. This process is done empirically by matching common sources in both images and does not explicitly use calculated PSFs for the images. The kernel κ is solved by minimizing the square sum of pixels from the difference image. The ZOGY algorithm assumes explicit knowledge of the PSF in each image and uses the Fourier transform between them to achieve the same seeing conditions of T and S and produce a detection image highlighting the astrophysical sources. A decorrelation method is also introduced in the ZOGY algorithm to perform decorrelation of the difference image.

2.1. AL Algorithm

The AL algorithm solves for the matching kernel κ, which uses stars in both images to match the PSF of the template image to the PSF of the science image. Galaxies can be used in addition to stars, although a galaxy provides less information than a star of the same flux level. Thus, this process is not dependent on reliable star–galaxy separation. In the spatially varying (Alard & Lupton 1998; Alard 2000) situation, the matching kernel can be expressed as

Equation (1)

Here Nκ is the number of kernel bases, aq is a set of fitted spatially varying coefficients, and kq is a set of kernel bases. The most commonly used kernel bases are composed with a set of Gaussian functions modulated with a set of polynomials (Alard & Lupton 1998). Another type of kernel basis is composed with delta bases (Bramich 2008; Miller et al. 2008), which are "shape-free" delta functions defined for each kernel pixel coordinate.

Using the matching kernel κ, the difference image of the AL algorithm can be expressed as

Equation (2)

where B(x, y) is the fitted sky background.

2.2. ZOGY Algorithm

The ZOGY algorithm was designed for the case where PSFs and flux scaling factors for T and S are properly estimated, and the sky backgrounds of both images have been properly subtracted. 8 The difference image from the ZOGY algorithm in Fourier space can be expressed as

Equation (3)

The equivalent PSF of the ZOGY difference image is

Equation (4)

Here the hat symbol represents the Fourier transform of each quantity, Pt and Ps are the PSFs of T and S, Ft and Fs are the flux scaling factors of both images, σt and σs are background noise levels, and FD is the normalization constant. The expression in the denominator of $\hat{D}$ plays the role of decorrelating the noise.

However, ZOGY provides no mechanism to include information about the spatial variation of PSFs. One approach to using a spatially varying ZOGY would be to split T and S into subregions within which the spatial variation of PSFs is much smaller (Zackay et al. 2016; Reiss 2017; Hu et al. 2022). Performing the ZOGY algorithm on each subregion is achievable at the expense of more CPU and more complicated bookkeeping for detections near and across the edges of subregions. Such a "cell-based" approach to templates and subtraction could fail if the PSFs are varying rapidly across the images.

3. The DC2 Data Set and Analysis Framework

3.1. The DC2 Data Set

To prepare for analysis of LSST data, LSST DESC created the DC2 data set with an end-to-end approach (LSST Dark Energy Science Collaboration (LSST DESC) et al. 2021) to simulate a subset of the full LSST survey. This approach started with a large N-body simulation of galaxies in the Universe, applied LSST-like observations, simulated each exposure, ran the LSST Data Release Processing pipeline, 9 and produced data products analogous to those that will be delivered by the Rubin Observatory. The DC2 observations include six optical bands (ugrizy) with coverage of 300 deg2 in a wide-fast-deep area and 1 deg2 in a deep drilling field. It simulates 5 yr of the LSST survey.

The DC2 data set provides two classes of imaging data. For single-frame processing, it provides calibrated exposures (calexps) with sky background subtraction. Each calexp image is a 4000 × 4072 pixel image array with 0farcs2 pixel−1 scale. It also has a source catalog (src) that contains measurements of sources in that visit. After producing calexps, multiple visits are coadded together to create deepCoadd exposures. The resampling grid of deepCoadds is defined with tracts and patches. Each tract contains 7 × 7 patches, and each patch contains 4100 × 4100 pixels with a scale of 0farcs2 pixel−1. A visual representation of the tract-and-patch structure can be found in Sánchez et al. (2022) or LSST Dark Energy Science Collaboration (LSST DESC) et al. (2021).

3.2. Template Exposure

We focus on i-band exposures of the DC2 data set for our analysis. We choose "deepCoadd" exposures with seven patches as template exposures. The total area analyzed in this paper is about 0.36 deg2.

3.3. Host Selection Criteria and Science Exposure

We undertook this work from the perspective of SN cosmology, and so we are interested in the detection and photometric accuracy of new point sources (SNe) on top of extended sources (galaxies). GCRCatalogs (Mao et al. 2018) is a DESC package that provides users with a consistent interface to access both the input truth catalogs and the output processed catalogs for the DC2 simulations. We use GCRCatalogs to extract a list of potential host galaxies from the input truth table with a range of apparent host magnitudes. Next, we measure the local surface brightness (SB) with 1'' aperture photometry at the center of each galaxy in the simulated images. We sort the galaxy sample into 1 mag arcsec−2 bins distributed from 20 to 25 mag arcsec−2 with 1 mag arcsec−2 bin widths. While in the real Universe, SNe occur throughout galaxies, what we really want to explore here is the dependence of SN detection on underlying SB. Thus, we place the SN at the centers of galaxies with different central SB.

Within each patch and each SB magnitude bin, we select a set of 20 hosts. Galaxies are chosen to satisfy two criteria: (1) each host center is at least 200 pixels away from the image boundaries in both the x- and y-directions, and (2) each host center is at least 100 pixels away from another host center. These criteria help us avoid subtractions near image edges and injecting overlapping synthetic sources. Locations of hosts on a patch are shown in Figure 1.

Figure 1.

Figure 1. Locations of host galaxies on a patch (4100 × 4100 pixels). Hosts with different SB are indicated with different colors.

Standard image High-resolution image

After selecting the host set within each patch and each host SB magnitude bin, we match each host set to 10 calexps using the tracts_mapping.sqlite3 database file of the DC2 data set. We inject synthetic sources onto each calexp and then use those as the science exposures for running the image difference algorithm. We define a calexp and its corresponding deepCoadd exposure as an "image pair." In total, we have 70 image pairs. The LSST survey takes images at a range of spatial and rotational dithers; thus, a single calexp often does not fully cover all 20 hosts in each host set. We restrict our sample to calexps that cover at least 15 host galaxies.

3.4. Synthetic Source Injection Framework

Our synthetic source injection procedure is based on the insertFakes module of the pipe_tasks 10 package of the LSST Science Pipelines. We convert the R.A., decl. of the host to the x, y coordinate on the image. We then simulate point sources based on the calexp PSF model at that location on the image. The synthetic source injection framework converts the magnitude to the instrumental flux and scales the normalized synthetic source model by the instrumental flux. In the single-frame processing, the instrumental flux to magnitude calibration of the images was calculated with respect to a 12 pixel radius aperture. While the PSF model is finite in size, it is always larger than the calibration radius. The PSF model is normalized to sum to 1. We calculate the aperture correction between the full PSF and the smaller calibration radius and scale the model of the PSF by this correction. We then multiply the scaled PSF by the instrumental flux from the calibration to inject a source that has the correct photoelectron contribution for its magnitude. The scaled model is added to the image plane of the calexp without making any changes to its variance plane, which means we ignore the Poisson noise of the synthetic source.

After finishing the injection stage, we run the DIA pipeline on the injected data set using the imageDifferenceDriver.py of the dia_pipe package with v23.0.0 11 of the LSST Science Pipelines. This stage produces difference images and then runs photometry measurement algorithms on that difference image to produce the DIA source (diaSrc) table, which contains source measurements of the detected source. We obtain positions of variable stars and active galactic nuclei (AGNs) from the truth tables in GCRCatalogs and match them to the diaSrc tables. All variables and AGNs that get matched are removed from the diaSrc tables. As a result, the detections from the diaSrc tables only have synthetic sources and artifacts. 12 Next, we match the known injection positions to the diaSrc tables to determine whether the injected synthetic sources are detected in the difference exposures. The diaSrc fluxes and the flag information of the detected sources are also extracted from the diaSrc tables for further analysis.

3.5. LSST Software

In this work, we use the v23.0.0 version of the LSST stack package to provide basic functionalities for synthetic source injection and image difference. We have documented the proper software versions in the GitHub repo LSSTDESC/dia_improvement, 13 which includes scripts to reproduce our work.

4. DIA Pipeline Analysis

In this section, we describe performance metrics, injection strategy, and analysis results of the DIA pipeline. Different kernel spatial degree of freedoms have been implemented in the pipeline to reduce subtraction artifacts. We compare their performance here and identify the photometric flags that effectively select true positives with high efficiency and a low artifact rate.

4.1. Metrics

We evaluate the subtraction quality of the DIA pipeline using the following metrics. These metrics should reflect the detection ability and the photometric measurement accuracy of the pipeline.

  • 1.  
    Efficiency 14 (Eff): the number of detected synthetic sources divided by the number of injected synthetic sources. We expect it to approach unity when synthetic sources are bright and decrease to 0 at the faint magnitude end. The efficiency curve can be fitted using a sigmoid model. In this work, we use the function
    Equation (5)
    to fit our data, with a and b as fitting parameters and m as the synthetic source magnitude. 15 Using this equation, we can solve the magnitude depth m1/2, which corresponds to the synthetic source magnitude, where the efficiency drops to 50%. The magnitude depth m1/2 is mathematically equivalent to the fitting parameter b.
  • 2.  
    Artifact rate (AR): the number of subtraction artifacts per square degree.
  • 3.  
    Flux pullF/σ):
    This flux pull should follow a normal distribution with a mean of 0 and a standard deviation of 1. In this analysis, we calculate the pull statistic of the diaSrc PSF flux.

4.2. Analysis Strategies

We apply two analysis strategies to sets of DC2 calexps. The first strategy is to focus on measuring efficiency as a function of source magnitude and background SB; we inject synthetic sources from 20–23 mag in 0.5 mag steps and 23–24 mag in 0.1 mag steps. These finer steps focus on the magnitude range where the efficiency drops rapidly. We use this strategy to determine the magnitude depth m1/2 at which 50% of the sources are recovered.

For the second strategy, we undertook a related but separate campaign of studying efficiency and artifact rate for high signal-to-noise synthetic sources and low signal-to-noise synthetic sources with different subtraction configurations with two consistent sets of 20 mag (MAG20) sources and 23 mag (MAG23) sources.

4.3. Performance of the Default Configuration of the DIA Pipeline

To get a baseline evaluation of the performance of the DIA pipeline, we run dia_pipe using the default configuration of the pipeline. The default difference imaging algorithm is AL with a set of Gauss-polynomial bases. The AL kernel has three Gaussians with sigma values of 0.7, 1.5, and 3.0 pixels, with respective order 4, 2, and 2 polynomials in kernel (u, v) space. These kernel terms are allowed to spatially vary at first order in x, y across the image. The background level is allowed to vary spatially at second order in x, y.

The upper plot of Figure 2 shows our recovered detection efficiency as a function of synthetic source magnitude. The efficiency follows predicted smooth curve efficiency functions, with a decrease in efficiency as a function of both (1) decreasing injected flux and (2) increasing underlying SB (and thus background noise) at the location the synthetic source is placed. The effect of the underlying host galaxy light is most visible for the brightest hosts, SB = 20–21 mag arcsec−2, which shows a substantial 0.25 mag m1/2 shift in the efficiency curve. The DIA pipeline successfully recovers all 16 synthetic sources brighter than 22 mag. The efficiency starts to drop below 1 around 22 mag, and it drops below 0.5 between 23 and 24 mag. In the lower plot of Figure 2, we focus on the detection efficiency at the faint magnitude (low S/N) limit. The predicted values of the m1/2 of the fitted function are between 23.40 and 23.71 mag.

Figure 2.

Figure 2. Detection efficiency vs. source mag (dots). Solid lines are sigmoid models fitted to the data. Different SB magnitudes are indicated using the colors shown in the legend. The upper plot corresponds to 20–24 synthetic source magnitudes with a 0.5 mag interval. The lower plot corresponds to 23–24 synthetic source magnitudes with a 0.1 mag interval. The m1/2 for different host SBs are 23.40 (20–21 mag arcsec−2), 23.55 (21–22 mag arcsec−2), 23.64 (22–23 mag arcsec−2), 23.66 (23–24 mag arcsec−2), and 23.71 (24–25 mag arcsec−2). They are shown as vertical lines in the lower plot.

Standard image High-resolution image

Figure 3 shows the S/N distribution of the detected synthetic sources when injected at 21, 22, 23, and 24 mag. At the synthetic source magnitude of 21, the S/N distribution is far above the detection limit, which means the pipeline should be able to recover all synthetic sources. At the injected magnitude of 23, the S/N distribution overlaps the detection threshold, and the efficiency begins to drop. When the synthetic source magnitude is 24, most injections are below the detection threshold, which explains why the efficiency is low at this magnitude.

Figure 3.

Figure 3. S/N distribution of detected sources with true mag listed above each panel. Injections from different host SB are binned together. The efficiency at each source magnitude is shown on each plot. The vertical line in each plot shows the S/N = 5.5 detection threshold. The S/N distribution is closer to the threshold for fainter synthetic sources.

Standard image High-resolution image

Figure 4 shows the distributions of the detected magnitude and injected magnitude of synthetic sources. At the injected magnitude of (20, 21, 22, 23, 24), more than (99%, 99%, 90%, 57%, 8%) of the flux measurements are within 0.1 mag of the injected magnitudes. At 24 mag, the distribution is significantly biased toward the bright side due to the Malmquist bias (Malmquist 1922, 1925).

Figure 4.

Figure 4. Distributions of recovered magnitudes (solid lines) of synthetic sources at different injected magnitudes (dashed lines). For injected magnitudes of (20, 21, 22, 23, 24), more than (99%, 99%, 90%, 57%, 8%) of the recovered flux measurements are within 0.1 mag of the injected magnitudes. At 24 mag, the distribution of recovered magnitudes is significantly biased toward the bright side due to Malmquist bias.

Standard image High-resolution image

4.4. Comparison with Previous DESC DC2 Analysis

Here we give a comparison of the i-band m1/2 to the previous DESC DC2 DIA (Sánchez et al. 2022). That work evaluated the m1/2 of the LSST DIA pipeline using about 15 deg2 of the DC2 data set with artificial SN Ia light curves overlaid onto exposures. The estimated m1/2 in the i band is 23.50 mag for all sources.

In this work, we estimate the m1/2 in the i band using about 0.36 deg2 of the DC2 data set with carefully controlled synthetic point sources distributed across different host brightness. The fitted m1/2 is between 23.40 mag and 23.71 mag, depending on the underlying SB.

4.5. Kernel Spatial Degree of Freedom

Alard & Lupton (1998) and Alard (2000) suggest expanding the AL kernel coefficients aq (x, y) (Section 2.1) as a function of a power series of positions. The highest-order ds of the power series defines the kernel spatial degree of freedom. The implementation of the AL algorithm in the DIA pipeline can be configured to run with different values of ds . In this section, we explore the pipeline performance with ds varying from 1 to 4. The pipeline default value of ds is 1. Our analysis was constrained to a maximum spatial order of ds = 4 due to limitations in v23.0.0 of the LSST Science Pipelines, which have since been lifted. Raising ds beyond 4 offers increased kernel flexibility, thereby potentially reducing the artifact rate. However, this approach carries the risk of overfitting, since the number of spatial polynomials scales proportionally with ${d}_{s}^{2}$ (Alard 2000; Bramich et al. 2013). Additionally, it would result in an increase in computational time.

The primary goal of exploring the spatial degrees of freedom is to reduce artifacts while maintaining high efficiency and flux accuracy. We focus here on high-S/N sources at 20 mag (MAG20), which is far above the detection limit, and low-S/N sources at 23 mag (MAG23), which is close to the m1/2. Since the comparison focuses on the region that is away from the image edge, 17 it is possible that the high ds s are performing worse near the image edge. More evaluations near the image boundary are still needed in future work.

In Table 1, we show the efficiency, artifact rate, 18 mean, and standard deviation of the flux pull distribution for each spatial degree of freedom at MAG20 and MAG23. We find that the results from high S/N and low S/N are consistent. Up to ds = 4, the artifact rate is a monotonically decreasing function of ds . The fluctuation of the recovered mean pull of the flux of the injected sources is less than 0.020. The fluctuation of the standard deviation is less than 0.010. The fluctuation of the efficiency is less than 0.013.

Table 1. Flux Difference Pull Distribution, Efficiency, and Artifact Rate with Different Kernel Spatial Degree of Freedom

 MAG20MAG23
ds MeanStd.Eff.Artifact RateMeanStd.Eff.Artifact Rate
 ΔF/σ  (deg−2)ΔF/σ  (deg−2)
10.0901.2321.0006950.0961.0000.847700
20.0871.2281.0006890.0900.9990.840693
30.0921.2371.0006500.0881.0110.843651
40.1061.2321.0006120.0921.0060.852608

Note. The mean and standard deviations are of the pull distribution of the difference between detected flux and injected flux (ΔF/σ). We apply a 5σ clip to the pull distribution to remove outliers. At MAG20, the percentages of outliers are (0.249%, 0.232%, 0.182%, 0.265%) at (ds = 1, 2, 3, 4). At MAG23, the percentages of outliers are (0.054%, 0.036%, 0.054%, 0.036%) at (ds = 1, 2, 3, 4). The default spatial degree of freedom is ds = 1.

Download table as:  ASCIITypeset image

In Figure 5, we show the postage stamps of difference images from different spatial degrees of freedom. Each column of stamps is made at the same position, while each row corresponds to a specific ds . By comparing the stamps, we find that high ds s, especially ds = 4, can produce cleaner difference images. As a result, the artifact rate decreases as the ds increases.

Figure 5.

Figure 5. Postage stamps of difference images from different kernel spatial degrees of freedom (ds ). Each column shows one location on the subtraction image. Each row corresponds to a specific ds .

Standard image High-resolution image

4.6. Detection Flags

The LSST Science Pipelines are configured to detect any possible source and characterize it with flags to indicate detections that are not real astrophysical sources. These 109 flags identify pixel-level features such as saturation, bad pixels, and cosmic rays as well as characteristics of the source in the subtraction such as moment and dipole fits. We use these flags to reject artifacts, which we here define as detections that are not in circular regions with 0farcs8 radii of known synthetic sources. The data are from MAG20 and MAG23, with host SB between 20 and 21 mag arcsec−2. In this synthetic source injection analysis, we split the detected sources from each diaSrc table into two tables: one for the detected synthetic sources (MAG20: 1184 entries; MAG23: 1003 entries) and the other for detections that are artifacts (MAG20: 2446 entries; MAG23: 2464 entries).

We use the efficiency and artifact rate to examine the power of each flag to classify sources into real and artifact. We calculate these metrics for all flags and show the full table in Table 2. From this information in this table, we can select a small set of flags where (1) the efficiency of each flag is close to 1 at MAG20 and (2) each flag has a clear physical interpretation related to artifact classification. We document this "selected flags" set in Table 3. To evaluate the efficacy of this flag set, we compare the efficiency and artifact rate before and after applying the union of these flags to our detection. At MAG20, we find that the efficiency drops from 1.000 to 0.999, and the artifact rate drops from 695 deg−2 to 69 deg−2. At MAG23, the efficiency drops from 0.847 to 0.791, and the artifact rate drops from 700 deg−2 to 68 deg−2. These results suggest that using the "selected flags" to sift artifacts significantly improves the artifact rate with only a small decline in efficiency. We strongly recommend the use of these flags when looking to identify a set of real astrophysical detections in diaSrc tables.

Table 2. Efficiency and Artifact Rate of All Flags

  MAG20MAG23
FlagDescriptionEff.Artifact RateEff.Artifact Rate
   (deg−2) (deg−2)
No flag applied1.0006950.847700
flags_negativeSet if source was detected as significantly negative1.0006950.847700
base_NaiveCentroid_flagGeneral failure flag1.0006850.847689
base_NaiveCentroid_flag_noCountsObject to be centroided has no counts1.0006950.847700
base_NaiveCentroid_flag_edgeObject too close to edge1.0006950.847700
base_NaiveCentroid_flag_resetToPeakSet if CentroidChecker reset the centroid1.0006850.847689
base_PeakCentroid_flagCentroiding failed1.0006950.847700
base_SdssCentroid_flagGeneral failure flag1.0002300.846220
base_SdssCentroid_flag_edgeObject too close to edge1.0006830.847689
base_SdssCentroid_flag_noSecondDerivativeVanishing second derivative1.0006950.847700
base_SdssCentroid_flag_almostNoSecondDerivativeAlmost vanishing second derivative1.0006760.847681
base_SdssCentroid_flag_notAtMaximumObject is not at a maximum1.0002750.846266
base_SdssCentroid_flag_resetToPeakSet if CentroidChecker reset the centroid1.0006790.847683
base_SdssCentroid_flag_badErrorError on x and/or y position is NaN1.0006810.847685
ip_diffim_NaiveDipoleCentroid_flagGeneral failure flag, set if anything went wrong1.0006950.847700
base_CircularApertureFlux_flag_badCentroidGeneral failure flag, set if anything went wrong1.0006950.847700
base_GaussianFlux_flag_badCentroidGeneral failure flag, set if anything went wrong1.0006950.847700
base_NaiveCentroid_flag_badInitialCentroidGeneral failure flag, set if anything went wrong1.0006950.847700
base_PeakLikelihoodFlux_flag_badCentroidGeneral failure flag, set if anything went wrong1.0006950.847700
base_PsfFlux_flag_badCentroidGeneral failure flag, set if anything went wrong1.0006950.847700
base_SdssCentroid_flag_badInitialCentroidGeneral failure flag, set if anything went wrong1.0006950.847700
base_SdssShape_flag_badCentroidGeneral failure flag, set if anything went wrong1.0006950.847700
slot_Centroid_flagGeneral failure flag, set if anything went wrong1.0006950.847700
ip_diffim_NaiveDipoleCentroid_pos_flagFailure flag for positive, set if anything went wrong1.0006950.847700
slot_Centroid_pos_flagFailure flag for positive, set if anything went wrong1.0006950.847700
ip_diffim_NaiveDipoleCentroid_neg_flagFailure flag for negative, set if anything went wrong1.0006950.847700
slot_Centroid_neg_flagFailure flag for negative, set if anything went wrong1.0006950.847700
base_SdssShape_flagGeneral failure flag1.0001610.791157
base_GaussianFlux_flag_badShapeGeneral failure flag1.0001610.791157
slot_Shape_flagGeneral failure flag1.0001610.791157
base_SdssShape_flag_unweightedBadBoth weighted and unweighted moments were invalid1.0002290.826225
base_GaussianFlux_flag_badShape_unweightedBadBoth weighted and unweighted moments were invalid1.0002290.826225
slot_Shape_flag_unweightedBadBoth weighted and unweighted moments were invalid1.0002290.826225
base_SdssShape_flag_unweightedWeighted moments converged to an invalid value; using unweighted moments1.0006330.812639
base_GaussianFlux_flag_badShape_unweightedWeighted moments converged to an invalid value; using unweighted moments1.0006330.812639
slot_Shape_flag_unweightedWeighted moments converged to an invalid value; using unweighted moments1.0006330.812639
base_SdssShape_flag_shiftCentroid shifted by more than the maximum allowed amount1.0006650.847673
base_GaussianFlux_flag_badShape_shiftCentroid shifted by more than the maximum allowed amount1.0006650.847673
slot_Shape_flag_shiftCentroid shifted by more than the maximum allowed amount1.0006650.847673
base_SdssShape_flag_maxIterToo many iterations in adaptive moments1.0006810.825684
base_GaussianFlux_flag_badShape_maxIterToo many iterations in adaptive moments1.0006810.825684
slot_Shape_flag_maxIterToo many iterations in adaptive moments1.0006810.825684
base_SdssShape_flag_psfFailure in measuring PSF model shape1.0006950.847700
base_GaussianFlux_flag_badShape_psfFailure in measuring PSF model shape1.0006950.847700
slot_Shape_flag_psfFailure in measuring PSF model shape1.0006950.847700
base_CircularApertureFlux_3_0_flagGeneral failure flag1.0006950.847700
base_CircularApertureFlux_3_0_flag_apertureTruncatedAperture did not fit within measurement image1.0006950.847700
base_CircularApertureFlux_3_0_flag_sincCoeffsTruncatedFull sinc coefficient image did not fit within measurement image1.0006950.847700
base_CircularApertureFlux_4_5_flagGeneral failure flag1.0006950.847700
base_CircularApertureFlux_4_5_flag_apertureTruncatedAperture did not fit within measurement image1.0006950.847700
base_CircularApertureFlux_4_5_flag_sincCoeffsTruncatedFull sinc coefficient image did not fit within measurement image1.0006940.847699
base_CircularApertureFlux_6_0_flagGeneral failure flag1.0006950.847700
base_CircularApertureFlux_6_0_flag_apertureTruncatedAperture did not fit within measurement image1.0006950.847700
base_CircularApertureFlux_6_0_flag_sincCoeffsTruncatedFull sinc coefficient image did not fit within measurement image1.0006940.847699
base_CircularApertureFlux_9_0_flagGeneral failure flag1.0006950.847700
base_CircularApertureFlux_9_0_flag_apertureTruncatedAperture did not fit within measurement image1.0006950.847700
base_CircularApertureFlux_9_0_flag_sincCoeffsTruncatedFull sinc coefficient image did not fit within measurement image1.0006760.847681
base_CircularApertureFlux_12_0_flagGeneral failure flag1.0006950.847700
slot_ApFlux_flagGeneral failure flag1.0006950.847700
base_CircularApertureFlux_12_0_flag_apertureTruncatedAperture did not fit within measurement image1.0006950.847700
slot_ApFlux_flag_apertureTruncatedAperture did not fit within measurement image1.0006950.847700
base_CircularApertureFlux_17_0_flagGeneral failure flag1.0006910.847696
base_CircularApertureFlux_17_0_flag_apertureTruncatedAperture did not fit within measurement image1.0006910.847696
base_CircularApertureFlux_25_0_flagGeneral failure flag1.0006820.847687
base_CircularApertureFlux_25_0_flag_apertureTruncatedAperture did not fit within measurement image1.0006820.847687
base_CircularApertureFlux_35_0_flagGeneral failure flag1.0006680.847673
base_CircularApertureFlux_35_0_flag_apertureTruncatedAperture did not fit within measurement image1.0006680.847673
base_CircularApertureFlux_50_0_flagGeneral failure flag1.0006510.847656
base_CircularApertureFlux_50_0_flag_apertureTruncatedAperture did not fit within measurement image1.0006510.847656
base_CircularApertureFlux_70_0_flagGeneral failure flag1.0006370.847641
base_CircularApertureFlux_70_0_flag_apertureTruncatedAperture did not fit within measurement image1.0006370.847641
base_GaussianFlux_flagGeneral failure flag1.0002670.837260
base_LocalPhotoCalib_flagSet for any fatal failure1.0006950.847700
base_LocalWcs_flagSet for any fatal failure1.0006950.847700
base_PeakLikelihoodFlux_flagGeneral failure flag1.0006950.847700
base_PixelFlags_flagGeneral failure flag, set if anything went wrong1.0006950.847700
base_PixelFlags_flag_offimageSource center is off image1.0006950.847700
base_PixelFlags_flag_edgeSource is outside usable exposure region (masked EDGE or NO_DATA)1.0005560.847561
base_PixelFlags_flag_interpolatedInterpolated pixel in the source footprint0.7963220.747328
base_PixelFlags_flag_saturatedSaturated pixel in the source footprint1.0003370.847343
base_PixelFlags_flag_crCosmic ray in the source footprint0.7965620.747567
base_PixelFlags_flag_badBad pixel in the source footprint1.0006950.847700
base_PixelFlags_flag_suspectSource's footprint includes suspect pixels1.0006950.847700
base_PixelFlags_flag_interpolatedCenterInterpolated pixel in the source center0.9353620.795369
base_PixelFlags_flag_saturatedCenterSaturated pixel in the source center1.0003400.847345
base_PixelFlags_flag_crCenterCosmic ray in the source center0.9356600.795665
base_PixelFlags_flag_suspectCenterSource's center is close to suspect pixels1.0006950.847700
base_PsfFlux_flagGeneral failure flag1.0006730.847678
slot_PsfFlux_flagGeneral failure flag1.0006730.847678
base_PsfFlux_flag_noGoodPixelsNot enough nonrejected pixels in data to attempt the fit1.0006950.847700
slot_PsfFlux_flag_noGoodPixelsNot enough nonrejected pixels in data to attempt the fit1.0006950.847700
base_PsfFlux_flag_edgeObject was too close to the edge of the image to use the full PSF model1.0006890.847694
slot_PsfFlux_flag_edgeObject was too close to the edge of the image to use the full PSF model1.0006890.847694
ip_diffim_NaiveDipoleFlux_flagGeneral failure flag, set if anything went wrong1.0006950.847700
ip_diffim_NaiveDipoleFlux_pos_flagFailure flag for positive, set if anything went wrong1.0006950.847700
ip_diffim_NaiveDipoleFlux_neg_flagFailure flag for negative, set if anything went wrong1.0006950.847700
ip_diffim_PsfDipoleFlux_flagGeneral failure flag, set if anything went wrong1.0006470.847650
ip_diffim_PsfDipoleFlux_pos_flagFailure flag for positive, set if anything went wrong1.0006950.847700
ip_diffim_PsfDipoleFlux_neg_flagFailure flag for negative, set if anything went wrong1.0006950.847700
ip_diffim_ClassificationDipole_flagSet to 1 for any fatal failure1.0006950.847700
ip_diffim_DipoleFit_flag_classificationFlag indicating diaSrc is classified as a dipole0.9994740.846480
ip_diffim_DipoleFit_flag_classificationAttemptedFlag indicating attempt to classify diaSrc as a dipole0.9994720.846478
ip_diffim_DipoleFit_flagGeneral failure flag for dipole fit0.0012220.001221
ip_diffim_DipoleFit_flag_edgeFlag set when dipole is too close to edge of image1.0006930.847699
base_GaussianFlux_flag_apCorrSet if unable to aperture correct base_GaussianFlux1.0006950.847700
base_PsfFlux_flag_apCorrSet if unable to aperture correct base_PsfFlux1.0006950.847700
slot_PsfFlux_flag_apCorrSet if unable to aperture correct base_PsfFlux1.0006950.847700
ip_diffim_forced_PsfFlux_flagForced PSF flux general failure flag1.0006890.847694
ip_diffim_forced_PsfFlux_flag_noGoodPixelsForced PSF flux not enough nonrejected pixels in data to attempt the fit1.0006950.847700
ip_diffim_forced_PsfFlux_flag_edgeForced PSF flux object was too close to the edge of the image1.0006890.847694
 to use the full PSF model    
Union of flagsApply all above flags for source selection0.00000.0000

Note. Flags can be used for removing artifacts. For a specific flag, detected sources are classified as artifacts if the flag is set to True. The first row shows the results without applying flags. The last row shows the results of applying all of the above flags.

Download table as:  ASCIITypeset images: 1 2 3

Table 3. Efficiency and Artifact Rate of Selected Flags

  MAG20MAG23
FlagDescriptionEff.Artifact RateEff.Artifact Rate
   (deg−2) (deg−2)
No flag applied1.0006950.847700
base_PixelFlags_flag_saturatedSaturated pixel in the source footprint1.0003370.847343
base_PixelFlags_flag_saturatedCenterSaturated pixel in the source center1.0003400.847345
base_PixelFlags_flag_suspectSource's footprint includes suspect pixels1.0006950.847700
base_PixelFlags_flag_suspectCenterSource's center is close to suspect pixels1.0006950.847700
base_PixelFlags_flag_offimageSource center is off image1.0006950.847700
base_PixelFlags_flag_edgeSource is outside usable exposure region (masked EDGE or NO_DATA)1.0005560.847561
base_PixelFlags_flag_badBad pixel in the source footprint1.0006950.847700
ip_diffim_DipoleFit_flag_classificationFlag indicating diaSrc is classified as a dipole0.9994740.846480
ip_diffim_DipoleFit_flag_classificationAttemptedFlag indicating classification of diaSrc as a dipole was attempted0.9994720.846478
base_SdssShape_flagGeneral failure flag1.0001610.791157
base_GaussianFlux_flag_badShapeGeneral failure flag1.0001610.791157
slot_Shape_flagGeneral failure flag1.0001610.791157
Union of flagsApply all above flags for source selection0.999690.79168

Note. Flags can be used for removing artifacts. For a specific flag, detected sources are classified as artifacts if the flag is set to True. The first row shows the results without applying flags. The last row shows the results of applying all of the above flags.

Download table as:  ASCIITypeset image

5. Artifact Morphology

5.1. Artifact Definition

Now that we can identify artifacts using the "selected flags" discussed in Section 4.6, we examine their physical origins using artifacts from AL-based subtractions on the MAG20 data set. We define four types of artifacts that are commonly seen in difference images. The morphology of these artifacts is shown in Figure 6.

  • 1.  
    Saturation artifact: caused by saturated pixels, where the PSF shape is distorted. As a result, an optimal kernel solution does not exist.
  • 2.  
    Clustering-pixels artifact: a clustering-pixels artifact appears as clusters of positive pixels and negative pixels. It comes from subtraction residuals of background sources or background noise.
  • 3.  
    Dipole artifact: offset positive and negative lobes. Inaccurate image registration or nonoptimal image resampling can cause dipole artifacts. It is a subclass of the clustering-pixels artifact.
  • 4.  
    Deconvolution artifact: a ring pattern. It is found in AL difference images when the full width at half-maximum (FWHM) of the science image is sharper than the FWHM of the template image. In this scenario, the matching kernel has to "deconvolve" the PSF of the template but is not able to do so because the necessary information is not available.

Figure 6.

Figure 6. Artifacts commonly seen in difference images. Sources with saturated pixels (left) are easy to identify and marked with saturation flags in the diaSrc table. Clustering pixels (second from left) come from subtraction residuals of background sources or background noise. Dipoles (second from right) require measurements and fitting; many are successfully identified, but there is a continuum of less-offset and lower signal-to-noise dipoles that are hard to identify. The dipole artifact is a subclass of the clustering-pixels artifact. Deconvolution (right) leaves characteristic rings from trying to invert information that is not available in the larger PSF image.

Standard image High-resolution image

5.2. Artifact Morphology and Flags

To help us understand the relationship between flags and artifact morphology, we split the "selected flags" defined in Section 4.6 into three groups: saturation flags, dipole flags, and shape flags. As their names suggest, saturation flags are set to True if the detected source contains saturated pixels. We name detected artifacts with the saturation flag set to True saturation-flag artifacts and name the rest non-saturation-flag artifacts. Dipole flags are set to True for dipole artifacts. It is possible that a detected source visually appears as a dipole but has no dipole flags set to True. This could be related to the trigger mechanism and flux separation of dipole measurements. When the DIA pipeline is making a dipole classification for a detected source, it fits the fluxes of the negative and positive lobes and their corresponding centroids simultaneously. A detected source is classified as a dipole if the quadrature sum of the fitted fluxes is above the corresponding minimum S/N threshold and the fitted fluxes weighted by the total flux are both smaller than the maximum flux ratio threshold. Any detection that visually appears as a dipole but does not satisfy these criteria is not classified as a dipole artifact. Also, the fitting algorithm itself may have degeneracy in estimating dipole fluxes. Dipoles from faint and highly separated sources are almost indistinguishable from dipoles from bright and closely separated sources (Reiss 2016). The inaccuracy of dipole flux estimation could further affect dipole classification. Possible improvements such as using presubtraction images in dipole fitting are discussed in Reiss (2016). Shape flags are set to True if the detected source has second moments that are inconsistent with a Gaussian shape. More detailed definitions of these flag sets are given in Table 4.

Table 4. Definitions of Saturation Flags, Dipole Flags, and Shape Flags

Flag SetFlagsNotes
Saturation flagsbase_PixelFlags_flag_saturated, base_PixelFlags_flag_saturatedCenter, base_PixelFlags_flag_suspect, base_PixelFlags_flag_suspectCenter, base_PixelFlags_flag_offimage, base_PixelFlags_flag_edge, base_PixelFlags_flag_badSome of these flags are not actually for saturation but rather for other types of pixel-level issues. We include them here because they also indicate pixel issues. Visually inspecting detected sources shows that the majority of artifacts that have these flags set to True are saturation artifacts.
Dipole flagsip_diffim_DipoleFit_flag_classification, ip_diffim_DipoleFit_flag_classificationAttempted
Shape flagsbase_SdssShape_flag, base_GaussianFlux_flag_badShapeThe algorithm is based on the elliptical shape (Lupton et al. 2001; Bosch et al. 2017) of the detected sources. The implementation (v23.0.0) used in our analysis flags negative pixel values, which is not what we want for time-variable objects that can be brighter and fainter than in the template image. This bug has been fixed in v25.0.0 of the LSST Science Pipelines.

Download table as:  ASCIITypeset image

5.3. Seeing Conditions

Artifact morphology can be affected by the relative seeing conditions of template images and science images. We split our image pairs defined in Section 3.3 into three classes based on the FWHM of their coadd (template)/calexp (science) image PSFs. These classes are defined as

  • 1.  
    broad: calexp FWHM ≥ coadd FWHM;
  • 2.  
    near: calexp FWHM < coadd FWHM and (calexp FWHM − coadd FWHM) / (coadd FWHM) > −0.05; and
  • 3.  
    sharp: (calexp FWHM − coadd FWHM) / (coadd FWHM) ≤ −0.05.

In the broad class, calexps have larger PSFs than coadd exposures, which is the condition that the AL algorithm is capable of matching the coadd PSF to the calexp PSF. In the near class, calexp PSFs are smaller than coadd PSFs. However, the relative difference of FWHMs is still close. In the sharp class, calexp PSFs are much smaller than coadd PSFs. The AL matching kernel has to "deconvolve" the coadd PSF to match the calexp PSF.

5.4. Artifact Morphology and Artifact Fraction

Using the above definitions, we can study artifact morphology in each FWHM class. We start with calculating the proportions of artifacts in each flag group. The results are shown in Table 5. The fraction of artifacts in each class with saturation flags set to True is nearly 50%. Since they are subtraction residuals of bright background sources (starts, galaxies, etc.), it is important to understand at what background source S/N this issue occurs most frequently. We match background sources from the calexp src (Section 3.1) to artifacts using a one-direction-matching 19 method with a 2'' search radius. Figure 7 shows the artifact fraction with respect to the calexp source S/N. 20 There is an increasing trend between the artifact fraction and the source S/N. This comparison shows that using saturation flags to identify pixel issues can help us to remove saturation artifacts, especially when the source S/N is greater than 600.

Figure 7.

Figure 7. Artifacts come from the mis-subtraction of bright sources. Here we show the fraction of detections that are artifacts as a function of the S/N of the nearest background source within 2''. The dashed line shows the fraction of all artifacts. The solid line shows the fraction of artifacts that have no saturation flags set to True.

Standard image High-resolution image

Next, we study the artifacts that have no saturation flags set to True. These non-saturation-flag artifacts are detected under different seeing conditions with different artifact rates. For our analysis, we find that the rates of non-saturation-flag artifacts per square degree are 286 (broad), 312 (near), and 414 (sharp), which suggests that as the calexp PSFs become sharper, the artifact rate increases (Table 5).

The morphologies of non-saturation-flag artifacts are shown in Figures 8, 9, and 10. By inspecting postage stamps of these artifacts, we find that the majority of artifacts in the broad class are dipole artifacts and clustering-pixels artifacts. In the near class, the artifact morphology is similar to the broad class. However, in the sharp class, there are more ring artifacts, which indicates the PSF matching kernel is deconvolving the seeing condition of the template image.

Figure 8.

Figure 8. Morphology of non-saturation-flag artifacts from the broad class. The top row shows artifacts with both dipole flags and shape flags set to True. The second row from the top shows artifacts with dipole flags set to True and shape flags set to False. The second row from the bottom shows artifacts with dipole flags set to False and shape flags set to True. The bottom row shows artifacts with both dipole flags and shape flags set to False.

Standard image High-resolution image
Figure 9.

Figure 9. Morphology of non-saturation-flag artifacts from the near class. The top row shows artifacts with both dipole flags and shape flags set to True. The second row from the top shows artifacts with dipole flags set to True and shape flags set to False. The second row from the bottom shows artifacts with dipole flags set to False and shape flags set to True. The bottom row shows artifacts with both dipole flags and shape flags set to False.

Standard image High-resolution image
Figure 10.

Figure 10. Morphology of non-saturation-flag artifacts from the sharp class. The top row shows artifacts with both dipole flags and shape flags set to True. The second row from the top shows artifacts with dipole flags set to True and shape flags set to False. The second row from the bottom shows artifacts with dipole flags set to False and shape flags set to True. The bottom row shows artifacts with both dipole flags and shape flags set to False.

Standard image High-resolution image

To understand the source properties of the non-saturation-flag artifacts, we match them to the src using the one-direction-matching method with a 2'' matching radius. 21 Of the artifacts, 80.2% have matches in the src, which means most of these artifacts are also subtraction residuals of background sources. In Figure 11, S/N distributions of matched sources in each FWHM class are plotted. Most sources have S/Ns close to or higher than 200, which means bright sources are the main contribution to detection artifacts. This is reasonable because we can only observe artifacts from bright sources where subtraction residuals can exceed sky background fluctuations. It is hard for faint sources to produce visible artifacts because their subtraction residuals are overwhelmed by sky background noise. Other than bright sources, we also find clusters of faint sources in the S/N distributions of each class. In Figure 12, we plot a few stamps of faint sources from coadd exposures, calexps, and their corresponding artifacts from difference exposures. These plots show that peak pixels are the main causes of these artifacts. As to the artifacts that have no matches from the src, we show their morphology in Figure 13. Clearly, background noise is the origin of these artifacts. The artifact rate of this type is 62 deg–2 in the broad class, 57 deg–2 in the near class, and 79 deg–2 in the sharp class.

Figure 11.

Figure 11. Signal-to-noise distribution of the sources in different FWHM classes.

Standard image High-resolution image
Figure 12.

Figure 12. Morphology of artifacts from faint background sources. (Top row) Postage stamps of coadd images. (Second row from top) Postage stamps of calexp images. (Bottom row) Postage stamps of artifacts.

Standard image High-resolution image
Figure 13.

Figure 13. Morphology of artifacts that have no matches from the src. These artifacts are visually similar in each class. Inspecting their stamps shows that they are from correlated noise.

Standard image High-resolution image

5.5. Dipole Artifacts

Because dipole artifacts are prevalent in all classes with the AL algorithm, we investigate whether the ZOGY algorithm also produces dipole artifacts at the same locations. Figure 14 shows postage stamps of AL dipoles and postage stamps of the ZOGY algorithm at the same positions. Both the kernel matching method (AL) and the cross-filtering method (ZOGY) have dipole artifacts with the same orientations at the same locations.

Figure 14.

Figure 14. (Top row) Postage stamps of coadd images. (Second row from top) Postage stamps of calexp images. (Second row from bottom) Postage stamps of dipole artifacts in subtraction using the AL Gauss-polynomial algorithm. (Bottom row) Postage stamps of subtractions using the ZOGY algorithm made at the same positions as the AL dipoles. The artifacts are consistent between the two methods.

Standard image High-resolution image

Here, we give a brief discussion of several different possible causes of dipole artifacts.

(1) Image registration. It is possible that dipoles are caused by small errors in the relative astrometric registration of the images. Applying astrometric calibration may reduce the registration error. However, even subpixel errors could be magnified by bright sources, thereby producing visible dipole artifacts that have positive–negative separations of multiple pixels. (2) Sampling. Another possible origin of dipoles is the sampling process used for observing or warping exposures. Errors introduced in this process can be magnified by the brightness of background stars, which would result in dipole artifacts. (3) Moving object. An object with a small motion between the images can create a dipole that represents real astrophysical change. However, there are no stars with proper motion in the DC2 simulation, so this particular effect is not applicable to our testing here. (4) PSF matching. It is also possible that the AL algorithm finds an approximate kernel solution. An asymmetric kernel convolving a symmetric coadd source and subtracted by a symmetric calexp source could produce a dipole artifact. However, we also find dipole artifacts using the ZOGY cross-filtering method, which has no kernel fitting involved (Figure 14). We do not entirely exclude the kernel fitting explanation because the PSF models used for cross-filtering are measured from images. It is thus also possible that the approximate PSF used for cross-filtering caused the dipole artifact, which is mathematically similar to a nonoptimal AL matching kernel. (5) Poisson fluctuation. In the bright region, the Poisson fluctuation of background sources can exceed the background sky fluctuation, which leads to local bright/faint residuals. These residuals can spread to nearby pixels after convolution operation, which may appear as dipole artifacts in difference images. More accurate estimations of variance and covariance may increase the detection thresholds and thereby remove these artifacts from the diaSrc table. (6) Differential chromatic refraction (DCR). The atmosphere of the Earth refracts incident light, thereby deflecting the observed positions of sources (FILIPPENKO 1982). This effect could result in dipole artifacts in difference images. Since the DC2 images do not include this effect, we exclude DCR as the cause of the artifacts in our analysis.

From this discussion, we conclude that the origin of the dipole artifacts is still undetermined. To thoroughly understand the cause will require control of the image resampling in more detail and carefully considering the resulting PSFs of the calexps and coadd templates.

6. Possible Improvements

We here summarize some potential improvements to the performance of the LSST Science Pipelines image subtraction algorithms.

(1) Implement spatially varying delta bases. In our work, we find that the default kernel bases used in the DIA pipeline do not fully correct sampling or registration issues. Instead of using Gauss-polynomial functions as the bases, we suggest using delta bases (Bramich 2008; Miller et al. 2008), which are the most flexible bases for image difference. We also recommend including spatial variation in the delta kernel because noise properties and PSF shapes may vary across a CCD image. To avoid overfitting the model to noise using these more flexible bases, different regularization techniques (Becker et al. 2012; Bramich et al. 2016) should also be tested.

(2) Correct the kernel solution when using preconvolution or choosing calexps as templates. When the PSF of a calexp is sharper than the PSF of a coadd exposure, the image subtraction kernel needs to perform the deconvolution to the coadd exposure, which usually results in ring artifacts in the difference image. To avoid ring artifacts, one approach is to "preconvolve" the calexp using either its own PSF or a more generic Gaussian kernel. The other approach is to use the calexp as the template image and match its PSF to the PSF of the coadd exposure. Both approaches involve convolving the calexp, which adds correlations between pixels. The default normal equation of the image difference solution assumes the template is noiseless. To account for the effect of correlations, we suggest including the pixel covariance matrix Σ in the kernel solution of the AL algorithm. Due to the huge size of Σ, how to properly estimate the correlations of neighboring pixels and store Σ are worthy questions for consideration in implementation.

(3) Improve the numerical stability of the ZOGY decorrelation kernel and its AL equivalent. The ZOGY algorithm provides a source-detection optimal difference image with the advantage of noise decorrelation. However, the numerical instability caused by the division of small values at high frequencies can affect the quality of its difference image and equivalent PSF (Reiss 2017; Kovacs 2021). Kovacs (2021) suggests that if we express the ZOGY $\hat{D}$ image as $\hat{D}=\hat{{k}_{s}}\hat{S}-\hat{{k}_{t}}\hat{T}$, with $\hat{{k}_{s}}\equiv \tfrac{{F}_{t}\hat{{P}_{t}}}{\sqrt{{\sigma }_{s}^{2}{F}_{t}^{2}| \hat{{P}_{t}}{| }^{2}+{\sigma }_{t}^{2}{F}_{s}^{2}| \hat{{P}_{s}}{| }^{2}}}$ and $\hat{{k}_{t}}\equiv \tfrac{{F}_{s}\hat{{P}_{s}}}{\sqrt{{\sigma }_{s}^{2}{F}_{t}^{2}| \hat{{P}_{t}}{| }^{2}+{\sigma }_{t}^{2}{F}_{s}^{2}| \hat{{P}_{s}}{| }^{2}}}$, the asymptotic behaviors of $\hat{{k}_{s}}$ and $\hat{{k}_{t}}$ can be determined by examining the ratio of $\tfrac{| \hat{{P}_{t}}| }{| \hat{{P}_{s}}| }$, which is either 0 or . In practice, the fast Fourier transform applied to obtain frequency images cannot maintain this ratio as a smooth function of frequency due to numerical accuracy. To correct this issue, Kovacs (2021) proposes three possible work-arounds, which are (a) using Gaussian models to create PSFs in Fourier space directly, (b) replacing the high-frequency noise using Gaussian limit values, and (c) using the score image of the ZOGY algorithm for detection because the score image is further convolved by PD , which can suppress the noisy part in D. The validity of these work-arounds has not been thoroughly studied and would benefit from careful thought and testing.

(4) Decorrelate the Poisson noise for bright sources. The ideal difference image is expected to have white noise, which means the covariance matrix of the pixels of the difference is diagonal. The ZOGY algorithm proposes dividing the cross-filtering difference image by $\tfrac{1}{\sqrt{{\sigma }_{s}^{2}{F}_{t}^{2}| \hat{{P}_{t}}{| }^{2}+{\sigma }_{t}^{2}{F}_{s}^{2}| \hat{{P}_{s}}{| }^{2}}}$ for image noise whitening. Mathematically, this is equivalent to the zero-phase component analysis (ZCA; Kessy et al. 2015) in complex space. The proof of this equivalence is given in the Appendix. In reality, however, the white-noise assumption for the original template image and science image is not proper for bright sources, where the Poisson noise of bright pixels is dominant. To obtain white noise for bright sources, it would be possible to apply a standard whitening process for the region around each bright source as long as we have its covariance matrix. However, given the large number of bright sources, calculating the inverse matrix of the covariance matrix may be time-consuming. A work-around method would be to treat each row or column of the stamp as a random vector and require the correlation between rows or columns to be white (Shi et al. 2016; Seghouane & Shokouhi 2018). This simplification significantly reduces the dimensions of the covariance matrix; however, the reliability of this method for image difference has not been tested.

7. Conclusions

We have evaluated the LSST Science Pipelines DIA pipeline using synthetic source injection. We selected seven coadd template exposures along with 10 calexps overlapping each coadd in the i band from the DC2 data set. By tracing the recovery of injected synthetic sources, we predict the range of the magnitude depth (m1/2) of the pipeline for realistic simulated LSST images to be 23.40–23.71 mag when the host SB ranges between 20–21 mag arcsec−2 and 24–25 mag arcsec−2. We explored the effects of changing the AL kernel spatial degree of freedom on reducing artifacts with synthetic injections at MAG20 and MAG23.

We find that the default setup of the AL algorithm is good in terms of transient source detection and photometric measurements. At the injected magnitudes of 20 and 21, the efficiency is above 0.989, and more than 99% of the flux measurements are within 0.1 mag of the injected magnitudes. In the region away from the image edge, increasing the spatial degrees of freedom up to 4 can reduce the artifact rate without hurting efficiency and flux measurement. We calculate the efficiency and artifact rate of the flags from the diaSrc table. We find that by applying a proper combination of saturation flags, dipole flags, and shape flags to the detection from the diaSrc table, we can reduce the artifact rate from 695 deg−2 to 69 deg−2 at MAG20, and the efficiency only drops from 1.000 to 0.999. At MAG23, the artifact rate drops from 700 deg−2 to 68 deg−2, while the efficiency drops from 0.847 to 0.791. To study the morphology of the artifacts and understand their origins, we split the artifacts into broad, near, and sharp classes, which correspond to the seeing condition where the calexp PSF is larger than the coadd PSF, the calexp PSF is smaller but less than 5% smaller than the coadd PSF, and the calexp PSF is more than 5% smaller than the coadd PSF. After removing saturation artifacts from the analysis, the artifact per square degree rate is 286 in the broad class, 312 in the near class, and 414 in the sharp class. We find dipole artifacts and correlated-pixel artifacts in subtractions from all classes. Ring artifacts are seen only in the sharp class, which is consistent with rings being decorrelation artifacts. By mapping these artifacts to the src of the original calexp, we find that most of the artifacts are subtraction residuals from regions around bright (S/N ≥ 200) stars. Possible causes of these artifacts include inexact PSF matching, registration issues, sampling errors, and Poisson fluctuation of bright pixels. To remove these artifacts from difference images, we suggest possible improvements, such as applying a set of more flexible kernel bases with spatial variation and improving the treatment of correlated noise in the images.

Acknowledgments

This paper has undergone internal review in the LSST Dark Energy Science Collaboration. The primary authors would like to thank the internal reviewers Bob Armstrong, David Jones, and Ian Sullivan for their comments.

We also further thank Richard Kessler, who contributed to the discussion of the difference image analysis and made many suggestions that improved the paper.

Author contributions are listed below. S.L.: project lead, tested the DESC DIA pipeline using synthetic source injection method, wrote the paper. W.M.W.-V.: project lead, provided feedback to the difference image analysis, wrote the paper. R.A.: contributed to the development of the DESC DIA pipeline, provided technical advice for running image difference software. G.N.: contributed to the discussion of the difference image analysis. B.S.: contributed to the discussion of the difference image analysis, draft reviewer.

The DESC acknowledges ongoing support from the Institut National de Physique Nucléaire et de Physique des Particules in France; the Science & Technology Facilities Council in the United Kingdom; and the Department of Energy, the National Science Foundation, and the LSST Corporation in the United States. DESC uses resources of the IN2P3 Computing Center (CC-IN2P3–Lyon/Villeurbanne - France) funded by the Centre National de la Recherche Scientifique; the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under contract No. DE-AC02-05CH11231; STFC DiRAC HPC Facilities, funded by UK BEIS National E-infrastructure capital grants; and the UK particle physics grid, supported by the GridPP Collaboration. This work was performed in part under DOE contract DE-AC02-76SF00515.

W.M.W.-V. and S.L. were supported in part by DOE Office of Science HEP grant DE-SC0007914.

Software: Difference Image Analysis pipeline using LSST DM Science Pipelines developed for DESC Data Challenges (https://github.com/LSSTDESC/dia_pipe), LSST Science Pipelines (v23.0.0; https://pipelines.lsst.io/releases/v23_0_0.html), The obs model for the LSST cameras (w.2022.48; https://github.com/lsst/obs_lsst), Astropy (Astropy Collaboration et al. 2022), Matplotlib (Hunter 2007), Numpy (Harris et al. 2020), pandas (McKinney 2010; The pandas development team 2023), Scipy (Virtanen et al. 2020), SQLite (Hipp 2020).

Appendix: Connections between the ZOGY Algorithm and ZCA Whitening

In signal processing, whitening is defined as a linear transformation that maps a random vector to a new vector with an identity covariance matrix. The ZCA is a whitening process that minimizes the squared distance between the original vector and the whitened vector (Kessy et al. 2015) in real space. In this section, we prove that the ZOGY decorrelation is equivalent to the ZCA whitening in complex space. For simplicity, we assume that the expectation of the original random vector is 0.

In ZCA whitening, a random vector X with covariance ΣX can be transformed to a white vector Z via matrix W as ZWX. The solution of W is

Equation (A1)

with ${{\rm{\Sigma }}}_{X}^{-\tfrac{1}{2}T}{{\rm{\Sigma }}}_{X}^{-\tfrac{1}{2}}={{\rm{\Sigma }}}_{X}^{-1}$. U is an orthogonal transformation, and Λ is a diagonal matrix where each element is an eigenvalue of ΣX .

The covariance matrix of Z is the identity matrix I:

Equation (A2)

In complex space, we generalize the ZCA whitening as

Equation (A3)

with U as a unitary matrix and U as its conjugate transpose.

Suppose we want to decorrelate a 1D cross-filtering difference vector M, defined as

Equation (A4)

The decorrelation matrix WM should equal ${{\rm{\Sigma }}}_{M}^{-\tfrac{1}{2}}$. Define Ps and Pt as circulant matrices of 1D vectors ps and pt . M can be expressed as M = Ps TPt S. Ps and Pt can be diagonalized by the Fourier transform matrix F and its conjugate transpose F as ${P}_{s}={F}^{\dagger }\hat{{P}_{s}}F$ and ${P}_{t}={F}^{\dagger }\hat{{P}_{t}}F$, with $\hat{{P}_{s}}$ and $\hat{{P}_{t}}$ being the diagonal eigenvalue matrices of Ps and Pt , which are also the Fourier transforms of ps and pt . The Fourier transform matrix F is defined as

Equation (A5)

Now we can express ΣM as

Equation (A6)

If T and S are white with covariance matrices ${\sigma }_{t}^{2}I$ and ${\sigma }_{s}^{2}I$, ΣM can be simplified as

Equation (A7)

The inverse square root of ΣM is

Equation (A8)

We can interpret ${W}_{M}={{\rm{\Sigma }}}_{M}^{-\tfrac{1}{2}}$ as a rotation of M to $\hat{M}$ by the Fourier transform. It then divides each component of $\hat{M}$ by the corresponding component of ${({\sigma }_{t}^{2}{\hat{{P}_{s}}}^{2}-{\sigma }_{s}^{2}{\hat{{P}_{t}}}^{2})}^{-\tfrac{1}{2}}$. Finally, it rotates the ${({\sigma }_{t}^{2}| {\hat{{P}_{s}}}^{2}| +{\sigma }_{s}^{2}| {\hat{{P}_{t}}}^{2}| )}^{-\tfrac{1}{2}}\hat{M}$ back to the original coordinate system. This is exactly the same whitening procedure as the ZOGY deconvolution solution in one dimension.

In 2D, images can be represented as 1D row-ordered vectors, and PSFs can be represented as doubly circulant matrices. The Fourier transform can be represented as a Kronecker product of 1D Fourier transform matrices. The inverse of the 2D Fourier transform matrix can diagonalize circulant matrices with their Fourier transforms as eigenvalues (Jain 1989). Thus, the proof in 1D can be generalized to 2D.

Table 5. Artifact Morphology Fraction

 BroadNearSharp
Saturation flags51.90%47.60%53.31%
Dipole flags only1.45%2.92%3.08%
Shape flags only25.11%24.84%21.76%
Both dipole flags and shape flags12.28%12.74%12.14%
Remaining9.26%11.90%9.71%
Non-saturation-flag artifacts per square degree286312414

Note. Percentage of artifacts in each flag set of the broad class, near class, and sharp class. The first row shows the fraction of saturation-flag artifacts. The second row shows the fraction of non-saturation-flag artifacts that have dipole flags set to True and shape flags set to False. The third row shows the fraction of non-saturation-flag artifacts that have dipole flags set to False and shape flags set to True. The fourth row shows the fraction of non-saturation-flag artifacts that have both dipole flags set to True and shape flags set to True. The fifth row shows the fraction of non-saturation-flag artifacts that have both dipole flags and shape flags set to False. The last row shows the artifact rate (deg−2) of non-saturation-flag artifacts.

Download table as:  ASCIITypeset image

Footnotes

  • 6  
  • 7  
  • 8  

    The ZOGY paper uses N to denote the science image and R for the template image. We here use T and S consistently for both AL and ZOGY.

  • 9  
  • 10  
  • 11  
  • 12  

    Strictly speaking, diaSrc tables also have simulated SNe. Given the rarity of the true SNe, classifying them as artifacts does not affect our statistics.

  • 13  
  • 14  

    In this work, we estimate the detection efficiency using synthetic sources that are injected on host galaxies. A method that untangles the effects of Poisson noise versus subtraction artifacts can be found in Doctor et al. (2017).

  • 15  

    This equation is correct for constant noise and PSF within an image. Our use across a set of images with some variation in noise and PSF is an approximation. A more robust relationship would be efficiency versus injected S/N.

  • 16  

    Efficiency of 0.989.

  • 17  

    Host galaxies are selected to be 200 pixels away from the image boundary.

  • 18  

    To be consistent with Section 4.6, we use only detections from SB = 20–21 mag arcsec−2 to calculate efficiency and artifact rate.

  • 19  

    One background source may cause subtraction artifacts in many science images.

  • 20  

    We calculate the S/Ns of the original contributing sources by dividing base_PsfFlux_instFlux by base_PsfFlux_instFluxErr from the src.

  • 21  

    We exclude the saturation artifacts because the origin of these artifacts can be inferred by the name of the flag, which is set to True.

Please wait… references are loading.
10.3847/1538-4357/ad3635