3.3 IR Data Calibration Steps

The IR data reduction process begins with the raw IR image file. This contains all the non-destructive readouts from an exposure, stored in reverse time order i.e. the first extension corresponds to the last array read. Most of the calibration steps are applied independently to each readout. For example, the DQICORR, NLINCORR, and FLATCORR steps apply the same bad pixel flags, non-linearity correction coefficients, and flat-field image, respectively, to each readout. On the other hand, the CRCORR step, which performs the up-the-ramp fit and removes the effects of cosmic rays hits, utilizes the values from all readouts of individual pixels simultaneously. Detailed descriptions of each step are provided in the following sections.

All steps up through UNITCORR are applied to an in-memory image stack that contains all the readouts. The CRCORR step produces an additional single image that gives the best-fit count rate for each pixel. The remaining steps in the process - FLATCORR and image statistics - are then applied to the full stack of readouts and to the single image produced by CRCORR.

Upon completion of the IR data calibration process, two output files are produced. The Intermediate MultiAccum (ima) file contains the full stack of calibrated readouts while the final calibrated image (flt) file is the single image produced by CRCORR with subsequent flat fielding applied.

Figure 3.3 shows a schematic representation of all the IR calibration steps, which are also briefly summarized below, in the order they are performed, with the corresponding calibration switch keyword in parenthesis:

  • Initialize data quality, DQ, array (DQICORR)
  • Estimate amount of signal in zeroth-read (ZSIGCOR)
  • Subtract bias level from reference pixels (BLEVCORR)
  • Subtract zeroth read image (ZOFFCORR)
  • Initialize error array, ERR (NOISCORR)
  • Correct for detector non-linear response (NLINCORR)
  • Subtract dark current image (DARKCORR)
  • Compute photometric keyword values for header (PHOTCORR)
  • Convert to units of count rate (UNITCORR)
  • Fit accumulating signal and identify CR hits (CRCORR)
  • Divide by flat-field image(s) and apply gain conversion (FLATCORR)
  • Compute image statistics (No switch)

Figure 3.3:Flow diagram for IR processing; each step, as well as the calibration switches that control each step are indicated.


3.3.1 Data Quality Initialization

  • Header Switch: DQICORR
  • Reference File: BPIXTAB

This step populates the data quality (DQ) array in all IR readouts by reading a table of known bad pixels for the detector, stored in the ‘Bad Pixel’ reference table BPIXTAB; the appropriate BPIXTAB is selected based on the value of the DETECTOR keyword and USEAFTER.

The types of bad pixels that can be flagged are listed in Table 3.3.

The DQ array is no longer updated to reflect any Take Data Flag (TDF) transition during the sample (see this issue of the STAN). Other DQ values will only be marked during further processing (such as cosmic-ray rejection).

If the users wish to update the DQ array themselves before running further processing, they should first complete the DQ initialization step, and remember that the data in the DQ extension is always in units of UNSIGNED INTEGER.

Table 3.3: Data quality array flags for IR files

NAME

VALUE

DESCRIPTION

GOODPIXEL

0

OK

SOFTERR

1

Reed-Solomon decoding error

DATALOST

2

data replaced by fill value

DETECTORPROB

4

bad detector pixel

BADZERO

8

unstable IR zero-read pixel

HOTPIX

16

hot pixel

UNSTABLE

32

IR unstable pixel

WARMPIX

64

unused

BADBIAS

128

bad reference pixel value

SATPIXEL

256

full-well or a-to-d saturated pixel

BADFLAT

512

bad flat-field value

SPIKE

1024

CR spike detected during ramp fitting

ZEROSIG

2048

signal in zeroth-read

CR hit

4096

cosmic ray detected by Astrodrizzle

DATAREJECT

8192

rejected during up-the-ramp fitting

HIGH_CURVATURE

16384

not used

RESERVED2

32768

can’t use

3.3.2 IR Zero-Read Signal Correction

  • Header Switch: ZSIGCORR
  • Reference Files: DARKFILE, NLINFILE

At the beginning of an IR observation, the detector pixels are reset and then read out to record the bias level. An interval of approximately 2.9 seconds elapses between the time each pixel is reset and then read. Because the IR channel does not have a shutter, signal from the field of view under observation, as well as persistent signal from previous observations, accumulates during that 2.9 second interval. When the initial (or ‘zeroth’) read is later subtracted from subsequent readouts, any signal in the zeroth read will also be subtracted. Because linearity correction and saturation checking, performed in the NLINCORR step described in Section 3.3.13, both depend on the absolute signal level in a pixel at the time it was read, the signal in the zeroth read from bright sources can be large enough that, if neglected in the NLINCORR calibration step, it can lead to inaccurate linearity corrections, as well as the failure to detect saturation conditions. The ZSIGCORR step is used to estimate the amount of source signal in the zeroth read and to supply this estimate to the NLINCORR step.

Such an estimate is given by the difference between the super zero read in the linearity reference file (NLINFILE) and the science zero read exposure. The ZSIGCORR step is executed roughly as follows:

  • Copy the zero signal image from the linearity reference file (super zero).
  • Compute any subarray offsets.
  • Subtract the super zero read reference image from the zero read science image.
  • Compute the noise in the zero image.
  • Pixels that contain more signal than ZTHRESH*noise are flagged (flag value = 2048) and the estimated zeroth read signal is passed to the NLINCORR step, which accounts for that signal when applying linearity corrections and saturation checking on the zeroth-read subtracted images. Pixels with signal below ZTHRESH*noise are ignored.
  • The linearity correction file (NLINFILE)has an extension with saturation values. Pixels that are saturated in the zeroth or first reads are flagged in the DQ and the number of saturated pixels is reported.

The ZSIGCORR step estimates the source signal in the science zero read by subtracting the super zero read from the science zero read instead of calculating an estimated signal based on the first read and zero read + estimated exposure time between them (which was the case prior to Mar-2011). This way the difference in readout time for subarrays is not an issue, and also dark current subtraction is no longer necessary for the signal estimate (the DARKFILE is no longer used by this step).

Note that this technique will not work well for pixels covered by targets that are so bright that the signal is already beginning to saturate in either the zeroth or first readouts because then it is difficult to accurately estimate the zeroth-read signal. The zero read correction ZSIGCORR thus checks for saturation in the zeroth and first read images and flags the saturated pixels with a 256 flag in the DQ extension of corresponding ima imset (0-th or 1-st).

Pixels that are determined to have detectable signal in the zeroth read are flagged in the DQ arrays of the first imset of the output ima file with a data quality value of 2048; in this case, this flag appears also in the final flt file. The NLINFILE is chosen based on the values of the DETECTOR and USEAFTER keywords, while the DARKFILES are selected based on the values of the DETECTOR, CCDAMP, CCDGAIN, SAMP_SEQ, SUBTYPE and USEAFTER keywords.

3.3.3 Bias Correction

  • Header Switch: BLEVCORR
  • Reference Files: OSCNTAB

The BLEVCORR step uses the reference pixels located around the perimeter of the IR detector to track and remove changes in the bias level that occur during an exposure. For each raw readout, the average signal level of the reference pixels is computed (via a resistant mean algorithm), subtracted from the image, and recorded in the MEANBLEV keyword in the SCI header.

The reference pixels located at the ends of each image row are used in this computation. Reference pixels are also located along the bottom and top of the detector, at the ends of each column, but those have been found to be less reliable and thus are not used.  There are 5 reference pixels around the perimeter but the outermost pixel is ignored on each side, for a total of 8 reference pixels per row used in the BLEVCORR step.

As with the UVIS overscan correction, the boundaries of the reference pixel regions that are used in the computation are defined in the OSCNTAB reference table, in the BIASSECT* columns. The BIASSECTA[1,2] values indicate the starting and ending column numbers for the reference pixels on the left edge of the image, and the BIASSECTB[1,2] give the values for the right side of the image.

The reference pixel regions are retained in the fits file throughout the remainder of processing but are usually ignored or skipped over in the actual application of calibration algorithms. They are left in place in the calibrated data stored in the ima file at the end of processing but are trimmed from the flt image file.

The reference file for bias level correction, OSCNTAB, is selected based on the value of the DETECTOR keyword only.

3.3.4 IR Zero-read Image Subtraction

  • Header Switch: ZOFFCORR
  • Reference Files: None

The ZOFFCORR step subtracts the zeroth read from all readouts in the exposure, including the zeroth read itself, resulting in a zero-read image that is exactly zero in the remainder of processing. The zeroth-read image is propagated through the remaining processing steps and included in the output products, so that a complete history of error estimates and data quality (DQ) flags is preserved.

Note: When interpreting the IR intermediate MultiAccum (ima) file, it is important to remember the file does NOT represent differences in adjacent reads, but always the difference between a given readout and the zero read. The signal rate recorded in each SCI extension of the ima file represents the average flux between that particular readout and the zero read.

3.3.5 Error Array Initialization

  • Header Switch: NOISCORR (not listed explicitly in image header, see text)
  • Reference Files: CCDTAB

This step computes an estimate of the errors associated with the raw science data based on a noise model for the detector. The NOISCORR keyword is not user-accessible and always set to PERFORM. Currently, the noise model (in DN) is a simple combination of detector read noise (RN) and Poisson noise in the signal, such that:


where the read noise is in units of electrons, gain is the analog-to-digital conversion gain factor (in electrons DN-1) and counts is the signal in a science image pixel in units of DN. The detector read noise and gain are read from the CCDTAB reference file and use separate values for the particular amplifier quadrant with which each pixel is associated.



Throughout the remaining calibration steps the ERR image is processed in lock-step with the science (SCI) image, getting updated as appropriate. Errors are propagated through combination in quadrature. The ERR array for the final calibrated flt image is populated by the CRCORR step, based on the calculated uncertainty of the count rate fit to the MultiAccum samples (see Section 3.3.13 for details).

The CCDTAB reference file used in this step is selected based on the value of the DETECTOR keyword only.

3.3.6 Detector Non-linearity Correction

  • Header Switch: NLINCORR
  • Reference Files: NLINFILE

In this step, the integrated counts in the IR science images are corrected for the non-linear response of the detector, flagging pixels that have saturated (as defined in the saturation extension of the NLINFILE reference image.) The observed response of the detector can be represented by two regimes:

  • At low and intermediate signal levels the detector response deviates from the incident flux in a way that is correctable using the following expression


Fc=(1+c1+c2×F+c3×F2+c4×F3)×F


where c1, c2, c3, and c4 are the correction coefficients, F is the uncorrected flux in DN and Fc is the corrected flux. The current form of the correction uses a third-order polynomial, but the algorithm can handle an arbitrary number of coefficients. The number of coefficients and error terms are given by the values of the NCOEFF and NERR keywords in the header of the NLINFILE.

  • At high signal levels, as saturation sets in, the response becomes highly non-linear and is not correctable to a scientifically useful degree.

The signal in the zero read is temporarily added back to the zeroth read image of the science data before the linearity correction is applied and before the saturation is judged. Once the correction has been applied the zero read signal is once again removed. This only occurs if the ZSIGCORR step is set to PERFORM. Saturation values for each pixel are stored in the NODE extension of the NLINFILE. After each group is corrected, the routine also sets saturation flags in the next group for those pixels that are flagged as saturated in the current group. This is necessary because the SCI image value of saturated pixels will sometimes start to go back down in the subsequent reads after saturation occurs, which means they could go unflagged by normal checking techniques. The SAMP and TIME arrays are not modified during this step. The NLINFILE reference files is selected based on the value of the DETECTOR keyword only.

3.3.7 Dark Current Subtraction

  • Header Switch: DARKCORR
  • Reference Files: DARKFILE

The DARKCORR step subtracts the detector dark current from the science data. The reference file listed under the DARKFILE header keyword is used to subtract the dark current from each sample. The DARKFILE reference file must have the same values for the DETECTOR, CCDAMP, CCDGAIN, SAMP_SEQ, and SUBTYPE keywords as the science image. Due to potential non-linearities in some of the signal components, such as reset-related effects in the first one or two reads of an exposure, the dark current subtraction is not applied by simply scaling a generic reference dark image by the exposure time and then subtracting it. Instead, a library of dark current images is maintained that includes darks taken in each of the available predefined MULTIACCUM sample sequences, full-frame as well as sub-array readout modes. The dark reference file is subtracted read-by-read from the stack of science image readouts so that there is an exact match in the timings and other characteristics of the dark image and the science image. The subtraction does not include the reference pixels. The ERR and DQ arrays from the reference dark file are combined with the SCI and DQ arrays from the science image, but the SAMP and TIME arrays are unchanged. The mean of the dark image is saved to the MEANDARK keyword in the output science image header.

3.3.8 Photometry Keywords

  • Header Switch: PHOTCORR
  • Reference Files: IMPHTTAB

The PHOTCORR step updates the image header with keywords that allow the user to convert their data from counts rates to absolute fluxes, and perform calibrated photometry. The step is performed using tables of precomputed values stored in the IMPHTTAB file. The appropriate entry in IMPHTTAB is selected according to the observing mode, whose value stored the image header keyword PHOTMODE. The updated keywords are:

  • PHOTFLAM: the inverse sensitivity in units of erg cm-2 Å-1 electron-1
  • PHOTFNU: the inverse sensitivity in units of Jy sec electron-1
  • PHOTZPT: the STMAG zero point
  • PHOTPLAM: the bandpass pivot wavelength in Å
  • PHOTBW: the bandpass RMS width in Å

3.3.9 Conversion To Signal Rate

  • Header Switch: UNITCORR
  • Reference Files: None

This step converts the science data from a time-integrated signal to a signal rate by dividing the SCI and ERR arrays for reach readout by the TIME array. No reference file is needed. The BUNIT keyword in the output data header reflects the appropriate data units. This step is skipped if the BUNIT value is already COUNTS/S. The flat fielding process (performed if FLATCORR is set to PERFORM), further changes the BUNIT by multiplying the image by the gain, therefore the final BUNIT value depends on the value of both UNITCORR and FLATCORR, as illustrated in Table 3.4.

Table 3.4: Possible values of BUNIT after calibrating IR data with calwf3.

UNITCORR
FLATCORR
BUNIT AFTER CALIBRATION
OMIT
OMIT
COUNTS
OMIT
COMPLETE
ELECTRONS
COMPLETE
OMIT
COUNTS/S
COMPLETE
COMPLETE
ELECTRONS/S


3.3.10 Up-the-ramp Fitting and Cosmic Ray Identification

  • Header Switch: CRCORR
  • Reference Files: CCREJTAB

CRCORR combines the data from all readouts into a single image and, in the process, identifies and flags pixels suspected of containing cosmic-ray (CR) hits. The method is extensively described in Fixsen et al. (2000).

The data from all readouts are analyzed pixel-by-pixel, iteratively computing a linear fit to the accumulating counts-versus-exposure time relation. Samples flagged as bad in the DQ arrays, such as when saturation occurs midway through the exposure, are rejected from the fitting process. CR hits are identified by searching for outliers from the fit results. The rejection threshold is set by the value in the CRSIGMAS column of the CR rejection parameters reference table CRREJTAB, and has a default value of 4. When a CR hit is detected, a linear fit is then performed independently for the sets of readouts before and after the hit; if a CR hit is identified to have occurred during a sample, the value measured for that sample is included in the ‘after’ ramp segment. Those fitting results are then again checked for outliers. This process is iterated until no new CRs are detected.

Pixel samples identified as containing a CR hit are flagged in the DQ arrays of the intermediate MultiAccum (ima) file, with DATAREJECT DQ value of 8192.

The DATAREJECT DQ flag is also set for all samples following the CR in order to indicate that the absolute value of the pixel is wrong after the first hit. However, this flagging smears the location of any hits which might occur after the first CR.

In addition to CRs, "negative CR hits" in the accumulated counts vs. time relation have occasionally been observed in WFC3 data. They appear as sudden “drops” in the accumulated counts vs. time plots for individual pixels (e.g. Figure 1 in WFC3 ISR 2010-13). These “negative CR hits” are also identified in the CRCORR step, and flagged with the SPIKE DQ flag, value = 1024. Appendix B of WFC3 ISR 2009-40 gives a possible explanation for a sub-class of such events: normal cosmic rays that traverse the detector but instead of hitting the photo-sensitive HgCdTe pixel bulk, they strike other parts of the pixel (e.g. the electronics) that are sensitive to charged particles. These events are sometimes clearly associated with CRs by the physical trail visible in raw images e.g. as shown in Figure 3.4. The trail has a lower signal than in the neighboring pixels (when the CR goes through the electronics) to undetectable (when the CR travels through layers that are unaffected by its passage) to higher signal than in the neighboring pixels (when charge is released in the active HgCdTe region of the pixel), see  Other negative spikes are sometimes observed in isolated pixels and are attributed to “burst noise”, also known as popcorn noise or random telegraph signal.

Figure 3.4: A negative cosmic ray trail associated with a positive one in a single read of a raw image.


The CR and SPIKES DQ flags are only present in the ima file and are not carried over into flt products which combine data from all readouts since the affected pixels are not including in the up-the-ramp fit.

Once all outliers have been identified, the slopes of each segment of non-flagged samples are computed via a linear fit to the counts vs. time data. This fit includes optimal weighting with individual data point uncertainties as well as contributions from the read noise and the Poisson noise for the source and the dark current. The linear fit reports the best fitting slope and its uncertainty. The final count rate value for a pixel (and its uncertainty) is determined by computing the weighted mean (and its uncertainty) of the individual sample's slopes. The result of this operation is stored in the output flt file, where the SCI array contains the final slope computed for each pixel, the ERR array contains the estimated uncertainty in the slope, the SAMP array contains the total number of non-flagged samples used to compute the slope, and the TIME array contains the total exposure time of those samples.

Pixels for which there are no unflagged samples, e.g., permanently hot or cold pixels, still have a slope computed (recorded in the SCI array of the output flt file) but they are flagged in the DQ array of the flt file. Users should therefore be always check the flt file DQ arrays to help determine whether a given SCI image pixel value is trustworthy for subsequent analysis.

The basic rule of thumb is that in order for a DQ value to propagate into the flt, it must be present in all the reads of the ima. The 8192 flag is not propagated info the flt because calwf3 has already accounted for the effects of the CR when performing the up-the-ramp fit. The DQ arrays in the ima files contain the complete record of when and where exactly each cosmic ray hit the detector.

A similar propagation scheme occurs for the saturation flag (DQ = 256). If, e.g., a pixel is saturated in the last two reads of a ramp, then those two reads are flagged with 256 in the ima file, and calwf3 ignores them during line-fitting. The resulting DQ value in the flt file is 0 because calwf3 has dealt with the saturation and the effects are not present in the flt. If saturation occurs in the first read of a ramp, the SCI extension of the flt file for that pixel contains an estimate of the flux equal to the value in the input zeroth read image, but the DQ extension of the flt does not have a 256 value added to it. If the zeroth read is also saturated, the flt file still contains the same flux estimate as in the first-read saturation case, but in this case, the DQ flag 256 is propagated into the output flt DQ extension.

Pixels where calwf3 finds 4 or more CRs up the ramp are flagged as UNSTABLE (32) which does propagate to the flt since that many signal jumps for a given pixel in a single ramp are an indication that likely the pixel should not be trusted. DQ values from any sample are carried through to the flt file if a pixel has no good samples.

Note for SCAN data: With the release of calwf3v3.3, IR scan data is processed with CRCORR set to OMIT as up-the-ramp fits to scan data do not produce meaningful results. By setting CRCORR=OMIT, the ramp fit is not performed and instead an flt output image is produced that contains the first-minus-last read result. Note that since the SCAN flt output image is not a fit up-the-ramp, the output image units will not be a rate but instead be in counts (if UNITCORR=OMIT) or electrons (if FLATCORR=COMPLETE).

3.3.11 Flat-field Correction

  • Header Switch: FLATCORR
  • Reference Files: PFLTFILE, LFLTFILE, DFLTFILE

The FLATCORR step corrects for pixel-to-pixel and large-scale sensitivity variations across the detector by dividing the science images by one or more flat-field images. A combined flat is created within calwf3 using up to three flat-field reference files: the pixel-to-pixel flat (c), the low-order flat (LFLTFILE), and the delta flat (DFLTFILE). FLATCORR also multiplies the science data by the detector gain, using the mean gain from all the amplifiers. Therefore the calibrated data will be in units of electrons per second (or electrons if UNITCORR = OMIT).

The PFLTFILE is a pixel-to-pixel flat-field correction file containing the small-scale flat-field variations. The PFLTFILE is always used in the calibration pipeline, while the other two flats are optional. The LFLTFILE is a low-order flat that corrects for any large-scale sensitivity variations across the detector. This file can be stored as a binned image, which is then expanded when being applied by calwf3. Finally, the DFLTFILE is a delta-flat containing any needed changes to the small-scale PFLTFILE. If the LFLTFILE and DFLTFILE are not specified in the SCI header, only the PFLTFILE is used for the flat-field correction. If two or more reference files are specified, they are read in and multiplied together to form a combined flat-field correction image.

The flat-field correction is applied to all readouts of the calibrated IR MultiAccum stack, as well as the single image produced by the CRCORR function. Due to geometric distortion effects, the area of the sky seen by different pixels is not constant and therefore observations of a constant surface brightness object will have counts per pixel that vary over the detector. In order to produce images that appear uniform for uniform illumination, the same counts per pixel variation across the field is left in place in the flat-field images, so that when a science image is divided by the flat it makes an implicit correction for the distortion effects on photometry. A consequence of this procedure is that two point-source objects of equal brightness will not have the same total counts after the flat-fielding step, thus point source photometry requires the application of a pixel area map (PAM) correction.

Note: All WFC3 observations, not just dithered images, processed with Astrodrizzle (drz files) will be corrected for geometric distortion and pixel area effects. However when using flt files to extract point-source photometry, the  pixel area map file must be applied manually (calwf3 does not perform a PAM correction). 

All flat-field reference images are selected based on the DETECTOR, CCDAMP, and FILTER used for the observation. A sub-array science image uses the same reference file(s) as a full-size image; calwf3 extracts the appropriate region from the reference file(s) and applies it to the sub-array input image.

3.3.12 Image Statistics Calculation

  • Header Switch: None
  • Reference Files: None

This step computes image statistics using the “good pixels”, i.e. with DQ value equal to 0, and records them in image header keywords. The operation is performed for every readout in the calibrated MultiAccum stack (ima), as well as the final (CRCORR-produced) calibrated image (flt). The values computed and captured in keywords are: the minimum, mean, and maximum values (GOODMIN, GOODMEAN, GOODMAX, respectively), the number of good pixels (NGOODPIX), as well as the minimum, mean, and maximum signal-to-noise ratio (i.e. the ratio of the SCI and ERR pixel values) which are SNRMIN, SNRMEAN, SNRMAX, respectively. The minimum, mean, and maximum statistics are computed for the ERR arrays as well.

3.3.13 Cosmic-ray rejection

  • Header Switch: RPTCORR
  • Reference Files: CCREJTAB

Associations with more than one member, i.e. have been associated using REPEAT-OBS, are combined using wf3rej (see Section 3.4.5 for more details). CR-SPLIT is not available for the IR channel. The task uses the same statistical detection algorithm developed for ACS (acsrej), STIS (ocrrj) and WFPC2 (crrej), providing a well-tested and robust procedure. For all associations (including dithered observations), the DRZ products will be created by Astrodrizzle, which performs both cosmic ray detection (in addition to wf3rej, for REPEAT-OBS observations) and corrects for geometric distortion.