2.7 Error and Data Quality Arrays

The calcos pipeline propagates data quality flags throughout the calibration process. The flux error estimates are computed by propagating the individual uncertainty components related to the flat-field, gross count rate, and smoothed background count rate in each wavelength bin.  Note that starting with calcos 3.3.10 the Poisson error calculations are achieved using the asymmetric "Frequentist-Confidence" method in astropy rather than the Gehrels (1986) upper confidence limit equation.  See COS ISR 2021-03 for more information.

2.7.1 Upper and Lower Flux Error Arrays

The ERROR and ERROR_LOWER arrays contain estimates of the 1-sigma upper and lower flux uncertainties, respectively, in each wavelength bin.  For high S/N observations the ERROR and ERROR_LOWER uncertainty values will be symmetric; however, the flux errors become increasingly asymmetric when count levels fall below ~30 counts.  Information about the three general terms that contribute to the flux errors are provided in each wavelength bin via the VARIANCE_FLAT, VARIANCE_COUNTS, and VARIANCE_BKG columns.  For X1D and X1DSUM<N> files comprised of single exposures, the general equation for calculating net count rate uncertainties is:



where the first term under the radical is VARIANCE_FLAT and the f_{u,l} function is the "Frequentist-Confidence" method in CalCOS but can be substituted for a function of the users choice.  For example, users wishing to use symmetric root-N errors could calculate the net count rate uncertainties as:


where the net count rate uncertainty can be converted into a flux uncertainty using the relation:


In contrast, X1DSUM and X1DSUM[N] files comprised of two or more exposures interpolate onto a common wavelength grid, and as a result are affected by covariance between adjacent wavelength bins along with including measurements with potentially different data quality flags.  The upper and lower flux errors in X1DSUM and X1DSUM[N] files are thus approximations based on treating all contributions to a given wavelength bin as "effective counts."  The summed contributions for the three general error terms are provided in the same VARIANCE_FLAT, VARIANCE_COUNTS, and VARIANCE_BKG columns as in the X1D files.  Further information is provided in Section 4.2 of COS ISR 2021-03.

2.7.2 Data Quality Flags

Every photon event in a COS corrtag file has a Data Quality (DQ) flag (Table 2.19). Each flagged condition sets a specific bit in the 16-bit DQ word, thus allowing each event during an exposure to be flagged with multiple conditions using the bitwise logical OR operation. DQ flags can be divided into five types:

1. Spatial flags mark events which fall on a detector region which may be questionable. The BPIXTAB reference file marks the corners of each region on the detector which falls into each of these categories. Separate BPIXTAB files are used for the FUV and NUV detectors. These regions were determined by visual inspection of a set of science data files. For FUV data, the GSAGTAB is applied along with the BPIXTAB and SPOTTAB. The GSAGTAB is used to flag regions that are severely gain sagged (with a median PHA of less than 3). The SPOTTAB is the hot spot reference table.

The DQICORR step of calcos maps these spatial regions to the individual photon events, and the x1dcorr module uses these flags and the value of SDQFLAGS to create the DQ and DQ_WGT arrays, and ultimately to determine which events to include in the final (x1dsum) spectrum (Section 3.4.18).

The spatial flags include:

  • Detector shadows (4) include the locations of the grid wires for the FUV detector, and the vignetted region on the NUV detector.
  • Poorly calibrated regions (8) include areas near the edge of the detector which may be suspect.
  • Very low response regions (16) are areas on the detector where the response presents a >80% depression.
  • Background features (32) correspond to regions on the detector where the background count rate has been observed to be higher than the surrounding region and/or unstable.
  • The pixel out-of-bounds flag (128) marks regions outside of the calibrated region of the detector.
  • Low response regions (1024) are areas on the detector where the response presents a >50% depression.
  • Low PHA features (4096) are regions in which unusual features have been identified in long background exposures. These features may have an effect on very low count rate observations.
  • Gain-Sag holes (8192) are regions on the FUV detector where the gain is low enough that the calibration may be affected (see Section 3.7.15 describing the GSAGTAB reference file).  Gain-Sag holes differ from low-response regions in being time-dependent and so are updated in the GSAGTAB instead of BPIXTAB.

2. Temporal flags mark photons that occur during time spans in which the data quality is suspect. Events flagged in this way will be removed from the data products, and the exposure time will be adjusted accordingly. Two types of temporal flags are used:

  • FUV event bursts (64), which are flagged by the BRSTCORR module of calcos. As of this writing, no bursts have been seen on orbit, so the BRSTCORR step is set to OMIT by default. If bursts are seen at some point, it is likely that the parameters in the BRSTTAB reference table will have to be adjusted before using BRSTCORR.
  • Other Bad Time Intervals (2048) can be defined in the BADTTAB reference file, for time ranges that are known to be problematic. At present, STScI has not defined any bad time intervals, but users running calcos on their own may wish to define their own intervals in order to exclude times with high background, etc.

3. Spatial and Temporal flags mark events that fall on a specific part of the detector, but also during specific time spans in which the data quality is suspect. Currently only the hotspot flag falls into this category.

  • Hotspot flag (2) only applies to FUV data. If a hotspot overlaps any of the good time intervals, the region is added to the set of regions that are applied to create the DQ mask and against which each event is tested to assign a DQ value. The hot-spot regions are flagged in the two-zone extraction module, even if they fall only in the outer zone, and they do not contribute to the summed spectra in the x1dsum file.

4. Event flags are set by calcos if a photon event falls outside defined thresholds. Currently, only the FUV Pulse Height flag (512) falls into this category. All FUV events with pulse heights falling outside the range specified in the PHATAB reference file will have this flag set, and the data will be excluded by the DQICORR module. This flag does not apply to NUV data. The default value of SDQFLAGS does not include 512, but pulse height thresholding is still conducted by default.

5. Lost Data flags occur if data are missing for some reason, such as errors in transmitting the data from the instrument to the ground. Data marked with these flags is always excluded from the final products. There are two flags in this category:

  • Reed-Solomon errors (1)
  • Fill Data (256)

Screening for temporal and event flags is done by turning calibration switches on or off, or by altering reference files. Once a photon has been determined to have a bad temporal or event flag, it will never appear in a final data product (i.e., x1dsum or x1dsumN) unless the modules which screen for it are turned off or the reference files which define them are changed. Events with a spatial DQ flag are included in the calibrated product, and flagged in the final DQ array. The screening for the spatial flags can be easily altered by changing the SDQFLAGS keyword in the header of the raw data file.  The default value for SDQFLAGS for the FUV is 8346, and it is 152 for the NUV.

The DQ extension of raw ACCUM files will be filled only when there are missing (data lost) or dubious (software error) data. If no such errors exist, initialization will produce an empty data quality extension whose header has NAXIS=0. These flags are set and used during the course of calibration, and may likewise be interpreted and used by downstream analysis applications. See Section 3.4.16 for more information on the data quality initialization calibration module.

Table 2.19: COS Data Quality Flags

FLAG Value

Bit Setting

Quality Condition

Type

FUV/NUV


0000 0000 0000 0000

No anomalies

N/A

Both

        1

0000 0000 0000 0001

Reed-Solomon error

Lost data

Both

        2

0000 0000 0000 0010

Hot Spot

Spatial and Temporal

FUV

        4

0000 0000 0000 0100

Detector shadow

Spatial

Both

        8

0000 0000 0000 1000

Poorly calibrated region (including detector edge)

Spatial

Both

      16

0000 0000 0001 0000

Very low response region (>80% depression)

Spatial

Both

      32

0000 0000 0010 0000

Background feature

Spatial

FUV

      64

0000 0000 0100 0000

Burst

Temporal

FUV

    128

0000 0000 1000 0000

Pixel out-of-bounds

Spatial

Both

    256

0000 0001 0000 0000

Fill data

Lost data

Both

    512

0000 0010 0000 0000

Pulse Height out of bounds

Event

FUV

  1024

0000 0100 0000 0000

Low response region (>50% depression)

Spatial

Both

  2048

0000 1000 0000 0000

Bad time interval

Temporal

Both

  4096

0001 0000 0000 0000

Low PHA feature

Spatial

Both

  8192

0010 0000 0000 0000

Gain-Sag Hole

Spatial

FUV

16384

0100 0000 0000 0000

FUV detector edge dark rates

N/A

N/A

Note 1: Flag values in bold italics (e.g., 128) are used in SDQFLAGS.

Note 2: Additional information on detector edge dark rates  may be found in: COS ISR 2019-11.

2.7.3 Explanation of DQ flags

The DQ flags that are listed in Table 2.19 are the flags that represent a particular data quality feature to be aware of.  Each DQ flag has an assigned bit, so they each have unique values that can be combined and then disentangled.  Values in the _x1d DQ array that aren't listed in the table represent multiple DQ flags at that pixel.  To understand which DQ flags are combined into the final DQ, bitwise math needs to be performed, or checking the numbers in binary. For example, a DQ value of 1040 in binary is 0000 0100 0001 0000.  Using Table 2.19, we can see that the 1s align with DQ Flag 16 (0000 0000 0001 0000), a very low response region, and DQ Flag 1024 (0000 0100 0000 0000), a low response region.  To confirm, 1024 + 16 = 1040, so those are indeed the flags that went into the final DQ flag at that pixel.

Alternatively, to isolate a DQ value such as the gain-sag holes (DQ flag = 8192), one can look for all of the values in the DQ array that contain 8192 using bitwise math. The syntax for this in python is: dq_array&8192 == 8192.  This syntax will be slightly different depending on the programming language you use, but should be generally similar.  If you are a python user, there is also a function in numpy called bitwise_and, which does the same thing.

The DQ_WGT column takes into account only those DQ flags that are contained within the SDQFLAGS value.  This is a header keyword for which the value is set in the first science extension of the file headers.  Pixels that contain any of the DQ values within the SDQFLAGS value will have a DQ_WGT of 0.  Everywhere else, the DQ_WGT will be 1 in an _x1d file.  As a note, the DQ_WGT can be greater than 1 in the x1dsum files, as it combines the DQ_WGT arrays from the individual files. 

For the FUV, the following “serious” DQs (SDQs) are by default flagged for removal from data using SQDFLAGS=8346: hot spots (2), poorly calibrated regions (8), very low response regions (16), pixel-out-of-bounds (128), and gain-sag-holes (8192). For the NUV, the following SDQs are by default flagged for removal from data using SDQFLAGS=152: poorly calibrated regions (8), very low response regions (16), and pixel out-of-bounds (128). The SDQFLAGs value is different in the NUV and the FUV simply due to differences in the detector. 

We recommend that users not remove any of the aforementioned SDQs from the SDQFLAGS value. Doing so will result in these bad regions not being removed from data, which significantly affects data quality.  To flag additional DQs for removal beyond the default SDQs, users should add the DQ value in question to the current SDQFLAGS value. For example, to also remove all background features (DQ = 32) from FUV data, one would add 32 + 8346, and update the headers in the _rawtag or _rawaccum file to SDQFLAGS = 8378. The file should then be calibrated through calcos again. The results should be that the DQ_WGT will now equal 0 wherever there is a DQ 32 flag (a bitwise and of 32 mentioned above) in addition to wherever it was 0 before.