3.6 Customizing COS Data Calibration

Sometimes the pipeline calibration performed shortly after the data were obtained from the telescope is not the best possible calibration for your science program. There are a number of reasons why it may be desirable to recalibrate your data. The most likely reasons include:

  • More appropriate reference files have become available since the data were originally obtained.
  • Some steps need to be repeated with different input parameters. For example, you may wish to re-perform the 1-D spectral extraction with a smaller BOXCAR height in order to minimize the background, or you may wish to cut a TIME-TAG exposure into sub-exposures, in order to study time variability.

In the first case, we recommend simply re-requesting the data from the archive, providing a reduction produced with the latest reference files. In the second case, to tailor the calibration to your individual preferences, it may be beneficial to run calcos yourself on your local machine, or to use tasks that improve the reference files or allow customized treatment of the data. Calcos is imported and executed within Python.

Be sure you are using the latest versions of the calcos, COS calibration files, and raw data files (which list the latest reference files in their headers). Calcos release information can be found at http://www.stsci.edu/hst/instrumentation/cos/documentation/calcos-release-notes/.

Calcos contains provisions for recalibrating raw data. Users can specify the pipeline processing steps to be performed and select the associated reference files. However, calcos was not designed to run its various modules independently, i.e., it is not re-entrant. The pipeline flow is modified by setting calibration switches or reference file names and then rerunning the entire pipeline. The calibration switches in the headers of the calibrated data files will reflect the operations performed on the calibrated data and the reference files used. 

Users are encouraged to use the Jupyter notebooks for COS.  A list of those available is given in 5.1 Data Reduction and Analysis Applications.

3.6.1 Mechanics of Tailored Recalibration

If you chose to recalibrate your COS data on your local machine, there is a certain amount of set up required for calcos to run properly. The operations mentioned in the checklist below will be described in detail in the following subsections:

  • Set up a directory structure for the required reference files.
  • Determine which reference files are needed and retrieve them from the Archive.
  • Set the environment variable lref to point to your reference file directory.
  • Update the input data file headers (including reference file names).
  • Set the calibration switches in the headers of the raw data files to perform the needed steps. The default calibration switches are listed in Table 2.16 and Table 2.17.
  • Update the input association files if changing files to be included.
  • Run calcos.

Set up the Directory Structure for Running calcos

Before running calcos, you will need to define an environment variable to indicate the location of the directory containing the needed calibration reference files. The names of the calibration files are preceded with the logical path name "lref$" in the COS science headers. You will need to define an environment variable from the host command line (see below) that is appropriate to your host machine. For Unix/Linux/Mac systems, the appropriate command for a .bashrc file and the directory "/data/vega3/cos/cal_ref/," for example, would be:

     % export lref=/data/vega3/cos/cal_ref/

Note that an alternative to using the lref$ variable is specifying the full pathnames to the reference files in the science headers, however there is a limit in the length of these pathnames.

When running calcos or any of its modules, you must define environment variables (such as lref$).

Retrieve Reference Files

To recalibrate your data, you will need to retrieve the reference files used by the different calibration steps to be performed. The names of the reference files to be used during calibration must be specified in the primary header of the input files, under the section "CALIBRATION REFERENCE FILES." Note that the data headers will be populated already with the names of the reference files used during pipeline calibration at STScI.

The COS reference files are all in FITS format, and can be in either IMAGE or BINTABLE extensions. The names of these files along with descriptions of their contents are given in Section 3.7. The rootname of a reference file is based on the time that the file was delivered to the Calibration Reference Data System (CRDS).

Chapter 1 of the Introduction to HST Data Handbooks describes how to retrieve data and reference files via the World Wide Web. To retrieve the best reference files via MAST (generally meaning the most recent reference files), check "Best Reference Files" in the "Reference Files" section of the Retrieval Options form. The reference files can also be downloaded from CRDS at https://hst-crds.stsci.edu/.

Edit the Calibration Header Keywords

To edit file headers in preparation for recalibration, use the astropy.io.fits convenience function setval(). The setval() function takes several input parameters: the name(s) of the raw data files to be edited, the header field to be edited, and the new value of the header field. It can be used to change the values of any calibration switches, reference files or tables to the values you wish to use for recalibrating your data. To edit the calibration keyword values:

  1. Run the setval() function in the astropy.io.fits module. This can be done on a single file, or in a for-loop to update several files. For example, you could change the flat reference file in the following way:  

    > from astropy.io import fits
    > fits.setval('filename_rawtag_a.fits', 'flatfile', \
  2. To update several files at once, use the glob module in the glob package to select all the raw files using wildcards, and then use a for-loop as shown below: 

    > from astropy.io import fits
    > import glob
    > rawfiles = glob.glob('*raw*.fits')
    > for myfile in rawfiles:
    >     fits.setval(myfile, 'flatfile', \
          value='lref$n9n201821_flat.fits', ext=0)

    Similarly, to turn off the FUV burst calibration switch use the command: 

    > from astropy.io import fits
    > fits.setval('filename_rawtag_a.fits', 'brstcorr', \
          value='OMIT', ext=0)

Edit the Input Association File

Users may find it necessary to edit the input association file for calcos. Reasons for editing an association file might include the use of a different wavecal or to remove a compromised exposure from an association. For this option, the full file name (but not the directory) must be given, and the case must be correct. One way to update an association file is to use the astropy.io.fits and astropy.table.Table modules. For example, use the Table.read() function to open and look at the contents of the association table ldel05050_asn.fits.

> from astropy.table import Table
> t = Table.read('ldel05050_asn.fits')
> print(t)
<Table length=5>
  str14    str14    bool  
--------- ------- --------
LDEL05JYQ  EXP-FP     True
LDEL05K0Q  EXP-FP     True
LDEL05K2Q  EXP-FP     True
LDEL05K4Q  EXP-FP     True
LDEL05050 PROD-FP     True

To quickly see basic exposure information for a list of exposures, such as those in the association, use the glob package, a for-loop, and the getheader() convenience function in the astropy.io.fits module:

> from astropy.io import fits
> import glob
> raw_files = glob.glob('ldel05*raw*.fits')
> for myfile in raw_files:
>     hdr0 = fits.getheader(myfile)
>     print(hdr0['filename'], \
>     hdr0['detector'], hdr0['aperture'], \
>     hdr0['opt_elem'], hdr0['cenwave'], \
>     hdr0['obsmode'], hdr0['fppos'])
ldel05jyq_rawtag_a.fits FUV PSA G160M 1611 TIME-TAG 1
ldel05k0q_rawtag_a.fits FUV PSA G160M 1611 TIME-TAG 2
ldel05k2q_rawtag_a.fits FUV PSA G160M 1611 TIME-TAG 3
ldel05k4q_rawtag_a.fits FUV PSA G160M 1611 TIME-TAG 4

To remove a member from association, ldel05050_asn.fits, use the modules astropy.io.fits and astropy.table.Table to read in the association and edit the table as follows:

>from astropy.io import fits
>from astropy.table import Table
>hdulist = fits.open('ldel05050_asn.fits',mode='update')
>tbdata = hdulist[1].data
--------- ------- --------
LDEL05JYQ  EXP-FP     True
LDEL05K0Q  EXP-FP     True
LDEL05K2Q  EXP-FP     True
LDEL05K4Q  EXP-FP     True
LDEL05050 PROD-FP     True
>tbdata['MEMPRSNT'][0] = False
>tbdata['MEMPRSNT'][1] = False
--------- ------- --------
LDEL05JYQ  EXP-FP    False
LDEL05K0Q  EXP-FP    False
LDEL05K2Q  EXP-FP     True
LDEL05K4Q  EXP-FP     True
LDEL05050 PROD-FP     True

Finally, reprocess the data by running calcos on the updated association file.

Run calcos

In stenv, users may choose between two methods to run calcos using Python or the Unix/Linux/Mac command line. The input arguments and examples for each case are as follows:

  1. To run calcos in Python:   

    > import calcos
    > calcos.calcos(’filename_asn.fits’, verbosity=2, \ outdir="new")
  2. To run calcos from the Unix/Linux/Mac command line: 

    % calcos -o new --stim stim.txt filename_asn.fits

Table 3.2: Arguments for Running calcos in Python.







"  "

Association table (asn) or individual raw file (rawtag, rawaccum) to be processed


directory name


The name of the ouptut directory


0, 1, 2


0=quiet, 1=verbose, 2=very verbose


True or False


Have calcos find the spectrum location and center the extraction box on that location


True or False


If True, write an image that reflects the counts detected at each pixel (includes deadcorr but not flatcorr), for OPUS to add to the cumulative image.




File containing wavecal shifts (will override shifts calculated by calcos)


True or False


Save temporary files: x1d_a, x1d_b, lampflash_a, and lampflash_b




If specified, the stim pulse positions will be written to (or appended to) this text file




If specified, the livetime factors wil be written to (or appended to) this text file




If specified, burst information will be written to (or appended to) this text file


True or False


If True, use raw pixel coordinates (rather than thermally and geometrically corrected) to create the csum image.


True or False


If True, create a csum image, but most other files will not be written.

binx, biny

int or None


Binning factor for the X and Y axes, or None, which means that the default binning (currently 1) should be used.


True or False


If True, compress the "calcos sum" image.




Two values separated by a comma; the first is the compression type (rice, gzip or hcompress), and the second is the quantization level.

Table 3.3: Command-line Options for Running calcos in Unix/Linux/Mac.




Print the version number and exit


Print the full version string and exit




Very verbose


Save temporary file

-o outdir

Output directory

--find yes

Have calcos find Y location of spectrum

--find no

Extract spectrum at default location

--find cutoff

Find Y location if sigma <= cutoff

--shift filename

File to specify shift values

--stim filename

Append stim pulse locations to filename

--live filename

Append livetime factors to filename

--burst filename

Append burst information to filename


Create 'calcos sum' image


Do little else but create csum


Use raw coordinates for csum image

--compress parameters

Compress csum image

--binx X_bin_factor

csum bin factor in X

--biny Y_bin_factor

csum bin factor in Y

To redirect the calcos STDOUT to a file use the following command:

% calcos -v -o new filename_asn.fits > log.txt

While we recommend that users run calcos on association files, it is possible to run calcos with a single raw or corrtag file as the input. In this mode, calcos will always automatically process both segment files for FUV data if they both exist. For example if rootname_rawtag_a.fits is the input for calcos, then rootname_rawtag_b.fits will automatically be processed. The data from both segments will be calibrated and combined to create the final product, rootname_x1d.fits.

Running calcos on rawtag or corrtag files instead of the asn file will cause the FUVB-only blue modes (G130M cenwaves 1055 and 1096) to be calibrated without the associated segment A EXP-IWAVE file contained in the asn.

3.6.2 Using GO Wavecals

Through the use of associations, calcos also contains a provision to select wavecals other than the default for calibration of the science exposures. To use an exposure other than or in addition to the default wavecal, the user can add a row to the association table. The rootname (case insensitive) should be given in the MEMNAME column, the string EXP-GWAVE in the MEMTYPE column, and the value in the boolean MEMPRSNT column set to true. Make sure that the WAVECORR keyword in the primary header of the raw science file is set to PERFORM, and then run calcos as normal. Note that GO wavecals can only be used with non TAGFLASH data.

3.6.3 Customizing the TWOZONE extraction

The TWOZONE extraction algorithm and the associated pipeline reference files are optimized for the spectral extraction of bright point source spectra. There are, however, a number of circumstances under which a customized extraction might yield better results.

  • For extended sources the use of the TWOZONE algorithm may lead to an underestimate of the measured flux and/or poor alignment of the extraction region with the source.
  • For very faint point or extended sources, the ALGNCORR step may not always be able to reliably measure the position of the source, and may default to assuming that the target is already aligned with the reference profile.
  • For some very faint point sources it may be possible to significantly improve the signal-to-noise by reducing the extraction height to minimize the included detector background.

If the user simply wishes to use the BOXCAR extraction in place of the TWOZONE algorithm, XTRCTALG should be set to "BOXCAR," and TRCECORR and ALGNCORR should be set to "OMIT." This will use the larger extraction regions defined for that algorithm. However, for observations at LP3 there may be significant overlap with the gain-sagged regions near LP1, and this may affect the accuracy of the calibration or even create artificial spectral features. This is similarly true of LP4 overlapping LP3, although to a lesser extent as LP3 is not as severely gain-sagged as LP1.

Both the XTRACTAB, which is used with the BOXCAR algorithm, and the TWOZXTAB, which is used with the TWOZONE algorithm, contain columns named HEIGHT and B_SPEC. For the BOXCAR algorithm, these parameters together with the SLOPE column directly control the size and location of the extraction region. For the TWOZONE algorithm, the HEIGHT and BSPEC numbers instead control the size and initial location of the region used for the ALGNCORR step. The HEIGHT column in the TWOZONE algorithm is also used to define the cross-dispersion width of the reference profile that is assumed to include 100% of the enclosed energy. The actual extraction region at each wavelength is adjusted so that the enclosed energy fraction of the reference profile matches the values given in the LOWER_OUTER and UPPER_OUTER columns of the TWOZXTAB. For example, if LOWER_OUTER=0.005 and UPPER_OUTER=0.995, at each wavelength the extraction region will be adjusted so that the central 99% of the encircled energy as measured from the reference profile is included. Fractional pixel locations are rounded outwards, and the final extracted flux will be scaled for the exact encircled energy fraction in each column.

To force a spectral extraction using the TWOZONE algorithm to sum over a region that contains only the central 80% of the reference profile's encircled energy, the user would just need to change LOWER_OUTER to 0.1 and UPPER_OUTER to 0.9 in the appropriate row of the TWOZXTAB prior to recalibration of the data.

Values of 0 or 1 for the enclosed energy boundaries have a special meaning. Setting the lower boundary to a value of 0 forces the extraction to start at the bottom of the region defined by a rectangular box of size HEIGHT, while setting the upper boundary to 1, forces it to end at the upper boundary. This can be used to give a rectangular extraction box rather than the wavelength dependent extraction region normally used for the TWOZONE algorithm. For a very extended target, it might be useful to force the use of the full height box, and also increase the HEIGHT allowing a further expansion of the extraction region.

The LOWER_INNER and UPPER_INNER columns in the TWOZXTAB behave very similarly to the "OUTER" boundaries, except that they are used to control the region over which data quality flags are combined rather than the region over which counts are summed. The user can also adjust these values.

The background regions in the TWOZONE algorithm are handled in a simpler fashion. To change where the background regions are located or the height of the background regions, edit the background centers (B_BKG1 and B_BKG2) and the background height (BHEIGHT). The background regions should not be placed directly above the spectrum at LP3, as that is where LP1 is located, and the detector is therefore very gain-sagged in that location. Similarly, the background regions should not be placed directly above the spectrum at LP4, as that is where LP3 is located. Also ensure that the background regions do not overlap the WCA (location found in XTRACTAB). See Figure A.1 in Appendix A for regions with low levels of gain sag.

The user can also override the shifts calculated by ALGNCORR. This can be useful if the automatic algorithm failed to properly center the target. To do this, the user should set the keyword SP_SET_A, (for detector segment FUVA), or SP_SET_B, (for FUVB), to the desired offset value which will be used in place of the SP_OFF_A or SP_OFF_B value calculated by the ALGNCORR algorithm. These keywords should be set in the extension header of the rawtag or corrtag file used as input for calcos.