5.3 Working with Extracted Spectra
Here we discuss how to customize the extraction of spectra and modify reference files that control the extraction process with calcos. Please note that STScI supports the FITS module located in the Python astropy.io package. This module provides the same core functionality as the earlier module PyFITS. Any script that is uses PyFITS can be modified by importing FITS into Python as follows:
from astropy.io import fits as pyfits
Information on the FITS module can be found on the astropy webpage.
5.3.1 Working with
x1d Files in Python
When calibrating a single spectroscopic exposure, the calcos pipeline creates a one-dimensional extracted spectrum file, with suffix
x1d and a filename such as "
x1d files are MEF format files and their data contents and extension formats are discussed in Section 2.4.3. As with other COS data files, the primary  header will contain only header information and no data. The extracted spectra are stored in a single [SCI] extension as a multi-dimensional binary table. A standard FITS table consists of columns and rows forming a two-dimensional grid of cells; however, each of these cells can contain a data array, effectively creating a table of higher dimensionality.
Using the "Selectors Syntax" to work with tables
In order to analyze COS tabular spectral data with most Python modules, one can use the selectors syntax to specify the desired row and column (e.g., the wavelength or flux). The general syntax for selecting a particular cell is to first select the desired segment, and then select the column:
Similarly, one could use the where() module in numpy to select the segment:
Dumping x1d data to an ASCII File
It is possible to dump the arrays of an
x1d file to an
ASCII file. For example, to extract the WAVELENGTH, FLUX, ERROR, and DQ_WGT columns of FUV
This will create a new 2-D
ASCII text file,
Each row of the science extensions in an
x1d file will contain the columns listed in Table 2.5; a similar table, including array dimensions, can be displayed by using the Table module in astropy.table.
When using many Python routines with
x1d files as input, it will be necessary to specify the extension number of the file. For example, to plot flux vs. wavelength for both segments in an
x1d file using the plot module in matplotlib.pyplot:
In this example, we explicitly reference extension 0 and 1 to obtain segment FUVB and FUVA respectively. One can verify this by printing the segment names:
For FUV data, the
x1d files contain both1 segments A and B. To plot flux vs. wavelength in an FUV
x1d file for segment A or segment B, using the plot module:
For NUV data, the
x1d files contain the three stripes A, B, and C. To plot flux vs. wavelength in an NUV
x1d file for all three stripes sequentially:
5.3.2 Redoing Spectral Extraction
The x1dcorr module in calcos is designed to extract flux calibrated 1-D spectra from corrected COS event lists (
corrtag files). This module is called by calcos as part of standard pipeline processing; its functioning in that role is described in Section 3.4.18.
Correcting for Shifts Along the Dispersion Direction
Properly aligning the spectrum along the dispersion direction is important not only for obtaining the correct wavelength solution, but also for properly applying the flux calibration. Incorrect registration of the spectrum will result in the wrong sensitivity being applied at each wavelength. This is especially important for low resolution spectra, since at some wavelengths the sensitivity changes rapidly with wavelength. The throughput also changes rapidly for the blue modes.
rawtag exposures the wavecal lamp exposures are taken either concurrently with the science
rawtag spectra (
TAGFLASH) or they are acquired as separate
rawtag spectra (
GO wavecals). For all science
rawaccum exposures the wavecals are acquired as separate
The wavecal exposures are used by calcos to determine the location of both the wavecal image and the corresponding science image on the detector. The locations may vary in a non-repeatable manner due to non-repeatability of the COS grating positions, but are always at a fixed position from one another. When auto-wavecals are acquired as separate exposures they are taken close in time to the science exposures, with the grating in the same position as during the science exposure.
After processing data through calcos, you may decide that you need to shift the spectrum along the dispersion direction to correct offsets in the wavelength calibration. For example, wavelength calibration offsets may occur due to offsets of the target from the center of the PSA aperture (which can occur if the target acquisition was imperfect), or from drift of the grating due to thermal flexure. Assuming that calcos has been run on the data and a residual wavelength offset has been found in the calibrated spectrum, the offset can be corrected by first calculating the number of pixels corresponding to the offset, then subtracting it from the CORR or FULL position of coordinates along the dispersion direction. In the example below, the shift is applied to the XFULL column of the
a,b) file. In Python, the corrtag can be modified to apply the shift as follows:
In the example above, the XFULL positions of the science spectrum in an FUV A
corrtag file have been moved by "SHIFT" pixels. Also, the header was updated such that
X1DCORR will be performed again. The
corrtag can then be run through calcos again if desired.
There is also a terminal command line option to apply a SHIFT1 value that is different than the one calcos finds internally. The values need to be in a separate file with the following columns included in the file: Dataset rootname, fpoffset, flash number, stripe or segment, desired SHIFT1 value, and desired SHIFT2 value. Calcos will recognize "any" as a value for fpoffset if the rootname is unique (not the association rootname). It will also recognize "any" for the flash number and stripe and segment. While SHIFT1 values (dispersion direction shift) must be specified in the file, specifying SHIFT2 values (cross-dispersion direction shift) is optional. This option can be called in the terminal as:
calcos –shift shift_filename.txt rawtag.fits
or in Python after importing calcos as:
Adjusting the Background Subtraction
For spectra, background regions offset from the extraction region are used to determine the background. You can adjust the default parameters for this background region by first copying the
_1dx reference file listed under the
XTRACTAB keyword in the primary header (or
_2zx reference file if performing a two-zone extraction) to a local directory, then adjusting the background parameters within the local version of the
_2zx reference file. Once you have adjusted the parameters to your satisfaction, edit the primary header of the
rawtag file to indicate the path to the local version of the
_2zx file. You can then run calcos with the updated background subtraction parameters.
Examples of customizing extraction box heights are provided in Appendix A of COS ISR 2017-03. The background parameters available for editing in the
_2dx files are described in Section 3.4.19.
The header keywords in the data files to be processed need to point to the updated
_2zx reference files, and the headers scan be edited with the setval() function, part of the astropy.io.fits module in stenv. Below is an example modifying the header to point to a custom
5.3.3 Splicing Extracted Spectra
The IRAF task splice can be applied to combine overlapping extracted COS spectra (e.g., spectra taken with different central wavelengths). Information about using IRAF and the STSDAS analysis software can be found in Chapter 3 of the Introduction to HST Data Handbooks. Users should be aware of differences in spectral resolution at a given wavelength between different cenwaves. Splice takes into account the error (ERR) array as well as the data quality (DQ) array. Handling of the DQ array is important as it helps splice perform the combination properly and avoid bad or noisy data in the output file arising from the large changes in throughput at the edges of the detector.
cl> splice obs1_x1d.fits,obs2_x1d.fits output_splice.fits
Please refer to the splice task help file for more useful information. If a multispec format spectrum is preferred for further analysis, the task tomultispec can be run on the output file of the splice task.
Running splice as mentioned above (rather than transforming individual
x1d fits tables into multispec format before combining them) has important advantages: it keeps the science data, error, and DQ arrays intact allowing for easier error analysis, and it does not have a limitation on the number of segments or wavelengths to include, which is the case with the multispec format. This limitation is caused by the size limit of the FITS header, which requires the wavelength scale to be fit with a function. We note that a Python replacement for the IRAF task splice is under development, and IRAF is no longer actively maintained.
1 For FUV
x1d files, both segments A and B will be present as long as the individual raw data from both segments were available at the time of processing. If only one segment was present during processing, then a row selector of row=0 will point to the data from that segment. Similarly, a row selector of row=1 will result in an error.
COS Data Handbook
- Chapter 1: COS Overview
- Chapter 2: COS Data Files
- Chapter 3: COS Calibration
- Chapter 4: COS Error Sources
- Chapter 5: COS Data Analysis
- Appendix A: COS Lifetime Positions
- Appendix B: COS High Voltage History
- • List of Figures
- • List of Tables