5.5 Working with Extracted Spectra

Here we discuss ways of customizing the extraction of spectra from _flt, _sfl, or _crj images. It is assumed that you have already performed any of the recommended recalibration described in Section 3.5.

5.5.1 Working With _x1d Files

When used to calculate one-dimensional extracted spectra, the calstis pipeline and the stistools task x1d task puts the output into an _x1d file with a name like “o5jj01010_x1d.fits”. Note that when the input spectral image is a combination of multiple subexposures, such as a combined cosmic ray rejected image, the file name will contain the string “sx1” in place of “x1d”, e.g., “o5ja06050_sx1.fits”. When x1d is used as a standalone task, the default output file’s name will always end in “_x1d.fits”.

The _x1d file will be a multi-extension FITS file. As with other STIS data files, the primary [0] extension will contain only header information, but no data. The extracted spectra will be stored in the [SCI] extension(s). There will be one [SCI] extension in the _x1d file for each separate [SCI] extension in the input image. The science extensions of the _x1d file are multidimensional FITS tables. STIS spectra are stored as binary arrays in FITS table cells. Section 2.3.2 discusses this format and describes the selectors syntax used to specify these data arrays. Each row of the table contains the extracted spectrum for a single spectral order. For first order spectra, there is only a single row.

Each row of each of the science extensions in an _x1d file will contain the columns listed in Table 5.2; a similar table, including array dimensions, can be displayed by using the astropy.table (see Section 5.1.2). The SPORDER column lists the spectral order number. For first order spectra, this is of course always 1. The NELEM column lists the number of elements in each of the data arrays. This will most commonly be 1024, although it may be less in the case of CCD data binned along the dispersion direction, or more in the rare situations when the user opts not to bin MAMA high-res pixels down to the native pixel scale. The WAVELENGTH column will be an array of NELEM values giving the calibrated wavelength vector for the spectral extraction in units of Angstroms. The GROSS column gives the summed counts within the extraction region for each pixel along the dispersion direction. The BACKGROUND gives the background vector that was found during the BACKCORR step of the processing, and NET=GROSSBACKGROUND.

The FLUX column is an array that gives the extracted flux for the point source spectrum. The units are erg/s/cm2/Å. The ERROR array gives the propagated error vector, including Poisson noise, dark noise and readnoise, and includes the propagated error due to the noise in the background. It normally has the same units as the flux array; however, if for some reason, the FLUXCORR step of the processing did not run, the FLUX values will all be zero, and the ERROR array will be in units of counts/s rather than erg/s/cm2/Å.

The A2CENTER value gives the row number in the y direction at which the spectral trace is centered. The EXTRLOCY column is an array that gives the location of the center of the spectral trace for each pixel along the dispersion direction. The BK1SIZE, BK2SIZE, BK1OFFST, and BK2OFFST columns give the sizes of the background extraction regions, and the amount by which the centers of the background regions were offset from the spectral trace. EXTRSIZE give the height of the extraction box in the cross dispersion direction. MAXSRCH gives the range that was allowed for the cross correlation search that located the spectral location, while OFFSET gives the difference between the starting position of the search and the A2CENTER value where the spectrum was actually found.

The nominal size of a pixel in the cross dispersion direction in units of arcsec can be found in the PLATESC keyword in the primary header of any STIS data file. For example, for the file o5jj01010_x1d.fits, typing the following command, will give the result 0.024600023:

>>> from astropy.io import fits
>>> hdu = fits.open("o5jj01010_x1d.fits")
>>> print(hdu[0].header['PLATESC'])
>>> hdu.close()

Note that the plate scale of a geometrically corrected output image is given in the SCALE column of the IDCTAB.

For most first order STIS modes, the plate scale in the dispersion direction is very close to that in the cross dispersion direction. The differences are largest for the medium resolution CCD first order modes. Results for these are detailed in STIS ISR 98-23.

Table 5.2: Science Extension Binary Table Columns

Column name

Format

Units

SPORDER

short integer

dimensionless

NELEM

short integer

dimensionless

WAVELENGTH

double array

Angstroms

GROSS

real array

Counts/s

BACKGROUND

real array

Counts/s

NET

real array

Counts/s

FLUX

real array

erg/s/cm2/Angstrom

ERROR

real array

erg/s/cm2/Angstrom

DQ

short integer array

dimensionless

A2CENTER

real

pixel

EXTRSIZE

real

pixel

MAXSRCH

short integer

pixel

BK1SIZE

real

pixel

BK2SIZE

real

pixel

BK1OFFST

real

pixel

BK2OFFST

real

pixel

EXTRLOCY

real array

pixel

OFFSET

real

pixel


When working with the _x1d files in the Python environment, it is important to be aware that the spectral data for a given column (e.g., wavelength) is contained in rows, even when the number of rows is only one, as it is for first order spectra. For example, the command len(data['WAVELENGTH'returns 1 for first order spectra indicating the number of rows, whereas len(data['WAVELENGTH'][0]returns 1024 which is the length of the wavelength array for that row. Below is an example for plotting flux vs. wavelength for the first spectral order included in an _x1d file with commonly used Python libraries.

>>> from astropy.io import fits
>>> from matplotlib import pyplot as plt

# Grab data from the science extension of a fits file
>>> hdulist = fits.open('obc410010_x1d.fits')
>>> sci_data = hdulist['SCI'].data

# For an x1d file, extract the wavelength and flux separately using keywords
>>> wavelength = sci_data['Wavelength'][0]
>>> flux = sci_data['Flux'][0]

# Plot Wavelength vs Flux
>>> plt.plot(wavelength,flux)

>>> hdulist.close


The following example shows one way of plotting flux vs. wavelength for an echelle spectrum in an _x1d file:

>>> from astropy.io import fits
>>> from matplotlib import pyplot as plt

>>> d = fits.getdata('odpce4050_x1d.fits', ext=1)

>>> fig, ax = plt.subplots(1,1)
>>> fig.set_size_inches(15,5)

>>> for i, order in enumerate(d):
        ax.plot(order['WAVELENGTH'], order['FLUX'], alpha=0.5, color='C{}'.format(i % 2))
        ax.set_xlabel('Wavelength (Å)')
        ax.set_ylabel('Flux (ergs / s / cm$^{2}$ / Å)')
        ax.set_ylim(0, 0.6e-10)

5.5.2 Using the stistools Task x1d to Recalibrate Data

The stistools task x1d task is designed to extract flux calibrated 1-D spectra from unrectified STIS spectral images (_flt, _crj, or _sfl files). It serves as a front end to the calstis6 routine used in pipeline extraction. The original implementation of this code and the algorithms used are described in STIS ISR 99-03.


The x1d task is not intended for use with rectified files (_x2d or _sx2 files), or files that have been processed by the wx2d task. It can be used for either echelle or first order STIS spectra; x1d will extract a single spectrum when used with first order spectral images and a spectrum for each order when used with echelle spectral data. This task is called by calstis as part of standard pipeline processing; its functioning in that role is described in Section 3.4.24.

Users should remember that the default parameters for the x1d task will not always match the default calibration flags set in the header keywords. When running x1d as a stand alone task, the task parameters will control which options execute, while when running calstis, the header keywords will control which steps execute. This may result in different results when using the same file as input for x1d vs. calstis. For example, STIS echelle files retrieved from the OTFR pipeline have SC2DCORR set to ’PERFORM’, but the x1d default of algorithm = unweighted is equivalent to setting SC2DCORR to ’OMIT’. It would be necessary to set algorithm = sc2d in stistools.x1d to reproduce the pipeline default.

The x1d task can be run to recalibrate spectra by using the following command:

>>> from stistools import x1d
>>> x1d.x1d('o5ja03030_flt.fits')

The output spectrum will be written in the file "o5ja03030_x1d.fits". For echelle data, the user will usually want to set the algorithm parameter of the x1d task to sc2d to select the deconvolution algorithm for computing and subtracting scattered light in echelle observations. Recall that calstis instead uses the value of the keyword SC2DCORR (PERFORM or OMIT) to turn the scattered light correction on or off. (see Section 3.4.20.)

We will not attempt to describe all of the options of the x1d task here. Instead we will concentrate on the options that users are likely to want to adjust to customize the extraction of their data.

Correcting for Shifts Along the Dispersion Direction

Properly aligning the spectrum along the dispersion direction is important not only for obtaining the correct wavelength solution, but also for properly applying the flux calibration. Incorrect registration of the spectrum will result in the wrong sensitivity being applied at each wavelength. This is especially important for low resolution & echelle spectra, for which the sensitivity changes rapidly with wavelength.

Auto-wavecal exposures are generally used to determine the location of both the wavecal image and the corresponding science image on the detector. The location varies somewhat because of non-repeatability of the STIS grating positions and flexures of the STIS optical bench. Science exposures and their auto-wavecals are taken close in time and without intervening changes in the grating position to keep them at approximately the same location. If either an auto-wavecal or a user-inserted wavecal has been taken, the stistools task wavecal should be run to measure the shifts in the wavecal spectrum. This populates the SHIFTA1 and SHIFTA2 keywords in each SCI extension header of the science image. The x1d task then takes these shifts into account when calculating the dispersion relation. If there is no wavecal exposure, or if the shift calculated from the wavecal has not been applied, the spectrum may be offset in wavelength by as much as several pixels.

Shifts of the spectrum along the dispersion direction can result from spatial offsets of the target from the center line of the aperture, repositioning the grating, and thermal flexures. Users can change the adopted shift along the dispersion direction by either editing the value of the SHIFTA1 keyword in the SCI extension of the file used as input to the x1d task, or by using the xoffset parameter in the x1d task to add an additional shift. Increasing the SHIFTA1 value or specifying a positive value for the xoffset parameter, will result in a smaller wavelength value being assigned to each pixel. Applying a positive correction will cause the spectrum plotted as a function of wavelength to shift to the left. So if the target is offset by two pixels to the right of the aperture center line, then specifying xoffset=2 would correct for this offset, i.e.,

>>> from stistools import x1d
>>> x1d.x1d('o5ja03030_flt.fits', xoffset=2.)

Note that when using a non-zero xoffset parameter, the value of the SHIFTA1 keyword in the output table is not changed, although a text message is printed to standard output, and placed in a HISTORY keyword in the primary header of the output x1d file. e.g., “Offset of 10 low-res pixels added in dispersion direction”. SHIFTA1 and xoffset are always specified in units of unbinned pixels for the STIS CCD, and low-res pixels for the MAMA detectors.

Locating the Spectrum in the Cross-Dispersion (Spatial) Direction

The default procedure for finding the location in the cross-dispersion direction at which a spectrum will be extracted is described in the “locate the spectrum” part of Section 3.4.24. Normally this procedure will correctly locate the brightest spectrum in the desired search range. However, occasionally this cross-correlation procedure will find a noise spike instead of the true spectrum, or the true spectrum may be faint enough that the cross correlation does not find it. In addition, some first order spectral images may contain point-source spectra of multiple objects; the default cross correlation will only find the brightest one. Also, for echelle data the spectrum of the wrong order can be extracted if multiple orders end up falling within the search region of a given order. The region searched for a spectrum can be controlled by using the a2center and maxsrch parameters of the x1d task. When both these parameters are specified, the cross-correlation search will extend from row a2centermaxsrch to row a2center+maxsrch. If maxsrch is set to zero, the cross correlation will be turned off and a spectral extraction will be done at a2center. In the case of echelle spectral images, it is also possible to extract a single order at a time. In this case, it is possible to control the order that will be extracted at a certain position by using the sporder parameter together with the a2center and maxsrch parameters.

If several individual first order spectra are to be extracted from a single long slit STIS image, the x1d task should be run once for each extraction, with a2center set each time to approximately the mean row at which that spectrum appears, and maxsrch set to a value that is large enough to allow the cross-correlation algorithm to find the best centroid but small enough that the algorithm does not find the wrong spectrum. If individual targets are also offset from the aperture center in the dispersion direction, it will be useful to specify the appropriate xoffset values to obtain the correct dispersion relations. A different output file name should also be specified for each extraction. For example to extract the spectra of two closely spaced stars, one might do:

>>> from stistools import x1d
>>> x1d.x1d('o5ja06030_flt.fits', output='ngc6681_e.fits', xoffset=12.43, a2center=424.5, maxsrch=3)
>>> x1d.x1d('o5ja06030_flt.fits', output='ngc6681_f.fits', xoffset=16.16, a2center=436.2, maxsrch=3)

Changing the Default Extraction Box Height

For each pixel in the dispersion direction, calstis sums the values over the height of the spectral extraction box. Each endpoint of the extraction box may include a fractional part of a pixel; calstis scales the counts in the given pixel by the fraction of the pixel extracted. The default full height of the extraction box is taken from the EXTRSIZE column of the XTRACTAB table. For CCD first order spectra and echelle spectra, this default height is 7 pixels, while for MAMA first order and PRISM spectra, the default height is 11 pixels. This default value can be overridden by specifying a value for the extrsize parameter of the x1d task.

The default extraction box heights were chosen so that roughly 85% of the total flux at a given wavelength is contained in the default extraction box. For observations of point source targets with good signal-to-noise, there will normally be little reason to change these default values. However, for very faint spectra, the signal-to-noise may sometimes be improved by using a smaller extraction box that includes less background noise. Similarly, when working with first order spectral images of a crowded field, a smaller than default extraction box might be needed to minimize contamination from adjacent sources. If the total flux of an extended object is being measured, it may be desirable to use an extrsize larger than the default, in order to include flux from the entire object.

There are some issues the user should take into account when using a non-standard extraction box height:

  1. STIS CCD spectral modes are significantly undersampled in the spatial direction. This, combined with the tilt of the spectral trace on the detector, can lead to substantial undulations of the extracted spectrum if a single row is extracted. The undulations are out of phase from one row to the next, so they become increasingly insignificant as more rows are combined in the extraction. (See Section 5.4.3.)
  2. Throughputs for each mode are based on measurements made using the standard extraction box height and a standard aperture (52X2 for first order modes and 0.2X0.2 for echelle modes) for each mode. Fluxes derived for different apertures or extraction box heights also depend on the accuracy of the PCTAB and aperture throughput corrections.
  3. When converting from counts/s in the extraction box to total point source flux, a correction for the fraction of encircled energy in the extraction box at each wavelength is applied using information tabulated in the PCTAB. This correction depends on the grating, the aperture, and the central wavelength. However, for each grating, there are values in the PCTAB for only selected extraction box sizes. If the selected extrsize does not match one of these pre-tabulated values, the correction that most nearly matches the specified extrsize is applied; no attempt is made to interpolate the PCTAB correction to an intermediate extraction box size. It is therefore best to only choose extraction box sizes that match the pre-tabulated correction vectors in the PCTAB. For all gratings, PCTAB vectors are supplied for the eight extraction height values of 3, 5, 7, 9, 11, 15, 21, and 600 pixels, where the 600 pixel high extraction box is intended to represent the estimated “infinite aperture” encircled energy. For most gratings, an additional ninth PCTAB row is included, the height of which depends on the grating (see Table 5.3 and STIS ISR 98-01). These second-largest extraction boxes contain 99% of the signal and in some instances provide a more robust estimate of the total signal. Note that these encircled energy fractions were, for the most part, measured using only low dispersion spectra, which were then used to estimate the corrections for other modes.

    Table 5.3: 9th PCTAB Row Extraction Box Size

    Grating

    Extraction Box Height

    FUV MAMA modes

    140

    NUV MAMA modes 
    (except PRISM)

    110

    G230LB/MB

    80

    G430L/M

    64

    G750L/M

    200

  4. The spectral purity declines significantly at increasing distances from the central trace of the spectrum. Using a larger than standard extraction box may therefore significantly degrade the spectral resolution.
  5. The observed flux is corrected to the total flux assuming that all of the observed flux originates in a point source. If the aperture width and/or the extraction box height are large enough to include a significant amount of extended flux after background subtraction, the correction will be inaccurate.

Adjusting the Background Subtraction

For first order spectra, two background regions offset from the extraction region are used to determine the background. A number of the parameters defining the way the background is derived can be adjusted by changing parameters of the x1d task. Note that for echelle data, when the background subtraction is done using the sd2c scattered light correction algorithm, most of these parameters do not affect the final background (see Section 3.4.20).

Changing the Background Smoothing Algorithm

The algorithm used for smoothing the background can be changed by adjusting the bksmode and bksorder parameters. The default value for the bksorder parameter is 3. Allowed values for bksmode are median (the default), average, and off. If set to off, no smoothing takes place. If set to either median or average, then background smoothing will be done. The detailed behavior depends on the mode for which the background smoothing is being done:

  • For CCD data, the raw background, after 1-D extraction, is smoothed by a running window low-pass filter (i.e. boxcar smoothing) with a 9 pixel window, then fitted by an nth degree polynomial. The value of the bksmode parameter defines the boxcar smoothing function to be either a median or average within the running window. The polynomial degree is defined by the bksorder task parameter. The default value for the bksorder parameter is 3.
  • For MAMA first order data, the 1-D extracted background is fitted by an nth degree polynomial, and the background is replaced by the fitted curve. The data is not filtered before the polynomial fit, and so for MAMA first order data, the result is the same whether bksmode is set to median or average. The polynomial degree is defined by the bksorder task parameter. For the G140L/M first order gratings, the regions around Lyman α and the 1300 Angstrom O I line are not smoothed.
  • For MAMA echelle when SC2DCORR is set to OMIT, the 1-D extracted background is smoothed twice by a running average with a 31 pixel window. This is equivalent to smoothing with a triangular filter.

Changing the Default Background Regions

The details of how the background regions are specified are described in Section 3.4.2. Two background regions are offset by amounts that are tabulated in the 1-D extraction table. This file also specifies the angle by which regions of constant wavelength are shifted as a function of offset. These values can be changed by adjusting parameters of the x1d task. The bk1size and bk2size parameters give the width of each background region, bk1offst and bk2offst give the offset from the center of the extraction region, and the bktilt parameter specifies the angle by which the background region at a given wavelength is offset. Leaving any of these parameters at the default INDEF value will cause the task to use the appropriate defaults from the 1-D extraction table. Note that users should rarely if ever have a reason to change the bktilt parameter.

Changing the Interpolation of the Background Values

The default of backord=0 which is implemented in the XTRACTAB for all modes, averages the values measured in the two background regions at each wavelength. In some cases, (e.g., other objects or detector artifacts contaminate one or both background regions), it might be desirable to chose asymmetrical offsets for the two background regions (bk1offst  bk2offst). Setting backord=1 would then interpolate between those two regions rather than taking a simple average. However, if both background regions were offset in the same direction, (this is the default for the E1 and E2 aperture positions where bk1offst=-300 and bk2offstT=-320), setting backord=1 would extrapolate the measured values. In such a case it may usually be better to leave backord=0, and just average the two regions.

TDS, CTE, and BZS Corrections

The FLUXCORR step, which converts net counts to flux units, now includes three additional corrections not envisaged in the original version of x1d. These are the corrections for time and temperature dependent sensitivity (TDS) changes applied to all spectroscopic modes, the correction for charge transfer efficiency (CTE) losses applied to CCD spectroscopic observations, and the blaze shift (BZS) correction applied to echelle spectroscopic data.

The TDS correction is calculated as a function of wavelength using coefficients taken from the TDSTAB. There is no flag for turning off the TDS correction, although if the TDSTAB keyword in the dataset header is set to a null value, the TDS correction will be skipped.

For the CTE correction, there is a parameter of the x1d task, ctecorr, which can be used to turn the CTE correction on or off. Note that the empirical CTE correction is not valid for data taken using sub-arrays, and so the correction is turned off by default for data processed with the full calstis pipeline, although it will be turned on by default when running x1d as a stand-alone task.

The CTE algorithm used for correcting STIS CCD spectra is described in STIS ISR 2006-01, STIS ISR 2006-03, as well as in the paper Goudfrooij et al. (2006, PASP, 118,1455). The BZS correction is calculated as a function of grating, order, and side of STIS operations (Side-1 or Side-2). It is based on the values of SHIFTA1 and SHIFTA2, as well as the day of observation as expressed in Modified Julian Days (MJD). Coefficients taken from the PHOTTAB are used for this purpose. There is no flag for turning off the BZS correction. However, x1d can be run with the value of the blazesh parameter equal to zero in order to skip this correction. In addition, the BZS correction can be customized (and the default BZS correction value applied can be overwritten) by setting blazesh to a certain number of pixels when running x1d as a stand-alone task. Only one value of blazesh can be specified at one time. In this case, all extracted orders will have the same BZS correction applied. We recommend the extraction of a single order at a time if a different value of blazesh is desired for every echelle spectral order.

5.5.3 Splicing Extracted Spectra

Users wishing to combine multiple spectra together to create a single one-dimensional spectrum (e.g., combining first-order spectra from observations using different gratings of combining echelle orders from a single observation) may do so with the IRAF task splice. Such task currently does not exist in Python, but work is being done to create a task similar to IRAF splice.

The task splice can be applied to both first order and echelle data. It takes into account the error (ERR) array as well as the data quality (DQ) array. Handling of the DQ array is important as it helps splice perform the combination properly and avoid bad or noisy data in the output file arising from the large changes in throughput at the edges of the detector. In order to use splice correctly, one has to manually mark these edges in the DQ array as “bad pixels” and tell splice to ignore these by setting the sdqflags parameter to the appropriate value (refer to Table 2.9for the data quality values). A flag value of 4 or 8 will work fine. The reason this does not work by default in splice at the moment is that the vignetted regions of the MAMA detectors change over time, and new bad pixel maps showing these regions as a function of time have not been produced. So at the moment, users need to modify the DQ arrays themselves. A simple way to do this is to edit the DQ array in the _flt image, flagging the edges as bad pixels, then re-extracting the 1-D spectrum using the x1d task in the stis package. For example, to set a 10-pixel border to a value of 4 (“bad detector pixel”) in the 1024 × 1024 image:

>>> import numpy as np
>>> from astropy.io import fits

>>> hdu = fits.open('obmj01050_crj.fits') # Pick your file
>>> dqdata = hdu['DQ',1].data # Pick your header layer

# Set a 10-pixel border to a value of 4
>>> dqdata[:10,:] = dqdata[-10:,:] = dqdata[:,:10] = dqdata[:,-10:] = 4

# To save these changes in a new fits file
>>> hdu.writeto('temp_nsp.fits',overwrite=True)
>>> hdu.close
cl> imcalc o4qx04010_flt.fits[dq,1] o4qx04010_flt.fits[dq,1] \
"if x <= 10 || x >= (1024-10) || y <= 10 || y >= (1024-10) then 4 else im1"

Users are encouraged to visually compare the 2-D science spectrum and the DQ arrays to determine the optimal size of the order for other data.Once you have followed the steps above, you can rerun x1d on the _flt file and get a new _x1d spectrum, which will have the updated DQ values. Then you can run splice, specifying that you want to ignore those vignetted regions that you just marked with DQ values when you do the splicing together of the spectrum, i.e., by making sure the sdqflags parameter in the splice task is set to an appropriate value.

cl> splice obs1_x1d.fits,obs2_x1d.fits output_splice.fits sdqflags=4

Splice will then only use data that has not been marked by the specified DQ flags when splicing together the spectrum. Please refer to the splice task help file for more useful information.