Data Set Technical Details for the S5 and S6 datasets, 2005-2010

(For O1, 2015-2016, see here)

This page contains technical specifications of released data, associated with a variety of different sets of information. In many cases, information contained on this page is not needed to use released data sets for scientific investigations.

Much of the information on this page is primarily of interest to LIGO Scientific Collaboration members.

Many of the links here are to internal wikis and web sites which are password protected - these are indicated by a black diamond (◆).

If you are interested in password protected information, please contact the GWOSC team. In many cases, we will be able to provide the requested documentation.

GWOSC data downsampling and repackaging

GWOSC builds files from standard LIGO h(t) frames using this code for S5/S6. We have chosen to repackage our data to make it more accessible to casual users both within the LVC and outside.

  • We start with the frame files (eg, for S6, from the files in /archive/frames/S6/LDAShoftC02 on the CIT cluster). However, frame format is unfamiliar to people outside the GW community, and a "lightweight" frame reader is not readily available and we don't want to have to support one. So, we convert to HDF5, to eliminate need for a frame reader. hdf5 is a popular format (easily readable in python, matlab, Mathematica, C, ..), and will be readable for many years. We also release frame files (repackaged as described below), in case the user already has frame reading software.
  • We re-sample the strain data from 16384 Hz to 4096 Hz. Almost all LVC searches do this already, in pre-processing, because of the increased shot noise and the dearth astrophysical source targets. at higher frequencies. The data quality are less well studied above 2 kHz, and the strain calibration is valid only up to 5 kHz. This resampling reduces the size of our data by a factor 4 (to 4 TB for S5, all three detectors), making the downloading easier and easing disk space needs for our users.
  • Advanced LIGO data are not calibrated or valid below 10 Hz or above 5 kHz, and the data sampled at 4096 Hz are not valid above 2 kHz. In most searches for astrophysical sources, data below 20 Hz are not used because the noise is too high.
  • We use a python wrapping of the LAL routine used by the CBC group, ResampleREAL8TimeSeries, which applies an acausal downsampling filter. This resampling has been carefully studied and reviewed by the GWOSC review team. It "leaks" into the 60 ms before and after each data block. Assuming the data block in question has valid (passing CAT1 veto) data, it has a tiny effect on the amplitude of the strain data for those short periods.
  • Our hdf5/frame files of fixed duration (4096s) and boundaries. This effectively eliminates the need for users to employ gw-data-find to "find" the data. Tutorial 4 presents a user API to get the data and load it into python, giving users access to a list of data segments. This approach is now also adopted for aLIGO frames.
  • We have Timelines and My Sources to aid the user in finding data (including DQ and HWinj info) from a particular time, effectively eliminating the need for segDB queries. From Timeline, you can see multiple DQ and Injection flags, zoom in, and download segments.
  • The DQ and HW Injections are summarized in 1Hz DQ vectors, in both the hdf5 and frame files. This approach is now also adopted for aLIGO frames.
  • This repackaged data is also on our LDG clusters (in /archive/losc) for LVC use.

Notes about the DATA flag

See the Defining the DATA Flag page.

S6 DQ flags

Note that during S6, the CBC group internally used modified DQ category definitions that were not consistent with the definitions shown on these GWOSC pages. In particular, the CBC group internally and temporarily redefined "passing CAT3 checks" to mean "passing CAT2 checks and not a HW injection"; the old "CAT3" was called "CAT4", and the old "CAT4" was called "CAT5".

However, S6 publications used the unmodified definitions, consistent with other LSC publications and consistent with what was used for S5 DQ flag definitions. The categories as applied on the GWOSC site and in GWOSC data files (described in the S5 and S6 data release pages) reflect the system described in S5 and in publications, not the above-described modified DQ category definitions.

Notes about Instrumental Lines

Below we provide links (mostly internal) that document the provenance of the information summarized in the OSC S5 Spectral Lines page and S6 Spectral Lines page.

Hardware Injections

Below we provide links that document the provenance of the information on Hardware Injections summarized in the GWOSC S5 and S6 data release pages.

S5 Hardware Injections

  • H1 CBC Hardware Injections: HTML CSV
  • H2 CBC Hardware Injections: HTML CSV
  • L1 CBC Hardware Injections: HTML CSV
  • Plots of injections with a "successful" log message:   H1 H2 L1
  • Plots of injections without "successful" log message: H1 H2 L1

Automatic Log Files (ASCII Text)

Annotated Log Files

S5 Burst Plots

Injections are sorted as "successful" or "failed" based on automatically generated log messages. Cases where the recovered SNR is not as expected are marked in the "Flag" and "Note" columns of the annotated log files.

S6 Hardware Injections

S6 Burst Injections

The ASCII lists of injections linked from the S6 Burst HW injection page are derived from the more detailed lists shown here:

Plots of recovered SNR may be seen at these links:

  • Successful Injections: H1 L1
  • Failed Injections: H1 L1

The S6 Burst hardware injections were mostly "coherent", meaning they simulated a signal from a particular sky position. The detailed parameters of the injections may be found in the S6 Burst Injection Parameter File. (◆)

A small number of differences exist between the Burst injection lists provided by Timeline and the injection log files. These are detailed on this Wiki page used for the review of this data set.

S6 CBC Injections

S5 Data Set Review