Full-Day Autocorrelation Checking

by Josh Dillon and Steven Murray, last updated April 11, 2023

This notebook is designed to assess per-day data quality from just autocorrelations, enabling a quick assessment of whether the day is worth pushing through further analysis. In particular, it is designed to look for times that are particularly contaminated by broadband RFI (e.g. from lightning), picking out fraction of days worth analyzing. It's output is a an a priori flag yaml file readable by hera_qm.metrics_io functions read_a_priori_chan_flags(), read_a_priori_int_flags(), and read_a_priori_ant_flags().

Here's a set of links to skip to particular figures:

• Figure 1: Preliminary Array Flag Fraction Summary

• Figure 2: z-Score of DPSS-Filtered, Averaged Good Autocorrelation and Initial Flags

• Figure 3: Proposed A Priori Time Flags Based on Frequency-Averaged and Convolved z-Score Magnitude

Parse input and output files

Parse settings

Classify Antennas and Find RFI Per-File

Figure 1: Preliminary Array Flag Fraction Summary

Per-antenna flagging fraction of data based purely on metrics that only use autocorrelations. This is likely an underestimate of flags, since it ignores low correlation, cross-polarized antennas, and high redcal $\chi^2$, among other factors.

Load and Average Unflagged Autocorrelations

DPSS Filter Average Autocorrlations

Figure 2: z-Score of DPSS-Filtered, Averaged Good Autocorrelation and Initial Flags

This plot shows the z-score of a DPSS-filtered, deeply averaged autocorrelation, where the noise is inferred from the integration time, channel width, and DPSS model. DPSS was performed using the per-file RFI flagging analogous to that used in the file_calibration notebook, which is generally insensitive to broadband RFI.

Find Bad Time Ranges

Figure 3: Proposed A Priori Time Flags Based on Frequency-Averaged and Convolved z-Score Magnitude

This plot shows the average (over frequency) magnitude of z-scores as a function of time. This metric is smoothed to pick out ranges of times where the DPSS residual reveals persistent temporal structure. Flags due to the sun being above the horizon are also shown. The unflagged range of times is required to be contiguous.

Write a priori flags to a yaml

Also writing as a priori flags channels that are 100% flagged and antennas that are 100% flagged.

Metadata