Converting SMA Data Format
Here we are primarily concesrned with comverting raw SMA data format to a CASA compatible type - either UVFITS or direct to measurement set.
Method | Details | Public Availability | Status |
---|---|---|---|
pyuvdata | Python interface to convert between interferomteric data formats | Conda | Working + ongoing development |
mir2ms | MIR/IDL command to convert raw SMA data to CASA MS | MIR Github | Working |
autofits | MIR/IDL command to convert calibrated SMA data to UVFITS | MIR Github | Working |
Pyuvdata is a python interface to interferometric datasets. It allows the conversion of datasets from one format to another with multiple data formats supported. Here we give the example of converting SMA data from raw MIR/IDL format to UVFITS and CASA measurement set.
- Retrieval
The package can be downloaded using pip or conda. If you are working on the RTDC this is already installed.$ conda install -c conda-forge pyuvdata
- Instructions
-
Follow the template below to create an executable script. The
read_mir
command allows you to extract a subset of the original data - you can find examples below.#! /opt/anaconda3/envs/pyuvdata/bin/python
import os
from pyuvdata import UVData
# Get path to current working directory
cwd = os.getcwd()UV = UVData()
####### SELECT WHICH DATA YOU WANT TO EXRTRACT #######
# Read in all sources, receivers, sidebands and chunks.
UV.read_mir("/sma/data/flux/mir_data/210808_13:37:36")
# Read in only the LSB 230GHz receiver data. All sources. All chunks.
UV.read_mir("/sma/data/flux/mir_data/210808_13:37:36", receivers='230', sidebands='l')
# Read in the Mars 345GHz LSB chunk 2 data.
# On the RTDC you can quickly list the sources usingwhatishere
UV.read_mir("/sma/data/flux/mir_data/210808_13:37:36", catalog_names=['Mars'], receivers='345', sidebands='l', corrchunk=[2])
####### SELECT THE OUTPUT FORMAT #######
# Write out to measurement set
UV.write_ms(cwd+"/210808_133736.ms")# Write out to uvfits
UV.write_uvfits(cwd+"/210808_133736.uvfits", spoof_nonessential=True)The construction of the script above means the output file/directory will be written to your current working directory. Omit the cwd elements if you want to define an explcit path for the output.
The links below will direct you to the pyuvdata docmentation describing all available arguments.
read_mir
write_ms
write_uvfits - Read the output into CASA
UVFITS
CASA: importuvfits(fitsfile='210808_133736.uvfits', vis='210808_133736.vis')
CASA: ms.open(vis)
Measurement set
CASA: vis='210808_133736.ms'
CASA: ms.open(vis)
-
Follow the template below to create an executable script. The
mir2ms
is a MIR task that converts a raw uncalibrated SMA dataset to CASA measurment. It uses the autofits MIR task to write the data out as UVFITS file; it follows this up by launching CASA and running an SMA package to import and concatenate the UVFITS files into a single measurment set.
We encourage users to report any issues to Holly Thomas (holly.thomas@cfa.harvard.edu).
- Retrieval
The script comes packaged with the June 2021 version of MIR. This is available on the RTDC or you can find it on the MIR github page sma-mir. - Status [Oct 2022]
Header informaion on the gunnLO is currently incompatible withmir2ms
; there is a work-around described below.
The instructions below for how to specify a non-default CASA installation. - Instructions
-
Create a .pro script in your current working directory; in this example it has been named mymir.pro (remember that the filename must match the name given to the 'pro' in the first line of the file). There are two required tasks in this file - applying the Tsys correction and fixing the problem data header. You can optionally provide instructions to perform some basic data cleaning (e.g. flagging pointing scans & spikes). The mymir.pro script should follow this template.
As part of the routine,
mir2ms
utilizes theautofits
MIR command (see 'Using MIR autofits' below) to generate UVFITS files per source, sideband, and chunk. Unlikeautofits
,mir2ms
opens your default, or provided, version of CASA then imports and concetenates the UVFITS files automatically.WARNING This routine creates over 200 temporary files and directories in your cwd which will require ~ 5x the disk space of the input data directory.
-
Run
mir2ms
in MIR.IDL> mir2ms, casa='/opt/casa-release-5.8.0-109.el6/bin/casa', dir='210808_13:37:36', rx=230, /mymir, outname='210808_133736'
If a receiver selection not provided it will produce two output measurement sets, one for each receiver.
Depending on the size of the data this may take multiple hours (e.g. ~1hour/10GB). When the script has finished, exit IDL and you will find the temporary files have been deleted. You will be left with an outname_rx.ms directory in your cwd along with IDL and CASA log files.
-
Create a .pro script in your current working directory; in this example it has been named mymir.pro (remember that the filename must match the name given to the 'pro' in the first line of the file). There are two required tasks in this file - applying the Tsys correction and fixing the problem data header. You can optionally provide instructions to perform some basic data cleaning (e.g. flagging pointing scans & spikes). The mymir.pro script should follow this template.
This option converts data already calibrated in MIR to CASA measurement set format. The data are written out in UVFITS format from MIR, then the provided script (see link below) must be used to import it to CASA.
- Retrieval
You can find the CASA import script at MIRFITStoCASA_casa5.py OR MIRFITStoCASA_casa6.py. This script is used in place of CASA'simportuvfits
procedure which does not propagate the weights correctly from MIR. - Status [July 2022]
A bug infits_out
(found inside the MIRautofits
routine) has been fixed. This caused each chunk and sideband to have slightly different uv coordinates (meters) in casa. This problem meant users saw a narrower u-v coverage per baseline, and got u-v coordinates wrong by a few percent. - Instructions
-
Create the UVFITS file in MIR by using the
autofits
routine. This will loop over all chunks, sidebands and receivers on a source-by-source basis. In this example our source is named orion.IDL> select,/p,/re
IDL> autofits, source='orion'after providing your source name, autofits creates a separate file for each SWARM chunk (s1-s4), sideband, and receiver. The files are named with the following convention SOURCE_SB_CHUNK_RX.UVFITS (e.g. ORION_L_S1_RX345.UVFITS, ORION_U_S4_RX230.UVFITS).
For a typical SWARM data set you will get 16 spectral files, along with 4 extra files for the pseudo-continuum chunks, C1, which can be ignored.
-
Next switch to CASA and use the provided script MIRFITStoCASA_casaX.py to import your data (ensure the version matches the release of CASA you are using). This script loops over all the .UVFITS files and converts each of them to the measurement set (.ms) format.
CASA: import sys CASA: sys.path.append('/path/to/MIRFITStoCASA_casaX.py/') CASA: import MIRFITStoCASA CASA: fullvis='allorion.ms' CASA: allNames = [] CASA: for sou in ['ORION']: CASA: for rx in ['345','240']: CASA: for sb in ['L','U']: CASA: for i in ['1','2','3','4']: CASA: name = sou+"_"+sb+"_S"+i+"_RX"+rx CASA: print("------converting "+name+" ....") CASA: MIRFITStoCASA.MIRFITStoCASA(UVFITSname=name+'.UVFITS', MSname=name+'.ms') CASA: allNames.append(name+'.ms')
-
Next concatenate all the newly created measurement sets into a single file. The input here is allNames which is the python list of .ms files created in step 2. The name of the concatenated output file (fullvis) was also defined in step 2.
CASA: concat(vis=allNames,concatvis=fullvis,timesort=True)
- Check the content of the final, concatenated measurement set:
CASA: listobs(fullvis)
- Flag the noisy edge channels.
CASA: flagdata(vis=fullvis, mode='manualflag', spw='*:0~nflagedge;(ntotal-nflagedge)~(ntotal-1)')
where
nflagedge
should be substituted with the integer number of channels you want to flag from the edges, andntotal
should be substituted with the integer number of channels per chunk (or spw in CASA).ntotal
can be found from the previouslistobs
command.The choice of how many edge channels to flag can be made by looking at the amplitude behavior as a function of frequency in each chunk, e.g. by using CASA's
plotms
function. Note that this can be slow if a lot of channels are present and/or a lot of tracks have previously been combined:CASA: plotms(vis=fullvis, xaxis='channel', yaxis='amp', avgtime='1e20', avgscan=True, iteraxis='spw')
This will show increased noise in the edge channels. We advise a conservative trim of about 8% (so that
nflagedge~0.08ntotal
). - Repeat the loop to create a new measurement set for another source. Alternatively, add a second source name to the
for sou
loop in step 2.
-
Create the UVFITS file in MIR by using the