Convert SMA to CASA

SMA Quick Links

Converting SMA Data Format

Here we are primarily concerned with converting raw SMA data format to a CASA compatible type - either UVFITS or direct to measurement set.

Method	Details	Public Availability	Status
pyuvdata	Python interface to convert between interferomteric data formats	Conda	Working
mir2ms	MIR/IDL command to convert raw SMA data to CASA MS	MIR Github	Working but unsupported
autofits	MIR/IDL command to convert calibrated SMA data to UVFITS	MIR Github	Working but unsupported

◼ pyuvdata [recommended]

Visit the pyuvdata SMA issues page to see current known issues and report any new SMA related ones.

Pyuvdata is a python interface to interferometric datasets. It allows the conversion of datasets from one format to another with multiple data formats supported. Here we give the example of converting SMA data from raw MIR/IDL format to UVFITS and CASA measurement set.

Retrieval
The package can be downloaded using pip or conda. If you are working on the RTDC this is already installed.
$ conda install -c conda-forge pyuvdata
Instructions
1. Follow the template below to create an executable script. The read_mir command allows you to extract a subset of the original data - you can find examples below.
  
  #! /opt/anaconda3/envs/pyuvdata/bin/python import os from pyuvdata import UVData # Get path to current working directory cwd = os.getcwd() UV = UVData() ####### SELECT WHICH DATA YOU WANT TO EXRTRACT ####### # Read in all sources, receivers, sidebands and chunks. UV.read_mir("/sma/data/flux/mir_data/210808_13:37:36") # Read in only the LSB 230GHz receiver data. All sources. All chunks. UV.read_mir("/sma/data/flux/mir_data/210808_13:37:36", receivers='230', sidebands='l') # Read in the Mars 345GHz LSB chunk 2 data. # On the RTDC you can quickly list the sources using whatishere UV.read_mir("/sma/data/flux/mir_data/210808_13:37:36", catalog_names=['Mars'], receivers='345', sidebands='l', corrchunk=[2]) ####### SELECT THE OUTPUT FORMAT ####### # Write out to measurement set UV.write_ms(cwd+"/210808_133736.ms")
  # Write out to uvfits UV.write_uvfits(cwd+"/210808_133736.uvfits", spoof_nonessential=True)
  
  The construction of the script above means the output file/directory will be written to your current working directory. Omit the cwd elements if you want to define an explcit path for the output.
  The links below will direct you to the pyuvdata docmentation describing all available arguments.
  read_mir
  write_ms
  write_uvfits
2. Read the output into CASA
  UVFITS
  CASA: importuvfits(fitsfile='210808_133736.uvfits', vis='210808_133736.vis')
  CASA: ms.open(vis)
  Measurement set
  CASA: vis='210808_133736.ms'
  CASA: ms.open(vis)

◼ mir2ms

mir2ms is a MIR task that converts a raw uncalibrated SMA dataset to CASA measurment. It uses the autofits MIR task to write the data out as UVFITS file; it follows this up by launching CASA and running an SMA package to import and concatenate the UVFITS files into a single measurment set.

We encourage users to report any issues to Holly Thomas (holly.thomas@cfa.harvard.edu).

Retrieval
The script comes packaged with the June 2021 version of MIR. This is available on the RTDC or you can find it on the MIR github page sma-mir.
Status [Oct 2022]
Header informaion on the gunnLO is currently incompatible with mir2ms; there is a work-around described below.
The instructions below for how to specify a non-default CASA installation.
Instructions
1. Create a .pro script in your current working directory; in this example it has been named mymir.pro (remember that the filename must match the name given to the 'pro' in the first line of the file). There are two required tasks in this file - applying the Tsys correction and fixing the problem data header. You can optionally provide instructions to perform some basic data cleaning (e.g. flagging pointing scans & spikes). The mymir.pro script should follow this template.
  As part of the routine, mir2ms utilizes the autofits MIR command (see 'Using MIR autofits' below) to generate UVFITS files per source, sideband, and chunk. Unlike autofits, mir2ms opens your default, or provided, version of CASA then imports and concetenates the UVFITS files automatically.
  WARNING This routine creates over 200 temporary files and directories in your cwd which will require ~ 5x the disk space of the input data directory.
2. Run mir2ms in MIR.
  IDL> mir2ms, casa='/opt/casa-release-5.8.0-109.el6/bin/casa', dir='210808_13:37:36', rx=230, /mymir, outname='210808_133736'
  If a receiver selection not provided it will produce two output measurement sets, one for each receiver.
  Depending on the size of the data this may take multiple hours (e.g. ~1hour/10GB). When the script has finished, exit IDL and you will find the temporary files have been deleted. You will be left with an outname_rx.ms directory in your cwd along with IDL and CASA log files.

◼ autofits

This option converts data already calibrated in MIR to CASA measurement set format. The data are written out in UVFITS format from MIR, then the provided script (see link below) must be used to import it to CASA.

Retrieval
You can find the CASA import script at MIRFITStoCASA_casa5 OR MIRFITStoCASA_casa6. This script is used in place of CASA's importuvfits procedure which does not propagate the weights correctly from MIR.
Status [July 2022]
A bug in fits_out (found inside the MIR autofits routine) has been fixed. This caused each chunk and sideband to have slightly different uv coordinates (meters) in casa. This problem meant users saw a narrower u-v coverage per baseline, and got u-v coordinates wrong by a few percent.
Instructions
1. Create the UVFITS file in MIR by using the autofits routine. This will loop over all chunks, sidebands and receivers on a source-by-source basis. In this example our source is named orion.
  IDL> select,/p,/re IDL> autofits, source='orion'
  after providing your source name, autofits creates a separate file for each SWARM chunk (s1-s4), sideband, and receiver. The files are named with the following convention SOURCE_SB_CHUNK_RX.UVFITS (e.g. ORION_L_S1_RX345.UVFITS, ORION_U_S4_RX230.UVFITS).
  For a typical SWARM data set you will get 16 spectral files, along with 4 extra files for the pseudo-continuum chunks, C1, which can be ignored.
2. Next switch to CASA and use the provided script MIRFITStoCASA_casaX.py to import your data (ensure the version matches the release of CASA you are using). This script loops over all the .UVFITS files and converts each of them to the measurement set (.ms) format.
```
CASA: import sys 
CASA: sys.path.append('/path/to/MIRFITStoCASA_casaX.py/') 
CASA: import MIRFITStoCASA_casaX
CASA: fullvis='allorion.ms'
CASA: allNames = []
CASA: for sou in ['ORION']:
CASA:    for rx in ['345','240']:
CASA:       for sb in ['L','U']:
CASA:          for i in ['1','2','3','4']:
CASA:             name = sou+"_"+sb+"_S"+i+"_RX"+rx
CASA:             print("------converting "+name+" ....")
CASA:             MIRFITStoCASA.MIRFITStoCASA(UVFITSname=name+'.UVFITS', MSname=name+'.ms')  
CASA:             allNames.append(name+'.ms')
```
3. Next concatenate all the newly created measurement sets into a single file. The input here is allNames which is the python list of .ms files created in step 2. The name of the concatenated output file (fullvis) was also defined in step 2.
  CASA: concat(vis=allNames,concatvis=fullvis,timesort=True)
4. Check the content of the final, concatenated measurement set:
  CASA: listobs(fullvis)
5. Flag the noisy edge channels.
  CASA: flagdata(vis=fullvis, mode='manualflag', spw='*:0~nflagedge;(ntotal-nflagedge)~(ntotal-1)')
  where nflagedge should be substituted with the integer number of channels you want to flag from the edges, and ntotal should be substituted with the integer number of channels per chunk (or spw in CASA). ntotal can be found from the previous listobs command.
  The choice of how many edge channels to flag can be made by looking at the amplitude behavior as a function of frequency in each chunk, e.g. by using CASA's plotms function. Note that this can be slow if a lot of channels are present and/or a lot of tracks have previously been combined:
  CASA: plotms(vis=fullvis, xaxis='channel', yaxis='amp', avgtime='1e20', avgscan=True, iteraxis='spw')
  This will show increased noise in the edge channels. We advise a conservative trim of about 8% (so that nflagedge~0.08ntotal).
6. Repeat the loop to create a new measurement set for another source. Alternatively, add a second source name to the for sou loop in step 2.