SFXC User's Manual

Mark Kettenis (JIVE)

Chapter 1. Running the SFXC correlator

SFXC is an MPI application. This means that running it is somewhat dependent on the MPI implementation installed on your system. The instructions here are for OpenMPI, which currently seems to be the most popular Open Source MPI implementation for Linux systems.

$ mpirun –machinefile machinefile –rankfile rankfile –np np sfxc controlfile vexfile

where controlfile is the name of the correlator control file that describes the correlation parameters, vexfile is the name of the VEX file that describes the experiment, np is the number of MPI processes to start as described by the machine file machinefile and rank file rankfile.

When creating the rank file, there are a few things that need to be taken into account. The process with rank 0 becomes the manager process. Since the manager process doesn't really do a lot, there is no point in assigning more than a single slot to it. The process with rank 1 becomes the log process. As with the manager process, there is no point in assigning more than a single slot. The process with rank 2 becomes the output process. This process will be able to take advantage of multiple cores, so assigning two slots is a good idea if you expect a significant output data rate. At JIVE we usually run all these processes on the cluster head node.

The processes starting at rank 3 become input processes. There will be one input process for each station in the correrlation. When correlating directly from Mark5 disk packs, these processes will need to run on the Mark5s containing the diskpacks for those stations. When correlating from files, these processes will need to run on machines that have access to the data files for these stations. The process with rank 3 will be assigned to the station that comes first when the stations are ordered alphabetically by station code. The process with rank 4 will be assigned to the station that comes second, etc. The input processes do the unpacking and corner turning of the input data, which can be cpu intensive. So assigning multiple slots is a good idea. By default the unpacking happens in two seperate threads, so using two or three slots makes sense.

The remainder of the processes will be assigned to correlations processes. A single slot is sufficient for these processes.

Example 1.1. Rank file example

rank 0=head slot=0
rank 1=head slot=1
rank 2=head slot=2,3
rank 3=sfxc-d2 slot=0,1
rank 4=sfxc-d2 slot=2,3
rank 5=sfxc-d3 slot=0,1
rank 6=sfxc-a0 slot=0
rank 7=sfxc-a1 slot=0
...
rank 36=sfxc-a2 slot=7
rank 37=sfxc-a3 slot=7

SFXC will automatically generate delay tables using the CALC10 code that's included in the distribution. The CALC10 needs some additional input files to do its work. These are the JPL Solar System Ephemeris (DE405_le.jpl), ocean loading information (ocean.dat) and antenna tilt (tilt.dat). It expects to find these in a directory pointed to by the CALC_DIR environment variable. A copy of DE405_le.jpl as well as ocean loading and antenna tilt information for many antennas that co-observe with the European VLBI Network (EVN) can be found in sfxc/lib/calc10/data in the source distribution.

Chapter 2. The correlator control file

The correlator control file uses the JavaScript Object Notation (JSON) format. It is constumary to give these files a .ctrl extension.

* output_file
A string specifying the name of the file to write the correltor output to. It is costumary to give this file a .cor extension.

* number_channels
An integer specifying the number of desired spectral channels in the correlator output. Has to be power of two.

* integr_time
A floating-point number specifying the integration time in seconds. Will be rounded to the nearest integral microsecond.

* cross_polarize
A boolean indicating whether cross hands should be calculated or not.

* stations
A list of strings specifying the stations that are to be correlated.

* data_sources
An object containing a list of strings for each station specifying the data source locations for each station. Each data source location is specified in the form of a Uniform Resource Identifier (URI). To correlate data from plain files, the standard file scheme can be used. Correlating data directly from Mark5 disk packs is achieved by specifying an appropriate mk5: URI. All URIs for a single station must use the same scheme. Specifying multiple URIs for a single station is currently only supported for the file scheme.

* start
A string specifying the start time of the correlation. The time should be specified in VEX (####y###d##h##m##s) format representing UTC. For real-time correlation the string “now” can be used, which will instruct the correlator to use the current wall clock time (in UTC) as the start time.

* stop
A string specifying the end time of the correlation. The time should be specified in VEX (####y###d##h##m##s) format representing UTC.

* exper_name
A string specifying the experiment name. Used for generating and referencing the appropriate delay tables.

* delay_directory
A string specifying the directory in which to store the delay tables.

An example of a control file is given below:

Example 2.1. Control file example

{
    "exper_name": "F13C4", 
    "cross_polarize": true, 
    "number_channels": 256, 
    "integr_time": 1, 
    "output_file": "file:///home/kettenis/test/f13c4/f13c4_no0023.cor", 
    "stations": [
        "Eb", 
        "Fd", 
        "Nl"
    ], 
    "data_sources": {
        "Nl": [
            "file:///scratch/kettenis/f13c4/f13c4_nl_no0023.m5b"
        ], 
        "Eb": [
            "file:///scratch/kettenis/f13c4/f13c4_eb_no0023.m5b"
        ], 
        "Fd": [
            "file:///scratch/kettenis/f13c4/f13c4_fd_no0023.m5b"
        ]
    }, 
    "start": "2013y148d10h29m26s", 
    "stop": "2013y148d10h34m06s", 
    "delay_directory": "file:///home/kettenis/test/f13c4/delays"
}

Chapter 3. Preparing your VEX file

Some information needs to be provided in the VEX file that is typically not emitted by the scheduling software. It is essential that you have $CLOCK and $EOP blocks. Some of the tools distributed with SFXC also use the $TAPELOG_OBS block. We recommend that in the $EOP block you provide entries at a 24 hour intervals and have an additional entry for the day before and the day after the observation. All these blocks need to be properly referenced; from the $GLOBAL block for $EOP and from the $STATION block fot the $CLOCK and $EOP blocks.

It is important that the description of the data format in the VEX file is correct. SFXC currently supports the Mark4, VLBA, Mark5B and VDIF data format and includes some heuristics to determine the correct data format from the VEX file. If SFXC crashes, seems to hang or complains it cannot find any valid data, please check that the data format description in your VEX file matches reality.

* Mark4
record_transport_type should be set to Mark5A and electronics_rack_type should be set to Mark4 or VLBA4 in the $DAS block; track_frame_format should be set to Mark4 in the $TRACKS section

* VLBA
record_transport_type should be set to Mark5A and electronics_rack_type should be set to VLBA in the $DAS block; track_frame_format should be set to VLBA in the $TRACKS section

* Mark5B
record_transport_type should be set to Mark5B in the $DAS block, and either a $TRACKS setion should be present and have its track_frame_format keyword set to Mark5B [1] , or a $BITSTREAMS block must be present as proposed for the upcoming VEX 2 standard.

* VDIF
VEX 1.5 does not provide the means to properly specify VDIF as the recording format. Current versions of $sfxc; recognize the $THREADS block as proposed for the new VEX 2 standard by Walter Brisken from NRAO. [2] However this proposal has been withdrawn in favour of a new $DATASTREAMS block. The intention is to have SFXC recognize $DATASTREAMS blocks once the final VEX 2 standard arrives. In the meantime a $THREADS block will need to be added, as SCHED doesn't do this.

record_transport_type should be set to Mark5C or VDIF in the $DAS block. If the record_transport_type is set to Mark5C, electronics_rack_type must be WIDAR. A $THREADS block must be present.

SFXC has been tested extensively with VEX output from (NRAO) SCHED. Your mileage may vary with output from other VLBI scheduling software.

[1] SCHED spells this as MARK5B, which is tolerated by SFXC

[2] https://safe.nrao.edu/wiki/bin/view/VLBA/Vex2doc#A_61_36THREADS_61_Block

Chapter 4. Post-processing

As with the Mark4 hardware correlator, SFXC output is converted into an AIPS++/CASA MeasurementSet using the j2ms2 program. Create a directory with the name of experiment as given in the VEX file. Copy the VEX file into this directory and rename it to experiment.vix where experiment is again the name of the experiment as given in the VEX file.

$ j2ms2 file …

where file is the name of the correlator output file. This will produce a MeasurementSet named experiment.ms. It is possible to specify multiple correlator output files on the j2ms2 command line. The visibilities in these files are simply concatenated and written out into a single MeasurementSet.

To convert data into FITS-IDI such that it can be read into AIPS, the tConvert program can be used.

$ tConvert experiment.ms experiment.IDI

The resulting FITS-IDI can be read directly into AIPS using FITLD.

Note that at JIVE we run some additional post-processing tools on the MeasurementSet before converting data into FITS-IDI. The most important things are:

Amplitude correction for a-bit data; currently j2ms2 assumes all data is 2-bit.

Flagging of delay-rate zero events.

Flagging of data with low weights.

If you correlate and convert your own data, you may have to take care of these things when reducing the data in AIPS.

JIVE Wiki

Table of Contents