User Tools

Site Tools


2007-2008:software

Software Development

Software correlator

This is in Arpad's stuff, but should go in this chapter

Logistics software

Bauke Kramer Huib says: this needs a rewrite by whom?

Kramer took care of the maintenance of the proposaltool Northstar. The validation of the data in an uploaded sourcelist was implemented.

Creation of a datamining program for the proposal data is in progress. This program uses the xml output of the proposaltool for JIVE, which is stored in a local database. It contains the name of pi, contact auther or potential observer for each proposal. Other stored items are: the country of origin, affiliation, email, etc. The program is written in php and will have many selection possibilities.

The backup program for the EVN Data Archive was improved. The backupprocedure was changed to backup by subdirectory of experiment instead of backing up by experiment. This prevents backing up the whole experiment while small changes in the data are archived. A local webpage was created which keep track of the backups.

Other programs which needed some maintenance were log2vex (program to create vexfiles), mark5 interface program and updateftp (program to mirror station logfiles and gpsfiles from Bologna and NRAO.

Harro Verkouter Huib is wondering if this should not go to Arpad's section

Verkouter spent time on investigating and documenting “disaster recovery” for the JIVE offline-software toolchain. This is a precautionary measure as building a working offline-software toolchain is not a turnkey-project. This is mainly due to the lack of documentation of how to deal with various external dependencies and intracacies of those dependencies when one has to build them from scratch. This is still ongoing at the time of writing.

ALBUS

ALBUS (Advanced Long Baseline User Software) is a Joint Research Activity within the RadioNet program in the EC 6th framework. In this project JIVE has the overall management responsibility and also carries out a large fraction of the work-packages, including the development and distribution of ParselTongue, which established a reasonably large user base during this period. There were a number of personnel changes during the period, with James Anderson leaving JIVE and the project, while Mike Sipior and Stephen Bourke joined the team

Ionospheric calibration

The aim of the ALBUS ionospheric calibration project (carried out at JIVE) was to develop improved methods for calibrating the ionosphere for phase-referenced VLBI observations using current data and model products. Two different ionospheric calibration approaches were developed for the ALBUS project.

The first type uses a global ionosphere model to predict the three-dimensional electron density distribution of the ionosphere using a few simple input data. Two such models were implemented using external public software packages, the Parameterized Ionosphere Model (PIM) and the International Reference Ionosphere (IRI) model. Both of these packages are software models which attempt to predict the electron distribution based on past observations and ionospheric physics. They use only a few input data, such as the Solar flux and the strength of the Solar wind, and the date and time of the observation to generate their predictions.

The second approach uses measurements from Global Positioning System (GPS) receivers. By comparing the difference in arrival times of the signals on the two different broadcast frequencies of the GPS sattelites (1575.42 MHz and 1227.60 MHz), the path delay of the ionosphere can be measured. Typically, a given GPS receiver can see 5 to 8 GPS satellites at any given time, and various combinations of satellite positions and ionospheric delay measurements can be made to provide calibration information for a specific source direction. Furthermore, by combining measurements from many GPS receivers spread over a surrounding area, a more accurate model of the ionosphere can be formed to make ionospheric calibration predictions. The model used by this ALBUS software is a variation on the Minimum Ionosphere Model.

By combining a physically-based software model (PIM or IRI in this case) with the GPS data modeling, a more refined ionospheric calibration model can be formed. This essentially forms a third type of ionospheric model developed for ALBUS. Most of the underlying software which performs the ionospheric physical modelling or model fitting was written by James Anderson in C/C++/FORTRAN, and a Python layer allows these routines to be run from ParselTongue. In addition to the ionospheric physics and GPS models, the ALBUS ionospheric software also has options to use the standard AIPS task TECOR (which uses publicly available Ionosphere Map Exchange, IONEX, models of the ionosphere), or to apply no ionosphere model at all. When data files for GPS measurements, or IONEX data, are required, the ALBUS software automatically handles downloading the public files from the Internet, and uncompressing and preprocessing them as necessary.

These calibration models were tested using EVN observations made during the 2005 May EVN observing session. They targeted several bright calibration sources, using standard phase-referencing techniques at L-band. As all target objects were themselves bright calibrators, the coherence loss from phase referencing and ionospheric calibration could be measured by comparing the final measured brightnesses of the targets when using phase referencing and various ionospheric calibration schemes, as compared with direct fringe fitting of the target source. The results of the tests were not very encouraging. On average, all of the ionospheric calibration algorithms from ALBUS and the standard AIPS TECOR task showed no net improvement in the ionospheric calibration. The main reason for this is that the residual ionospheric effects tend to be small-scale features (traveling ionospheric disturbances, TIDs, and other disturbances), which none of the models can accommodate at this point in time.

However, the ALBUS ionospheric calibration routines do seem to make improvements for polarization measurements. In addition to modelling the ionospheric path delay, the ALBUS ionospheric software can also model the ionospheric Faraday rotation. Tests with observations using (individually) Parkes and Westerbork show that the modelled Faraday rotation predictions do make improvements over using no corrections

Future ionospheric calibration developments should concentrate on measuring the small-scale structure of the ionosphere. Test show that this requires a dense network of GPS stations surrounding individual VLBI telescopes, and further software development to properly model wave features in the ionosphere. However, in the near future, the dense GPS receiver networks required for this calibration are unlikely to be present for most VLBI stations. Some of these findings fed into the discussions on the calibration of LOFAR in which James Anderson participated.

Wide Field Imaging

Wide Field VLBI imaging requires observed data to correlated with high time and frequency resolution. While the EVN correlator has these capabilities, handling the resulting large datasets is not well accounted for in existing software. The work carried out focused on providing software that facilitates the distributed processing of data to enable wide field imaging.

Existing software such as AIPS provides excellent imaging algorithms. The problems that a user faces when processing their data for the purpose of wide field imaging are largely related to data management, hardware utilisation, processing time, and data inspection analysis. Software components were developed to address these issues.

Interacting with high performance computing systems

The target environments for this wide field imaging software are commodity clusters. To attain good utilisation of the underlying hardware, it is essential that the user have control over the allocation of computation and data resources. A python interface to the PBS batch system (used on many clusters) was developed to allow sub-jobs to be allocated to compute nodes at run time. A module (AIPSLite) was developed to allow Python/ParselTongue code to be run in a lightweight environment without the need for a full AIPS install, as one is often not present on shared-use clusters. The AIPSLite module also allows for any directory to be setup and utilised as an AIPS data area at run time. Together, these modules provide a level of control over CPU and disk usage on commodity clusters. Furthermore, software modules were developed by Stephen Bourke that provide functions to determine the properties of datasets and catalogs, perform astronomy related calculations, various conversions and utility functions. These libraries, when loaded in Python, complement the ParselTongue infrastructure and facilitate application development.

Parallelisation

To accomplish parallelisation, data can be split up in many ways. For imaging, decomposing in the frequency domain is particularly useful for spectral line data. Decomposing by frequency is also useful for continuum imaging as the resulting dirty images can be combined prior to performing a so-called Clark Clean. The performance of the Cotton-Schwab Clean is prohibitively slow on the large datasets that wide field imaging typically necessitate, so the algorithmic advantages of the latter are compensated by the speed of imaging in parallel followed by a Clark Clean. This approach was implemented and used to image regions of massive star formation observed with the EVN. Fringe fitting is also time consuming on large datasets and is suitable for parallelisation as separate time slices can be processed independently. This method of parallel calibration was implemented, however for observations currently typical the overhead associated with the decomposition of the task are not compensated by the resulting speed up.

Automated analysis and visualisation

A follow-on problem from VLBI wide field imaging is analysing the resulting images, which can be extremely large. Stephen Bourke developed automated detection routines, which search the field for sources and create a text file detailing possible sources of emission. The software can also create plots to give an overview of the field.

This approach was extensively tried on a large EVN data-set observed at the methanol line in collaboration with Kalle Torstensson and Huib van Langevelde. The data was sampled at 1024 (2 kHz) channels and 0.25s intervals, allowing an area of 5 square arc-minutes to be mapped with 4 mas resolution producing a imaged datasets of over 1012 cells (Figure 7). The processing was done utilising 128 CPUs of the Walton (opteron) cluster at the Irish Centre for High End Computing (ICHEC). The method proved successful, with the software performing well and in a very stable manner. However, no other masers sources than the known central object was found. In the future more targets can be tried. Needs an additional paragraph on running the stuff

Stephen's image Facet layout for a processed data set. Each of the 4669 boxes corresponds to a data cube of dimensions 2048 _ 2048 cells with 1024 frequency channels giving a total image size of over 18 tera-pixels

Infrastructure Software

ParselTongue was developed in the context of ALBUS to define a common interface for developing new algorithms and disseminating the results to the user community. By providing a Python binding for classic AIPS it allows high level scripting for a large range of radio astronomy applications. It provides enhanced interface functions between AIPS and the outside world, which has been an important feature for implementing the EVN and MERLIN pipelines.

During this period, ParselTongue has seen a number of features added, both to the code base proper, and to the user support infrastructure around it. Version 1.0.6 of ParselTongue was released in July 2007, and includes support for directly modifying image pixel values, as well as adding arbitrary data keywords to an image, for purpose of organisation or reference. Other improvements since include the ability to modify and extend data headers, directly access the AIPS catalogue, and attach and manipulate tables. In 2007 it was made possible to a switch from the deprecated numarray Python module to the new NumPy framework. In addition, per-task log files have been added, allowing users more control over their output, especially when running a large number of simultaneous processes. Also, a basic infrastructure to simplify transferring data sets between AIPS installations has been implemented. As with per-task log files, this infrastructure is aimed at facilitating the use of ParselTongue in large-scale distributed pipelines.

Recent development by Mike Sipior has focused on exploring some new facilities for making simple parallel data reduction possible. This would be a great help for dealing with the very large datasets coming out of wide-field VLBI work, along with the obvious applications for data from large radio surveys. ParselTongue development in 2008 has concentrated on establishing a framework for easily running multiple AIPS tasks simultaneously, from a single control script, using user-specified queues of AIPS tasks. Task queues are simply containers of AIPSTask objects, which, once populated, are executed in a fashion analogous to simple AIPSTasks. Because the data objects supplied as arguments to the individual AIPSTasks contain all of the information required to access the data itself, no extra work is required to run jobs on a remote machine

The ParselTongue user community has grown substantially to approximately a hundred users, also implying an increasing number of users filing bug reports and feature requests. A great deal of work has been done to streamline the installation of ParselTongue and its dependencies, lowering the bar for new users to get started. In addition, the ParselTongue wiki at the JIVE web site http://www.jive.nl/dokuwiki/doku.php was enhanced and includes a facility for users to share their own scripts and ParselTongue snippets on the aforementioned wiki. The ParselTongue mailing list serves both to announce new releases and as a forum for users to share experiences and ask questions about the software.

back to index

2007-2008/software.txt · Last modified: 2009/03/12 15:59 by 127.0.0.1