Thursday, December 6, 2012

Code release: pyrmsynth - Fast RM synthesis in Python

Another in what will be a series of code release posts. It's not that I've been some kind of coding superhero lately, but that I've got a backlog of code to release in the wild and am only now finding time to do it.

Last week I released pyrmsynth, a fast, Python based RM synthesis package including a CLEAN algorithm for simple deconvolution. This is the RM synthesis software that I wrote for the LOFAR collaboration, and has been in use on their compute cluster for some time. 

The pyrmsynth package is both an application for performing RM synthesis on multi-frequency polarized imaging data, as well as a library of sorts for writing your own applications. If your imaging data is in FITS format, or can be converted into FITS, then the included rmsynthesis.py application will probably do the trick for you out of the box. Or, if you have data in another format, you can write your own file I/O and data parsing script, and use the included RMSynth and RMClean classes to actually perform the RM synthesis on your data.

The advantage of using this package is its flexibility and relative speed. The interface is written in Python, so writing scripts for I/O, data manipulation, and visualization is easy. It's also easy to modify and extend the core functionality, if you need to. Despite being mainly written in Python, the code is still fast for two reasons: 
  1. We use FFTs to perform the imaging
  2. We have written gridding code in Cython (which generates C code) to enable usage of FFTs. 
FFTs require that the data is defined at regular intervals, but because sampling of λ2 space is generally unevenly spaced, this is not true for the case of RM synthesis. We use a fancy binning method known as gridding to sample the data at regular intervals, thereby enabling use of the speedy FFT routine. The gridding would itself be slow if written in pure Python, so instead we have implemented it in C (via Cython), and the result is a very fast code. On my machine, with the current version, I can process (with CLEAN and including file I/O) 10,000 lines of sight in ~6 seconds.

pyrmsynth is released under GPLv3. Contributions are welcome for feature additions, bug fixes, documentation, etc. Just contact me or fork the project on github and get cracking.

http://mrbell.github.com/pyrmsynth/

Sunday, November 25, 2012

Code release: GFFT

This week I released a new Python package to the world. It's called GFFT (General Fast Fourier Transformation) and what it does is it makes applying (relatively) fast Fourier transformations to a wider class of data much simpler.

Fast Fourier transformation in Python is pretty simple in many circumstances already. Just use the Numpy FFT and you're all set, right? Well, that's true if you have data that is regularly sampled, e.g. you've repeatedly made some measurement exactly once every second. But what if your measurements are taken at irregular intervals? Or what if you need to do a lot of phase shifts because your data doesn't start at the origin of whatever coordinate system on which it is defined?

Well, these are both situations that I and others in my group face quite often, and we found that we were using similar pieces of code again and again in a variety of projects. So Henrik Junklewitz and I developed GFFT to make life a bit simpler. The features of GFFT include:

  • Fast Fourier transformations of data defined on, and being transformed to, arbitrarily defined coordinate spaces including:
    • regular to regular
    • irregular to regular
    • regular to irregular
    • irregular to irregular
  • Fast transformations, even when working on irregular spaces, thanks to the included gridding algorithm.
  • The gridding code is implemented in C (via Cython), so it's fast.
  • When working with irregular to regular space transforms, you can store irregularly sampled Hermitian symmetric data in an efficient way... just store half of the data and the symmetric parts will be generated internally.
  • Automatically handles arbitrary phase shifts of the input and output data if they are not defined on axes that start at the origin.
Interested? Check out the project on GitHub: http://mrbell.github.com/gfft/

Please note that as of now the package is in beta, and a major revision is already underway (see the rewrite branch in the git repository). The revision will clean up the code base significantly (it's almost embarrassingly sloppy right now, but you know how it goes when you develop code for scientific applications...) and will make the interface more intuitive. So if you start using GFFT immediately, just be aware that we are planning to fundamentally change the interface soon and future updates may break your code (we'll put in a deprecation warning first...). Things will stabilize soon. Until then, try it out and let us know what you think!

Friday, November 23, 2012

New submission: Improving CLEAN for rotation measure synthesis

This week we submitted a new paper to A&A where we suggest a method for improving CLEAN images in the context of RM synthesis. The method allows one to make lower resolution images while obtaining better results than when using CLEAN alone and moreover it makes the results much less dependent on the choice of pixelization.

What we and many others have found is that RMCLEAN doesn't do such a great job at reconstructing the locations or fluxes of sources, especially when there are several of them close together. When writing the 3D CLEAN algorithm for Faraday synthesis we noticed that RMCLEAN doesn't even do that great with a single source unless it is located directly in the middle of an image pixel. The dynamic range in a CLEANed image is well-known to be limited due to the fact that you can't exactly model the location of a source in a pixelized image. You can partially overcome the issue by making the image have very high resolution, but especially for 3D imaging this becomes expensive both computationally and in terms of storage.

So we looked into this in a bit more detail, and devised a method for improving the CLEAN generated model using maximum likelihood (ML) estimation. The method is similar to others that have been suggested for aperture synthesis imaging. but it seems to have even more impact in the case of RM synthesis. In our testing, we found that the ML method dramatically reduces the error in measurements of both source location and flux. Somewhat surprisingly, we also found that increasing resolution doesn't reduce the errors in normal CLEAN images in the case where two sources are nearby to one another. However, he ML algorithm was able to get accurate results in such a case even in a low resolution image.

Ultimately we don't think that CLEAN (or any method that assumes the signal to be diagonal in pixel space) is the right approach in all cases, and we're actively working on new methods that take advantage of correlated structures in the data to help constrain reconstructions. Nevertheless, CLEAN is undoubtedly a useful algorithm in some circumstances, and together with this new algorithm it can give rather good results while being easy to implement and (relatively) computationally inexpensive.

The pre-print is available now on the arXiv. Give it a read if you want to learn more. And if you're interested in implementing the method for your own analysis, then be sure to stay tuned because I plan to release the code for this relatively soon.