IDL vs. Python

by Kelle on May 4, 2009

python-logo-master-v3

Here is a really nice listing of the pros and cons of IDL and Python for astronomers. It’s an Appendix of the Python tutorial Using Python for Interactive Data idlAnalysis by Greenfield and Jedrzejewski at STScI. Data and scripts for the examples and exercises in the tutorial are available at scipy.org.

Update 28 Jan 2010: We’ve transferred this article to the wiki so that it can be updated as Python evolves.

Why would I switch from IDL to Python (or not)?

by Greenfield and Jedrzejewski

We do not claim that all, or even most, current IDL users should switch to using Python now. IDL suits many people’s needs very well and we recognize that there must be a strong motivation for starting to use Python over IDL. This appendix will present the pros and cons of each so that users can make a better informed decision about whether they should consider using Python. At the end we give a few cases where we feel users should give serious consideration to using Python over IDL.

Pros and Cons are addressed below in a comparative sense. Attributes that both share, e.g., that they are interpreted and relatively slow for very simple operations, are not listed.

Pros of IDL

  • Mature many numerical and astronomical libraries available
  • Wide astronomical user base
  • Numerical aspect well integrated with language itself
  • Many local users with deep experience
  • Faster for small arrays
  • Easier installation
  • Good, unified documentation
  • Standard GUI run/debug tool (IDLDE)
  • Single widget system (no angst about which to choose or learn)
  • SAVE/RESTORE capability
  • Use of keyword arguments as flags more convenient

Cons of IDL

  • Narrow applicability, not well suited to general programming
  • Slower for large arrays
  • Array functionality less powerful
  • Table support poor
  • Limited ability to extend using C or Fortran, such extensions hard to distribute and support
  • Expensive, sometimes problem collaborating with others that don’t have or can’t afford licenses.
  • Closed source (only RSI can fix bugs)
  • Very awkward to integrate with IRAF tasks
  • Memory management more awkward
  • Single widget system (useless if working within another framework)
  • Plotting:
    • Awkward support for symbols and math text
    • Many font systems, portability issues (v5.1 alleviates somewhat)
    • not as flexible or as extensible
    • plot windows not intrinsically interactive (e.g., pan & zoom)

Pros of Python

  • Very general and powerful programming language, yet easy to learn. Strong, but optional, Object Oriented programming support
  • Very large user and developer community, very extensive and broad library base
  • Very extensible with C, C++, or Fortran, portable distribution mechanisms available
  • Free; non-restrictive license; Open Source
  • Becoming the standard scripting language for astronomy
  • Easy to use with IRAF tasks
  • Basis of STScI application efforts
  • More general array capabilities
  • Faster for large arrays, better support for memory mapping
  • Many books and on-line documentation resources available (for the language and its libraries)
  • Better support for table structures
  • Plotting
    • framework (matplotlib) more extensible and general
    • Better font support and portability (only one way to do it too)
    • Usable within many windowing frameworks (GTK, Tk, WX, Qt…)
    • Standard plotting functionality independent of framework used
    • plots are embeddable within other GUIs
    • more powerful image handling (multiple simultaneous LUTS, optional resampling/rescaling, alpha blending, etc)
  • Support for many widget systems
  • Strong local influence over capabilities being developed for Python

Cons of Python

  • More items to install separately
  • Not as well accepted in astronomical community (but support clearly growing)
  • Scientific libraries not as mature:
    • Documentation not as complete, not as unified
    • Not as deep in astronomical libraries and utilities
    • Not all IDL numerical library functions have corresponding functionality in Python
  • Some numeric constructs not quite as consistent with language (or slightly less convenient than IDL)
  • Array indexing convention “backwards”
  • Small array performance slower
  • No standard GUI run/debug tool
  • Support for many widget systems (angst regarding which to choose)
  • Current lack of function equivalent to SAVE/RESTORE in IDL
  • matplotlib does not yet have equivalents for all IDL 2-D plotting capability (e.g., surface plots)
  • Use of keyword arguments used as flags less convenient
  • Plotting:
    • comparatively immature, still much development going on
    • missing some plot type (e.g., surface)
    • 3-d capability requires VTK (though matplotlib has some basic 3-d capability)

Specific cases where using Python provides strong advantages over IDL

  • Your processing needs depend on running a few hard-to-replicate IRAF tasks, but you don’t want to do most of your data manipulation in IRAF, but would rather write your own IDL-style programs to do so (and soon other systems will be accessible from Python, e.g., MIDAS, ALMA, slang, etc)
  • You have algorithms that cannot be efficiently coded in IDL. They likely won’t be efficiently coded in Python either, but you will find interfacing the needed C or Fortran code easier, more flexible, more portable, and distributable. (Question: how many distributed IDL libraries developed by 3rd parties include C or Fortran code?) Or you need to wrap existing C libraries (Python has many tools to make this easier to do).
  • You do work on algorithms that may migrate into STSDAS packages. Using Python means that your work will be more easily adapted as a distributed and supported tool.
  • You wish to integrate data processing with other significant non-numerical processing such as databases, web page generation, web services, text processing, process control, etc.
  • You want to learn object-oriented programming and use it with your data analysis. (But you don’t need to learn object-oriented programming to do data analysis in Python.)
  • You want to be able to use the same language you use for data analysis for most of your other scripting and programming tasks.
  • Your boss makes you.
  • You want to be a cool, with-it person.
  • You are honked off at ITT Space Systems/RSI.

Obviously using a new language and libraries entails time spent learning. Despite what people say, it’s never that easy, especially if one has a lot of experience and code invested in an existing language. If you don’t have any strong motivations to switch, you should probably wait.

Reproduced with permission. Copyright 2007, Association of Universities for Research in Astronomy, Inc (AURA).

Update 28 Jan 2010: We’ve transferred this article to the wiki so that it can be updated as Python evolves.

{ 23 comments }

1 bjswift May 5, 2009 at 12:09 am

Kelle, quit reading my mind!! (Did I mention this to you last week?) I’ve been thinking about switching for a while now and just this week I’ve had 3 or 4 conversations about it and almost installed all the libraries.

Honked off at RSI lately? 🙂

2 kelle May 5, 2009 at 7:23 am

@bjswift What are your primary reasons for wanting to switch?

3 Jessica May 5, 2009 at 7:55 am

I made the switch a couple of years ago and I don’t regret it at all. If you are interested in making the switch, I strongly suggest you install scisoft rather than the individual python packages. This gives you python, matplotlib, numpy, scipy, pyraf, etc. in one easy install.

Here are several helpful links on Marcos Huerta’s wiki. Feel free to edit the wiki, or leave comments and I can make the changes:

http://macsingularity.org/astrowiki/tiki-index.php?page=Mac Astronomers Wiki

http://macsingularity.org/astrowiki/tiki-index.php?page=python&PHPSESSID=f0a572a20043a3c9efdf87e97c88ebf1

http://macsingularity.org/astrowiki/tiki-index.php?page=pyraf&PHPSESSID=f0a572a20043a3c9efdf87e97c88ebf1

jessica

4 Jessica May 5, 2009 at 8:09 am

Also, if you are making the switch, I would recommend writing new code in python… don’t try porting all your old code over all at once.

Here are a few additional links for plotting, fitting, and stats tutorials:

http://www.scipy.org/Cookbook

http://matplotlib.sourceforge.net/examples/index.html

5 J.S. May 5, 2009 at 11:17 am

freedom is the big one. it’s one thing to use a proprietary program that manipulates data in open formats (ie, iTunes): you can always take your data elsewhere. it’s entirely another thing to create new works (ie, code) that REQUIRE a license to be useful. if i want python on a machine, i just put it on there. however, the technical advantages of python are almost overwhelming at this point. finally, python is a general purpose programming language you can use to do just about anything with your data once created. for example, python runs my wiki, creates the plots that go in it, and automates the process of getting them there. simply put, IDL blows. its interactive novelty is long gone, and its design flaws are overwhelming.

6 Marshall May 5, 2009 at 1:04 pm

You asked (elsewhere) why I’d switch to Python, so I figured I’d post the answer here. The truth is I’m only halfway switched right now, and suspect I’ll keep a foot in both camps for years. But I *enjoy* Python more.

Far and away the main driver for me is that Python has all the features of a modern programming language: powerful types like dicts and lists, real object orientation, list comprehensions, lambda functions, all that. IDL’s language has been stuck in stasis for _years_, apart from minor additions like = with version 6.0. Supposedly they’re going to be adding dictionaries and other features in version 7.2 next year (see http://michaelgalloy.com/2009/04/20/idl-roadmap.html ), but I’ll believe it when I see it. Python is getting better *fast*; there’s a huge community out there now, and a lot of developer energy behind numpy and matplotlib and so on. Meanwhile programming in IDL today basically feels like it did when I started grad school: I have more routines in my library, sure, but the language itself hasn’t evolved with the times. IDL itself is just ITT, and they can’t possibly compete with the Python community.

Lately I’ve gotten increasingly frustrated with programming in IDL: as the scope of what I’m trying to do has gotten larger, paradoxically I find myself spending more and more time on “stupid stuff” like wrestling with the ancient and limited plotting system, building very ugly GUIs which nonetheless take vast amounts of cumbersome code to build, and dealing with namespace conflicts between routines with identical names in different libraries. Python is not perfect, but it’s a heck of a lot better than IDL in all of these aspects. Like I said, I’m only halfway switched (and certain collaborations are going to keep me in IDL for years, as will all my legacy code) but for new stuff Python seems like it’s got the wind behind its sails.

7 Jane July 30, 2009 at 4:39 pm

Thanks everyone, for discussing this!

By habit I code in Perl (or PDL, the IDL-like add-on), but it’s become clear to me that the scientific modules of perl are not being maintained. I’m trying to figure out what to switch to. I wrote an IDL program yesterday, and half of a Python script the day before. It’s obvious there’s a steep learning curve with either.

Quick question for the hive mind: How well is python supporting FITS formats (including weird multidimensions, iraf multispec, WCSes), coordinate transfers, and other astronomy-specific issues?

8 Jessica July 30, 2009 at 5:56 pm

Preface: I am an ex-IDL, current-python user that made the switch about 2 years ago… statements below are some justification for why I switched; but are probably not completely objective.

Python and IDL are roughly equivalent in terms of reading FITS files to get the raw data into array format and then performing tasks. Long-term, python is likely to have better support for the more complex and newer aspects of FITS (including WCS, weird extensions, etc.) as the python FITS modules are being developed and maintained by space telescope science institute for HST and JWST. Multi-beam FITS data (e.g. ALMA, IRAM, APEX) will be developed/supported for python by ALMA, etc. The efforts for supporting FITS improvements in IDL tend to be more individualized and based around CFITSIO, etc.

Support for plotting FITS with WCS in python is probably lagging behind IDL right now in terms of high-end functionality. BUT, one of my favorite features is python/PyRAF’s ability to link up to ds9. And APLpy will probably go a long way to help this as well.

Finally, with regards to coordinate tranformations, etc. there are a lot of handy IDL functions that haven’t been ported; however, python has access to the very powerful slalib (written in fortran, but I use f2py to access it via python). There are many advanced astronomy tools that have never been ported to IDL in there and most of the algorithms are more accurate in slalib than in IDL. There is an extra step though.

9 Eli July 30, 2009 at 10:30 pm

Jessica has covered most of the details concerning the pro’s of Python for astronomers. Python has another facet that IDL does not have; an extensive base of scientists and programmers. IDL is a closed system and its user base is limited whereas Python is used by a broader community. The Python user in retrospect has access to a large library of utilities.

Take PyCUDA for example, which is a Python module that allows easy access to the GPU for processing data. It can turn a GHz desktop to a gigaflop number cruncher (assuming you have a better than average NVIDIA graphics card).

10 Marcos July 31, 2009 at 6:47 am

At the same time, there is an awful lot of IDL code out there – it’s definitely a smaller group than Python, but which has more code relevant to astronomers? Thinking back to my thesis coding and the sheer amount of code I used (JHU/APL, NASA IDL library, MPFITS, etc), the idea of doing it all in python is unpleasant. Though, I’m sure it could be done.

Has anyone tried GDL (the free, IDL clone) for anything serious? I tried to use my code in it… but it didn’t work, some routines aren’t supported.

11 Nicholas Chapman August 1, 2009 at 7:09 am

In an emoticon, I <3 python. As for plotting (especially FITS images) check out pywip. I wrote a python wrapper around WIP. I think it is great, but then I am slightly biased. My website link goes to the proper page.

12 WesTF August 18, 2009 at 2:49 pm

I have been a long time user of python, and swear by it. But there is one flaw not mentioned here that should be. Python appears to handle implicit type conversion very nicely. But it has a few flaws in its memory handling. The two combined together can be disastorous. You can delete an item, whether it be an object, module, variable, doesn’t matter. But you can never recover the memory used by that deleted item. In addition, under certain scenarios (which I have run into a few) python will take up a new memory section, and not use the old stuff already allocated to previously deleted objects. It goes something like this:

a=”String”
del a
a=[…large array of floats…]
forloop over that a while

And blam! Memory is all gone.

Now with good code, or with some awareness to the memory management issues, one can avoid these things. But lets be honest, not all astronomers (myself included) are good coders. And when you crash a processor bank because you used every last bit of memory, usually people are angry.

Now I love Python, and do everything not super calc. intensive in it. But this is something people should be aware of.

13 Mike McKerns August 19, 2009 at 4:30 pm

Or you could use both at the same time…
http://pypi.python.org/pypi/pyIDL

14 Adam Ginsburg October 21, 2009 at 9:54 am

Tom Robitaille and the AstroPython/APLpy group released IDLsave about two weeks ago. That should alleviate some of the concerns about saving, especially if they come up with a ‘write’ package to match.

15 WesTF October 29, 2009 at 6:22 pm

If saving is really a concern to python users, they could always adopt the pickle library. It can do most of what idl save does, but is more specific.

16 Eli October 29, 2009 at 7:13 pm

WesTF and Adam are both onto something. I wrote a program earlier this year that replicates the IDL save/load capability. The program looks through all of your global environment variables in a Python session and selects the numpy arrays automatically when you choose to save. After seeing the comments on this topic and a push from several colleagues I’ll clean up the code and post it for download so others can use it.

17 JDS November 24, 2009 at 5:43 pm

I wonder if those ex-IDLers now using Python (+ associated scientific computing packages, hereafter for convenience called “Iython”) can comment on the following:

– How easy is it for end users to install the various packages needed for a “core” Iython install, including plotting, data handling, widgets, etc., for Win/Linux/Mac? And a related question…

– A basic set of instructions for an end user of distributed IDL routines/packages might be “use IDL > X.X; NASA Lib, MPFIT on IDL search path,” with a very high chance the code will run out of the box if those minimal dependencies are satisfied. How short or long would the equivalent install instructions for an Iython package of comparable complexity be?

– Does Iython come with its own C compiler, so that you can distribute platform agnostic C code and have it auto-compiled? If not, what advantages does Python offer in plugging in C or other code, compared to, say, MAKE_DLM?

– Is there a preferred editor which understands the operator constructs in NumPy, etc.?

– Does Iython offer interfaces to array libraries, polygon clipping libraries, map projection libraries, etc.

– Does Iython’s plotting framework use a consistent set of fonts for display/vector output/hardcopy? Can you predict in advance the size (horizontal and vertical) of displayed strings for precise positioning?

Thanks.

18 Marshall November 24, 2009 at 8:25 pm

There is not yet any equivalent to “use NASA lib & MPFIT” for Python. A reasonably starting point would be “use Numpy+Scipy+Matplotlib”, but there is as yet no standardized set of astronomy specific packages equivalent to Goddard. It’s early, things are in flux and there are still multiple competing alternatives (for instance the several implementations of WCS). Which is in some ways good, and in other ways bad. Establishing a common set of starting packages for beginners is definitely an area which could use some improvement. (And the ease of install of packages varies wildly. Many are as simple as typing ‘easy_install packagename’ but others require wrestling with compilers. I still can’t get the MySQL interface to build properly on Snow Leopard for some reason.)

My impression is that the Matplotlib plotting framework is very solid and consistent. The ability to hit ‘save’ and produce a PDF file that looks pretty much exactly like any given window is a huge plus. But I’ve not tried specifically to predict the size of display strings like you asked. And live resizable windows! It’s like a breath of fresh air.

Don’t get me wrong, I still spend more time in IDL than in Python right now, particularly given the three humongous applications I’m working on that I really don’t want to port. And there are some things each language feels more smooth at. But I consider myself a relative neophyte in Python in a lot of ways (maybe green belt level, hopefully on my way up to a black belt eventually to match my ones in taekwondo and IDL. 🙂 and I’m trying to produce more new code there right now.

19 Tom November 24, 2009 at 11:31 pm

Marshall already answered most of your questions, but just wanted to add a few things. First, about the ‘core’ package. The short answer is that if you don’t mind not having the absolute latest and greatest versions of every python module, you can use something like the Enthought Python Distribution which includes many pre-installed libraries (including numpy, scipy, matplotlib, and the MySQL-python Marshall was mentioning). This provides a very complete set of (non-astronomy) python modules. Most astronomy packages are fairly simple and will install on top of EPD without much problem. EPD is available for Mac/Linux/Windows if I am not mistaken, and is free for Academics.

As to the question about C compilers, quite a few python modules include C code (e.g. numpy) that is compiled on the fly when the package is installed, using whatever C compilers are available on the host machine. Some packages also include Fortran code that is compiler using whatever Fortran compiler is available. Finally, it is actually possible to write C code inside python code using the scipy.weave package, and this code is compiled at runtime.

Array library: NumPy and Scipy have tons of array manipulation routines.

Polygon clipping: I think SymPy has a good geometry package to do that kind of thing

Map projection: Not sure if you mean WCS (pywcs) or something like basemap.

On a side note, I see the diversity of Python modules as a good thing, even if it means spending a little more time installing things, because it allows anyone to contribute small easily installable modules to do a specific thing well. I do think Astronomy modules in general need more coordination, but the fact that there are multiple FITS or WCS modules out there is good, as this drives competition and ultimately improves the quality of the packages.

I hope this helps!

20 Joshua Bloom December 24, 2009 at 1:42 pm

The pro/con list seems a bit outdated. For instance, there is now surface plots in matplotlib. Check out the
surface demo. Your readers might be interested in seeing a bunch of matplotlib figures. Rendering figures is starting to get as powerful in python as it is in IDL. For truly cool 3d stuff, check out Mayavi2, which is part of the Enthought Distribution.

21 python .gt. IDL January 26, 2010 at 12:21 am

python + numpy/scipy + matplotlib + pyfits + pymc is far and away a better choice for new astronomers than IDL. It is true that python still doesn’t natively do all of the niceties that IDL does for astronomers (by the way, the Greenfield document is quite old in python years), but just because IDL does something should not imply that IDL does it well. Interpolation is an easy example.

Support for sparse matrices and a thorough wrapping of linalg packages (e.g. lapack), robust and flexible spline interpolation, automagic MCMC, a huge assortment of filtering options, pickle, shelve, matplotlib (which makes much nicer plots than IDL), clean syntax, simple OO interfacing, many optimization routines (MPFIT has been ported, but that’s really just levenberg-marquardt, which has been in scipy for a long time — check out openopt for a great deal of optimizers!), huge community, and it is free. And this last point really is important because it attracts even more (perhaps non-astronomer) users to help build the codebase. Python is worth it just for pymc — perhaps this will finally motivate astronomers to treat their data and models properly!

22 Jessica January 27, 2010 at 11:48 pm

@Josh

Now we have a version of the Pros/Cons list on the wiki. The wiki version is meant to be updated. I modified to reflect your comment about Mayavi2.

http://www.astrobetter.com/wiki/tiki-index.php?page=idl_vs_python

23 Kelle January 28, 2010 at 9:23 am

I’m closing the comments. Please continue this conversation on the wiki page instead: http://www.astrobetter.com/wiki/tiki-index.php?page=idl_vs_python

Comments on this entry are closed.

Previous post:

Next post: