# Visualization Fun with Python

by on February 10, 2014

The below plot is my favorite data visualization I created for my thesis. It is a 2D density plot with histograms projected along each axis.

I based the above plot on code from here, however this plot also includes a 2D temperature/density plot in the middle, and 1/2/3 sigma contour lines.  Below is the code I used to generate this plot in python. There are quite a few tricks in the below code including:

• Adding multiple plots to a single figure
• Making 2D temperature/density plots
• Making 1D histogram plots
• Adding contour lines to plots
• Adjusting size and font of labels and tick-marks
• Adding text to a plot
• Adjusting the limits of the plot
• Removing tick-labels

Here is the code:

import numpy as np import matplotlib.pyplot as plt from matplotlib.ticker import NullFormatter, MaxNLocator from numpy import linspace plt.ion()   # Define a function to make the ellipses def ellipse(ra,rb,ang,x0,y0,Nb=100): xpos,ypos=x0,y0 radm,radn=ra,rb an=ang co,si=np.cos(an),np.sin(an) the=linspace(0,2*np.pi,Nb) X=radm*np.cos(the)*co-si*radn*np.sin(the)+xpos Y=radm*np.cos(the)*si+co*radn*np.sin(the)+ypos return X,Y   # Define the x and y data # For example just using random numbers x = np.random.randn(10000) y = np.random.randn(10000)   # Set up default x and y limits xlims = [min(x),max(x)] ylims = [min(y),max(y)]   # Set up your x and y labels xlabel = '$\mathrm{Your\\ X\\ Label}$' ylabel = '$\mathrm{Your\\ Y\\ Label}$'   # Define the locations for the axes left, width = 0.12, 0.55 bottom, height = 0.12, 0.55 bottom_h = left_h = left+width+0.02   # Set up the geometry of the three plots rect_temperature = [left, bottom, width, height] # dimensions of temp plot rect_histx = [left, bottom_h, width, 0.25] # dimensions of x-histogram rect_histy = [left_h, bottom, 0.25, height] # dimensions of y-histogram   # Set up the size of the figure fig = plt.figure(1, figsize=(9.5,9))   # Make the three plots axTemperature = plt.axes(rect_temperature) # temperature plot axHistx = plt.axes(rect_histx) # x histogram axHisty = plt.axes(rect_histy) # y histogram   # Remove the inner axes numbers of the histograms nullfmt = NullFormatter() axHistx.xaxis.set_major_formatter(nullfmt) axHisty.yaxis.set_major_formatter(nullfmt)   # Find the min/max of the data xmin = min(xlims) xmax = max(xlims) ymin = min(ylims) ymax = max(y)   # Make the 'main' temperature plot # Define the number of bins nxbins = 50 nybins = 50 nbins = 100   xbins = linspace(start = xmin, stop = xmax, num = nxbins) ybins = linspace(start = ymin, stop = ymax, num = nybins) xcenter = (xbins[0:-1]+xbins[1:])/2.0 ycenter = (ybins[0:-1]+ybins[1:])/2.0 aspectratio = 1.0*(xmax - 0)/(1.0*ymax - 0)   H, xedges,yedges = np.histogram2d(y,x,bins=(ybins,xbins)) X = xcenter Y = ycenter Z = H   # Plot the temperature data cax = (axTemperature.imshow(H, extent=[xmin,xmax,ymin,ymax], interpolation='nearest', origin='lower',aspect=aspectratio))   # Plot the temperature plot contours contourcolor = 'white' xcenter = np.mean(x) ycenter = np.mean(y) ra = np.std(x) rb = np.std(y) ang = 0   X,Y=ellipse(ra,rb,ang,xcenter,ycenter) axTemperature.plot(X,Y,"k:",ms=1,linewidth=2.0) axTemperature.annotate('$1\\sigma$', xy=(X[15], Y[15]), xycoords='data',xytext=(10, 10), textcoords='offset points', horizontalalignment='right', verticalalignment='bottom',fontsize=25)   X,Y=ellipse(2*ra,2*rb,ang,xcenter,ycenter) axTemperature.plot(X,Y,"k:",color = contourcolor,ms=1,linewidth=2.0) axTemperature.annotate('$2\\sigma$', xy=(X[15], Y[15]), xycoords='data',xytext=(10, 10), textcoords='offset points',horizontalalignment='right', verticalalignment='bottom',fontsize=25, color = contourcolor)   X,Y=ellipse(3*ra,3*rb,ang,xcenter,ycenter) axTemperature.plot(X,Y,"k:",color = contourcolor, ms=1,linewidth=2.0) axTemperature.annotate('$3\\sigma$', xy=(X[15], Y[15]), xycoords='data',xytext=(10, 10), textcoords='offset points',horizontalalignment='right', verticalalignment='bottom',fontsize=25, color = contourcolor)   #Plot the axes labels axTemperature.set_xlabel(xlabel,fontsize=25) axTemperature.set_ylabel(ylabel,fontsize=25)   #Make the tickmarks pretty ticklabels = axTemperature.get_xticklabels() for label in ticklabels: label.set_fontsize(18) label.set_family('serif')   ticklabels = axTemperature.get_yticklabels() for label in ticklabels: label.set_fontsize(18) label.set_family('serif')   #Set up the plot limits axTemperature.set_xlim(xlims) axTemperature.set_ylim(ylims)   #Set up the histogram bins xbins = np.arange(xmin, xmax, (xmax-xmin)/nbins) ybins = np.arange(ymin, ymax, (ymax-ymin)/nbins)   #Plot the histograms axHistx.hist(x, bins=xbins, color = 'blue') axHisty.hist(y, bins=ybins, orientation='horizontal', color = 'red')   #Set up the histogram limits axHistx.set_xlim( min(x), max(x) ) axHisty.set_ylim( min(y), max(y) )   #Make the tickmarks pretty ticklabels = axHistx.get_yticklabels() for label in ticklabels: label.set_fontsize(12) label.set_family('serif')   #Make the tickmarks pretty ticklabels = axHisty.get_xticklabels() for label in ticklabels: label.set_fontsize(12) label.set_family('serif')   #Cool trick that changes the number of tickmarks for the histogram axes axHisty.xaxis.set_major_locator(MaxNLocator(4)) axHistx.yaxis.set_major_locator(MaxNLocator(4))   #Show the plot plt.draw()   # Save to a File filename = 'myplot' plt.savefig(filename + '.pdf',format = 'pdf', transparent=True)

1 Anne February 10, 2014 at 12:22 pm

Nice! This is a very useful kind of plot. But the rainbow color map is a bad idea: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4118486&tag=1 Fortunately it’s just a matter of using plt.hot() (or your other favourite non-problematic color map).

2 Adam Ginsburg February 10, 2014 at 3:16 pm

This would be nice as a submission to the matplotlib gallery; it combines elements of a few examples already there:
http://matplotlib.org/examples/pylab_examples/scatter_hist.html
http://matplotlib.org/examples/axes_grid/scatter_hist.html
http://matplotlib.org/examples/pylab_examples/hist2d_demo.html

Also, while I often do as you did with ticks & loops, there are convenience functions to replace things like:
ticklabels = axHistx.get_yticklabels()
for label in ticklabels:
label.set_fontsize(12)
label.set_family(‘serif’)
with:
axHistx.yaxis.set_tick_params(labelsize=12)
though apparently there’s no way to set the font family with this approach… that surprises me.

3 Coleman Krawczyk February 13, 2014 at 11:59 pm

You can always set the global font family in one line by doing:

plt.rcParams['font.family'] = ‘serif’

4 Bo Milvang-Jensen February 10, 2014 at 4:16 pm

Thanks for sharing Jess! I’m new to Python and I’m looking forward to getting to know it.

Why is the top panel wider than the middle panel?

5 adrian p-w February 10, 2014 at 6:20 pm

Try using histtype=’step’ or histtype=’stepfilled’ in the .hist() call — all those vertical black lines in the histogram just become noise to the reader

6 Peter I. P. February 11, 2014 at 3:50 am

Using the http://matplotlib.org/users/gridspec.html package, you can have an even tidier arrangement of your subplots (as now they are not perfectly well lined up with the central plot).

7 Pablo Marchant February 11, 2014 at 7:46 am

Just reposting this link from the astronomers facebook page, a small read that summarizes the problems with the rainbow color map:

http://www.jwave.vt.edu/~rkriz/Projects/create_color_table/color_07.pdf

And that doesn’t even deal with issues with color blind people. Very specific to matplotlib, this stackoverflow page contains detailed answers on available colormaps, and their luminance from end to end:

http://stackoverflow.com/questions/13968520/color-selection-for-matplotlib-that-prints-well

A mononously changing luminance is usually paramount for publications, in order for plots to be usable in b/w.

Other than that, its a nice example