Recently I’ve been investigating a key dataset in my research, and really seeking to understand what is causing the patterns that I see. I realised that it would be really useful if I could plot an interactive scatter plot in Python, and then hover over points to find out further information in them.
Putting this into more technical terms: I wanted a scatter plot with tooltips (or ‘hover boxes’, or whatever else you want to call them) that showed information from the other columns in the pandas DataFrame that held the original data.
I’m fairly experienced with creating graphs using matplotlib, but I have very little experience with other systems (although I had played a bit with bokeh and mpld3). So, I emailed my friend and colleague Max Albert, asking if he knew of a nice simple function that looked a bit like this:
# df is a pandas DataFrame with columns A, B, C and D # # scatter plot of A vs B, with a hover tool giving the # values of A, B, C and D plot(A, B, 'x') # same, but only showing C (to deal with DataFrames with # loads of columns) plot(A, B, 'x', cols=['C'])
He got back to me saying that he didn’t know of a function that worked quite like that, but that it was relatively simple to do in bokeh – and sent some example code. I’ve now tidied that code up and created a function that is very similar to the example above.
My function is called scatter_with_hover, and using it, the two examples above would be written as:
scatter_with_hover(df, 'A', 'B') scatter_with_hover(df, 'A', 'B', cols=['C'])
You can pass all sorts of other parameters to the function too – such as marker shapes to use (circles, squares etc), a figure to plot on to, and any other parameters that the bokeh scatter function accepts (such as color, size etc).
Anyway, the code is below – feel free to use it in any way you want, and I hope it’s useful.
#wrap_githubgist04110b82190f13fa7ff7 .gist-data {max-height: 100%;}
""" | <> | """ | |