As we all know, Histograms are an extremely common way to make sense of discrete data. Whether we mean to or not, when we're using histograms, we're usually doing some form of density estimation. That is, although we only have a few discrete data points, we'd really pretend that we have some sort of continuous distribution, and we'd really like to know what that distribution is. For instance, I was recently grading an exam and trying to figure out what the underlying distribution of grades looked like, whether I should curve the exam, and, if so, how I should curve it.
I'll poke at this in an IPython Notebook; if you're doing this in a different environments, you may wish to uncomment out the commented lines so that your namespace is properly polluted.
from __future__ import division %pylab inline grades = array((93.5,93,60.8,94.5,82,87.5,91.5,99.5,86,93.5,92.5,78,76,69,94.5,89.5,92.8,78,65.5,98,98.5,92.3,95.5,76,91,95,61.4,96,90)) junk = hist(grades)
Populating the interactive namespace from numpy and matplotlib