Clearly Color Categories
As a grad student, I work a lot with satellite data. This satellite data sometimes has categorical information like whether the pixel is from a broken detector or it’s nightime in that pixel or ocean. Often this data also measures something, and both these things should go on a plot(please forgive the colors):
import matplotlib.colors as mcolors
import matplotlib.cm as mcm
import matplotlib.patches as mpatches
# Create a custom color scheme for the missing value codes
colors = ["#FFCCCC", "#FFCC99", "#CCCCCC",
"#CC9933", "#99FFFF", "#66FFFF",
"#999999", "#CC9999", "#666666"]
mcmap = mcolors.ListedColormap(colors)
mnorm = mcolors.BoundaryNorm([200, 201, 211, 225, 237,239, 250, 254,255, 260], mcmap.N)
#make proper legend for the missing data
labels = ['missing data', 'no decision', 'night',
'land', 'inland water', 'ocean',
'cloud', 'detector saturated', 'fill']
def plot_fsc(fig, ax, snow, codes):
cd = ax.imshow(codes, cmap=mcmap, norm=mnorm, interpolation='nearest')
# everything else omitted
patches = [mpatches.Patch(color=c, label=l) for c, l in zip(colors, labels)]
ax3.legend(patches, labels, ncol=1, loc=6)
Even edited such that the non-categorical stuff is omitted, that’s a lot of boilerplate. But, even worse, it’s confusing
as all out because 260 isn’t even in the data but necessary to exploit
BoundaryNorm
, which
assigns colors based on lower bound <= value < upper bound
. This is confusing enough when the categories are nominally numerical, but it’s even more problematic when the data is numerical. So my [WIP] categorical color pull request aims to simplify the hoops and make it easier to keep track of which colors are attached to what categories. The aim is to be able to create the above image with something like the following code:
import matplotlib.category as cat
codings = [('missing data', '#FFCCCC'), ('no decision', '#FFCC99'),
('night', '#CCCCCC'), ('land', '#CC9933'),
('inland water', '#99FFFF'), ('ocean', '#66FFFF'),
('cloud', '#999999'), ('detector saturated', '#CC9999'),
('fill', '#666666')]
cmap, norm = cat.colors_from_categories(codings)
fig, ax = plt.subplots()
sm = ax.imshow(data, cmap=cmap, norm=norm)
ax.legend()
There will also likely be the ability to pass labels into legend in case the data is encoded in integers, like the satellite data. The first step in this has been to develop a CategoryNorm
that maps a list of categories to colors directly and doesn’t rely on them being contiguous. This has the added bonus of allowing users to select the categories they care about without having to explicitely create a mask.