v0.6.0 (June 2015)#
This is a major release from 0.5. The main objective of this release was to unify the API for categorical plots, which means that there are some relatively large API changes in some of the older functions. See below for details of those changes, which may break code written for older versions of seaborn. There are also some new functions (stripplot()
, and countplot()
), numerous enhancements to existing functions, and bug fixes.
Additionally, the documentation has been completely revamped and expanded for the 0.6 release. Now, the API docs page for each function has multiple examples with embedded plots showing how to use the various options. These pages should be considered the most comprehensive resource for examples, and the tutorial pages are now streamlined and oriented towards a higher-level overview of the various features.
Changes and updates to categorical plots#
In version 0.6, the “categorical” plots have been unified with a common API. This new category of functions groups together plots that show the relationship between one numeric variable and one or two categorical variables. This includes plots that show distribution of the numeric variable in each bin (boxplot()
, violinplot()
, and stripplot()
) and plots that apply a statistical estimation within each bin (pointplot()
, barplot()
, and countplot()
). There is a new tutorial chapter that introduces these functions.
The categorical functions now each accept the same formats of input data and can be invoked in the same way. They can plot using long- or wide-form data, and can be drawn vertically or horizontally. When long-form data is used, the orientation of the plots is inferred from the types of the input data. Additionally, all functions natively take a hue
variable to add a second layer of categorization.
With the (in some cases new) API, these functions can all be drawn correctly by FacetGrid
. However, factorplot
can also now create faceted versions of any of these kinds of plots, so in most cases it will be unnecessary to use FacetGrid
directly. By default, factorplot
draws a point plot, but this is controlled by the kind
parameter.
Here are details on what has changed in the process of unifying these APIs:
Changes to
boxplot()
andviolinplot()
will probably be the most disruptive. Both functions maintain backwards-compatibility in terms of the kind of data they can accept, but the syntax has changed to be more similar to other seaborn functions. These functions are now invoked withx
and/ory
parameters that are either vectors of data or names of variables in a long-form DataFrame passed to the newdata
parameter. You can still pass wide-form DataFrames or arrays todata
, but it is no longer the first positional argument. See the github pull request (#410) for more information on these changes and the logic behind them.As
pointplot()
andbarplot()
can now plot with the major categorical variable on the y axis, thex_order
parameter has been renamed toorder
.Added a
hue
argument toboxplot()
andviolinplot()
, which allows for nested grouping the plot elements by a third categorical variable. Forviolinplot()
, this nesting can also be accomplished by splitting the violins when there are two levels of thehue
variable (usingsplit=True
). To make this functionality feasible, the ability to specify where the plots will be draw in data coordinates has been removed. These plots now are drawn at set positions, like (and identical to)barplot()
andpointplot()
.Added a
palette
parameter toboxplot()
/violinplot()
. Thecolor
parameter still exists, but no longer does double-duty in accepting the name of a seaborn palette.palette
supersedescolor
so that it can be used with aFacetGrid
.
Along with these API changes, the following changes/enhancements were made to the plotting functions:
The default rules for ordering the categories has changed. Instead of automatically sorting the category levels, the plots now show the levels in the order they appear in the input data (i.e., the order given by
Series.unique()
). Order can be specified when plotting with theorder
andhue_order
parameters. Additionally, when variables are pandas objects with a “categorical” dtype, the category order is inferred from the data object. This change also affectsFacetGrid
andPairGrid
.Added the
scale
andscale_hue
parameters toviolinplot()
. These control how the width of the violins are scaled. The default isarea
, which is different from how the violins used to be drawn. Usescale='width'
to get the old behavior.Used a different style for the
box
kind of interior plot inviolinplot()
, which shows the whisker range in addition to the quartiles. Useinner='quartile'
to get the old style.
New plotting functions#
Added the
stripplot()
function, which draws a scatterplot where one of the variables is categorical. This plot has the same API asboxplot()
andviolinplot()
. It is useful both on its own and when composed with one of these other plot kinds to show both the observations and underlying distribution.Added the
countplot()
function, which uses a bar plot representation to show counts of variables in one or more categorical bins. This replaces the old approach of callingbarplot()
without a numeric variable.
Other additions and changes#
The
corrplot()
and underlyingsymmatplot()
functions have been deprecated in favor ofheatmap()
, which is much more flexible and robust. These two functions are still available in version 0.6, but they will be removed in a future version.Added the
set_color_codes()
function and thecolor_codes
argument toset()
andset_palette()
. This changes the interpretation of shorthand color codes (i.e. “b”, “g”, k”, etc.) within matplotlib to use the values from one of the named seaborn palettes (i.e. “deep”, “muted”, etc.). That makes it easier to have a more uniform look when using matplotlib functions directly with seaborn imported. This could be disruptive to existing plots, so it does not happen by default. It is possible this could change in the future.The
color_palette()
function no longer trims palettes that are longer than 6 colors when passed into it.Added the
as_hex
method to color palette objects, to return a list of hex codes rather than rgb tuples.jointplot()
now passes additional keyword arguments to the function used to draw the plot on the joint axes.Changed the default
linewidths
inheatmap()
andclustermap()
to 0 so that larger matrices plot correctly. This parameter still exists and can be used to get the old effect of lines demarcating each cell in the heatmap (the old defaultlinewidths
was 0.5).heatmap()
andclustermap()
now automatically use a mask for missing values, which previously were shown with the “under” value of the colormap per defaultplt.pcolormesh
behavior.Added the
seaborn.crayons
dictionary and thecrayon_palette()
function to define colors from the 120 box (!) of Crayola crayons.Added the
line_kws
parameter toresidplot()
to change the style of the lowess line, when used.Added open-ended
**kwargs
to theadd_legend
method onFacetGrid
andPairGrid
, which will pass additional keyword arguments through when calling the legend function on theFigure
orAxes
.Added the
gridspec_kws
parameter toFacetGrid
, which allows for control over the size of individual facets in the grid to emphasize certain plots or account for differences in variable ranges.The interactive palette widgets now show a continuous colorbar, rather than a discrete palette, when
as_cmap
is True.The default Axes size for
pairplot()
andPairGrid
is now slightly smaller.Added the
shade_lowest
parameter tokdeplot()
which will set the alpha for the lowest contour level to 0, making it easier to plot multiple bivariate distributions on the same axes.The
height
parameter ofrugplot()
is now interpreted as a function of the axis size and is invariant to changes in the data scale on that axis. The rug lines are also slightly narrower by default.Added a catch in
distplot()
when calculating a default number of bins. For highly skewed data it will now use sqrt(n) bins, where previously the reference rule would return “infinite” bins and cause an exception in matplotlib.Added a ceiling (50) to the default number of bins used for
distplot()
histograms. This will help avoid confusing errors with certain kinds of datasets that heavily violate the assumptions of the reference rule used to get a default number of bins. The ceiling is not applied when passing a specific number of bins.The various property dictionaries that can be passed to
plt.boxplot
are now applied after the seaborn restyling to allow for full customizability.Added a
savefig
method toJointGrid
that defaults to a tight bounding box to make it easier to save figures using this class, and set a tight bbox as the default for thesavefig
method on other Grid objects.You can now pass an integer to the
xticklabels
andyticklabels
parameter ofheatmap()
(and, by extension,clustermap()
). This will make the plot use the ticklabels inferred from the data, but only plot everyn
label, wheren
is the number you pass. This can help when visualizing larger matrices with some sensible ordering to the rows or columns of the dataframe.Added
"figure.facecolor"
to the style parameters and set the default to white.The
load_dataset()
function now caches datasets locally after downloading them, and uses the local copy on subsequent calls.
Bug fixes#
Fixed bugs in
clustermap()
where the mask and specified ticklabels were not being reorganized using the dendrograms.Fixed a bug in
FacetGrid
andPairGrid
that lead to incorrect legend labels when levels of thehue
variable appeared inhue_order
but not in the data.Fixed a bug in
FacetGrid.set_xticklabels()
orFacetGrid.set_yticklabels()
whencol_wrap
is being used.Fixed a bug in
PairGrid
where thehue_order
parameter was ignored.Fixed two bugs in
despine()
that caused errors when trying to trim the spines on plots that had inverted axes or no ticks.Improved support for the
margin_titles
option inFacetGrid
, which can now be used with a legend.