v0.11.0 (September 2020)#
This is a major release with several important new features, enhancements to existing functions, and changes to the library. Highlights include an overhaul and modernization of the distributions plotting functions, more flexible data specification, new colormaps, and better narrative documentation.
For an overview of the new features and a guide to updating, see this Medium post.
Required keyword arguments#
API
Most plotting functions now require all of their parameters to be specified using keyword arguments. To ease adaptation, code without keyword arguments will trigger a FutureWarning
in v0.11. In a future release (v0.12 or v0.13, depending on release cadence), this will become an error. Once keyword arguments are fully enforced, the signature of the plotting functions will be reorganized to accept data
as the first and only positional argument (#2052, #2081).
Modernization of distribution functions#
The distribution module has been completely overhauled, modernizing the API and introducing several new functions and features within existing functions. Some new features are explained here; the tutorial documentation has also been rewritten and serves as a good introduction to the functions.
New plotting functions#
Feature Enhancement
First, three new functions, displot()
, histplot()
and ecdfplot()
have been added (#2157, #2125, #2141).
The figure-level displot()
function is an interface to the various distribution plots (analogous to relplot()
or catplot()
). It can draw univariate or bivariate histograms, density curves, ECDFs, and rug plots on a FacetGrid
.
The axes-level histplot()
function draws univariate or bivariate histograms with a number of features, including:
mapping multiple distributions with a
hue
semanticnormalization to show density, probability, or frequency statistics
flexible parameterization of bin size, including proper bins for discrete variables
adding a KDE fit to show a smoothed distribution over all bin statistics
experimental support for histograms over categorical and datetime variables.
The axes-level ecdfplot()
function draws univariate empirical cumulative distribution functions, using a similar interface.
Changes to existing functions#
API Feature Enhancement Defaults
Second, the existing functions kdeplot()
and rugplot()
have been completely overhauled (#2060, #2104).
The overhauled functions now share a common API with the rest of seaborn, they can show conditional distributions by mapping a third variable with a hue
semantic, and they have been improved in numerous other ways. The github pull request (#2104) has a longer explanation of the changes and the motivation behind them.
This is a necessarily API-breaking change. The parameter names for the positional variables are now x
and y
, and the old names have been deprecated. Efforts were made to handle and warn when using the deprecated API, but it is strongly suggested to check your plots carefully.
Additionally, the statsmodels-based computation of the KDE has been removed. Because there were some inconsistencies between the way different parameters (specifically, bw
, clip
, and cut
) were implemented by each backend, this may cause plots to look different with non-default parameters. Support for using non-Gaussian kernels, which was available only in the statsmodels backend, has been removed.
Other new features include:
several options for representing multiple densities (using the
multiple
andcommon_norm
parameters)weighted density estimation (using the new
weights
parameter)better control over the smoothing bandwidth (using the new
bw_adjust
parameter)more meaningful parameterization of the contours that represent a bivariate density (using the
thresh
andlevels
parameters)log-space density estimation (using the new
log_scale
parameter, or by scaling the data axis before plotting)“bivariate” rug plots with a single function call (by assigning both
x
andy
)
Deprecations#
API
Finally, the distplot()
function is now formally deprecated. Its features have been subsumed by displot()
and histplot()
. Some effort was made to gradually transition distplot()
by adding the features in displot()
and handling backwards compatibility, but this proved to be too difficult. The similarity in the names will likely cause some confusion during the transition, which is regrettable.
Standardization and enhancements of data ingest#
Feature Enhancement Docs
The code that processes input data has been refactored and enhanced. In v0.11, this new code takes effect for the relational and distribution modules; other modules will be refactored to use it in future releases (#2071).
These changes should be transparent for most use-cases, although they allow a few new features:
Named variables for long-form data can refer to the named index of a
pandas.DataFrame
or to levels in the case of a multi-index. Previously, it was necessary to callpandas.DataFrame.reset_index()
before using index variables (e.g., after a groupby operation).relplot()
now has the same flexibility as the axes-level functions to accept data in long- or wide-format and to accept data vectors (rather than named variables) in long-form mode.The data parameter can now be a Python
dict
or an object that implements that interface. This is a new feature for wide-form data. For long-form data, it was previously supported but not documented.A wide-form data object can have a mixture of types; the non-numeric types will be removed before plotting. Previously, this caused an error.
There are better error messages for other instances of data mis-specification.
See the new user guide chapter on data formats for more information about what is supported.
Other changes#
Documentation improvements#
Docs Added two new chapters to the user guide, one giving an overview of the types of functions in seaborn, and one discussing the different data formats that seaborn understands.
Docs Expanded the color palette tutorial to give more background on color theory and better motivate the use of color in statistical graphics.
Docs Added more information to the installation guidelines and streamlined the introduction page.
Docs Improved cross-linking within the seaborn docs and between the seaborn and matplotlib docs.
Theming#
API The
set()
function has been renamed toset_theme()
for more clarity about what it does. For the foreseeable future,set()
will remain as an alias, but it is recommended to update your code.
Relational plots#
Enhancement Defaults Reduced some of the surprising behavior of relational plot legends when using a numeric hue or size mapping (#2229):
Added an “auto” mode (the new default) that chooses between “brief” and “full” legends based on the number of unique levels of each variable.
Modified the ticking algorithm for a “brief” legend to show up to 6 values and not to show values outside the limits of the data.
Changed the approach to the legend title: the normal matplotlib legend title is used when only one variable is assigned a semantic mapping, whereas the old approach of adding an invisible legend artist with a subtitle label is used only when multiple semantic variables are defined.
Modified legend subtitles to be left-aligned and to be drawn in the default legend title font size.
Enhancement Defaults Changed how functions that use different representations for numeric and categorical data handle vectors with an
object
data type. Previously, data was considered numeric if it could be coerced to a float representation without error. Now, object-typed vectors are considered numeric only when their contents are themselves numeric. As a consequence, numbers that are encoded as strings will now be treated as categorical data (#2084).Enhancement Defaults Plots with a
style
semantic can now generate an infinite number of unique dashes and/or markers by default. Previously, an error would be raised if thestyle
variable had more levels than could be mapped using the default lists. The existing defaults were slightly modified as part of this change; if you need to exactly reproduce plots from earlier versions, refer to the old defaults (#2075).Defaults Changed how
scatterplot()
sets the default linewidth for the edges of the scatter points. New behavior is to scale with the point sizes themselves (on a plot-wise, not point-wise basis). This change also slightly reduces the default width when point sizes are not varied. Setlinewidth=0.75
to reproduce the previous behavior. (#2708).Enhancement Improved support for datetime variables in
scatterplot()
andlineplot()
(#2138).Fix Fixed a bug where
lineplot()
did not pass thelinestyle
parameter down to matplotlib (#2095).Fix Adapted to a change in matplotlib that prevented passing vectors of literal values to
c
ands
inscatterplot()
(#2079).
Categorical plots#
Enhancement Defaults Fix Fixed a few computational issues in
boxenplot()
and improved its visual appearance (#2086):Changed the default method for computing the number of boxes to``k_depth=”tukey”
, as the previous default (``k_depth="proportion"
) is based on a heuristic that produces too many boxes for small datasets.Added the option to specify the specific number of boxes (e.g.
k_depth=6
) or to plot boxes that will cover most of the data points (k_depth="full"
).Added a new parameter,
trust_alpha
, to control the number of boxes whenk_depth="trustworthy"
.Changed the visual appearance of
boxenplot()
to more closely resembleboxplot()
. Notably, thin boxes will remain visible when the edges are white.
Enhancement Allowed
catplot()
to use different values on the categorical axis of each facet when axis sharing is turned off (e.g. by specifyingsharex=False
) (#2196).Enhancement Improved the error messages produced when categorical plots process the orientation parameter.
Enhancement Added an explicit warning in
swarmplot()
when more than 5% of the points overlap in the “gutters” of the swarm (#2045).
Multi-plot grids#
Feature Enhancement Defaults A few small changes to make life easier when using
PairGrid
(#2234):Added public access to the legend object through the
legend
attribute (also affectsFacetGrid
).The
color
andlabel
parameters are no longer passed to the plotting functions whenhue
is not used.The data is no longer converted to a numpy object before plotting on the marginal axes.
It is possible to specify only one of
x_vars
ory_vars
, using all variables for the unspecified dimension.The
layout_pad
parameter is stored and used every time you call thePairGrid.tight_layout()
method.
Feature Added a
tight_layout
method toFacetGrid
andPairGrid
, which runs thematplotlib.pyplot.tight_layout()
algorithm without interference from the external legend (#2073).Feature Added the
axes_dict
attribute toFacetGrid
for named access to the component axes (#2046).Enhancement Made
FacetGrid.set_axis_labels()
clear labels from “interior” axes (#2046).Feature Added the
marginal_ticks
parameter toJointGrid
which, if set toTrue
, will show ticks on the count/density axis of the marginal plots (#2210).Enhancement Improved
FacetGrid.set_titles()
withmargin_titles=True
, such that texts representing the original row titles are removed before adding new ones (#2083).Defaults Changed the default value for
dropna
toFalse
inFacetGrid
,PairGrid
,JointGrid
, and corresponding functions. As all or nearly all seaborn and matplotlib plotting functions handle missing data well, this option is no longer useful, but it causes problems in some edge cases. It may be deprecated in the future. (#2204).Fix Fixed a bug in
PairGrid
that appeared when settingcorner=True
anddespine=False
(#2203).
Color palettes#
Docs Improved and modernized the color palettes chapter of the seaborn tutorial.
Feature Added two new perceptually-uniform colormaps: “flare” and “crest”. The new colormaps are similar to “rocket” and “mako”, but their luminance range is reduced. This makes them well suited to numeric mappings of line or scatter plots, which need contrast with the axes background at the extremes (#2237).
Enhancement Defaults Enhanced numeric colormap functionality in several ways (#2237):
Added string-based access within the
color_palette()
interface todark_palette()
,light_palette()
, andblend_palette()
. This means that anywhere you specify a palette in seaborn, a name like"dark:blue"
will usedark_palette()
with the input"blue"
.Added the
as_cmap
parameter tocolor_palette()
and changed internal code that uses a continuous colormap to take this route.Tweaked the
light_palette()
anddark_palette()
functions to use an endpoint that is a very desaturated version of the input color, rather than a pure gray. This produces smoother ramps. To exactly reproduce previous plots, useblend_palette()
with".13"
for dark or".95"
for light.Changed
diverging_palette()
to have a default value ofsep=1
, which gives better results.
Enhancement Added a rich HTML representation to the object returned by
color_palette()
(#2225).Fix Fixed the
"{palette}_d"
logic to modify reversed colormaps and to use the correct direction of the luminance ramp in both cases.
Deprecations and removals#
Enhancement Removed an optional (and undocumented) dependency on BeautifulSoup (#2190) in
get_dataset_names()
.API Deprecated the
axlabel
function; useax.set(xlabel=, ylabel=)
instead.API Deprecated the
iqr
function; usescipy.stats.iqr()
instead.API Final removal of the previously-deprecated
annotate
method onJointGrid
, along with related parameters.API Final removal of the
lvplot
function (the previously-deprecated name forboxenplot()
).