Cache Filtering#
In many cases you will want to choose what you want to cache instead of just caching everything. By
default, all read-only (GET
and HEAD
) requests with a 200 response code are cached. A
few options are available to modify this behavior.
Note
When using CachedSession
, any requests that you don’t want to cache can also be made
with a regular requests.Session
object, or wrapper functions like
requests.get()
, etc.
Filter by HTTP Methods#
To cache additional HTTP methods, specify them with allowable_methods
:
>>> session = CachedSession(allowable_methods=('GET', 'POST'))
>>> session.post('https://httpbin.org/post', json={'param': 'value'})
For example, some APIs use the POST
method to request data via a JSON-formatted request body, for
requests that may exceed the max size of a GET
request. You may also want to cache POST
requests
to ensure you don’t send the exact same data multiple times.
Filter by Status Codes#
To cache additional status codes, specify them with allowable_codes
>>> session = CachedSession(allowable_codes=(200, 418))
>>> session.get('https://httpbin.org/teapot')
Filter by URLs#
You can use URL patterns to define an allowlist for selective caching, by
using a expiration value of requests_cache.DO_NOT_CACHE
for non-matching request URLs:
>>> from requests_cache import DO_NOT_CACHE, NEVER_EXPIRE, CachedSession
>>> urls_expire_after = {
... '*.site_1.com': 30,
... 'site_2.com/static': NEVER_EXPIRE,
... '*': DO_NOT_CACHE,
... }
>>> session = CachedSession(urls_expire_after=urls_expire_after)
Note that the catch-all rule above ('*'
) will behave the same as setting the session-level
expiration to 0
:
>>> urls_expire_after = {'*.site_1.com': 30, 'site_2.com/static': -1}
>>> session = CachedSession(urls_expire_after=urls_expire_after, expire_after=0)
Custom Cache Filtering#
If you need more advanced behavior for choosing what to cache, you can provide a custom filtering
function via the filter_fn
param. This can by any function that takes a
requests.Response
object and returns a boolean indicating whether or not that response
should be cached. It will be applied to both new responses (on write) and previously cached
responses (on read):
>>> from sys import getsizeof
>>> from requests_cache import CachedSession
>>> def filter_by_size(response: Response) -> bool:
>>> """Don't cache responses with a body over 1 MB"""
>>> return getsizeof(response.content) <= 1024 * 1024
>>> session = CachedSession(filter_fn=filter_by_size)
Note
filter_fn()
will be used in addition to other filtering options.