gevent.threadpool
- A pool of native threads#
- class ThreadPool(maxsize, hub=None, idle_task_timeout=-1)[source]#
Bases:
GroupMappingMixin
A pool of native worker threads.
This can be useful for CPU intensive functions, or those that otherwise will not cooperate with gevent. The best functions to execute in a thread pool are small functions with a single purpose; ideally they release the CPython GIL. Such functions are extension functions implemented in C.
It implements the same operations as a
gevent.pool.Pool
, but using threads instead of greenlets.Note
The method
apply_async()
will always return a new greenlet, bypassing the threadpool entirely.Most users will not need to create instances of this class. Instead, use the threadpool already associated with gevent’s hub:
pool = gevent.get_hub().threadpool result = pool.spawn(lambda: "Some func").get()
Important
It is only possible to use instances of this class from the thread running their hub. Typically that means from the thread that created them. Using the pattern shown above takes care of this.
There is no gevent-provided way to have a single process-wide limit on the number of threads in various pools when doing that, however. The suggested way to use gevent and threadpools is to have a single gevent hub and its one threadpool (which is the default without doing any extra work). Only dispatch minimal blocking functions to the threadpool, functions that do not use the gevent hub.
The
len
of instances of this class is the number of enqueued (unfinished) tasks.Just before a task starts running in a worker thread, the values of
threading.setprofile()
andthreading.settrace()
are consulted. Any values there are installed in that thread for the duration of the task (usingsys.setprofile()
andsys.settrace()
, respectively). (Because worker threads are long-lived and outlast any given task, this arrangement lets the hook functions change between tasks, but does not let them see the bookkeeping done by the worker thread itself.)Caution
Instances of this class are only true if they have unfinished tasks.
Changed in version 1.5a3: The undocumented
apply_e
function, deprecated since 1.1, was removed.Changed in version 20.12.0: Install the profile and trace functions in the worker thread while the worker thread is running the supplied task.
Changed in version 22.08.0: Add the option to let idle threads expire and be removed from the pool after idle_task_timeout seconds (-1 for no timeout)
- apply(func, args=None, kwds=None)#
Rough equivalent of the
apply()
builtin function, blocking until the result is ready and returning it.The
func
will usually, but not always, be run in a way that allows the current greenlet to switch out (for example, in a new greenlet or thread, depending on implementation). But if the current greenlet or thread is already one that was spawned by this pool, the pool may choose to immediately run thefunc
synchronously.Note
As implemented, attempting to use
Threadpool.apply()
from inside another function that was itself spawned in a threadpool (any threadpool) will cause the function to be run immediately.Changed in version 1.1a2: Now raises any exception raised by func instead of dropping it.
- apply(func, args=None, kwds=None)#
Rough quivalent of the
apply()
builtin function blocking until the result is ready and returning it.The
func
will usually, but not always, be run in a way that allows the current greenlet to switch out (for example, in a new greenlet or thread, depending on implementation). But if the current greenlet or thread is already one that was spawned by this pool, the pool may choose to immediately run thefunc
synchronously.Any exception
func
raises will be propagated to the caller ofapply
(that is, this method will raise the exception thatfunc
raised).
- apply_async(func, args=None, kwds=None, callback=None)#
A variant of the
apply()
method which returns aGreenlet
object.When the returned greenlet gets to run, it will call
apply()
, passing in func, args and kwds.If callback is specified, then it should be a callable which accepts a single argument. When the result becomes ready callback is applied to it (unless the call failed).
This method will never block, even if this group is full (that is, even if
spawn()
would block, this method will not).Caution
The returned greenlet may or may not be tracked as part of this group, so
joining
this group is not a reliable way to wait for the results to be available or for the returned greenlet to run; instead, join the returned greenlet.Tip
Because
ThreadPool
objects do not track greenlets, the returned greenlet will never be a part of it. To reduce overhead and improve performance,Group
andPool
may choose to track the returned greenlet. These are implementation details that may change.
- apply_cb(func, args=None, kwds=None, callback=None)#
apply()
the given func(*args, **kwds), and, if a callback is given, run it with the results of func (unless an exception was raised.)The callback may be called synchronously or asynchronously. If called asynchronously, it will not be tracked by this group. (
Group
andPool
call it asynchronously in a new greenlet;ThreadPool
calls it synchronously in the current greenlet.)
- imap(func, *iterables, maxsize=None) iterable #
An equivalent of
itertools.imap()
, operating in parallel. The func is applied to each element yielded from each iterable in iterables in turn, collecting the result.If this object has a bound on the number of active greenlets it can contain (such as
Pool
), then at most that number of tasks will operate in parallel.- Parameters:
maxsize (int) –
If given and not-None, specifies the maximum number of finished results that will be allowed to accumulate awaiting the reader; more than that number of results will cause map function greenlets to begin to block. This is most useful if there is a great disparity in the speed of the mapping code and the consumer and the results consume a great deal of resources.
Note
This is separate from any bound on the number of active parallel tasks, though they may have some interaction (for example, limiting the number of parallel tasks to the smallest bound).
Note
Using a bound is slightly more computationally expensive than not using a bound.
Tip
The
imap_unordered()
method makes much better use of this parameter. Some additional, unspecified, number of objects may be required to be kept in memory to maintain order by this function.- Returns:
An iterable object.
Changed in version 1.1b3: Added the maxsize keyword parameter.
Changed in version 1.1a1: Accept multiple iterables to iterate in parallel.
- imap_unordered(func, *iterables, maxsize=None) iterable #
The same as
imap()
except that the ordering of the results from the returned iterator should be considered in arbitrary order.This is lighter weight than
imap()
and should be preferred if order doesn’t matter.See also
imap()
for more details.
- map(func, iterable)#
Return a list made by applying the func to each element of the iterable.
See also
- map_async(func, iterable, callback=None)#
A variant of the map() method which returns a Greenlet object that is executing the map function.
If callback is specified then it should be a callable which accepts a single argument.
- spawn(func, *args, **kwargs)[source]#
Add a new task to the threadpool that will run
func(*args, **kwargs)
.Waits until a slot is available. Creates a new native thread if necessary.
This must only be called from the native thread that owns this object’s hub. This is because creating the necessary data structures to communicate back to this thread isn’t thread safe, so the hub must not be running something else. Also, ensuring the pool size stays correct only works within a single thread.
- Returns:
- Raises:
InvalidThreadUseError – If called from a different thread.
Changed in version 1.5: Document the thread-safety requirements.
- property maxsize#
The maximum allowed number of worker threads.
This is also (approximately) a limit on the number of tasks that can be queued without blocking the waiting greenlet. If this many tasks are already running, then the next greenlet that submits a task will block waiting for a task to finish.
- class ThreadPoolExecutor(*args, **kwargs)[source]#
Bases:
ThreadPoolExecutor
A version of
concurrent.futures.ThreadPoolExecutor
that always uses native threads, even when threading is monkey-patched.The
Future
objects returned from this object can be used with gevent waiting primitives likegevent.wait()
.Caution
If threading is not monkey-patched, then the
Future
objects returned by this object are not guaranteed to work withas_completed()
andwait()
. The individual blocking methods likeresult()
andexception()
will always work.New in version 1.2a1: This is a provisional API.
Takes the same arguments as
concurrent.futures.ThreadPoolExecuter
, which vary between Python versions.The first argument is always max_workers, the maximum number of threads to use. Most other arguments, while accepted, are ignored.
- kill(wait=True, **kwargs)#
Clean-up the resources associated with the Executor.
It is safe to call this method several times. Otherwise, no other methods can be called after this one.
- Args:
- wait: If True then shutdown will not return until all running
futures have finished executing and the resources used by the executor have been reclaimed.
- cancel_futures: If True then shutdown will cancel all pending
futures. Futures that are completed or running will not be cancelled.
- shutdown(wait=True, **kwargs)[source]#
Clean-up the resources associated with the Executor.
It is safe to call this method several times. Otherwise, no other methods can be called after this one.
- Args:
- wait: If True then shutdown will not return until all running
futures have finished executing and the resources used by the executor have been reclaimed.
- cancel_futures: If True then shutdown will cancel all pending
futures. Futures that are completed or running will not be cancelled.