I/O in Trio¶

The abstract Stream API¶

Trio provides a set of abstract base classes that define a standard interface for unidirectional and bidirectional byte streams.

Why is this useful? Because it lets you write generic protocol implementations that can work over arbitrary transports, and easily create complex transport configurations. Here’s some examples:

trio.SocketStream wraps a raw socket (like a TCP connection over the network), and converts it to the standard stream interface.
trio.SSLStream is a “stream adapter” that can take any object that implements the trio.abc.Stream interface, and convert it into an encrypted stream. In Trio the standard way to speak SSL over the network is to wrap an SSLStream around a SocketStream.
If you spawn a subprocess, you can get a SendStream that lets you write to its stdin, and a ReceiveStream that lets you read from its stdout. If for some reason you wanted to speak SSL to a subprocess, you could use a StapledStream to combine its stdin/stdout into a single bidirectional Stream, and then wrap that in an SSLStream:
```
ssl_context = ssl.create_default_context()
ssl_context.check_hostname = False
s = SSLStream(StapledStream(process.stdin, process.stdout), ssl_context)
```

It sometimes happens that you want to connect to an HTTPS server, but you have to go through a web proxy… and the proxy also uses HTTPS. So you end up having to do SSL-on-top-of-SSL. In Trio this is trivial – just wrap your first SSLStream in a second SSLStream:

# Get a raw SocketStream connection to the proxy:
s0 = await open_tcp_stream("proxy", 443)

# Set up SSL connection to proxy:
s1 = SSLStream(s0, proxy_ssl_context, server_hostname="proxy")
# Request a connection to the website
await s1.send_all(b"CONNECT website:443 / HTTP/1.0\r\n\r\n")
await check_CONNECT_response(s1)

# Set up SSL connection to the real website. Notice that s1 is
# already an SSLStream object, and here we're wrapping a second
# SSLStream object around it.
s2 = SSLStream(s1, website_ssl_context, server_hostname="website")
# Make our request
await s2.send_all(b"GET /index.html HTTP/1.0\r\n\r\n")
...

The trio.testing module provides a set of flexible in-memory stream object implementations, so if you have a protocol implementation to test then you can can start two tasks, set up a virtual “socket” connecting them, and then do things like inject random-but-repeatable delays into the connection.

Abstract base classes¶

Overview: abstract base classes for I/O¶
Abstract base class	Inherits from…	Adds these abstract methods…	And these concrete methods.	Example implementations
`AsyncResource`		`aclose()`	`__aenter__`, `__aexit__`	Asynchronous file objects
`SendStream`	`AsyncResource`	`send_all()`, `wait_send_all_might_not_block()`		`MemorySendStream`
`ReceiveStream`	`AsyncResource`	`receive_some()`	`__aiter__`, `__anext__`	`MemoryReceiveStream`
`Stream`	`SendStream`, `ReceiveStream`			`SSLStream`
`HalfCloseableStream`	`Stream`	`send_eof()`		`SocketStream`, `StapledStream`
`Listener`	`AsyncResource`	`accept()`		`SocketListener`, `SSLListener`
`SendChannel`	`AsyncResource`	`send()`		`MemorySendChannel`
`ReceiveChannel`	`AsyncResource`	`receive()`	`__aiter__`, `__anext__`	`MemoryReceiveChannel`
`Channel`	`SendChannel`, `ReceiveChannel`

class trio.abc.AsyncResource¶

A standard interface for resources that needs to be cleaned up, and where that cleanup may require blocking operations.

This class distinguishes between “graceful” closes, which may perform I/O and thus block, and a “forceful” close, which cannot. For example, cleanly shutting down a TLS-encrypted connection requires sending a “goodbye” message; but if a peer has become non-responsive, then sending this message might block forever, so we may want to just drop the connection instead. Therefore the aclose() method is unusual in that it should always close the connection (or at least make its best attempt) even if it fails; failure indicates a failure to achieve grace, not a failure to close the connection.

Objects that implement this interface can be used as async context managers, i.e., you can write:

async with create_resource() as some_async_resource:
    ...

Entering the context manager is synchronous (not a checkpoint); exiting it calls aclose(). The default implementations of __aenter__ and __aexit__ should be adequate for all subclasses.

abstractmethod await aclose()¶

Close this resource, possibly blocking.

IMPORTANT: This method may block in order to perform a “graceful” shutdown. But, if this fails, then it still must close any underlying resources before returning. An error from this method indicates a failure to achieve grace, not a failure to close the connection.

For example, suppose we call aclose() on a TLS-encrypted connection. This requires sending a “goodbye” message; but if the peer has become non-responsive, then our attempt to send this message might block forever, and eventually time out and be cancelled. In this case the aclose() method on SSLStream will immediately close the underlying transport stream using trio.aclose_forcefully() before raising Cancelled.

If the resource is already closed, then this method should silently succeed.

Once this method completes, any other pending or future operations on this resource should generally raise ClosedResourceError, unless there’s a good reason to do otherwise.

Generic stream tools¶

Trio currently provides a generic helper for writing servers that listen for connections using one or more Listeners, and a generic utility class for working with streams. And if you want to test code that’s written against the streams interface, you should also check out Streams in trio.testing.

await trio.serve_listeners(handler, listeners, *, handler_nursery=None, task_status=TASK_STATUS_IGNORED)¶

Listen for incoming connections on listeners, and for each one start a task running handler(stream).

Warning

If handler raises an exception, then this function doesn’t do anything special to catch it – so by default the exception will propagate out and crash your server. If you don’t want this, then catch exceptions inside your handler, or use a handler_nursery object that responds to exceptions in some other way.

Parameters

handler – An async callable, that will be invoked like handler_nursery.start_soon(handler, stream) for each incoming connection.
listeners – A list of Listener objects. serve_listeners() takes responsibility for closing them.
handler_nursery – The nursery used to start handlers, or any object with a start_soon method. If None (the default), then serve_listeners() will create a new nursery internally and use that.
task_status – This function can be used with nursery.start, which will return listeners.

Returns

This function never returns unless cancelled.

Resource handling:

If handler neglects to close the stream, then it will be closed using trio.aclose_forcefully().

Error handling:

Most errors coming from accept() are allowed to propagate out (crashing the server in the process). However, some errors – those which indicate that the server is temporarily overloaded – are handled specially. These are OSErrors with one of the following errnos:

EMFILE: process is out of file descriptors

ENFILE: system is out of file descriptors

ENOBUFS, ENOMEM: the kernel hit some sort of memory limitation when trying to create a socket object

When serve_listeners() gets one of these errors, then it:

Logs the error to the standard library logger trio.serve_listeners (level = ERROR, with exception information included). By default this causes it to be printed to stderr.

Waits 100 ms before calling accept again, in hopes that the system will recover.

class trio.StapledStream(send_stream, receive_stream)¶

Bases: trio.abc.HalfCloseableStream

This class staples together two unidirectional streams to make single bidirectional stream.

Parameters

send_stream (SendStream) – The stream to use for sending.
receive_stream (ReceiveStream) – The stream to use for receiving.

Example

A silly way to make a stream that echoes back whatever you write to it:

left, right = trio.testing.memory_stream_pair()
echo_stream = StapledStream(SocketStream(left), SocketStream(right))
await echo_stream.send_all(b"x")
assert await echo_stream.receive_some() == b"x"

StapledStream objects implement the methods in the HalfCloseableStream interface. They also have two additional public attributes:

send_stream¶: The underlying SendStream. send_all() and wait_send_all_might_not_block() are delegated to this object.

receive_stream¶: The underlying ReceiveStream. receive_some() is delegated to this object.

await aclose()¶: Calls aclose on both underlying streams.

await receive_some(max_bytes=None)¶: Calls self.receive_stream.receive_some.

await send_all(data)¶: Calls self.send_stream.send_all.

await send_eof()¶

Shuts down the send side of the stream.

If self.send_stream.send_eof exists, then calls it. Otherwise, calls self.send_stream.aclose().

await wait_send_all_might_not_block()¶: Calls self.send_stream.wait_send_all_might_not_block.

Sockets and networking¶

The high-level network interface is built on top of our stream abstraction.

await trio.open_tcp_stream(host, port, *, happy_eyeballs_delay=0.25)¶

Connect to the given host and port over TCP.

If the given host has multiple IP addresses associated with it, then we have a problem: which one do we use?

One approach would be to attempt to connect to the first one, and then if that fails, attempt to connect to the second one … until we’ve tried all of them. But the problem with this is that if the first IP address is unreachable (for example, because it’s an IPv6 address and our network discards IPv6 packets), then we might end up waiting tens of seconds for the first connection attempt to timeout before we try the second address.

Another approach would be to attempt to connect to all of the addresses at the same time, in parallel, and then use whichever connection succeeds first, abandoning the others. This would be fast, but create a lot of unnecessary load on the network and the remote server.

This function strikes a balance between these two extremes: it works its way through the available addresses one at a time, like the first approach; but, if happy_eyeballs_delay seconds have passed and it’s still waiting for an attempt to succeed or fail, then it gets impatient and starts the next connection attempt in parallel. As soon as any one connection attempt succeeds, all the other attempts are cancelled. This avoids unnecessary load because most connections will succeed after just one or two attempts, but if one of the addresses is unreachable then it doesn’t slow us down too much.

This is known as a “happy eyeballs” algorithm, and our particular variant is modelled after how Chrome connects to webservers; see RFC 6555 for more details.

Parameters

host (str or bytes) – The host to connect to. Can be an IPv4 address, IPv6 address, or a hostname.
port (int) – The port to connect to.
happy_eyeballs_delay (float) – How many seconds to wait for each connection attempt to succeed or fail before getting impatient and starting another one in parallel. Set to math.inf if you want to limit to only one connection attempt at a time (like socket.create_connection()). Default: 0.25 (250 ms).

Returns

a Stream connected to the given server.

Return type

SocketStream

Raises

OSError – if the connection fails.

SSL / TLS support¶

Trio provides SSL/TLS support based on the standard library ssl module. Trio’s SSLStream and SSLListener take their configuration from a ssl.SSLContext, which you can create using ssl.create_default_context() and customize using the other constants and functions in the ssl module.

Warning

Avoid instantiating ssl.SSLContext directly. A newly constructed SSLContext has less secure defaults than one returned by ssl.create_default_context(), dramatically so before Python 3.6.

Instead of using ssl.SSLContext.wrap_socket(), you create a SSLStream:

class trio.SSLStream(transport_stream, ssl_context, *, server_hostname=None, server_side=False, https_compatible=False, max_refill_bytes='unused and deprecated')¶

Bases: trio.abc.Stream

Encrypted communication using SSL/TLS.

SSLStream wraps an arbitrary Stream, and allows you to perform encrypted communication over it using the usual Stream interface. You pass regular data to send_all(), then it encrypts it and sends the encrypted data on the underlying Stream; receive_some() takes encrypted data out of the underlying Stream and decrypts it before returning it.

You should read the standard library’s ssl documentation carefully before attempting to use this class, and probably other general documentation on SSL/TLS as well. SSL/TLS is subtle and quick to anger. Really. I’m not kidding.

Parameters

transport_stream (Stream) – The stream used to transport encrypted data. Required.
ssl_context (SSLContext) – The SSLContext used for this connection. Required. Usually created by calling ssl.create_default_context().
server_hostname (str or None) – The name of the server being connected to. Used for SNI and for validating the server’s certificate (if hostname checking is enabled). This is effectively mandatory for clients, and actually mandatory if ssl_context.check_hostname is True.
server_side (bool) – Whether this stream is acting as a client or server. Defaults to False, i.e. client mode.
https_compatible (bool) –
There are two versions of SSL/TLS commonly encountered in the wild: the standard version, and the version used for HTTPS (HTTP-over-SSL/TLS).

Standard-compliant SSL/TLS implementations always send a cryptographically signed close_notify message before closing the connection. This is important because if the underlying transport were simply closed, then there wouldn’t be any way for the other side to know whether the connection was intentionally closed by the peer that they negotiated a cryptographic connection to, or by some man-in-the-middle attacker who can’t manipulate the cryptographic stream, but can manipulate the transport layer (a so-called “truncation attack”).

However, this part of the standard is widely ignored by real-world HTTPS implementations, which means that if you want to interoperate with them, then you NEED to ignore it too.

Fortunately this isn’t as bad as it sounds, because the HTTP protocol already includes its own equivalent of close_notify, so doing this again at the SSL/TLS level is redundant. But not all protocols do! Therefore, by default Trio implements the safer standard-compliant version (https_compatible=False). But if you’re speaking HTTPS or some other protocol where close_notifys are commonly skipped, then you should set https_compatible=True; with this setting, Trio will neither expect nor send close_notify messages.

If you have code that was written to use ssl.SSLSocket and now you’re porting it to Trio, then it may be useful to know that a difference between SSLStream and ssl.SSLSocket is that SSLSocket implements the https_compatible=True behavior by default.

transport_stream¶

The underlying transport stream that was passed to __init__. An example of when this would be useful is if you’re using SSLStream over a SocketStream and want to call the SocketStream’s setsockopt() method.

Type: trio.abc.Stream

Internally, this class is implemented using an instance of ssl.SSLObject, and all of SSLObject’s methods and attributes are re-exported as methods and attributes on this class. However, there is one difference: SSLObject has several methods that return information about the encrypted connection, like cipher() or selected_alpn_protocol(). If you call them before the handshake, when they can’t possibly return useful data, then ssl.SSLObject returns None, but trio.SSLStream raises NeedHandshakeError.

This also means that if you register a SNI callback using sni_callback, then the first argument your callback receives will be a ssl.SSLObject.

await aclose()¶

Gracefully shut down this connection, and close the underlying transport.

If https_compatible is False (the default), then this attempts to first send a close_notify and then close the underlying stream by calling its aclose() method.

If https_compatible is set to True, then this simply closes the underlying stream and marks this stream as closed.

await do_handshake()¶

Ensure that the initial handshake has completed.

The SSL protocol requires an initial handshake to exchange certificates, select cryptographic keys, and so forth, before any actual data can be sent or received. You don’t have to call this method; if you don’t, then SSLStream will automatically peform the handshake as needed, the first time you try to send or receive data. But if you want to trigger it manually – for example, because you want to look at the peer’s certificate before you start talking to them – then you can call this method.

If the initial handshake is already in progress in another task, this waits for it to complete and then returns.

If the initial handshake has already completed, this returns immediately without doing anything (except executing a checkpoint).

Warning

If this method is cancelled, then it may leave the SSLStream in an unusable state. If this happens then any future attempt to use the object will raise trio.BrokenResourceError.

await receive_some(max_bytes=None)¶

Read some data from the underlying transport, decrypt it, and return it.

See trio.abc.ReceiveStream.receive_some() for details.

Warning

If this method is cancelled while the initial handshake or a renegotiation are in progress, then it may leave the SSLStream in an unusable state. If this happens then any future attempt to use the object will raise trio.BrokenResourceError.

await send_all(data)¶

Encrypt some data and then send it on the underlying transport.

See trio.abc.SendStream.send_all() for details.

Warning

If this method is cancelled, then it may leave the SSLStream in an unusable state. If this happens then any attempt to use the object will raise trio.BrokenResourceError.

await unwrap()¶

Cleanly close down the SSL/TLS encryption layer, allowing the underlying stream to be used for unencrypted communication.

You almost certainly don’t need this.

Returns: A pair (transport_stream, trailing_bytes), where transport_stream is the underlying transport stream, and trailing_bytes is a byte string. Since SSLStream doesn’t necessarily know where the end of the encrypted data will be, it can happen that it accidentally reads too much from the underlying stream. trailing_bytes contains this extra data; you should process it as if it was returned from a call to transport_stream.receive_some(...).

await wait_send_all_might_not_block()¶: See trio.abc.SendStream.wait_send_all_might_not_block().

And if you’re implementing a server, you can use SSLListener:

class trio.SSLListener(transport_listener, ssl_context, *, https_compatible=False, max_refill_bytes='unused and deprecated')¶

Bases: trio.abc.Listener

A Listener for SSL/TLS-encrypted servers.

SSLListener wraps around another Listener, and converts all incoming connections to encrypted connections by wrapping them in a SSLStream.

Parameters

transport_listener (Listener) – The listener whose incoming connections will be wrapped in SSLStream.
ssl_context (SSLContext) – The SSLContext that will be used for incoming connections.
https_compatible (bool) – Passed on to SSLStream.

transport_listener¶

The underlying listener that was passed to __init__.

Type: trio.abc.Listener

await accept()¶

Accept the next connection and wrap it in an SSLStream.

See trio.abc.Listener.accept() for details.

await aclose()¶: Close the transport listener.

Some methods on SSLStream raise NeedHandshakeError if you call them before the handshake completes:

exception trio.NeedHandshakeError¶: Some SSLStream methods can’t return any meaningful data until after the handshake. If you call them before the handshake, they raise this error.

Low-level networking with `trio.socket`¶

The trio.socket module provides Trio’s basic low-level networking API. If you’re doing ordinary things with stream-oriented connections over IPv4/IPv6/Unix domain sockets, then you probably want to stick to the high-level API described above. If you want to use UDP, or exotic address families like AF_BLUETOOTH, or otherwise get direct access to all the quirky bits of your system’s networking API, then you’re in the right place.

Top-level exports¶

Generally, the API exposed by trio.socket mirrors that of the standard library socket module. Most constants (like SOL_SOCKET) and simple utilities (like inet_aton()) are simply re-exported unchanged. But there are also some differences, which are described here.

First, Trio provides analogues to all the standard library functions that return socket objects; their interface is identical, except that they’re modified to return Trio socket objects instead:

trio.socket.socket(family=-1, type=-1, proto=-1, fileno=None)¶

Create a new Trio socket, like socket.socket().

This function’s behavior can be customized using set_custom_socket_factory().

trio.socket.socketpair(family=None, type=<SocketKind.SOCK_STREAM: 1>, proto=0)¶: Like socket.socketpair(), but returns a pair of Trio socket objects.

trio.socket.fromfd(fd, family, type, proto=0)¶: Like socket.fromfd(), but returns a Trio socket object.

trio.socket.fromshare(data)¶: Like socket.fromshare(), but returns a Trio socket object.

In addition, there is a new function to directly convert a standard library socket into a Trio socket:

trio.socket.from_stdlib_socket(sock)¶: Convert a standard library socket.socket() object into a Trio socket object.

Unlike socket.socket(), trio.socket.socket() is a function, not a class; if you want to check whether an object is a Trio socket, use isinstance(obj, trio.socket.SocketType).

For name lookup, Trio provides the standard functions, but with some changes:

await trio.socket.getaddrinfo(host, port, family=0, type=0, proto=0, flags=0)¶

Look up a numeric address given a name.

Arguments and return values are identical to socket.getaddrinfo(), except that this version is async.

Also, trio.socket.getaddrinfo() correctly uses IDNA 2008 to process non-ASCII domain names. (socket.getaddrinfo() uses IDNA 2003, which can give the wrong result in some cases and cause you to connect to a different host than the one you intended; see bpo-17305.)

This function’s behavior can be customized using set_custom_hostname_resolver().

await trio.socket.getnameinfo(sockaddr, flags)¶

Look up a name given a numeric address.

Arguments and return values are identical to socket.getnameinfo(), except that this version is async.

This function’s behavior can be customized using set_custom_hostname_resolver().

await trio.socket.getprotobyname(name)¶

Look up a protocol number by name. (Rarely used.)

Like socket.getprotobyname(), but async.

Trio intentionally DOES NOT include some obsolete, redundant, or broken features:

gethostbyname(), gethostbyname_ex(), gethostbyaddr(): obsolete; use getaddrinfo() and getnameinfo() instead.

getservbyport(): obsolete and buggy; instead, do:

_, service_name = await getnameinfo((127.0.0.1, port), NI_NUMERICHOST))

getservbyname(): obsolete and buggy; instead, do:
```
await getaddrinfo(None, service_name)
```
getfqdn(): obsolete; use getaddrinfo() with the AI_CANONNAME flag.
getdefaulttimeout(), setdefaulttimeout(): instead, use Trio’s standard support for Cancellation and timeouts.
On Windows, SO_REUSEADDR is not exported, because it’s a trap: the name is the same as Unix SO_REUSEADDR, but the semantics are different and extremely broken. In the very rare cases where you actually want SO_REUSEADDR on Windows, then it can still be accessed from the standard library’s socket module.

Socket objects¶

class trio.socket.SocketType¶

Note

trio.socket.SocketType is an abstract class and cannot be instantiated directly; you get concrete socket objects by calling constructors like trio.socket.socket(). However, you can use it to check if an object is a Trio socket via isinstance(obj, trio.socket.SocketType).

Trio socket objects are overall very similar to the standard library socket objects, with a few important differences:

First, and most obviously, everything is made “Trio-style”: blocking methods become async methods, and the following attributes are not supported:

setblocking(): Trio sockets always act like blocking sockets; if you need to read/write from multiple sockets at once, then create multiple tasks.
settimeout(): see Cancellation and timeouts instead.
makefile(): Python’s file-like API is synchronous, so it can’t be implemented on top of an async socket.
sendall(): Could be supported, but you’re better off using the higher-level SocketStream, and specifically its send_all() method, which also does additional error checking.

In addition, the following methods are similar to the equivalents in socket.socket(), but have some Trio-specific quirks:

await connect()¶

Connect the socket to a remote address.

Similar to socket.socket.connect(), except async.

Warning

Due to limitations of the underlying operating system APIs, it is not always possible to properly cancel a connection attempt once it has begun. If connect() is cancelled, and is unable to abort the connection attempt, then it will:

forcibly close the socket to prevent accidental re-use
raise Cancelled.

tl;dr: if connect() is cancelled then the socket is left in an unknown state – possibly open, and possibly closed. The only reasonable thing to do is to close it.

is_readable()¶: Check whether the socket is readable or not.

sendfile()¶: Not implemented yet!

We also keep track of an extra bit of state, because it turns out to be useful for trio.SocketStream:

did_shutdown_SHUT_WR¶: This bool attribute is True if you’ve called sock.shutdown(SHUT_WR) or sock.shutdown(SHUT_RDWR), and False otherwise.

The following methods are identical to their equivalents in socket.socket(), except async, and the ones that take address arguments require pre-resolved addresses:

All methods and attributes not mentioned above are identical to their equivalents in socket.socket():

Asynchronous filesystem I/O¶

Trio provides built-in facilities for performing asynchronous filesystem operations like reading or renaming a file. Generally, we recommend that you use these instead of Python’s normal synchronous file APIs. But the tradeoffs here are somewhat subtle: sometimes people switch to async I/O, and then they’re surprised and confused when they find it doesn’t speed up their program. The next section explains the theory behind async file I/O, to help you better understand your code’s behavior. Or, if you just want to get started, you can jump down to the API overview.

Background: Why is async file I/O useful? The answer may surprise you¶

Many people expect that switching from synchronous file I/O to async file I/O will always make their program faster. This is not true! If we just look at total throughput, then async file I/O might be faster, slower, or about the same, and it depends in a complicated way on things like your exact patterns of disk access, or how much RAM you have. The main motivation for async file I/O is not to improve throughput, but to reduce the frequency of latency glitches.

To understand why, you need to know two things.

First, right now no mainstream operating system offers a generic, reliable, native API for async file or filesystem operations, so we have to fake it by using threads (specifically, trio.to_thread.run_sync()). This is cheap but isn’t free: on a typical PC, dispatching to a worker thread adds something like ~100 µs of overhead to each operation. (“µs” is pronounced “microseconds”, and there are 1,000,000 µs in a second. Note that all the numbers here are going to be rough orders of magnitude to give you a sense of scale; if you need precise numbers for your environment, measure!)

And second, the cost of a disk operation is incredibly bimodal. Sometimes, the data you need is already cached in RAM, and then accessing it is very, very fast – calling io.FileIO’s read method on a cached file takes on the order of ~1 µs. But when the data isn’t cached, then accessing it is much, much slower: the average is ~100 µs for SSDs and ~10,000 µs for spinning disks, and if you look at tail latencies then for both types of storage you’ll see cases where occasionally some operation will be 10x or 100x slower than average. And that’s assuming your program is the only thing trying to use that disk – if you’re on some oversold cloud VM fighting for I/O with other tenants then who knows what will happen. And some operations can require multiple disk accesses.

Putting these together: if your data is in RAM then it should be clear that using a thread is a terrible idea – if you add 100 µs of overhead to a 1 µs operation, then that’s a 100x slowdown! On the other hand, if your data’s on a spinning disk, then using a thread is great – instead of blocking the main thread and all tasks for 10,000 µs, we only block them for 100 µs and can spend the rest of that time running other tasks to get useful work done, which can effectively be a 100x speedup.

But here’s the problem: for any individual I/O operation, there’s no way to know in advance whether it’s going to be one of the fast ones or one of the slow ones, so you can’t pick and choose. When you switch to async file I/O, it makes all the fast operations slower, and all the slow operations faster. Is that a win? In terms of overall speed, it’s hard to say: it depends what kind of disks you’re using and your kernel’s disk cache hit rate, which in turn depends on your file access patterns, how much spare RAM you have, the load on your service, … all kinds of things. If the answer is important to you, then there’s no substitute for measuring your code’s actual behavior in your actual deployment environment. But what we can say is that async disk I/O makes performance much more predictable across a wider range of runtime conditions.

If you’re not sure what to do, then we recommend that you use async disk I/O by default, because it makes your code more robust when conditions are bad, especially with regards to tail latencies; this improves the chances that what your users see matches what you saw in testing. Blocking the main thread stops all tasks from running for that time. 10,000 µs is 10 ms, and it doesn’t take many 10 ms glitches to start adding up to real money; async disk I/O can help prevent those. Just don’t expect it to be magic, and be aware of the tradeoffs.

API overview¶

If you want to perform general filesystem operations like creating and listing directories, renaming files, or checking file metadata – or if you just want a friendly way to work with filesystem paths – then you want trio.Path. It’s an asyncified replacement for the standard library’s pathlib.Path, and provides the same comprehensive set of operations.

For reading and writing to files and file-like objects, Trio also provides a mechanism for wrapping any synchronous file-like object into an asynchronous interface. If you have a trio.Path object you can get one of these by calling its open() method; or if you know the file’s name you can open it directly with trio.open_file(). Alternatively, if you already have an open file-like object, you can wrap it with trio.wrap_file() – one case where this is especially useful is to wrap io.BytesIO or io.StringIO when writing tests.

Asynchronous path objects¶

class trio.Path(*args)¶

A pathlib.Path wrapper that executes blocking methods in trio.to_thread.run_sync().

as_posix()¶: Return the string representation of the path with forward (/) slashes.

as_uri()¶: Return the path as a ‘file’ URI.

await chmod(*args, **kwargs)¶: Like chmod(), but async.

classmethod await cwd(*args, **kwargs)¶: Like cwd(), but async.

await exists(*args, **kwargs)¶: Like exists(), but async.

await expanduser(*args, **kwargs)¶: Like expanduser(), but async.

await glob(*args, **kwargs)¶: Like glob(), but async.

await group(*args, **kwargs)¶: Like group(), but async.

classmethod await home(*args, **kwargs)¶: Like home(), but async.

is_absolute()¶: True if the path is absolute (has both a root and, if applicable, a drive).

await is_block_device(*args, **kwargs)¶: Like is_block_device(), but async.

await is_char_device(*args, **kwargs)¶: Like is_char_device(), but async.

await is_dir(*args, **kwargs)¶: Like is_dir(), but async.

await is_fifo(*args, **kwargs)¶: Like is_fifo(), but async.

await is_file(*args, **kwargs)¶: Like is_file(), but async.

await is_mount(*args, **kwargs)¶: Like is_mount(), but async.

is_reserved()¶: Return True if the path contains one of the special names reserved by the system, if any.

await is_socket(*args, **kwargs)¶: Like is_socket(), but async.

await is_symlink(*args, **kwargs)¶: Like is_symlink(), but async.

await iterdir(*args, **kwargs)¶

Like pathlib.Path.iterdir(), but async.

This is an async method that returns a synchronous iterator, so you use it like:

for subpath in await mypath.iterdir():
    ...

Note that it actually loads the whole directory list into memory immediately, during the initial call. (See issue #501 for discussion.)

joinpath(*args)¶: Combine this path with one or several arguments, and return a new path representing either a subpath (if all arguments are relative paths) or a totally different path (if one of the arguments is anchored).

await lchmod(*args, **kwargs)¶: Like lchmod(), but async.

await lstat(*args, **kwargs)¶: Like lstat(), but async.

match(path_pattern)¶: Return True if this path matches the given pattern.

await mkdir(*args, **kwargs)¶: Like mkdir(), but async.

await open(mode='r', buffering=-1, encoding=None, errors=None, newline=None)¶: Open the file pointed by this path and return a file object, as the built-in open() function does.

await owner(*args, **kwargs)¶: Like owner(), but async.

await read_bytes(*args, **kwargs)¶: Like read_bytes(), but async.

await read_text(*args, **kwargs)¶: Like read_text(), but async.

relative_to(*other)¶: Return the relative path to another path identified by the passed arguments. If the operation is not possible (because this is not a subpath of the other path), raise ValueError.

await rename(*args, **kwargs)¶: Like rename(), but async.

await replace(*args, **kwargs)¶: Like replace(), but async.

await resolve(*args, **kwargs)¶: Like resolve(), but async.

await rglob(*args, **kwargs)¶: Like rglob(), but async.

await rmdir(*args, **kwargs)¶: Like rmdir(), but async.

await samefile(*args, **kwargs)¶: Like samefile(), but async.

await stat(*args, **kwargs)¶: Like stat(), but async.

await symlink_to(*args, **kwargs)¶: Like symlink_to(), but async.

await touch(*args, **kwargs)¶: Like touch(), but async.

await unlink(*args, **kwargs)¶: Like unlink(), but async.

with_name(name)¶: Return a new path with the file name changed.

with_suffix(suffix)¶: Return a new path with the file suffix changed. If the path has no suffix, add given suffix. If the given suffix is an empty string, remove the suffix from the path.

await write_bytes(*args, **kwargs)¶: Like write_bytes(), but async.

await write_text(*args, **kwargs)¶: Like write_text(), but async.

Asynchronous file objects¶

await trio.open_file(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)¶

Asynchronous version of io.open().

Returns: An asynchronous file object

Example:

async with await trio.open_file(filename) as f:
    async for line in f:
        pass

assert f.closed

Spawning subprocesses¶

Trio provides support for spawning other programs as subprocesses, communicating with them via pipes, sending them signals, and waiting for them to exit. The interface for doing so consists of two layers:

trio.run_process() runs a process from start to finish and returns a CompletedProcess object describing its outputs and return value. This is what you should reach for if you want to run a process to completion before continuing, while possibly sending it some input or capturing its output. It is modelled after the standard subprocess.run() with some additional features and safer defaults.
trio.open_process starts a process in the background and returns a Process object to let you interact with it. Using it requires a bit more code than run_process, but exposes additional capabilities: back-and-forth communication, processing output as soon as it is generated, and so forth. It is modelled after the standard library subprocess.Popen.

Options for starting subprocesses¶

All of Trio’s subprocess APIs accept the numerous keyword arguments used by the standard subprocess module to control the environment in which a process starts and the mechanisms used for communicating with it. These may be passed wherever you see **options in the documentation below. See the full list or just the frequently used ones in the subprocess documentation. (You may need to import subprocess in order to access constants such as PIPE or DEVNULL.)

Currently, Trio always uses unbuffered byte streams for communicating with a process, so it does not support the encoding, errors, universal_newlines (alias text in 3.7+), and bufsize options.

Running a process and waiting for it to finish¶

The basic interface for running a subprocess start-to-finish is trio.run_process(). It always waits for the subprocess to exit before returning, so there’s no need to worry about leaving a process running by mistake after you’ve gone on to do other things. run_process() is similar to the standard library subprocess.run() function, but tries to have safer defaults: with no options, the subprocess’s input is empty rather than coming from the user’s terminal, and a failure in the subprocess will be propagated as a subprocess.CalledProcessError exception. Of course, these defaults can be changed where necessary.

await trio.run_process(command, *, stdin=b'', capture_stdout=False, capture_stderr=False, check=True, deliver_cancel=None, **options)¶

Run command in a subprocess, wait for it to complete, and return a subprocess.CompletedProcess instance describing the results.

If cancelled, run_process() terminates the subprocess and waits for it to exit before propagating the cancellation, like Process.aclose().

Input: The subprocess’s standard input stream is set up to receive the bytes provided as stdin. Once the given input has been fully delivered, or if none is provided, the subprocess will receive end-of-file when reading from its standard input. Alternatively, if you want the subprocess to read its standard input from the same place as the parent Trio process, you can pass stdin=None.

Output: By default, any output produced by the subprocess is passed through to the standard output and error streams of the parent Trio process. If you would like to capture this output and do something with it, you can pass capture_stdout=True to capture the subprocess’s standard output, and/or capture_stderr=True to capture its standard error. Captured data is provided as the stdout and/or stderr attributes of the returned CompletedProcess object. The value for any stream that was not captured will be None.

If you want to capture both stdout and stderr while keeping them separate, pass capture_stdout=True, capture_stderr=True.

If you want to capture both stdout and stderr but mixed together in the order they were printed, use: capture_stdout=True, stderr=subprocess.STDOUT. This directs the child’s stderr into its stdout, so the combined output will be available in the stdout attribute.

Error checking: If the subprocess exits with a nonzero status code, indicating failure, run_process() raises a subprocess.CalledProcessError exception rather than returning normally. The captured outputs are still available as the stdout and stderr attributes of that exception. To disable this behavior, so that run_process() returns normally even if the subprocess exits abnormally, pass check=False.

Parameters

command (list or str) – The command to run. Typically this is a sequence of strings such as ['ls', '-l', 'directory with spaces'], where the first element names the executable to invoke and the other elements specify its arguments. With shell=True in the **options, or on Windows, command may alternatively be a string, which will be parsed following platform-dependent quoting rules.
stdin (bytes, file descriptor, or None) – The bytes to provide to the subprocess on its standard input stream, or None if the subprocess’s standard input should come from the same place as the parent Trio process’s standard input. As is the case with the subprocess module, you can also pass a file descriptor or an object with a fileno() method, in which case the subprocess’s standard input will come from that file.
capture_stdout (bool) – If true, capture the bytes that the subprocess writes to its standard output stream and return them in the stdout attribute of the returned CompletedProcess object.
capture_stderr (bool) – If true, capture the bytes that the subprocess writes to its standard error stream and return them in the stderr attribute of the returned CompletedProcess object.
check (bool) – If false, don’t validate that the subprocess exits successfully. You should be sure to check the returncode attribute of the returned object if you pass check=False, so that errors don’t pass silently.
deliver_cancel (async function or None) –
If run_process is cancelled, then it needs to kill the child process. There are multiple ways to do this, so we let you customize it.

If you pass None (the default), then the behavior depends on the platform:
- On Windows, Trio calls TerminateProcess, which should kill the process immediately.
- On Unix-likes, the default behavior is to send a SIGTERM, wait 5 seconds, and send a SIGKILL.
Alternatively, you can customize this behavior by passing in an arbitrary async function, which will be called with the Process object as an argument. For example, the default Unix behavior could be implemented like this:
```
async def my_deliver_cancel(process):
    process.send_signal(signal.SIGTERM)
    await trio.sleep(5)
    process.send_signal(signal.SIGKILL)
```
When the process actually exits, the deliver_cancel function will automatically be cancelled – so if the process exits after SIGTERM, then we’ll never reach the SIGKILL.

In any case, run_process will always wait for the child process to exit before raising Cancelled.
**options – run_process() also accepts any general subprocess options and passes them on to the Process constructor. This includes the stdout and stderr options, which provide additional redirection possibilities such as stderr=subprocess.STDOUT, stdout=subprocess.DEVNULL, or file descriptors.

Returns

A subprocess.CompletedProcess instance describing the return code and outputs.

Raises

UnicodeError – if stdin is specified as a Unicode string, rather than bytes
ValueError – if multiple redirections are specified for the same stream, e.g., both capture_stdout=True and stdout=subprocess.DEVNULL
subprocess.CalledProcessError – if check=False is not passed and the process exits with a nonzero exit status
OSError – if an error is encountered starting or communicating with the process

Note

The child process runs in the same process group as the parent Trio process, so a Ctrl+C will be delivered simultaneously to both parent and child. If you don’t want this behavior, consult your platform’s documentation for starting child processes in a different process group.

Interacting with a process as it runs¶

If you want more control than run_process() affords, you can use trio.open_process to spawn a subprocess, and then interact with it using the Process interface.

await trio.open_process(command, *, stdin=None, stdout=None, stderr=None, **options) → trio.Process¶

Execute a child program in a new process.

After construction, you can interact with the child process by writing data to its stdin stream (a SendStream), reading data from its stdout and/or stderr streams (both ReceiveStreams), sending it signals using terminate, kill, or send_signal, and waiting for it to exit using wait. See Process for details.

Each standard stream is only available if you specify that a pipe should be created for it. For example, if you pass stdin=subprocess.PIPE, you can write to the stdin stream, else stdin will be None.

Parameters

command (list or str) – The command to run. Typically this is a sequence of strings such as ['ls', '-l', 'directory with spaces'], where the first element names the executable to invoke and the other elements specify its arguments. With shell=True in the **options, or on Windows, command may alternatively be a string, which will be parsed following platform-dependent quoting rules.
stdin – Specifies what the child process’s standard input stream should connect to: output written by the parent (subprocess.PIPE), nothing (subprocess.DEVNULL), or an open file (pass a file descriptor or something whose fileno method returns one). If stdin is unspecified, the child process will have the same standard input stream as its parent.
stdout – Like stdin, but for the child process’s standard output stream.
stderr – Like stdin, but for the child process’s standard error stream. An additional value subprocess.STDOUT is supported, which causes the child’s standard output and standard error messages to be intermixed on a single standard output stream, attached to whatever the stdout option says to attach it to.
**options – Other general subprocess options are also accepted.

Returns

A new Process object.

Raises

OSError – if the process spawning fails, for example because the specified command could not be found.

class trio.Process(popen, stdin, stdout, stderr)¶

A child process. Like subprocess.Popen, but async.

This class has no public constructor. To create a child process, use open_process:

process = await trio.open_process(...)

Process implements the AsyncResource interface. In order to make sure your process doesn’t end up getting abandoned by mistake or after an exception, you can use async with:

async with await trio.open_process(...) as process:
    ...

“Closing” a Process will close any pipes to the child and wait for it to exit; if cancelled, the child will be forcibly killed and we will ensure it has finished exiting before allowing the cancellation to propagate.

args¶

The command passed at construction time, specifying the process to execute and its arguments.

Type: str or list

pid¶

The process ID of the child process managed by this object.

Type: int

stdin¶

A stream connected to the child’s standard input stream: when you write bytes here, they become available for the child to read. Only available if the Process was constructed using stdin=PIPE; otherwise this will be None.

Type: trio.abc.SendStream or None

stdout¶

A stream connected to the child’s standard output stream: when the child writes to standard output, the written bytes become available for you to read here. Only available if the Process was constructed using stdout=PIPE; otherwise this will be None.

Type: trio.abc.ReceiveStream or None

stderr¶

A stream connected to the child’s standard error stream: when the child writes to standard error, the written bytes become available for you to read here. Only available if the Process was constructed using stderr=PIPE; otherwise this will be None.

Type: trio.abc.ReceiveStream or None

stdio¶

A stream that sends data to the child’s standard input and receives from the child’s standard output. Only available if both stdin and stdout are available; otherwise this will be None.

Type: trio.StapledStream or None

returncode¶

The exit status of the process (an integer), or None if it’s still running.

By convention, a return code of zero indicates success. On UNIX, negative values indicate termination due to a signal, e.g., -11 if terminated by signal 11 (SIGSEGV). On Windows, a process that exits due to a call to Process.terminate() will have an exit status of 1.

Unlike the standard library subprocess.Popen.returncode, you don’t have to call poll or wait to update this attribute; it’s automatically updated as needed, and will always give you the latest information.

await aclose()¶

Close any pipes we have to the process (both input and output) and wait for it to exit.

If cancelled, kills the process and waits for it to finish exiting before propagating the cancellation.

await wait()¶

Block until the process exits.

Returns: The exit status of the process; see returncode.

poll()¶

Returns the exit status of the process (an integer), or None if it’s still running.

Note that on Trio (unlike the standard library subprocess.Popen), process.poll() and process.returncode always give the same result. See returncode for more details. This method is only included to make it easier to port code from subprocess.

kill()¶

Immediately terminate the process.

On UNIX, this is equivalent to send_signal(signal.SIGKILL). On Windows, it calls TerminateProcess. In both cases, the process cannot prevent itself from being killed, but the termination will be delivered asynchronously; use wait() if you want to ensure the process is actually dead before proceeding.

terminate()¶

Terminate the process, politely if possible.

On UNIX, this is equivalent to send_signal(signal.SIGTERM); by convention this requests graceful termination, but a misbehaving or buggy process might ignore it. On Windows, terminate() forcibly terminates the process in the same manner as kill().

send_signal(sig)¶

Send signal sig to the process.

On UNIX, sig may be any signal defined in the signal module, such as signal.SIGINT or signal.SIGTERM. On Windows, it may be anything accepted by the standard library subprocess.Popen.send_signal().

Note

communicate() is not provided as a method on Process objects; use run_process() instead, or write the loop yourself if you have unusual needs. communicate() has quite unusual cancellation behavior in the standard library (on some platforms it spawns a background thread which continues to read from the child process even after the timeout has expired) and we wanted to provide an interface with fewer surprises.

Quoting: more than you wanted to know¶

The command to run and its arguments usually must be passed to Trio’s subprocess APIs as a sequence of strings, where the first element in the sequence specifies the command to run and the remaining elements specify its arguments, one argument per element. This form is used because it avoids potential quoting pitfalls; for example, you can run ["cp", "-f", source_file, dest_file] without worrying about whether source_file or dest_file contains spaces.

If you only run subprocesses without shell=True and on UNIX, that’s all you need to know about specifying the command. If you use shell=True or run on Windows, you probably should read the rest of this section to be aware of potential pitfalls.

With shell=True on UNIX, you must specify the command as a single string, which will be passed to the shell as if you’d entered it at an interactive prompt. The advantage of this option is that it lets you use shell features like pipes and redirection without writing code to handle them. For example, you can write Process("ls | grep some_string", shell=True). The disadvantage is that you must account for the shell’s quoting rules, generally by wrapping in shlex.quote() any argument that might contain spaces, quotes, or other shell metacharacters. If you don’t do that, your safe-looking f"ls | grep {some_string}" might end in disaster when invoked with some_string = "foo; rm -rf /".

On Windows, the fundamental API for process spawning (the CreateProcess() system call) takes a string, not a list, and it’s actually up to the child process to decide how it wants to split that string into individual arguments. Since the C language specifies that main() should take a list of arguments, most programs you encounter will follow the rules used by the Microsoft C/C++ runtime. subprocess.Popen, and thus also Trio, uses these rules when it converts an argument sequence to a string, and they are documented alongside the subprocess module. There is no documented Python standard library function that can directly perform that conversion, so even on Windows, you almost always want to pass an argument sequence rather than a string. But if the program you’re spawning doesn’t split its command line back into individual arguments in the standard way, you might need to pass a string to work around this. (Or you might just be out of luck: as far as I can tell, there’s simply no way to pass an argument containing a double-quote to a Windows batch file.)

On Windows with shell=True, things get even more chaotic. Now there are two separate sets of quoting rules applied, one by the Windows command shell CMD.EXE and one by the process being spawned, and they’re different. (And there’s no shlex.quote() to save you: it uses UNIX-style quoting rules, even on Windows.) Most special characters interpreted by the shell &<>()^| are not treated as special if the shell thinks they’re inside double quotes, but %FOO% environment variable substitutions still are, and the shell doesn’t provide any way to write a double quote inside a double-quoted string. Outside double quotes, any character (including a double quote) can be escaped using a leading ^. But since a pipeline is processed by running each command in the pipeline in a subshell, multiple layers of escaping can be needed:

echo ^^^&x | find "x" | find "x"          # prints: &x

And if you combine pipelines with () grouping, you can need even more levels of escaping:

(echo ^^^^^^^&x | find "x") | find "x"    # prints: &x

Since process creation takes a single arguments string, CMD.EXE’s quoting does not influence word splitting, and double quotes are not removed during CMD.EXE’s expansion pass. Double quotes are troublesome because CMD.EXE handles them differently from the MSVC runtime rules; in:

prog.exe "foo \"bar\" baz"

the program will see one argument foo "bar" baz but CMD.EXE thinks bar\ is not quoted while foo \ and baz are. All of this makes it a formidable task to reliably interpolate anything into a shell=True command line on Windows, and Trio falls back on the subprocess behavior: If you pass a sequence with shell=True, it’s quoted in the same way as a sequence with shell=False, and had better not contain any shell metacharacters you weren’t planning on.

Signals¶

with trio.open_signal_receiver(*signals) as signal_aiter¶

A context manager for catching signals.

Entering this context manager starts listening for the given signals and returns an async iterator; exiting the context manager stops listening.

The async iterator blocks until a signal arrives, and then yields it.

Note that if you leave the with block while the iterator has unextracted signals still pending inside it, then they will be re-delivered using Python’s regular signal handling logic. This avoids a race condition when signals arrives just before we exit the with block.

Parameters

signals – the signals to listen for.

Raises

TypeError – if no signals were provided.
RuntimeError – if you try to use this anywhere except Python’s main thread. (This is a Python limitation.)

Example

A common convention for Unix daemons is that they should reload their configuration when they receive a SIGHUP. Here’s a sketch of what that might look like using open_signal_receiver():

with trio.open_signal_receiver(signal.SIGHUP) as signal_aiter:
    async for signum in signal_aiter:
        assert signum == signal.SIGHUP
        reload_configuration()

I/O in Trio¶

The abstract Stream API¶

Abstract base classes¶

Generic stream tools¶

Sockets and networking¶

SSL / TLS support¶

Low-level networking with trio.socket¶

Top-level exports¶

Socket objects¶

Asynchronous filesystem I/O¶

Background: Why is async file I/O useful? The answer may surprise you¶

API overview¶

Asynchronous path objects¶

Asynchronous file objects¶

Spawning subprocesses¶

Options for starting subprocesses¶

Running a process and waiting for it to finish¶

Interacting with a process as it runs¶

Quoting: more than you wanted to know¶

Signals¶

Low-level networking with `trio.socket`¶