io_uring_register(2) | Linux Programmer's Manual | io_uring_register(2) |
io_uring_register - register files or user buffers for asynchronous I/O
#include <liburing.h>
int io_uring_register(unsigned int fd, unsigned int opcode, void *arg, unsigned int nr_args);
The io_uring_register(2) system call registers resources (e.g. user buffers, files, eventfd, personality, restrictions) for use in an io_uring(7) instance referenced by fd. Registering files or user buffers allows the kernel to take long term references to internal data structures or create long term mappings of application memory, greatly reducing per-I/O overhead.
fd is the file descriptor returned by a call to io_uring_setup(2). opcode can be one of:
After a successful call, the supplied buffers are mapped into the kernel and eligible for I/O. To make use of them, the application must specify the IORING_OP_READ_FIXED or IORING_OP_WRITE_FIXED opcodes in the submission queue entry (see the struct io_uring_sqe definition in io_uring_enter(2)), and set the buf_index field to the desired buffer index. The memory range described by the submission queue entry's addr and len fields must fall within the indexed buffer.
It is perfectly valid to setup a large buffer and then only use part of it for an I/O, as long as the range is within the originally mapped region.
An application can increase or decrease the size or number of registered buffers by first unregistering the existing buffers, and then issuing a new call to io_uring_register(2) with the new buffers.
Note that before 5.13 registering buffers would wait for the ring to idle. If the application currently has requests in-flight, the registration will wait for those to finish before proceeding.
An application need not unregister buffers explicitly before shutting down the io_uring instance. Available since 5.1.
arg points to a struct io_uring_rsrc_register, and nr_args should be set to the number of bytes in the structure.
struct io_uring_rsrc_register {
__u32 nr;
__u32 resv;
__u64 resv2;
__aligned_u64 data;
__aligned_u64 tags; };
The data field contains a pointer to a struct iovec array of nr entries. The tags field should either be 0, then tagging is disabled, or point to an array of nr "tags" (unsigned 64 bit integers). If a tag is zero, then tagging for this particular resource (a buffer in this case) is disabled. Otherwise, after the resource had been unregistered and it's not used anymore, a CQE will be posted with user_data set to the specified tag and all other fields zeroed.
Note that resource updates, e.g. IORING_REGISTER_BUFFERS_UPDATE, don't necessarily deallocate resources by the time it returns, but they might be held alive until all requests using it complete.
Available since 5.13.
arg must contain a pointer to a struct io_uring_rsrc_update2, which contains an offset on which to start the update, and an array of struct iovec. tags points to an array of tags. nr must contain the number of descriptors in the passed in arrays. See IORING_REGISTER_BUFFERS2 for the resource tagging description.
struct io_uring_rsrc_update2 {
__u32 offset;
__u32 resv;
__aligned_u64 data;
__aligned_u64 tags;
__u32 nr;
__u32 resv2; };
Available since 5.13.
To make use of the registered files, the IOSQE_FIXED_FILE flag must be set in the flags member of the struct io_uring_sqe, and the fd member is set to the index of the file in the file descriptor array.
The file set may be sparse, meaning that the fd field in the array may be set to -1. See IORING_REGISTER_FILES_UPDATE for how to update files in place.
Note that before 5.13 registering files would wait for the ring to idle. If the application currently has requests in-flight, the registration will wait for those to finish before proceeding. See IORING_REGISTER_FILES_UPDATE for how to update an existing set without that limitation.
Files are automatically unregistered when the io_uring instance is torn down. An application needs only unregister if it wishes to register a new set of fds. Available since 5.1.
arg points to a struct io_uring_rsrc_register, and nr_args should be set to the number of bytes in the structure.
The data field contains a pointer to an array of nr file descriptors (signed 32 bit integers). tags field should either be 0 or or point to an array of nr "tags" (unsigned 64 bit integers). See IORING_REGISTER_BUFFERS2 for more info on resource tagging.
Note that resource updates, e.g. IORING_REGISTER_FILES_UPDATE, don't necessarily deallocate resources, they might be held until all requests using that resource complete.
Available since 5.13.
arg must contain a pointer to a struct io_uring_files_update, which contains an offset on which to start the update, and an array of file descriptors to use for the update. nr_args must contain the number of descriptors in the passed in array. Available since 5.5.
File descriptors can be skipped if they are set to IORING_REGISTER_FILES_SKIP. Skipping an fd will not touch the file associated with the previous fd at that index. Available since 5.12.
arg must contain a pointer to a struct io_uring_rsrc_update2, which contains an offset on which to start the update, and an array of file descriptors to use for the update stored in data. tags points to an array of tags. nr must contain the number of descriptors in the passed in arrays. See IORING_REGISTER_BUFFERS2 for the resource tagging description.
Available since 5.13.
An application can temporarily disable notifications, coming through the registered eventfd, by setting the IORING_CQ_EVENTFD_DISABLED bit in the flags field of the CQ ring. Available since 5.8.
After the execution of this operation, the io_uring ring is enabled: submissions and registration are allowed, but they will be validated following the registered restrictions (if any). This operation takes no argument, must be invoked with arg set to NULL and nr_args set to zero. Available since 5.10.
With an entry it is possible to allow an io_uring_register(2) opcode, or specify which opcode and flags of the submission queue entry are allowed, or require certain flags to be specified (these flags must be set on each submission queue entry).
All the restrictions must be submitted with a single io_uring_register(2) call and they are handled as an allowlist (opcodes and flags not registered, are not allowed).
Restrictions can be registered only if the io_uring ring started in a disabled state (IORING_SETUP_R_DISABLED must be specified in the call to io_uring_setup(2)).
Available since 5.10.
Available since 5.14.
Available since 5.14.
arg must be set to an unsigned int pointer to an array of two values, with the values in the array being set to the maximum count of workers per NUMA node. Index 0 holds the bounded worker count, and index 1 holds the unbounded worker count. On successful return, the passed in array will contain the previous maximum valyes for each type. If the count being passed in is 0, then this command returns the current maximum values and doesn't modify the current setting. nr_args must be set to 2, as the command takes two values.
Available since 5.15.
Similarly to how io_uring allows registration of files, this allow registration of the ring file descriptor itself. This reduces the overhead of the io_uring_enter(2) system call.
arg must be set to an unsigned int pointer to an array of type struct io_uring_rsrc_register of nr_args number of entries. The data field of this struct must point to an io_uring file descriptor, and the offset field can be either -1 or an explicit offset desired for the registered file descriptor value. If -1 is used, then upon successful return of this system call, the field will contain the value of the registered file descriptor to be used for future io_uring_enter(2) system calls.
On successful completion of this request, the returned descriptors may be used instead of the real file descriptor for io_uring_enter(2), provided that IORING_ENTER_REGISTERED_RING is set in the flags for the system call. This flag tells the kernel that a registered descriptor is used rather than a real file descriptor.
Each thread or process using a ring must register the file descriptor directly by issuing this request.
The maximum number of supported registered ring descriptors is currently limited to 16.
Available since 5.18.
arg must be set to an unsigned int pointer to an array of type struct io_uring_rsrc_register of nr_args number of entries. Only the offset field should be set in the structure, containing the registered file descriptor offset previously returned from IORING_REGISTER_RING_FDS that the application wishes to unregister.
Note that this isn't done automatically on ring exit, if the thread or task that previously registered a ring file descriptor isn't exiting. It is recommended to manually unregister any previously registered ring descriptors if the ring is closed and the task persists. This will free up a registration slot, making it available for future use.
Available since 5.18.
The arg argument must be filled in with the appropriate information. It looks as follows:
struct io_uring_buf_reg {
__u64 ring_addr;
__u32 ring_entries;
__u16 bgid;
__u16 pad;
__u64 resv[3]; };
The ring_addr field must contain the address to the memory allocated to
fit this ring. The memory must be page aligned and hence allocated
appropriately using eg posix_memalign(3) or similar. The size of the
ring is the product of ring_entries and the size of struct
io_uring_buf. ring_entries is the desired size of the ring, and
must be a power-of-2 in size. The maximum size allowed is 2^15 (32768).
bgid is the buffer group ID associated with this ring. SQEs that
select a buffer has a buffer group associated with them in their
buf_group field, and the associated CQE will have
IORING_CQE_F_BUFFER set in their flags member, which will also
contain the specific ID of the buffer selected. The rest of the fields are
reserved and must be cleared to zero.
The flags argument is currently unused and must be set to zero.
nr_args must be set to 1.
Also see io_uring_register_buf_ring(3) for more details. Available since 5.19.
Available since 5.19.
arg must be set to a pointer to a struct io_uring_sync_cancel_reg structure, with the details filled in for what request(s) to target for cancelation. See io_uring_register_sync_cancel(3) for details on that. The return values are the same, except they are passed back synchronously rather than through the CQE res field. nr_args must be set to 1.
Available since 6.0.
nr_args must be set to 1 and arg must be set to a pointer to a struct io_uring_file_index_range:
struct io_uring_file_index_range {
__u32 off;
__u32 len;
__u64 resv; };
with off being set to the starting value for the range, and len
being set to the number of descriptors. The reserved resv field must
be cleared to zero.
The application must have registered a file table first.
Available since 6.0.
On success, io_uring_register(2) returns either 0 or a positive value, depending on the opcode used. On error, a negative error value is returned. The caller should not rely on the errno variable.
2019-01-17 | Linux |