DOKK / manpages / debian 12 / stenographer / stenotype.1.en
stenotype(1) stenotype(1)

stenotype - dump raw packets to disk

stenotype [-qv?] [--aiops=NUM] [--blocks=NUM] [--count=NUM]

[--dir=STRING] [--fanout_id=NUM] [--fanout_type=NUM]
[--fileage_sec=NUM] [--filesize_mb=NUM] [--filter=STRING]
[--gid=NUM] [--iface=STRING] [--index_nicelevel=NUM] [--no_index]
[--no_watchdogs] [--preallocate_file_mb=NUM] [--seccomp=STRING]
[--threads=NUM] [--uid=NUM] [--help] [--usage]

Stenotype is a mechanism for quickly dumping raw packets to disk. It aims to have a simple interface (no file rotation: that's left as an exercise for the reader) while being very powerful.

stenotype uses a NIC->disk pipeline specifically designed to provide as fast an output to disk as possible while just using the kernel's built-in mechanisms.

1.
NIC -> RAM: stenotype uses MMAP'd AF_PACKET with 1MB blocks and a high timeout to offload writing packets and deciding their layout to the kernel. The kernel packs all the packets it can into 1MB, then lets the userspace process know there's a block available in the MMAP'd ring buffer. Nicely, it guarantees no overruns (packets crossing the 1MB boundary) and good alignment to memory pages.
2.
RAM -> Disk: Since the kernel already gave us a single 1MB block of packets that's nicely aligned, we can O_DIRECT write it straight to disk. This avoids any additional copying or kernel buffering. To keep sequential reads going strong, we do all disk IO asynchronously via io_submit (which works specifically for O_DIRECT files... joy!). Since the data is being written to disk asynchronously, we use the time it's writing to disk to do our own in-memory processing and indexing.

There are N (flag-specified) async IO operations available... once we've used up all N, we block on a used one finishing, then reuse it. The whole pipeline consists of:

  • kernel gives userspace a 1MB block of packets
  • userspace iterates over packets in block, updates any indexes
  • userspace starts async IO operation to write block to disk
  • after N async IO operations are submitted, we synchronously wait for the least recent one to finish.
  • when an async IO operation finishes, we release the 1MB block back to the kernel to write more packets.

Max number of async IO operations
Total number of blocks to use, each is 1MB
Total number of packets to read, -1 to read forever
Directory to store packet files in
If fanning out across processes, set this
TPACKET_V3 fanout type to fanout packets
Files older than this many secs are rotated
Max file size in MB before file is rotated
BPF compiled filter used to filter which packets will be captured. This has to be a compiled BPF in hexadecimal, which can be obtained from a human readable filter expression using the provided compile_bpf.sh script.
Drop privileges to this group
Interface to read packets from
Nice level of indexing threads
Do not compute or write indexes
Don't start any watchdogs
When creating new files, preallocate to this many MB
Quiet logging. Each -q counteracts one -v
Seccomp style, one of 'none', 'trace', 'kill'.
Number of parallel threads to read packets with
Drop privileges to this user
Verbose logging, may be given multiple times
-?, --help
Give this help list
Give a short usage message
09 April 2023 stenographer 1.0.1