ugrep, ug -- file pattern searcher
ugrep [OPTIONS] [-A NUM] [-B
NUM] [-C NUM] [-y] [-Q|PATTERN]
[-f FILE]
[-e PATTERN] [-N PATTERN] [-t
TYPES] [-g GLOBS] [--sort[=KEY]]
[--color[=WHEN]|--colour[=WHEN]]
[--pager[=COMMAND]] [FILE ...]
The ugrep utility searches any given input files, selecting
lines that match one or more patterns. By default, a pattern matches an
input line if the regular expression (RE) matches the input line. A pattern
matches multiple input lines if the RE in the pattern matches one or more
newlines in the input. An empty pattern matches every line. Each input line
that matches at least one of the patterns is written to the standard
output.
ugrep accepts input of various encoding formats and
normalizes the output to UTF-8. When a UTF byte order mark is present in the
input, the input is automatically normalized; otherwise, ugrep
assumes the input is ASCII, UTF-8, or raw binary. An input encoding format
may be specified with option --encoding.
The ug command is equivalent to ugrep --config to
load the default configuration file, which allows for customization, see
CONFIGURATION.
If no FILE arguments are specified and standard input is
read from a terminal, recursive searches are performed as if -R is
specified. To force reading from standard input, specify `-' as a
FILE argument.
Directories specified as FILE arguments are searched
without recursing into subdirectories, unless -R, -r, or
-2...-9 is specified.
Hidden files and directories are ignored in recursive searches.
Option -. (--hidden) includes hidden files and directories in
recursive searches.
A query interface is opened with -Q (--query) to
interactively specify search patterns and view search results. Note that a
PATTERN argument cannot be specified in this case. To specify one or
more patterns with -Q, use -e PATTERN.
For help, --help WHAT displays help on options
related to WHAT.
The following options are available:
- -A NUM,
--after-context=NUM
- Print NUM lines of trailing context after matching lines. Places a
--group-separator between contiguous groups of matches. See also
options -B, -C, and -y.
- -a, --text
- Process a binary file as if it were text. This is equivalent to the
--binary-files=text option. This option might output binary garbage
to the terminal, which can have problematic consequences if the terminal
driver interprets some of it as commands.
- --and [[-e]
PATTERN] ... -e PATTERN
- Specify additional patterns to match. Patterns must be specified with
-e. Each -e PATTERN following this option is
considered an alternative pattern to match, i.e. each -e is
interpreted as an OR pattern. For example, -e A -e
B --and -e C -e D matches lines
with (`A' or `B') and (`C' or `D'). Note that multiple -e
PATTERN are alternations that bind more tightly together than
--and. Option --stats displays the search patterns applied.
See also options --not, --andnot, and --bool.
- --andnot
[[-e] PATTERN] ...
- Combines --and --not. See also options --and,
--not, and --bool.
- -B NUM,
--before-context=NUM
- Print NUM lines of leading context before matching lines. Places a
--group-separator between contiguous groups of matches. See also
options -A, -C, and -y.
- -b,
--byte-offset
- The offset in bytes of a matched line is displayed in front of the
respective matched line. If -u is specified, displays the offset
for each pattern matched on the same line. Byte offsets are exact for
ASCII, UTF-8, and raw binary input. Otherwise, the byte offset in the
UTF-8 normalized input is displayed.
- --binary-files=TYPE
- Controls searching and reporting pattern matches in binary files. TYPE can
be `binary', `without-match`, `text`, `hex`, and `with-hex'. The default
is `binary' to search binary files and to report a match without
displaying the match. `without-match' ignores binary matches. `text'
treats all binary files as text, which might output binary garbage to the
terminal, which can have problematic consequences if the terminal driver
interprets some of it as commands. `hex' reports all matches in
hexadecimal. `with-hex' only reports binary matches in hexadecimal,
leaving text matches alone. A match is considered binary when matching a
zero byte or invalid UTF. Short options are -a, -I,
-U, -W, and -X.
- --bool,
-%
- Specifies Boolean search patterns. A Boolean search pattern is composed of
`AND', `OR', `NOT' operators and grouping with `(' `)'. Spacing between
subpatterns is the same as `AND', `|' is the same as `OR', and a `-' is
the same as `NOT'. The `OR' operator binds more tightly than `AND'. For
example, --bool 'A|B C|D' matches lines with (`A' or `B') and (`C'
or `D'), --bool 'A -B' matches lines with `A' and not `B'.
Operators `AND', `OR', `NOT' require proper spacing. For example,
--bool 'A OR B AND C OR D' matches lines with (`A' or `B') and (`C'
or `D'), --bool 'A AND NOT B' matches lines with `A' without `B'.
Quoted subpatterns are matched literally as strings. For example,
--bool 'A "AND"|"OR"' matches lines with `A'
and also either `AND' or `OR'. Parenthesis are used for grouping. For
example, --bool '(A B)|C' matches lines with `A' and `B', or lines
with `C'. Note that all subpatterns in a Boolean search pattern are
regular expressions, unless option -F is used. Options -E,
-F, -G, -P, and -Z can be combined with
--bool to match subpatterns as strings or regular expressions
(-E is the default.) This option does not apply to -f
FILE patterns. Option --stats displays the search patterns
applied. See also options --and, --andnot, and
--not.
- --break
- Adds a line break between results from different files.
- -C NUM,
--context=NUM
- Print NUM lines of leading and trailing context surrounding each match.
Places a --group-separator between contiguous groups of matches.
See also options -A, -B, and -y.
- -c, --count
- Only a count of selected lines is written to standard output. If -o
or -u is specified, counts the number of patterns matched. If
-v is specified, counts the number of non-matching lines.
- --color[=WHEN],
--colour[=WHEN]
- Mark up the matching text with the expression stored in the GREP_COLOR or
GREP_COLORS environment variable. WHEN can be `never', `always', or
`auto', where `auto' marks up matches only when output on a terminal. The
default is `auto'.
- --colors=COLORS,
--colours=COLORS
- Use COLORS to mark up text. COLORS is a colon-separated list of one or
more parameters `sl=' (selected line), `cx=' (context line), `mt='
(matched text), `ms=' (match selected), `mc=' (match context), `fn=' (file
name), `ln=' (line number), `cn=' (column number), `bn=' (byte offset),
`se=' (separator). Parameter values are ANSI SGR color codes or `k'
(black), `r' (red), `g' (green), `y' (yellow), `b' (blue), `m' (magenta),
`c' (cyan), `w' (white). Upper case specifies background colors. A `+'
qualifies a color as bright. A foreground and a background color may be
combined with font properties `n' (normal), `f' (faint), `h' (highlight),
`i' (invert), `u' (underline). Parameter `hl' enables file name
hyperlinks. Parameter `rv' reverses the `sl=' and `cx=' parameters with
option -v. Selectively overrides GREP_COLORS.
- --config[=FILE],
---[FILE]
- Use configuration FILE. The default FILE is `.ugrep'. The working
directory is checked first for FILE, then the home directory. The options
specified in the configuration FILE are parsed first, followed by the
remaining options specified on the command line.
- --confirm
- Confirm actions in -Q query mode. The default is confirm.
- --cpp
- Output file matches in C++. See also options --format and
-u.
- --csv
- Output file matches in CSV. If -H, -n, -k, or
-b is specified, additional values are output. See also options
--format and -u.
- -D ACTION,
--devices=ACTION
- If an input file is a device, FIFO or socket, use ACTION to process it. By
default, ACTION is `skip', which means that devices are silently skipped.
If ACTION is `read', devices read just as if they were ordinary
files.
- -d ACTION,
--directories=ACTION
- If an input file is a directory, use ACTION to process it. By default,
ACTION is `skip', i.e., silently skip directories unless specified on the
command line. If ACTION is `read', warn when directories are read as
input. If ACTION is `recurse', read all files under each directory,
recursively, following symbolic links only if they are on the command
line. This is equivalent to the -r option. If ACTION is
`dereference-recurse', read all files under each directory, recursively,
following symbolic links. This is equivalent to the -R option.
- --depth=[MIN,][MAX],
-1, -2 ... -9, --10, --11 ...
- Restrict recursive searches from MIN to MAX directory levels deep, where
-1 (--depth=1) searches the specified path without recursing
into subdirectories. Note that -3 -5, -3-5, or
-35 searches 3 to 5 levels deep. Enables -R if -R or
-r is not specified.
- --dotall
- Dot `.' in regular expressions matches anything, including newline. Note
that `.*' matches all input and should not be used.
- -E,
--extended-regexp
- Interpret patterns as extended regular expressions (EREs). This is the
default.
- -e PATTERN,
--regexp=PATTERN
- Specify a PATTERN used during the search of the input: an input line is
selected if it matches any of the specified patterns. Note that longer
patterns take precedence over shorter patterns. This option is most useful
when multiple -e options are used to specify multiple patterns,
when a pattern begins with a dash (`-'), to specify a pattern after option
-f or after the FILE arguments.
- --encoding=ENCODING
- The encoding format of the input, where ENCODING can be: `binary',
`ASCII', `UTF-8', `UTF-16', `UTF-16BE', `UTF-16LE', `UTF-32', `UTF-32BE',
`UTF-32LE', `LATIN1', `ISO-8859-1', `ISO-8859-2', `ISO-8859-3',
`ISO-8859-4', `ISO-8859-5', `ISO-8859-6', `ISO-8859-7', `ISO-8859-8',
`ISO-8859-9', `ISO-8859-10', `ISO-8859-11', `ISO-8859-13', `ISO-8859-14',
`ISO-8859-15', `ISO-8859-16', `MAC', `MACROMAN', `EBCDIC', `CP437',
`CP850', `CP858', `CP1250', `CP1251', `CP1252', `CP1253', `CP1254',
`CP1255', `CP1256', `CP1257', `CP1258', `KOI8-R', `KOI8-U',
`KOI8-RU'.
- --exclude=GLOB
- Skip files whose name matches GLOB using wildcard matching, same as
-g ^GLOB. GLOB can use **, *, ?, and [...] as wildcards, and \ to
quote a wildcard or backslash character literally. When GLOB contains a
`/', full pathnames are matched. Otherwise basenames are matched. When
GLOB ends with a `/', directories are excluded as if --exclude-dir
is specified. Otherwise files are excluded. Note that --exclude
patterns take priority over --include patterns. GLOB should be
quoted to prevent shell globbing. This option may be repeated.
- --exclude-dir=GLOB
- Exclude directories whose name matches GLOB from recursive searches, same
as -g ^GLOB/. GLOB can use **, *, ?, and [...] as wildcards, and \
to quote a wildcard or backslash character literally. When GLOB contains a
`/', full pathnames are matched. Otherwise basenames are matched. Note
that --exclude-dir patterns take priority over --include-dir
patterns. GLOB should be quoted to prevent shell globbing. This option may
be repeated.
- --exclude-from=FILE
- Read the globs from FILE and skip files and directories whose name matches
one or more globs (as if specified by --exclude and
--exclude-dir). Lines starting with a `#' and empty lines in FILE
are ignored. When FILE is a `-', standard input is read. This option may
be repeated.
- --exclude-fs=MOUNTS
- Exclude file systems specified by MOUNTS from recursive searches, MOUNTS
is a comma-separated list of mount points or pathnames of directories on
file systems. Note that --exclude-fs mounts take priority over
--include-fs mounts. This option may be repeated.
- -F,
--fixed-strings
- Interpret pattern as a set of fixed strings, separated by newlines, any of
which is to be matched. This makes ugrep behave as fgrep. If a PATTERN is
specified, or -e PATTERN or -N PATTERN, then
this option has no effect on -f FILE patterns to allow
-f FILE patterns to narrow or widen the scope of the PATTERN
search.
- -f FILE,
--file=FILE
- Read newline-separated patterns from FILE. White space in patterns is
significant. Empty lines in FILE are ignored. If FILE does not exist, the
GREP_PATH environment variable is used as path to FILE. If that fails,
looks for FILE in /usr/local/share/ugrep/patterns. When FILE is a `-',
standard input is read. Empty files contain no patterns; thus nothing is
matched. This option may be repeated.
- --filter=COMMANDS
- Filter files through the specified COMMANDS first before searching.
COMMANDS is a comma-separated list of `exts:command [option ...]', where
`exts' is a comma-separated list of filename extensions and `command' is a
filter utility. The filter utility should read from standard input and
write to standard output. Files matching one of `exts' are filtered. When
`exts' is `*', files with non-matching extensions are filtered. One or
more `option' separated by spacing may be specified, which are passed
verbatim to the command. A `%' as `option' expands into the pathname to
search. For example, --filter='pdf:pdftotext % -' searches PDF
files. The `%' expands into a `-' when searching standard input. Option
--label=.ext may be used to specify extension `ext' when searching
standard input.
- --filter-magic-label=[+]LABEL:MAGIC
- Associate LABEL with files whose signature "magic bytes" match
the MAGIC regex pattern. Only files that have no filename extension are
labeled, unless +LABEL is specified. When LABEL matches an extension
specified in --filter=COMMANDS, the corresponding command is
invoked. This option may be repeated.
- --format=FORMAT
- Output FORMAT-formatted matches. For example --format='%f:%n:%O%~'
outputs matching lines `%O' with filename `%f` and line number `%n'
followed by a newline `%~'. Context options -A, -B,
-C, and -y are ignored. See `man ugrep' section FORMAT.
- --free-space
- Spacing (blanks and tabs) in regular expressions are ignored.
- -G,
--basic-regexp
- Interpret pattern as a basic regular expression, i.e. make ugrep behave as
traditional grep.
- -g GLOBS,
--glob=GLOBS
- Search only files whose name matches the specified comma-separated list of
GLOBS, same as --include='glob' for each `glob' in GLOBS. When a
`glob' is preceded by a `!' or a `^', skip files whose name matches
`glob', same as --exclude='glob'. When `glob' contains a `/', full
pathnames are matched. Otherwise basenames are matched. When `glob' ends
with a `/', directories are matched, same as --include-dir='glob'
and --exclude-dir='glob'. A leading `/' matches the working
directory. This option may be repeated and may be combined with options
-M, -O and -t to expand the recursive search.
- --group-separator[=SEP]
- Use SEP as a group separator for context options -A, -B, and
-C. The default is a double hyphen (`--').
- -H,
--with-filename
- Always print the filename with output lines. This is the default when
there is more than one file to search.
- -h,
--no-filename
- Never print filenames with output lines. This is the default when there is
only one file (or only standard input) to search.
- --heading,
-+
- Group matches per file. Adds a heading and a line break between results
from different files.
- --help [WHAT],
-? [WHAT]
- Display a help message, specifically on WHAT when specified.
- --hexdump=[1-8][b][c][h]
- Output matches in 1 to 8 columns of 8 hexadecimal octets. The default is 2
columns or 16 octets per line. Option `b' removes all space breaks, `c'
removes the character column, and `h' removes the hex spacing. Enables
-X if -W or -X is not specified.
- --hidden,
-.
- Search hidden files and directories.
- --hyperlink
- Hyperlinks are enabled for file names when colors are enabled. Same as
--colors=hl.
- -I,
--ignore-binary
- Ignore matches in binary files. This option is equivalent to the
--binary-files=without-match option.
- -i,
--ignore-case
- Perform case insensitive matching. By default, ugrep is case sensitive. By
default, this option applies to ASCII letters only. Use options -P
and -i for Unicode case insensitive matching.
- --ignore-files[=FILE]
- Ignore files and directories matching the globs in each FILE that is
encountered in recursive searches. The default FILE is `.gitignore'.
Matching files and directories located in the directory of a FILE's
location and in directories below are ignored by temporarily overriding
the --exclude and --exclude-dir globs. Files and directories
that are explicitly specified as command line arguments are never ignored.
This option may be repeated.
- --include=GLOB
- Search only files whose name matches GLOB using wildcard matching, same as
-g GLOB. GLOB can use **, *, ?, and [...] as wildcards, and
\ to quote a wildcard or backslash character literally. When GLOB contains
a `/', full pathnames are matched. Otherwise basenames are matched. When
GLOB ends with a `/', directories are included as if --include-dir
is specified. Otherwise files are included. Note that --exclude
patterns take priority over --include patterns. GLOB should be
quoted to prevent shell globbing. This option may be repeated.
- --include-dir=GLOB
- Only directories whose name matches GLOB are included in recursive
searches, same as -g GLOB/. GLOB can use **, *, ?, and [...]
as wildcards, and \ to quote a wildcard or backslash character literally.
When GLOB contains a `/', full pathnames are matched. Otherwise basenames
are matched. Note that --exclude-dir patterns take priority over
--include-dir patterns. GLOB should be quoted to prevent shell
globbing. This option may be repeated.
- --include-from=FILE
- Read the globs from FILE and search only files and directories whose name
matches one or more globs (as if specified by --include and
--include-dir). Lines starting with a `#' and empty lines in FILE
are ignored. When FILE is a `-', standard input is read. This option may
be repeated.
- --include-fs=MOUNTS
- Only file systems specified by MOUNTS are included in recursive searches.
MOUNTS is a comma-separated list of mount points or pathnames of
directories on file systems. --include-fs=. restricts recursive
searches to the file system of the working directory only. Note that
--exclude-fs mounts take priority over --include-fs mounts.
This option may be repeated.
- -J NUM,
--jobs=NUM
- Specifies the number of threads spawned to search files. By default an
optimum number of threads is spawned to search files simultaneously.
-J1 disables threading: files are searched in the same order as
specified.
- -j,
--smart-case
- Perform case insensitive matching like option -i, unless a pattern
is specified with a literal ASCII upper case letter.
- --json
- Output file matches in JSON. If -H, -n, -k, or
-b is specified, additional values are output. See also options
--format and -u.
- -K
FIRST[,LAST],
--range=FIRST[,LAST]
- Start searching at line FIRST, stop at line LAST when specified.
- -k,
--column-number
- The column number of a matched pattern is displayed in front of the
respective matched line, starting at column 1. Tabs are expanded when
columns are counted, see also option --tabs.
- -L,
--files-without-match
- Only the names of files not containing selected lines are written to
standard output. Pathnames are listed once per file searched. If the
standard input is searched, the string ``(standard input)'' is
written.
- -l,
--files-with-matches
- Only the names of files containing selected lines are written to standard
output. ugrep will only search a file until a match has been found, making
searches potentially less expensive. Pathnames are listed once per file
searched. If the standard input is searched, the string ``(standard
input)'' is written.
- --label=LABEL
- Displays the LABEL value when input is read from standard input where a
file name would normally be printed in the output. Associates a filename
extension with standard input when LABEL has a suffix. The default value
is `(standard input)'.
- --line-buffered
- Force output to be line buffered instead of block buffered.
- -M MAGIC,
--file-magic=MAGIC
- Only files matching the signature pattern MAGIC are searched. The
signature "magic bytes" at the start of a file are compared to
the MAGIC regex pattern. When matching, the file will be searched. When
MAGIC is preceded by a `!' or a `^', skip files with matching MAGIC
signatures. This option may be repeated and may be combined with options
-O and -t to expand the search. Every file on the search
path is read, making searches potentially more expensive.
- -m NUM,
--max-count=NUM
- Stop reading the input after NUM matches in each input file.
- --match
- Match all input. Same as specifying an empty pattern to search.
- --max-files=NUM
- Restrict the number of files matched to NUM. Note that --sort or
-J1 may be specified to produce replicable results. If
--sort is specified, the number of threads spawned is limited to
NUM.
- --mmap[=MAX]
- Use memory maps to search files. By default, memory maps are used under
certain conditions to improve performance. When MAX is specified, use up
to MAX mmap memory per thread.
- -N PATTERN,
--neg-regexp=PATTERN
- Specify a negative PATTERN used during the search of the input: an input
line is selected only if it matches any of the specified patterns unless a
subpattern of PATTERN. Same as -e (?^PATTERN). Negative PATTERN
matches are essentially removed before any other patterns are matched.
Note that longer patterns take precedence over shorter patterns. This
option may be repeated.
- -n,
--line-number
- Each output line is preceded by its relative line number in the file,
starting at line 1. The line number counter is reset for each file
processed.
- --no-group-separator
- Removes the group separator line from the output for context options
-A, -B, and -C.
- --not [-e]
PATTERN
- Specifies that PATTERN should not match. Note that -e A
--not -e B matches lines with `A' or lines without a
`B'. To match lines with `A' that have no `B', specify -e A
--andnot -e B. Option --stats displays the
search patterns applied. See also options --and, --andnot,
and --bool.
- -O EXTENSIONS,
--file-extension=EXTENSIONS
- Search only files whose filename extensions match the specified
comma-separated list of EXTENSIONS, same as --include='*.ext' for
each `ext' in EXTENSIONS. When an `ext' is preceded by a `!' or a `^',
skip files whose filename extensions matches `ext', same as
--exclude='*.ext'. This option may be repeated and may be combined
with options -g, -M and -t to expand the recursive
search.
- -o,
--only-matching
- Print only the matching part of lines. When multiple lines match, the line
numbers with option -n are displayed using `|' as the field
separator for each additional line matched by the pattern. If -u is
specified, ungroups multiple matches on the same line. This option cannot
be combined with options -A, -B, -C, -v, and
-y.
- --only-line-number
- The line number of the matching line in the file is output without
displaying the match. The line number counter is reset for each file
processed.
- -P,
--perl-regexp
- Interpret PATTERN as a Perl regular expression using PCRE2.
- -p,
--no-dereference
- If -R or -r is specified, no symbolic links are followed,
even when they are specified on the command line.
- When output is sent to the terminal, uses COMMAND to page through the
output. The default COMMAND is `less -R'. Enables --heading
and --line-buffered.
- --pretty
- When output is sent to a terminal, enables --color,
--heading, -n, --sort and -T when not
explicitly disabled or set.
- -Q[DELAY],
--query[=DELAY]
- Query mode: user interface to perform interactive searches. This mode
requires an ANSI capable terminal. An optional DELAY argument may be
specified to reduce or increase the response time to execute searches
after the last key press, in increments of 100ms, where the default is 5
(0.5s delay). No whitespace may be given between -Q and its
argument DELAY. Initial patterns may be specified with -e
PATTERN, i.e. a PATTERN argument requires option -e. Press
F1 or CTRL-Z to view the help screen. Press F2 or CTRL-Y to invoke a
command to view or edit the file shown at the top of the screen. The
command can be specified with option --view, or defaults to
environment variable PAGER if defined, or EDITOR. Press Tab and Shift-Tab
to navigate directories and to select a file to search. Press Enter to
select lines to output. Press ALT-l for option -l to list files,
ALT-n for -n, etc. Non-option commands include ALT-] to increase
fuzziness and ALT-} to increase context. Enables --heading. See
also options --confirm and --view.
- -q, --quiet,
--silent
- Quiet mode: suppress all output. ugrep will only search until a match has
been found.
- -R,
--dereference-recursive
- Recursively read all files under each directory. Follow all symbolic
links, unlike -r. When -J1 is specified, files are searched
in the same order as specified. Note that when no FILE arguments are
specified and input is read from a terminal, recursive searches are
performed as if -R is specified.
- -r,
--recursive
- Recursively read all files under each directory, following symbolic links
only if they are on the command line. When -J1 is specified, files
are searched in the same order as specified.
- -S,
--dereference
- If -r is specified, all symbolic links are followed, like
-R. The default is not to follow symbolic links.
- -s,
--no-messages
- Silent mode: nonexistent and unreadable files are ignored, i.e. their
error messages are suppressed.
- --save-config[=FILE]
- Save configuration FILE. By default `.ugrep' is saved. If FILE is a `-',
write the configuration to standard output.
- --separator[=SEP]
- Use SEP as field separator between file name, line number, column number,
byte offset, and the matched line. The default is a colon (`:').
- --sort[=KEY]
- Displays matching files in the order specified by KEY in recursive
searches. KEY can be `name' to sort by pathname (default), `best' to sort
by best match with option -Z (sort by best match requires two
passes over the input files), `size' to sort by file size, `used' to sort
by last access time, `changed' to sort by last modification time, and
`created' to sort by creation time. Sorting is reversed with `rname',
`rbest', `rsize', `rused', `rchanged', or `rcreated'. Archive contents are
not sorted. Subdirectories are sorted and displayed after matching files.
FILE arguments are searched in the same order as specified. Normally ugrep
displays matches in no particular order to improve performance.
- --stats
- Output statistics on the number of files and directories searched, and the
inclusion and exclusion constraints applied.
- -T,
--initial-tab
- Add a tab space to separate the file name, line number, column number, and
byte offset with the matched line.
- -t TYPES,
--file-type=TYPES
- Search only files associated with TYPES, a comma-separated list of file
types. Each file type corresponds to a set of filename extensions passed
to option -O. For capitalized file types, the search is expanded to
include files with matching file signature magic bytes, as if passed to
option -M. When a type is preceded by a `!' or a `^', excludes
files of the specified type. This option may be repeated. The possible
file types can be (where -tlist displays a detailed list):
`actionscript', `ada', `asm', `asp', `aspx', `autoconf', `automake',
`awk', `Awk', `basic', `batch', `bison', `c', `c++', `clojure', `csharp',
`css', `csv', `dart', `Dart', `delphi', `elisp', `elixir', `erlang',
`fortran', `gif', `Gif', `go', `groovy', `gsp', `haskell', `html', `jade',
`java', `jpeg', `Jpeg', `js', `json', `jsp', `julia', `kotlin', `less',
`lex', `lisp', `lua', `m4', `make', `markdown', `matlab', `node', `Node',
`objc', `objc++', `ocaml', `parrot', `pascal', `pdf', `Pdf', `perl',
`Perl', `php', `Php', `png', `Png', `prolog', `python', `Python', `r',
`rpm', `Rpm', `rst', `rtf', `Rtf', `ruby', `Ruby', `rust', `scala',
`scheme', `shell', `Shell', `smalltalk', `sql', `svg', `swift', `tcl',
`tex', `text', `tiff', `Tiff', `tt', `typescript', `verilog', `vhdl',
`vim', `xml', `Xml', `yacc', `yaml'.
- --tabs[=NUM]
- Set the tab size to NUM to expand tabs for option -k. The value of
NUM may be 1, 2, 4, or 8. The default tab size is 8.
- --tag[=TAG[,END]]
- Disables colors to mark up matches with TAG. END marks the end of a match
if specified, otherwise TAG. The default is `___'.
- -U, --binary
- Disables Unicode matching for binary file matching, forcing PATTERN to
match bytes, not Unicode characters. For example, -U '\xa3' matches
byte A3 (hex) instead of the Unicode code point U+00A3 represented by the
UTF-8 sequence C2 A3. See also option --dotall.
- -u, --ungroup
- Do not group multiple pattern matches on the same matched line. Output the
matched line again for each additional pattern match, using `+' as the
field separator.
- -V, --version
- Display version information and exit.
- -v,
--invert-match
- Selected lines are those not matching any of the specified patterns.
- --view[=COMMAND]
- Use COMMAND to view/edit a file in query mode when pressing CTRL-Y.
- -W,
--with-hex
- Output binary matches in hexadecimal, leaving text matches alone. This
option is equivalent to the --binary-files=with-hex option.
- -w,
--word-regexp
- The PATTERN is searched for as a word, such that the matching text is
preceded by a non-word character and is followed by a non-word character.
Word characters are letters, digits, and the underscore. With option
-P, word characters are Unicode letters, digits, and underscore.
This option has no effect if -x is also specified. If a PATTERN is
specified, or -e PATTERN or -N PATTERN, then
this option has no effect on -f FILE patterns to allow
-f FILE patterns to narrow or widen the scope of the PATTERN
search.
- -X, --hex
- Output matches in hexadecimal. This option is equivalent to the
--binary-files=hex option. See also option --hexdump.
- -x,
--line-regexp
- Select only those matches that exactly match the whole line, as if the
patterns are surrounded by ^ and $. If a PATTERN is specified, or
-e PATTERN or -N PATTERN, then this option has
no effect on -f FILE patterns to allow -f FILE
patterns to narrow or widen the scope of the PATTERN search.
- --xml
- Output file matches in XML. If -H, -n, -k, or
-b is specified, additional values are output. See also options
--format and -u.
- -Y, --empty
- Permits empty matches. By default, empty matches are disabled, unless a
pattern begins with `^' or ends with `$'. With this option, empty-matching
patterns such as x? and x*, match all input, not only lines containing the
character `x'.
- -y,
--any-line
- Any matching or non-matching line is output. Non-matching lines are output
with the `-' separator as context of the matching lines. See also options
-A, -B, and -C.
- -Z[[+-~]MAX],
--fuzzy[=[+-~]MAX]
- Fuzzy mode: report approximate pattern matches within MAX errors. By
default, MAX is 1: one deletion, insertion or substitution is allowed.
When `+' and/or `-' precede MAX, only insertions and/or deletions are
allowed, respectively. When `~' precedes MAX, substitution counts as one
error. For example, -Z+~3 allows up to three insertions or
substitutions, but no deletions. The first character of an approximate
match always matches the begin of a pattern. Option --sort=best
orders matching files by best match. No whitespace may be given between
-Z and its argument.
- -z,
--decompress
- Decompress files to search, when compressed. Archives (.cpio, .pax, .tar
and .zip) and compressed archives (e.g. .taz, .tgz, .tpz, .tbz, .tbz2,
.tb2, .tz2, .tlz, .txz, .tzst) are searched and matching pathnames of
files in archives are output in braces. If -g, -O,
-M, or -t is specified, searches files within archives whose
name matches globs, matches file name extensions, matches file signature
magic bytes, or matches file types, respectively. Supported compression
formats: gzip (.gz), compress (.Z), zip, bzip2 (requires suffix .bz, .bz2,
.bzip2, .tbz, .tbz2, .tb2, .tz2), lzma and xz (requires suffix .lzma,
.tlz, .xz, .txz), lz4 (requires suffix .lz4), zstd (requires suffix .zst,
.zstd, .tzst).
- -0, --null
- Prints a zero-byte (NUL) after the file name. This option can be used with
commands such as `find -print0' and `xargs -0' to process
arbitrary file names.
A `--' signals the end of options; the rest of the parameters are
FILE arguments, allowing filenames to begin with a `-' character.
Long options may start with `--no-' to disable, when
applicable.
The regular expression pattern syntax is an extended form of the
POSIX ERE syntax. For an overview of the syntax see README.md or visit:
- https://github.com/Genivia/ugrep
Note that `.' matches any non-newline character. Pattern `\n'
matches a newline character. Multiple lines may be matched with patterns
that match one or more newline characters.
The ugrep utility exits with one of the following
values:
- 0
- One or more lines were selected.
- 1
- No lines were selected.
- >1
- An error occurred.
If -q or --quiet or --silent is used and a
line is selected, the exit status is 0 even if an error occurred.
The ug command is intended for context-dependent
interactive searching and is equivalent to the ugrep --config command
to load the default configuration file `.ugrep' when present in the working
directory or in the home directory.
A configuration file contains `NAME=VALUE' pairs per line, where
`NAME` is the name of a long option (without `--') and `=VALUE' is an
argument, which is optional and may be omitted depending on the option.
Empty lines and lines starting with a `#' are ignored.
The --config=FILE option and its abbreviated form
---FILE load the specified configuration file located in the
working directory or, when not found, located in the home directory. An
error is produced when FILE is not found or cannot be read.
Command line options are parsed in the following order: the
configuration file is loaded first, followed by the remaining options and
arguments on the command line.
The --save-config option saves a `.ugrep' configuration
file to the working directory with a subset of the current options. The
--save-config=FILE option saves the configuration to
FILE. The configuration is written to standard output when
FILE is a `-'.
Globbing is used by options -g, --include,
--include-dir, --include-from, --exclude,
--exclude-dir, --exclude-from to match pathnames and basenames
in recursive searches. Glob arguments for these options should be quoted to
prevent shell globbing.
Globbing supports gitignore syntax and the corresponding matching
rules. When a glob ends in a path separator it matches directories as if
--include-dir or --exclude-dir is specified. When a glob
contains a path separator `/', the full pathname is matched. Otherwise the
basename of a file or directory is matched. For example, *.h matches
foo.h and bar/foo.h. bar/*.h matches bar/foo.h but not foo.h and not
bar/bar/foo.h. Use a leading `/' to force /*.h to match foo.h but not
bar/foo.h.
When a glob starts with a `^' or a `!' as in
-g^GLOB, the match is negated. Likewise, a `!' (but not a `^')
may be used with globs in the files specified --include-from,
--exclude-from, and --ignore-files to negate the glob match.
Empty lines or lines starting with a `#' are ignored.
Glob Syntax and Conventions
- *
- Matches anything except a /.
- ?
- Matches any one character except a /.
- [a-z]
- Matches one character in the selected range of characters.
- [^a-z]
- Matches one character not in the selected range of characters.
- [!a-z]
- Matches one character not in the selected range of characters.
- /
- When used at the begin of a glob, matches if pathname has no /. When used
at the end of a glob, matches directories only.
- **/
- Matches zero or more directories.
- /**
- When used at the end of a glob, matches everything after the /.
- \?
- Matches a ? (or any character specified after the backslash).
Glob Matching Examples
- *
- Matches a, b, x/a, x/y/b
- a
- Matches a, x/a, x/y/a, but not b, x/b, a/a/b
- /*
- Matches a, b, but not x/a, x/b, x/y/a
- /a
- Matches a, but not x/a, x/y/a
- a?b
- Matches axb, ayb, but not a, b, ab, a/b
- a[xy]b
- Matches axb, ayb but not a, b, azb
- a[a-z]b
- Matches aab, abb, acb, azb, but not a, b, a3b, aAb, aZb
- a[^xy]b
- Matches aab, abb, acb, azb, but not a, b, axb, ayb
- a[^a-z]b
- Matches a3b, aAb, aZb but not a, b, aab, abb, acb, azb
- a/*/b
- Matches a/x/b, a/y/b, but not a/b, a/x/y/b
- **/a
- Matches a, x/a, x/y/a, but not b, x/b.
- a/**/b
- Matches a/b, a/x/b, a/x/y/b, but not x/a/b, a/b/x
- a/**
- Matches a/x, a/y, a/x/y, but not a, b/x
- a\?b
- Matches a?b, but not a, b, ab, axb, a/b
Note that exclude glob patterns take priority over include glob
patterns when specified with options -g, --exclude, --exclude-dir, --include
and include-dir.
Glob patterns specified with prefix `!' in any of the files
associated with --include-from, --exclude-from and --ignore-files will
negate a previous glob match. That is, any matching file or directory
excluded by a previous glob pattern specified in the files associated with
--exclude-from or --ignore-file will become included again. Likewise, any
matching file or directory included by a previous glob pattern specified in
the files associated with --include-from will become excluded again.
- GREP_PATH
- May be used to specify a file path to pattern files. The file path is used
by option -f to open a pattern file, when the pattern file does not
exist.
- GREP_COLOR
- May be used to specify ANSI SGR parameters to highlight matches when
option --color is used, e.g. 1;35;40 shows pattern matches in bold
magenta text on a black background. Deprecated in favor of
GREP_COLORS, but still supported.
- GREP_COLORS
- May be used to specify ANSI SGR parameters to highlight matches and other
attributes when option --color is used. Its value is a
colon-separated list of ANSI SGR parameters that defaults to
cx=33:mt=1;31:fn=1;35:ln=1;32:cn=1;32:bn=1;32:se=36. The
mt=, ms=, and mc= capabilities of GREP_COLORS
take priority over GREP_COLOR. Option --colors takes
priority over GREP_COLORS.
Colors are specified as string of colon-separated ANSI SGR
parameters of the form `what=substring', where `substring' is a
semicolon-separated list of ANSI SGR codes or `k' (black), `r' (red), `g'
(green), `y' (yellow), `b' (blue), `m' (magenta), `c' (cyan), `w' (white).
Upper case specifies background colors. A `+' qualifies a color as bright. A
foreground and a background color may be combined with one or more font
properties `n' (normal), `f' (faint), `h' (highlight), `i' (invert), `u'
(underline). Substrings may be specified for:
- sl=
- SGR substring for selected lines.
- cx=
- SGR substring for context lines.
- rv
- Swaps the sl= and cx= capabilities when -v is
specified.
- mt=
- SGR substring for matching text in any matching line.
- ms=
- SGR substring for matching text in a selected line. The substring
mt= by default.
- mc=
- SGR substring for matching text in a context line. The substring
mt= by default.
- fn=
- SGR substring for filenames.
- ln=
- SGR substring for line numbers.
- cn=
- SGR substring for column numbers.
- bn=
- SGR substring for byte offsets.
- se=
- SGR substring for separators.
- rv
- a Boolean parameter, switches sl= and cx= with option
-v.
- hl
- a Boolean parameter, enables filename hyperlinks
(\33]8;;link).
- ne
- a Boolean parameter, disables ``erase in line'' \33[K.
Option --format=FORMAT specifies an output format
for file matches. Fields may be used in FORMAT, which expand into the
following values:
- %[ARG]F
- if option -H is used: ARG, the file pathname and
separator.
- %f
- the file pathname.
- %a
- the file basename without directory path.
- %p
- the directory path to the file.
- %z
- the file pathname in a (compressed) archive.
- %[ARG]H
- if option -H is used: ARG, the quoted pathname and
separator.
- %h
- the quoted file pathname.
- %[ARG]N
- if option -n is used: ARG, the line number and
separator.
- %n
- the line number of the match.
- %[ARG]K
- if option -k is used: ARG, the column number and
separator.
- %k
- the column number of the match.
- %[ARG]B
- if option -b is used: ARG, the byte offset and
separator.
- %b
- the byte offset of the match.
- %[ARG]T
- if option -T is used: ARG and a tab character.
- %t
- a tab character.
- %[SEP]$
- set field separator to SEP for the rest of the format fields.
- %[ARG]<
- if the first match: ARG.
- %[ARG]>
- if not the first match: ARG.
- %,
- if not the first match: a comma, same as %[,]>.
- %:
- if not the first match: a colon, same as %[:]>.
- %;
- if not the first match: a semicolon, same as %[;]>.
- %|
- if not the first match: a verical bar, same as %[|]>.
- %[ARG]S
- if not the first match: ARG and separator, see also %$.
- %s
- the separator, see also %S and %$.
- %~
- a newline character.
- %m
- the number of matches or matched files.
- %O
- the matching line is output as a raw string of bytes.
- %o
- the match is output as a raw string of bytes.
- %Q
- the matching line as a quoted string, \" and \\ replace " and
\.
- %q
- the match as a quoted string, \" and \\ replace " and \.
- %C
- the matching line formatted as a quoted C/C++ string.
- %c
- the match formatted as a quoted C/C++ string.
- %J
- the matching line formatted as a quoted JSON string.
- %j
- the match formatted as a quoted JSON string.
- %V
- the matching line formatted as a quoted CSV string.
- %v
- the match formatted as a quoted CSV string.
- %X
- the matching line formatted as XML character data.
- %x
- the match formatted as XML character data.
- %w
- the width of the match, counting wide characters.
- %d
- the size of the match, counting bytes.
- %e
- the ending byte offset of the match.
- %Z
- the edit distance cost of an approximate match with option -Z
- %u
- select unique lines only, unless option -u is used.
- %1
- the first regex group capture of the match, and so on up to group
%9, same as %[1]#; requires option -P.
- %[NUM]#
- the regex group capture NUM; requires option -P.
- %[NUM1|NUM2|...]#
- the first group capture NUM that matched; requires option
-P.
- %[NAME]#
- the NAMEd group capture; requires option -P and capturing
pattern `(?<NAME>PATTERN)', see also %G.
- %[NAME1|NAME2|...]#
- the first NAMEd group capture that matched; requires option
-P and capturing pattern `(?<NAME>PATTERN)', see also
%G.
- %G
- list of group capture indices/names that matched; requires option
-P.
- %[TEXT1|TEXT2|...]G
- list of TEXT indexed by group capture indices that matched;
requires option -P.
- %g
- the group capture index/name matched or 1; requires option -P.
- %[TEXT1|TEXT2|...]g
- the first TEXT indexed by the first group capture index that
matched; requires option -P.
- %%
- the percentage sign.
Formatted output is written without a terminating newline, unless
%~ or `\n' is explicitly specified in the format string.
The [ARG] part of a field is optional and may
be omitted. When present, the argument must be placed in [] brackets,
for example %[,]F to output a comma, the pathname, and a
separator.
%[SEP]$ and %u are switches and do not
send anything to the output.
The separator used by the %F, %H, %N,
%K, %B, %S and %G fields may be changed by
preceding the field by %[SEP]$. When
[SEP] is not provided, this reverts the separator to
the default separator or the separator specified with
--separator.
Formatted output is written for each matching pattern, which means
that a line may be output multiple times when patterns match more than once
on the same line. If field %u is specified anywhere in a format
string, matching lines are output only once, unless option -u,
--ungroup is specified or when more than one line of input matched
the search pattern.
Additional formatting options:
- --format-begin=FORMAT
- the FORMAT when beginning the search.
- --format-open=FORMAT
- the FORMAT when opening a file and a match was found.
- --format-close=FORMAT
- the FORMAT when closing a file and a match was found.
- --format-end=FORMAT
- the FORMAT when ending the search.
The context options -A, -B, -C, -y,
and display options --break, --heading, --color,
-T, and --null have no effect on formatted output.
Display lines containing the word `patricia' in `myfile.txt':
- $ ugrep -w patricia myfile.txt
Display lines containing the word `patricia', ignoring case:
- $ ugrep -wi patricia myfile.txt
Display lines approximately matching the word `patricia', ignoring
case and allowing up to 2 spelling errors using fuzzy search:
- $ ugrep -Z2 -wi patricia myfile.txt
Count the number of lines containing `patricia', ignoring
case:
- $ ugrep -cwi patricia myfile.txt
Count the number of words `patricia', ignoring case:
- $ ugrep -cowi patricia myfile.txt
List lines with both `amount' and a decimal number, ignoring
case:
- $ ugrep -wi --bool 'amount +(.+)?' myfile.txt
Alternative query:
- $ ugrep -wi -e amount --and '+(.+)?' myfile.txt
List all Unicode words in a file:
- $ ugrep -o '\w+' myfile.txt
List all ASCII words in a file:
- $ ugrep -o '[[:word:]]+' myfile.txt
List the laughing face emojis (Unicode code points U+1F600 to
U+1F60F):
- $ ugrep -o '[\x{1F600}-\x{1F60F}]' myfile.txt
Check if a file contains any non-ASCII (i.e. Unicode)
characters:
- $ ugrep -q '[^[:ascii:]]' myfile.txt && echo "contains
Unicode"
Display the line and column number of `FIXME' in C++ files using
recursive search, with one line of context before and after a matched
line:
- $ ugrep -C1 -R -n -k -tc++ FIXME
List the C/C++ comments in a file with line numbers:
- $ ugrep -n -e '//.*' -e '/\*([^*]|(\*+[^*/]))*\*+\/' myfile.cpp
The same, but using predefined pattern c++/comments:
- $ ugrep -n -f c++/comments myfile.cpp
List the lines that need fixing in a C/C++ source file by looking
for the word `FIXME' while skipping any `FIXME' in quoted strings:
- $ ugrep -e FIXME -N '"(\\.|\\\r?\n|[^\\\n"])*"'
myfile.cpp
The same, but using predefined pattern cpp/zap_strings:
- $ ugrep -e FIXME -f cpp/zap_strings myfile.cpp
Find lines with `FIXME' or `TODO':
- $ ugrep -n -e FIXME -e TODO myfile.cpp
Find lines with `FIXME' that also contain the word `urgent':
- $ ugrep -n FIXME myfile.cpp | ugrep -w urgent
Find lines with `FIXME' but not the word `later':
- $ ugrep -n FIXME myfile.cpp | ugrep -v -w later
Output a list of line numbers of lines with `FIXME' but not
`later':
- $ ugrep -n FIXME myfile.cpp | ugrep -vw later |
ugrep -P '^(\d+)' --format='%,%n'
Find lines with `FIXME' in the C/C++ files stored in a
tarball:
- $ ugrep -z -tc++ -n FIXME project.tgz
Recursively find lines with `FIXME' in C/C++ files, but do not
search any `bak' and `old' directories:
- $ ugrep -n FIXME -tc++ -g^bak/,^old/
Recursively search for the word `copyright' in
cpio/jar/pax/tar/zip archives, compressed and regular files, and in PDFs
using a PDF filter:
- $ ugrep -z -w --filter='pdf:pdftotext % -' copyright
Match the binary pattern `A3hhhhA3hh' (hex) in a binary file
without Unicode pattern matching -U (which would otherwise match
`\xaf' as a Unicode character U+00A3 with UTF-8 byte sequence C2 A3) and
display the results in hex with -X using `less -R' as a pager:
- $ ugrep --pager -UXo '\xa3[\x00-\xff]{2}\xa3[\x00-\xff]' a.out
Hexdump an entire file:
- $ ugrep -X '' a.out
List all files that are not ignored by one or more
`.gitignore':
- $ ugrep -l '' --ignore-files
List all files containing a RPM signature, located in the `rpm'
directory and recursively below up to two levels deeper (3 levels
total):
- $ ugrep -3 -l -tRpm '' rpm/
Monitor the system log for bug reports and ungroup multiple
matches on a line:
- $ tail -f /var/log/system.log | ugrep -u -i -w bug
Interactive fuzzy search with Boolean search queries:
- $ ugrep -Q --bool -Z3 --sort=best
Display all words in a MacRoman-encoded file that has CR
newlines:
- $ ugrep --encoding=MACROMAN '\w+' mac.txt
Display all options related to "fuzzy" searching:
- $ ugrep --help fuzzy
Report bugs at:
- https://github.com/Genivia/ugrep/issues
ugrep is released under the BSD-3 license. All parts of the
software have reasonable copyright terms permitting free redistribution.
This includes the ability to reuse all or parts of the ugrep source
tree.