NAME

gbget - Basic data extraction and manipulation tool

SYNOPSIS

gbget [options] 'filename[index](C,R)trans'

DESCRIPTION

Print slices of tabular data from files and apply transformations. Data are read from text files with fields separated by space (use option -F to specify a different separator). Inside data file, data-blocks are separated by two empty lines. File can be compressed with zlib (.gz).

filename: is the input file. If not specified it default to stdin or the last specified filename if any.
index: stands for a data-block index.
index: stands for a data-block index.
C,R: stands for columns and rows spec given as "min:max:skip" to select from "min" to "max" every "skip" steps. If negative min and max are counted from the end. By default all data are printed ("1:-1:1"). If min>max then count is reversed and skip must be negative (-1 by default). Different specs are separated by semicolon ';' and considered sequentially.
trans: is a list of transformations applied to selected data: 'd' take the diff of subsequent columns; 'D' remove all rows with at least one Not-A-Number (NAN) entry; 'f' flatten the output piling all columns; 'l' take log of all entries, 'P' print all entries collected as a data-block; 't' transpose the matrix of data; 'z' subtract from the entries in each column their mean; 'Z' replace the entry in each column with their zscore; 'w' divide the entry in each columns by their mean.

: '<..;..>' functions separated by semicolons in angle brackets can be used for generic data transformation; the function is computed for each row of data. Variables names are 'x' followed by the number of the column and optionally by 'l' and the number of lags. For instance 'x2+x3l1' means the sum of the entries in the 2nd column plus the entries in the 3rd column in the previous row. 'x0' stands for the row number and 'x' is equal to 'x1'
: '<@..;..>' if the functions specification starts with a '@' the functions are computed recursively along the columns. In this case the number after the 'x' is the relative column counted starting from the one considered at each step.
: '{...}' a function in curly brackets can be use to select data: only rows that return a non-negative value are retained

OPTIONS

-F: set the input fields separators (default ' \t')
-o: set the output format (default '%12.6e')
-e: set the output format for empty fields (default '%13s')
-s: set the output separation string (default ' ')
-t: define global transformations applied before each output (default '')
-v: verbose mode

EXAMPLES

gbget 'file(1:3)ld': select the first three columns in 'file', take the log and the difference of successive columns;
gbget 'file(2,-10:-1): <x^2> select the last ten elements of the second' of 'file' and print their squares
gbget '[2]()' '[1]()' < ...: select the second and first data block from the standard input.
gbget 'file(1:3)<x1*x2-x3>': select the first three columns in 'file' and in each row multiply the first two entries and. subtract the third.
gbget 'file()<@x1+x2>': print the sum of two subsequent columns
gbget 'file(1:3){x2-2}': select the first three columns in 'file' for the rows whose second field is not lower then 2

AUTHOR

Written by Giulio Bottazzi

REPORTING BUGS

Report bugs to <gbutils@googlegroups.com>

Package home page <http://cafim.sssup.it/~giulio/software/gbutils/index.html>

COPYRIGHT

Copyright © 2001-2018 Giulio Bottazzi This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License (version 2) as published by the Free Software Foundation;

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

June 2021

gbget 6.2