DOKK / manpages / debian 11 / ncbi-entrez-direct / gbf2xml.1.en

NCBI Entrez Direct User's Manual

NAME

align-columns, gbf2xml, transmute - transform (NCBI Entrez Direct) data

SYNOPSIS

transmute -x2p|-j2p

transmute -align [-a codes] [-g N] [-h N]

transmute -j2x [-set tag] [-rec tag] [-nest flat|recurse|plural|depth]

transmute -a2x [-set tag] [-rec tag]

transmute -t2x|-c2x [-set tag] [-rec tag] [-skip N] [-header] [-lower|-upper] [-indent|-flush] columnName1 ...

transmute -g2x (gbf2xml

transmute -diff

transmute -revcomp

transmute -remove [-first N] [-last N]

transmute -retain -leading N-trailing N

transmute -replace -offset N|-column N [-delete N] [-insert seq]

transmute -extract feat_loc

transmute -cds2prot [-code N] [-frame N] [-stop] [-trim] [-part5] [-part3] [-every]

transmute -molwt [-met]

transmute -hgvs

transmute -encodeXML|-decodeXML|-plainXML

transmute -encodeURL|-decodeURL

transmute -encode64|-decode64

transmute -aa1to3|-aa3to1

transmute -format fmt [-xml declaration] [-doctype declaration] [-comment] [-cdata] [-separate] [-self] [-unicode style] [-script style] [-mathml terse]

transmute -filter element action target

transmute -normalize database

DESCRIPTION

transmute reads data from standard input, transforms it according to the specified mode, and writes the transformed data to standard output.

align-columns aligns the columns of tab-delimited input, and is roughly equivalent to transmute -align, but accepts - as shorthand for -h 2 -g 4 -a l.

gbf2xml converts from GenBank flatfile format to INSDSeq XML, and is equivalent to transmute -g2x.

OPTIONS

Pretty-Printing

-x2p: Reformat XML
-j2p: Reformat JSON
-align: Table column alignment

-a codes: Column alignment codes:

l: left
c: center
r: right
n: numeric align on decimal point
N: trailing zero-pad decimals
z: leading zero-pad integers

-g N: Spacing between columns
-h N: Indentation before columns

Data Conversion

-j2x: Convert JSON stream to XML suitable for -path navigation.

-set tag: Replace set wrapper tag.
-rec tag: Replace record wrapper tag.
-nest flat|recurse|plural|depth: Nested array naming policy.

-a2x: Convert text ASN.1 stream to XML suitable for -path navigation.

-set tag: Replace set wrapper tag.
-rec tag: Replace record wrapper tag.

-t2x, -c2x: Convert tab-delimited table or comma-separated values file, respectively, to XML.

-set tag: Replace set wrapper tag.
-rec tag: Replace record wrapper tag.
-skip N: Skip the first N lines.
-header: Use fields from first row for column names.
-lower: Convert text to lowercase.
-upper: Convert text to uppercase.
-indent: Indent XML output.
-flush: Do not indent XML output.
columnName1 ...: XML object names per column.

-g2x: Convert GenBank flatfile format to INSDSeq XML.

Sequence Comparison

-diff: Compare two aligned files for point differences.

Sequence Editing

-revcomp: Reverse complement nucleotide sequence.
-remove: Trim at ends of sequence.

-first N: Delete first N bases or residues.
-last N: Delete last N bases or residues.

-retain: Save either end of sequence.

-leading N: Keep first N bases or residues.
-trailing N: Keep last N bases or residues.

-replace: Apply base or residue substition.

-offset N: Skip ahead by 0-based count (SPDI), or
-column N: Move just before 1-based position (HGVS).
-delete N: Delete N bases or residues.
-insert seq: Insert given sequence.

-extract feat_loc: Use xtract -insd ... feat_location instructions.

Sequence Processing

-cds2prot: Translate coding region into protein.

-code N: Use genetic code N (1 by default).
-frame N: Offset in sequence.
-stop: Include stop residue.
-trim: Remove trailing Xs.
-part5: CDS partial at 5' end.
-part3: CDS extends past 3' end.
-every: Translate all codons.

-molwt: Calculate molecular weight of peptide.

-met: Do not cleave leading methionine.

Variation Processing

-hgvs: Convert Human Genome Variation Society variation format to XML.

String Transformations

-encodeXML: XML-encode <, >, &, ", and ' characters.
-decodeXML: Decode XML entity references.
-plainXML: Remove embedded mixed-content tags and compress runs of spaces.
-encodeURL: Compress runs of spaces, and URI-escape the result.
-decodeURL: URI-unescape the input.
-encode64: Base64-encode the input.
-decode64: Base64-decode the input.

Protein

-aa1to3: Convert amino acids from 1-character to 3-character format.
-aa3to1: Convert amino acids from 3-character to 1-character format.

Customized XML Reformatting

-format fmt

copy: Fast block copy (still applies processing flags).
compact: Compress runs of spaces.
flush: Suppress line indentation.
indent: Indent according to nesting depth.
expand: Place each attribute on a separate line.

-xml declaration: Use the given XML declaration.
-doctype declaration: Use the given document type declaration.
-comment: Preserve comments.
-cdata: Preserve cdata blocks.
-separate: If the input contains multiple top-level documents, keep them separate.
-self: Keep empty self-closing tags.
-unicode style: How to handle Unicode superscript and subscript digits (first converted to ASCII form in all cases).

fuse: Run them all together, with no additional markup.
space: Add spaces between digits in different positions.
period: Add periods between digits in different positions.
brackets: Surround superscripts by square brackets and subscripts by parentheses.
markdown: Surround superscripts with carets and subscripts with tildes.
slash: Add backslashes when going up in height and forward slashes when going down.
tag: Put superscripts in XML sup elements and subscripts in sub elements.

-script style: How to handle XML sup and sub elements (denoting superscripts and subscripts, respectively).

brackets: Surround superscripts by square brackets and subscripts by parentheses.
markdown: Surround superscripts with carets and subscripts with tildes.

-mathml terse: Flatten MathML markup tersely.

XML Modification

-filter element action target: Actions:

retain: Keep matching elements (no-op).
remove: Remove matching elements.
encode: HTML-escape special characters.
decode: Decode HTML escapes.
shrink: Compress runs of spaces.
expand: Place each attribute on a separate line.
accent: Strip off Unicode accents.

Targets:

content: Plain-text content.
cdata: CDATA blocks.
comment: Comments.
object: The whole object.
attributes: Attributes.
container: Start and end tags.

EFetch XML Normalization

-normalize database: Adjust XML fields to conform to common conventions.