DOKK / manpages / debian 12 / augustus / filterBam.1.en
FILTERBAM(1)   FILTERBAM(1)

filterBam - filter BAM file for use with AUGUSTUS tools

filterBam --in in.bam --out out.bam [options]

The input file must be sorted lexicographically by 'queryname', with e.g.

•Convert the file into SAM format, sort lexicographically by queryname and convert back again into BAM format.
E.g.
export LC_ALL=C
samtools view -H file.bam > file_sorted.sam
samtools view file.bam | sort -k 1 >> file_sorted.sam
samtools view -bS file_sorted.sam > file_sorted.bam
[be aware: 'export LC_ALL=C' might be used because sort ignores characters like ':']
Also, please bear in mind that this will require converting your BAM file into SAM.

samtools and bamtools provide facilities to do the sorting, but they are not guaranteed to work because of the problem mentioned above.

•In the case of samtools, the command to sort by 'queryname' is:
samtools sort -n -o file_sorted.bam file.bam
For more information check the man page included in samtools distribution.

•bamtools can also sort bam files:
bamtools sort -byname -in file.bam -out file_sorted.bam
but only provides the option to do it by queryname.

-i, --in=in.bam

input file in BAM format

-o, --out=out.bam

output file in BAM format

-u, --uniq

keep only the best match, remove all matches, if the second best is not much worse

-q f, --uniqThresh=f

threshold % for uniq, second best must be lower than this fraction of best to keep the best match (default 0.96)

-b, --best

output all best matches that satisfy minId and minCover

-e n, --minId=n

minimal percentage of identity (default 92)

-c n, --minCover=n

minimal percentage of coverage of the query read (default 80)

-n, --noIntrons

do not allow longer gaps -for RNA-RNA alignments-

-l n, --insertLimit=n

maximum assumed size of inserts (default 10)

-s n, --maxSortesTest=n

test if input file is sorted by query name for this number of alignments (default 100000)

-p, --paired

require that paired reads are on opposite strands of same target. Requires alignment names to contain the suffixes /1,/2 or /f,/r.

-w, --pairwiseAlignments

use in case alignments were done in pairwise fashion

-x n, --maxIntronLen=n

maximal separation of paired reads (default 500000)

-d file, --pairBedFile=file

file name of pairedness coverage: a BED format file in which for each position the number of filtered read pairs is reported that contain the position in or between the reads

-g file, --commonGeneFile=file

file name in which to write cases where one read maps several different genes

-t n, --threads=n

use n threads for compression/decompression (default 1); available only if library SeqLib >= 1.2 is used

-v, --verbose

output debugging info

-h, --help

produce help message.

AUGUSTUS was written by M. Stanke, O. Keller, S. König, L. Gerischer and L. Romoth.

An exhaustive documentation can be found in the file /usr/share/doc/augustus/README.md.gz.