vcftools(1) | vcftools man page | vcftools(1) |
vcftools - Utilities for the variant call format (VCF) and binary variant call format (BCF)
vcftools [ --vcf FILE | --gzvcf FILE | --bcf FILE] [ --out OUTPUT PREFIX ] [ FILTERING OPTIONS ] [ OUTPUT OPTIONS ]
vcftools is a suite of functions for use on genetic variation data in the form of VCF and BCF files. The tools provided will be used mainly to summarize data, run calculations on data, filter out data, and convert data into other useful file formats.
Output allele frequency for all sites in the input vcf file from chromosome 1
Output a new vcf file from the input vcf file that removes any indel sites
Output file comparing the sites in two vcf files
Output a new vcf file to standard out without any sites that have a filter tag, then compress it with gzip
Output a Hardy-Weinberg p-value for every site in the bcf file that does not have any missing genotypes
Output nucleotide diversity at a list of positions
These options are used to specify the input and output files.
--gzvcf <input_filename>
--bcf <input_filename>
--stdout
-c
--temp <temporary_directory>
These options are used to include or exclude certain sites from any analysis being performed by the program.
--from-bp <integer>
--to-bp <integer>
--positions <filename>
--exclude-positions <filename>
--positions-overlap <filename>
--exclude-positions-overlap <filename>
--bed <filename>
--exclude-bed <filename>
--thin <integer>
--mask <filename>
--invert-mask <filename>
--mask-min <integer>
--snps <filename>
--exclude <filename>
--keep-filtered <string>
--remove-filtered <string>
--non-ref-af <float>
--max-non-ref-af <float>
--non-ref-ac <integer>
--max-non-ref-ac <integer>
--non-ref-af-any <float>
--max-non-ref-af-any <float>
--non-ref-ac-any <integer>
--max-non-ref-ac-any <integer>
--mac <integer>
--max-mac <integer>
--min-alleles <integer>
--max-alleles <integer>
--hwe <float>
--max-missing <float>
--max-missing-count <integer>
--phased
These options are used to include or exclude certain individuals
from any analysis being performed by the program.
--keep <filename>
--remove <filename>
--max-indv <integer>
These options are used to exclude genotypes from any analysis
being performed by the program. If excluded, these values will be treated as
missing.
--remove-filtered-geno <string>
--minGQ <float>
--minDP <float>
--maxDP <float>
These options specify which analyses or conversions to perform on the data that passed through all specified filters.
--counts
--counts2
--derived
--site-depth
--site-mean-depth
--geno-depth
--geno-r2
--geno-chisq
--hap-r2-positions <positions list file>
--geno-r2-positions <positions list file>
--ld-window <integer>
--ld-window-bp <integer>
--ld-window-min <integer>
--ld-window-bp-min <integer>
--min-r2 <float>
--interchrom-hap-r2
--interchrom-geno-r2
--TsTv-summary
--TsTv-by-count
--TsTv-by-qual
--FILTER-summary
--window-pi <integer>
--window-pi-step <integer>
--fst-window-size <integer>
--fst-window-step <integer>
--hardy
--TajimaD <integer>
--indv-freq-burden
--LROH
--relatedness
--relatedness2
--site-quality
--missing-indv
--missing-site
--SNPdensity <integer>
--kept-sites
--removed-sites
--singletons
--hist-indel-len
--hapcount <BED file>
--mendel <PED file>
--extract-FORMAT-info <string>
--get-INFO <string>
--recode-INFO <string>
--recode-INFO-all
--contigs <string>
--IMPUTE
--ldhat
--ldhelmet
--ldhat-geno
--BEAGLE-GL
--BEAGLE-PL
--plink
--plink-tped
--chrom-map
These options are used to compare the original variant file to another variant file and output the results. All of the diff functions require both files to contain the same chromosomes and that the files be sorted in the same order. If one of the files contains chromosomes that the other file does not, use the --not-chr filter to remove them from the analysis.
--diff-indv
--diff-site-discordance
--diff-indv-discordance
--diff-indv-map <filename>
--diff-discordance-matrix
--diff-switch-error
Adam Auton
Anthony Marcketta
2 August 2018 | 0.1.16 |