DefineClones.py - Repertoire clonal assignment toolkit (Python
  3)
usage: DefineClones.py [--version] [-h] -d DB_FILES
    [DB_FILES ...]
  - [-o OUT_FILES [OUT_FILES ...]] [--outdir OUT_DIR]
 
  - [--outname OUT_NAME] [--log LOG_FILE] [--failed] [--format {airr,changeo}]
      [--nproc NPROC] [--sf SEQ_FIELD] [--vf V_FIELD] [--jf J_FIELD] [--gf
      GROUP_FIELDS [GROUP_FIELDS ...]] [--mode {allele,gene}] [--act
      {first,set}] [--model
      {ham,aa,hh_s1f,hh_s5f,mk_rs1nf,mk_rs5nf,hs1f_compat,m1n_compat}] [--dist
      DISTANCE] [--norm {len,mut,none}] [--sym {avg,min}] [--link
      {single,average,complete}] [--maxmiss MAX_MISSING]
 
Assign Ig sequences into clones
  - -d DB_FILES [DB_FILES
    ...]
 
  - A list of tab delimited database files. (default: None)
 
  - -o OUT_FILES [OUT_FILES
    ...]
 
  - Explicit output file name. Note, this argument cannot be used with the
      --failed, --outdir, or --outname arguments. If
      unspecified, then the output filename will be based on the input
      filename(s). (default: None)
 
  - --outdir
    OUT_DIR
 
  - Specify to changes the output directory to the location specified. The
      input file directory is used if this is not specified. (default:
    None)
 
  - --outname
    OUT_NAME
 
  - Changes the prefix of the successfully processed output file to the string
      specified. May not be specified with multiple input files. (default:
    None)
 
  - --log LOG_FILE
 
  - Specify to write verbose logging to a file. May not be specified with
      multiple input files. (default: None)
 
  - --failed
 
  - If specified create files containing records that fail processing.
      (default: False)
 
  - --format
    {airr,changeo}
 
  - Specify input and output format. (default: airr)
 
  - --nproc
    NPROC
 
  - The number of simultaneous computational processes to execute (CPU cores
      to utilized). (default: 8)
 
  - --sf SEQ_FIELD
 
  - Field to be used to calculate distance between records. Defaults to
      junction (airr) or JUNCTION (changeo). (default: None)
 
  - --vf V_FIELD
 
  - Field containing the germline V segment call. Defaults to v_call (airr) or
      V_CALL (changeo). (default: None)
 
  - --jf J_FIELD
 
  - Field containing the germline J segment call. Defaults to j_call (airr) or
      J_CALL (changeo). (default: None)
 
  - --gf GROUP_FIELDS
    [GROUP_FIELDS ...]
 
  - Additional fields to use for grouping clones aside from V, J and junction
      length. (default: None)
 
  - --mode
    {allele,gene}
 
  - Specifies whether to use the V(D)J allele or gene for initial grouping.
      (default: gene)
 
  - --act
    {first,set}
 
  - Specifies how to handle multiple V(D)J assignments for initial grouping.
      The "first" action will use only the first gene listed. The
      "set" action will use all gene assignments and construct a
      larger gene grouping composed of any sequences sharing an assignment or
      linked to another sequence by a common assignment (similar to
      single-linkage). (default: set)
 
  - --model
    {ham,aa,hh_s1f,hh_s5f,mk_rs1nf,mk_rs5nf,hs1f_compat,m1n_compat}
 
  - Specifies which substitution model to use for calculating distance between
      sequences. The "ham" model is nucleotide Hamming distance and
      "aa" is amino acid Hamming distance. The "hh_s1f" and
      "hh_s5f" models are human specific single nucleotide and 5-mer
      content models, respectively, from Yaari et al, 2013. The
      "mk_rs1nf" and "mk_rs5nf" models are mouse specific
      single nucleotide and 5-mer content models, respectively, from Cui et al,
      2016. The "m1n_compat" and "hs1f_compat" models are
      deprecated models provided backwards compatibility with the
      "m1n" and "hs1f" models in Change-O v0.3.3 and SHazaM
      v0.1.4. Both 5-mer models should be considered experimental. (default:
      ham)
 
  - --dist
    DISTANCE
 
  - The distance threshold for clonal grouping (default: 0.0)
 
  - --norm
    {len,mut,none}
 
  - Specifies how to normalize distances. One of none (do not normalize), len
      (normalize by length), or mut (normalize by number of mutations between
      sequences). (default: len)
 
  - --sym {avg,min}
 
  - Specifies how to combine asymmetric distances. One of avg (average of
      A->B and B->A) or min (minimum of A->B and B->A). (default:
      avg)
 
  - --link
    {single,average,complete}
 
  - Type of linkage to use for hierarchical clustering. (default: single)
 
  - --maxmiss
    MAX_MISSING
 
  - The maximum number of non-ACGT characters (gaps or Ns) to permit in the
      junction sequence before excluding the record from clonal assignment.
      Note, under single linkage non-informative positions can create
      artifactual links between unrelated sequences. Use with caution. (default:
      0)
 
  
  - clone-pass
 
  
  - database with assigned clonal group numbers.
 
  
  - clone-fail
 
  
  - database with records failing clonal grouping.
 
  
  - sequence_id, v_call, j_call, junction
 
 This manpage was written by Nilesh Patra for the Debian
    distribution and
  
   can be used for any other usage of the program.