DOKK / manpages / debian 12 / murasaki-common / substitch.pl.1.en
SUBSTITCH(1) User Contributed Perl Documentation SUBSTITCH(1)

substitch.pl -- Split/merge stitch files into/out of stitch files

substitch.pl --split 5 allchromosomes.stitch #split big stitch into 5 roughly equal chunks

substitch.pl --project allspecies.seqs sub.anchors #project some anchors into a different coordinate space (as long as the stitch component sequences match)

--verbose => makes more verbose --faketfidf => fake tfidf scores based on score stat in file

Note on split: This program does not claim to produce an optimal splitting. It tries a couple heuristics, refines the results, and picks the best arrangement it's found so far. Technically this is a variation on the traditional "trunk packing problem," which is (at least in the abstract case) NP-hard, if I remember 15-251 correctly. This particular variety of trunk packing however, seems like it should be solvable faster (worst case some n^k dynamic programming I think, but I'm betting this way is faster and tons easier to write for 90% of the cases out there). If anyone reading this goes "You moron, this has been solved a thousand times already," please let me know how: krisp@dna.bio.keio.ac.jp

2021-08-15 perl v5.32.1