add manpages
Sascha Steinbiss
7 years ago
0 | # mash-dist(1) | |
1 | ||
2 | ## NAME | |
3 | ||
4 | mash-dist - estimate the distance of query sequences to references | |
5 | ||
6 | ## SYNOPSIS | |
7 | ||
8 | *mash dist* [options] <reference> <query> [<query>] ... | |
9 | ||
10 | ## DESCRIPTION | |
11 | ||
12 | Estimate the distance of each query sequence to the reference. Both the | |
13 | reference and queries can be fasta or fastq, gzipped or not, or Mash sketch | |
14 | files (.msh) with matching k-mer sizes. Query files can also be files of file | |
15 | names (see *-l*). Whole files are compared by default (see *-i*). The output | |
16 | fields are [reference-ID, query-ID, distance, p-value, shared-hashes]. | |
17 | ||
18 | ## OPTIONS | |
19 | ||
20 | *-h*:: | |
21 | Help | |
22 | ||
23 | *-p* <int>:: | |
24 | Parallelism. This many threads will be spawned for processing. [1] | |
25 | ||
26 | ### Input | |
27 | ||
28 | *-l*:: | |
29 | List input. Each query file contains a list of sequence files, one | |
30 | per line. The reference file is not affected. | |
31 | ||
32 | ### Output | |
33 | ||
34 | *-t*:: | |
35 | Table output (will not report p-values, but fields will be blank if | |
36 | they do not meet the p-value threshold). | |
37 | ||
38 | *-v* <num>:: | |
39 | Maximum p-value to report. (0-1) [1.0] | |
40 | ||
41 | *-d* <num>:: | |
42 | Maximum distance to report. (0-1) [1.0] | |
43 | ||
44 | ### Sketching | |
45 | ||
46 | *-k* <int>:: | |
47 | K-mer size. Hashes will be based on strings of this many | |
48 | nucleotides. Canonical nucleotides are used by default (see | |
49 | Alphabet options below). (1-32) [21] | |
50 | ||
51 | *-s* <int>:: | |
52 | Sketch size. Each sketch will have at most this many non-redundant | |
53 | min-hashes. [1000] | |
54 | ||
55 | *-i*:: | |
56 | Sketch individual sequences, rather than whole files. | |
57 | ||
58 | *-w* <num>:: | |
59 | Probability threshold for warning about low k-mer size. (0-1) [0.01] | |
60 | ||
61 | *-r*:: | |
62 | Input is a read set. See Reads options below. Incompatible with *-i*. | |
63 | ||
64 | ### Sketching (reads) | |
65 | ||
66 | *-b* <size>:: | |
67 | Use a Bloom filter of this size (raw bytes or with K/M/G/T) to | |
68 | filter out unique k-mers. This is useful if exact filtering with *-m* | |
69 | uses too much memory. However, some unique k-mers may pass | |
70 | erroneously, and copies cannot be counted beyond 2. Implies *-r*. | |
71 | ||
72 | *-m* <int>:: | |
73 | Minimum copies of each k-mer required to pass noise filter for | |
74 | reads. Implies *-r*. [1] | |
75 | ||
76 | *-c* <num>:: | |
77 | Target coverage. Sketching will conclude if this coverage is | |
78 | reached before the end of the input file (estimated by average | |
79 | k-mer multiplicity). Implies *-r*. | |
80 | ||
81 | *-g* <size>:: | |
82 | Genome size. If specified, will be used for p-value calculation | |
83 | instead of an estimated size from k-mer content. Implies *-r*. | |
84 | ||
85 | ### Sketching (alphabet) | |
86 | ||
87 | *-n*:: | |
88 | Preserve strand (by default, strand is ignored by using canonical | |
89 | DNA k-mers, which are alphabetical minima of forward-reverse | |
90 | pairs). Implied if an alphabet is specified with *-a* or *-z*. | |
91 | ||
92 | *-a*:: | |
93 | Use amino acid alphabet (A-Z, except BJOUXZ). Implies *-n*, *-k* 9. | |
94 | ||
95 | *-z* <text>:: | |
96 | Alphabet to base hashes on (case ignored by default; see *-Z*). | |
97 | K-mers with other characters will be ignored. Implies *-n*. | |
98 | ||
99 | *-Z*:: | |
100 | Preserve case in k-mers and alphabet (case is ignored by default). | |
101 | Sequence letters whose case is not in the current alphabet will be | |
102 | skipped when sketching. | |
103 | ||
104 | ## SEE ALSO | |
105 | ||
106 | mash(1) |
0 | # mash-info(1) | |
1 | ||
2 | ## NAME | |
3 | ||
4 | mash-info - display information about sketch files | |
5 | ||
6 | ## SYNOPSIS | |
7 | ||
8 | *mash info* [options] <sketch> | |
9 | ||
10 | ## DESCRIPTION | |
11 | ||
12 | Displays information about sketch files. | |
13 | ||
14 | ## OPTIONS | |
15 | ||
16 | *-h*:: | |
17 | Help | |
18 | ||
19 | *-H*:: | |
20 | Only show header info. Do not list each sketch. Incompatible with *-t* | |
21 | and *-c*. | |
22 | ||
23 | *-t*:: | |
24 | Tabular output (rather than padded), with no header. Incompatible with | |
25 | *-H* and *-c*. | |
26 | ||
27 | *-c*:: | |
28 | Show hash count histograms for each sketch. Incompatible with *-H* and | |
29 | *-t*. | |
30 | ||
31 | ## SEE ALSO | |
32 | ||
33 | mash(1) |
0 | # mash-paste(1) | |
1 | ||
2 | ## NAME | |
3 | ||
4 | mash-paste - create a single sketch file from multiple sketch files | |
5 | ||
6 | ## SYNOPSIS | |
7 | ||
8 | *mash paste* [options] <out_prefix> <sketch> [<sketch>] ... | |
9 | ||
10 | ## DESCRIPTION | |
11 | ||
12 | Create a single sketch file from multiple sketch files. | |
13 | ||
14 | ## OPTIONS | |
15 | ||
16 | *-h*:: | |
17 | Help | |
18 | ||
19 | *-l*:: | |
20 | Input files are lists of file names. | |
21 | ||
22 | ## SEE ALSO | |
23 | ||
24 | mash(1) |
0 | # mash-sketch(1) | |
1 | ||
2 | ## NAME | |
3 | ||
4 | mash-sketch - create sketches (reduced representations for fast operations) | |
5 | ||
6 | ## SYNOPSIS | |
7 | ||
8 | *mash sketch* [options] fast(a|q)[.gz] ... | |
9 | ||
10 | ## DESCRIPTION | |
11 | ||
12 | Create a sketch file, which is a reduced representation of a sequence or set | |
13 | of sequences (based on min-hashes) that can be used for fast distance | |
14 | estimations. Input can be fasta or fastq files (gzipped or not), and "-" can | |
15 | be given to read from standard input. Input files can also be files of file | |
16 | names (see *-l*). For output, one sketch file will be generated, but it can have | |
17 | multiple sketches within it, divided by sequences or files (see *-i*). By | |
18 | default, the output file name will be the first input file with a '.msh' | |
19 | extension, or 'stdin.msh' if standard input is used (see *-o*). | |
20 | ||
21 | ## OPTIONS | |
22 | ||
23 | *-h*:: | |
24 | Help | |
25 | ||
26 | *-p* <int>:: | |
27 | Parallelism. This many threads will be spawned for processing. [1] | |
28 | ||
29 | ### Input | |
30 | ||
31 | *-l*:: | |
32 | List input. Each file contains a list of sequence files, one per line. | |
33 | ||
34 | ### Output | |
35 | ||
36 | *-o* <path>:: | |
37 | Output prefix (first input file used if unspecified). The suffix | |
38 | '.msh' will be appended. | |
39 | ||
40 | ### Sketching | |
41 | ||
42 | *-k* <int>:: | |
43 | K-mer size. Hashes will be based on strings of this many | |
44 | nucleotides. Canonical nucleotides are used by default (see | |
45 | Alphabet options below). (1-32) [21] | |
46 | ||
47 | *-s* <int>:: | |
48 | Sketch size. Each sketch will have at most this many non-redundant | |
49 | min-hashes. [1000] | |
50 | ||
51 | *-i*:: | |
52 | Sketch individual sequences, rather than whole files. | |
53 | ||
54 | *-w* <num>:: | |
55 | Probability threshold for warning about low k-mer size. (0-1) [0.01] | |
56 | ||
57 | *-r*:: | |
58 | Input is a read set. See Reads options below. Incompatible with *-i*. | |
59 | ||
60 | ### Sketching (reads) | |
61 | ||
62 | *-b* <size>:: | |
63 | Use a Bloom filter of this size (raw bytes or with K/M/G/T) to | |
64 | filter out unique k-mers. This is useful if exact filtering with *-m* | |
65 | uses too much memory. However, some unique k-mers may pass | |
66 | erroneously, and copies cannot be counted beyond 2. Implies *-r*. | |
67 | ||
68 | *-m* <int>:: | |
69 | Minimum copies of each k-mer required to pass noise filter for | |
70 | reads. Implies *-r*. [1] | |
71 | ||
72 | *-c* <num>:: | |
73 | Target coverage. Sketching will conclude if this coverage is | |
74 | reached before the end of the input file (estimated by average | |
75 | k-mer multiplicity). Implies *-r*. | |
76 | ||
77 | *-g* <size>:: | |
78 | Genome size. If specified, will be used for p-value calculation | |
79 | instead of an estimated size from k-mer content. Implies *-r*. | |
80 | ||
81 | ### Sketching (alphabet) | |
82 | ||
83 | *-n*:: | |
84 | Preserve strand (by default, strand is ignored by using canonical | |
85 | DNA k-mers, which are alphabetical minima of forward-reverse | |
86 | pairs). Implied if an alphabet is specified with *-a* or *-z*. | |
87 | ||
88 | *-a*:: | |
89 | Use amino acid alphabet (A-Z, except BJOUXZ). Implies *-n*, *-k* 9. | |
90 | ||
91 | *-z* <text>:: | |
92 | Alphabet to base hashes on (case ignored by default; see *-Z*). | |
93 | K-mers with other characters will be ignored. Implies *-n*. | |
94 | ||
95 | *-Z*:: | |
96 | Preserve case in k-mers and alphabet (case is ignored by default). | |
97 | Sequence letters whose case is not in the current alphabet will be | |
98 | skipped when sketching. | |
99 | ||
100 | ## SEE ALSO | |
101 | ||
102 | mash(1) |
0 | = mash(1) | |
1 | ||
2 | == NAME | |
3 | ||
4 | mash - fast genome and metagenome distance estimation using MinHash | |
5 | ||
6 | ## SYNOPSIS | |
7 | ||
8 | *mash* <command> [options] [arguments ...] | |
9 | ||
10 | ## DESCRIPTION | |
11 | ||
12 | *mash* is the main executable for the **Mash** software. The actual | |
13 | functionality is provided by the subtools (*commands'): | |
14 | ||
15 | ### Commands | |
16 | ||
17 | *bounds*:: | |
18 | Print a table of Mash error bounds. | |
19 | ||
20 | *dist*:: | |
21 | Estimate the distance of query sequences to references. | |
22 | ||
23 | *info*:: | |
24 | Display information about sketch files. | |
25 | ||
26 | *paste*:: | |
27 | Create a single sketch file from multiple sketch files. | |
28 | ||
29 | *sketch*:: | |
30 | Create sketches (reduced representations for fast operations). | |
31 | ||
32 | ## SEE ALSO | |
33 | ||
34 | mash-dist(1), mash-info(1), mash-paste(1), mash-sketch(1) |
0 | debian/man/*.1 |
23 | 23 | |
24 | 24 | override_dh_auto_clean: |
25 | 25 | dh_auto_clean |
26 | rm -rf $(CURDIR)/debian/sphinxdoc | |
26 | rm -rf $(CURDIR)/debian/sphinxdoc $(CURDIR)/debian/man | |
27 | 27 | |
28 | 28 | override_dh_auto_configure: |
29 | 29 | dh_auto_configure -- --with-capnp=/usr --with-gsl=/usr --prefix=$(CURDIR)/debian/tmp/usr |
31 | 31 | override_dh_auto_build: |
32 | 32 | dh_auto_build |
33 | 33 | sphinx-build doc/sphinx $(CURDIR)/debian/sphinxdoc |
34 | ||
35 | override_dh_installman: | |
36 | mkdir -p $(CURDIR)/debian/man | |
37 | asciidoctor -a docdate='' -b manpage $(CURDIR)/debian/man_src/*.adoc | |
38 | cp $(CURDIR)/debian/man_src/*.? $(CURDIR)/debian/man | |
39 | dh_installman -- |