Codebase list mash / 8fed007
add manpages Sascha Steinbiss 7 years ago
7 changed file(s) with 312 addition(s) and 1 deletion(s). Raw diff Collapse all Expand all
0 # mash-dist(1)
1
2 ## NAME
3
4 mash-dist - estimate the distance of query sequences to references
5
6 ## SYNOPSIS
7
8 *mash dist* [options] <reference> <query> [<query>] ...
9
10 ## DESCRIPTION
11
12 Estimate the distance of each query sequence to the reference. Both the
13 reference and queries can be fasta or fastq, gzipped or not, or Mash sketch
14 files (.msh) with matching k-mer sizes. Query files can also be files of file
15 names (see *-l*). Whole files are compared by default (see *-i*). The output
16 fields are [reference-ID, query-ID, distance, p-value, shared-hashes].
17
18 ## OPTIONS
19
20 *-h*::
21 Help
22
23 *-p* <int>::
24 Parallelism. This many threads will be spawned for processing. [1]
25
26 ### Input
27
28 *-l*::
29 List input. Each query file contains a list of sequence files, one
30 per line. The reference file is not affected.
31
32 ### Output
33
34 *-t*::
35 Table output (will not report p-values, but fields will be blank if
36 they do not meet the p-value threshold).
37
38 *-v* <num>::
39 Maximum p-value to report. (0-1) [1.0]
40
41 *-d* <num>::
42 Maximum distance to report. (0-1) [1.0]
43
44 ### Sketching
45
46 *-k* <int>::
47 K-mer size. Hashes will be based on strings of this many
48 nucleotides. Canonical nucleotides are used by default (see
49 Alphabet options below). (1-32) [21]
50
51 *-s* <int>::
52 Sketch size. Each sketch will have at most this many non-redundant
53 min-hashes. [1000]
54
55 *-i*::
56 Sketch individual sequences, rather than whole files.
57
58 *-w* <num>::
59 Probability threshold for warning about low k-mer size. (0-1) [0.01]
60
61 *-r*::
62 Input is a read set. See Reads options below. Incompatible with *-i*.
63
64 ### Sketching (reads)
65
66 *-b* <size>::
67 Use a Bloom filter of this size (raw bytes or with K/M/G/T) to
68 filter out unique k-mers. This is useful if exact filtering with *-m*
69 uses too much memory. However, some unique k-mers may pass
70 erroneously, and copies cannot be counted beyond 2. Implies *-r*.
71
72 *-m* <int>::
73 Minimum copies of each k-mer required to pass noise filter for
74 reads. Implies *-r*. [1]
75
76 *-c* <num>::
77 Target coverage. Sketching will conclude if this coverage is
78 reached before the end of the input file (estimated by average
79 k-mer multiplicity). Implies *-r*.
80
81 *-g* <size>::
82 Genome size. If specified, will be used for p-value calculation
83 instead of an estimated size from k-mer content. Implies *-r*.
84
85 ### Sketching (alphabet)
86
87 *-n*::
88 Preserve strand (by default, strand is ignored by using canonical
89 DNA k-mers, which are alphabetical minima of forward-reverse
90 pairs). Implied if an alphabet is specified with *-a* or *-z*.
91
92 *-a*::
93 Use amino acid alphabet (A-Z, except BJOUXZ). Implies *-n*, *-k* 9.
94
95 *-z* <text>::
96 Alphabet to base hashes on (case ignored by default; see *-Z*).
97 K-mers with other characters will be ignored. Implies *-n*.
98
99 *-Z*::
100 Preserve case in k-mers and alphabet (case is ignored by default).
101 Sequence letters whose case is not in the current alphabet will be
102 skipped when sketching.
103
104 ## SEE ALSO
105
106 mash(1)
0 # mash-info(1)
1
2 ## NAME
3
4 mash-info - display information about sketch files
5
6 ## SYNOPSIS
7
8 *mash info* [options] <sketch>
9
10 ## DESCRIPTION
11
12 Displays information about sketch files.
13
14 ## OPTIONS
15
16 *-h*::
17 Help
18
19 *-H*::
20 Only show header info. Do not list each sketch. Incompatible with *-t*
21 and *-c*.
22
23 *-t*::
24 Tabular output (rather than padded), with no header. Incompatible with
25 *-H* and *-c*.
26
27 *-c*::
28 Show hash count histograms for each sketch. Incompatible with *-H* and
29 *-t*.
30
31 ## SEE ALSO
32
33 mash(1)
0 # mash-paste(1)
1
2 ## NAME
3
4 mash-paste - create a single sketch file from multiple sketch files
5
6 ## SYNOPSIS
7
8 *mash paste* [options] <out_prefix> <sketch> [<sketch>] ...
9
10 ## DESCRIPTION
11
12 Create a single sketch file from multiple sketch files.
13
14 ## OPTIONS
15
16 *-h*::
17 Help
18
19 *-l*::
20 Input files are lists of file names.
21
22 ## SEE ALSO
23
24 mash(1)
0 # mash-sketch(1)
1
2 ## NAME
3
4 mash-sketch - create sketches (reduced representations for fast operations)
5
6 ## SYNOPSIS
7
8 *mash sketch* [options] fast(a|q)[.gz] ...
9
10 ## DESCRIPTION
11
12 Create a sketch file, which is a reduced representation of a sequence or set
13 of sequences (based on min-hashes) that can be used for fast distance
14 estimations. Input can be fasta or fastq files (gzipped or not), and "-" can
15 be given to read from standard input. Input files can also be files of file
16 names (see *-l*). For output, one sketch file will be generated, but it can have
17 multiple sketches within it, divided by sequences or files (see *-i*). By
18 default, the output file name will be the first input file with a '.msh'
19 extension, or 'stdin.msh' if standard input is used (see *-o*).
20
21 ## OPTIONS
22
23 *-h*::
24 Help
25
26 *-p* <int>::
27 Parallelism. This many threads will be spawned for processing. [1]
28
29 ### Input
30
31 *-l*::
32 List input. Each file contains a list of sequence files, one per line.
33
34 ### Output
35
36 *-o* <path>::
37 Output prefix (first input file used if unspecified). The suffix
38 '.msh' will be appended.
39
40 ### Sketching
41
42 *-k* <int>::
43 K-mer size. Hashes will be based on strings of this many
44 nucleotides. Canonical nucleotides are used by default (see
45 Alphabet options below). (1-32) [21]
46
47 *-s* <int>::
48 Sketch size. Each sketch will have at most this many non-redundant
49 min-hashes. [1000]
50
51 *-i*::
52 Sketch individual sequences, rather than whole files.
53
54 *-w* <num>::
55 Probability threshold for warning about low k-mer size. (0-1) [0.01]
56
57 *-r*::
58 Input is a read set. See Reads options below. Incompatible with *-i*.
59
60 ### Sketching (reads)
61
62 *-b* <size>::
63 Use a Bloom filter of this size (raw bytes or with K/M/G/T) to
64 filter out unique k-mers. This is useful if exact filtering with *-m*
65 uses too much memory. However, some unique k-mers may pass
66 erroneously, and copies cannot be counted beyond 2. Implies *-r*.
67
68 *-m* <int>::
69 Minimum copies of each k-mer required to pass noise filter for
70 reads. Implies *-r*. [1]
71
72 *-c* <num>::
73 Target coverage. Sketching will conclude if this coverage is
74 reached before the end of the input file (estimated by average
75 k-mer multiplicity). Implies *-r*.
76
77 *-g* <size>::
78 Genome size. If specified, will be used for p-value calculation
79 instead of an estimated size from k-mer content. Implies *-r*.
80
81 ### Sketching (alphabet)
82
83 *-n*::
84 Preserve strand (by default, strand is ignored by using canonical
85 DNA k-mers, which are alphabetical minima of forward-reverse
86 pairs). Implied if an alphabet is specified with *-a* or *-z*.
87
88 *-a*::
89 Use amino acid alphabet (A-Z, except BJOUXZ). Implies *-n*, *-k* 9.
90
91 *-z* <text>::
92 Alphabet to base hashes on (case ignored by default; see *-Z*).
93 K-mers with other characters will be ignored. Implies *-n*.
94
95 *-Z*::
96 Preserve case in k-mers and alphabet (case is ignored by default).
97 Sequence letters whose case is not in the current alphabet will be
98 skipped when sketching.
99
100 ## SEE ALSO
101
102 mash(1)
0 = mash(1)
1
2 == NAME
3
4 mash - fast genome and metagenome distance estimation using MinHash
5
6 ## SYNOPSIS
7
8 *mash* <command> [options] [arguments ...]
9
10 ## DESCRIPTION
11
12 *mash* is the main executable for the **Mash** software. The actual
13 functionality is provided by the subtools (*commands'):
14
15 ### Commands
16
17 *bounds*::
18 Print a table of Mash error bounds.
19
20 *dist*::
21 Estimate the distance of query sequences to references.
22
23 *info*::
24 Display information about sketch files.
25
26 *paste*::
27 Create a single sketch file from multiple sketch files.
28
29 *sketch*::
30 Create sketches (reduced representations for fast operations).
31
32 ## SEE ALSO
33
34 mash-dist(1), mash-info(1), mash-paste(1), mash-sketch(1)
0 debian/man/*.1
2323
2424 override_dh_auto_clean:
2525 dh_auto_clean
26 rm -rf $(CURDIR)/debian/sphinxdoc
26 rm -rf $(CURDIR)/debian/sphinxdoc $(CURDIR)/debian/man
2727
2828 override_dh_auto_configure:
2929 dh_auto_configure -- --with-capnp=/usr --with-gsl=/usr --prefix=$(CURDIR)/debian/tmp/usr
3131 override_dh_auto_build:
3232 dh_auto_build
3333 sphinx-build doc/sphinx $(CURDIR)/debian/sphinxdoc
34
35 override_dh_installman:
36 mkdir -p $(CURDIR)/debian/man
37 asciidoctor -a docdate='' -b manpage $(CURDIR)/debian/man_src/*.adoc
38 cp $(CURDIR)/debian/man_src/*.? $(CURDIR)/debian/man
39 dh_installman --