Codebase list fastaq / debian/1.6.0
Add debian/ dir as uploaded for version 1.6.0 which magically vanished after `git import-orig`. Seems this repository remains broken. :-( Andreas Tille 9 years ago
12 changed file(s) with 1162 addition(s) and 0 deletion(s). Raw diff Collapse all Expand all
0 fastaq (1.6.0-1) experimental; urgency=medium
1
2 * New upstream release
3
4 -- Jorge Soares <j.s.soares@gmail.com> Tue, 18 Nov 2014 16:34:01 +0000
5
6 fastaq (1.5.0-1) unstable; urgency=medium
7
8 * Initial release (Closes: #766321)
9
10 -- Jorge Soares <j.s.soares@gmail.com> Thu, 23 Oct 2014 20:23:54 +0200
0 Source: fastaq
1 Maintainer: Debian Med Packaging Team <debian-med-packaging@lists.alioth.debian.org>
2 Uploaders: Andreas Tille <tille@debian.org>,
3 Jorge Soares <j.s.soares@gmail.com>
4 Section: science
5 Priority: optional
6 Build-Depends: debhelper (>= 9),
7 python3,
8 python3-setuptools,
9 python3-numpy,
10 python3-nose,
11 samtools,
12 help2man
13 Standards-Version: 3.9.6
14 Vcs-Browser: https://anonscm.debian.org/cgit/debian-med/fastaq.git
15 Homepage: https://github.com/sanger-pathogens/Fastaq
16
17 Package: fastaq
18 Architecture: all
19 Depends: ${python3:Depends},
20 ${misc:Depends}
21 Description: FASTA and FASTQ file manipulation tools
22 A collection of scripts that perform useful and common
23 fasta/q manipulation tasks.
24 .
25 All scripts automatically detect whether the input is
26 a FASTA or FASTQ file.
27 .
28 Input and output files can be gzipped.
29 .
30 fastaq_capillary_to_pairs -
31 Given a fasta/q file of capillary reads,
32 makes an interleaved file of read pairs
33 .
34 fastaq_chunker -
35 Splits a multi fasta/q file into separate files.
36 Splits sequences into chunks of a fixed size.
37 .
38 fastaq_count_sequences -
39 Counts the number of sequences in a fasta/q file
40 .
41 fastaq_deinterleave -
42 Deinterleaves fasta/q file, so that reads are written
43 alternately between two output files
44 .
45 fastaq_enumerate_names -
46 Renames sequences in a file, calling them 1,2,3...
47 .
48 fastaq_expand_nucleotides -
49 Makes all combinations of sequences in input file
50 by using all possibilities of redundant bases.
51 e.g. ART could be AAT or AGT.
52 .
53 fastaq_extend_gaps -
54 Extends the length of all gaps (and trims the start/end
55 of sequences) in a fasta/q file.
56 .
57 fastaq_fasta_to_fastq -
58 Given a fasta and qual file, makes a fastq file.
59 .
60 fastaq_filter -
61 Filters a fasta/q file by sequence length and/or
62 by name matching a regular expression.
63 .
64 fastaq_get_ids -
65 Gets IDs from each sequence in a fasta or fastq file.
66 .
67 fastaq_get_seq_flanking_gaps -
68 Gets the sequences either side of gaps in a fasta/q file.
69 .
70 fastaq_insert_or_delete_bases -
71 Deletes or inserts bases at given position(s)
72 from a fasta/q file.
73 .
74 fastaq_interleave -
75 Interleaves two fasta/q files, so that reads are written
76 alternately first/second in output file.
77 .
78 fastaq_long_read_simulate -
79 Simulates long reads from a fasta/q file. Can optionally
80 make insertions into the reads, like pacbio does.
81 .
82 fastaq_make_random_contigs -
83 Makes a multi-fasta file of random sequences,
84 all of the same length. Each base has equal chance of
85 being A,C,G or T
86 .
87 fastaq_merge -
88 Converts multi fasta/q file to single sequence file,
89 preserving original order of sequences.
90 .
91 fastaq_replace_bases -
92 Replaces all occurences of one letter with another in
93 a fasta/q file.
94 .
95 fastaq_reverse_complement -
96 Reverse complements all sequences in a fasta/q file
97 .
98 fastaq_scaffolds_to_contigs -
99 Creates a file of contigs from a file of scaffolds - i.e.
100 breaks at every gap in the input.
101 .
102 fastaq_search_for_seq -
103 Searches for an exact match on a given string and its
104 reverese complement, in every sequences of a fasta/q file.
105 Case insensitive. Guaranteed to find all hits.
106 .
107 fastaq_sequence_trim -
108 Trims sequences off the start of all sequences in a pair
109 of fasta/q files, whenever there is a perfect match.
110 Only keeps a read pair if both reads of the pair are at
111 least a minimum length after any trimming.
112 .
113 fastaq_split_by_base_count -
114 Splits a multi fasta/q file into separate files.
115 Does not split sequences. Puts up to max_bases
116 into each split file. The exception is that any
117 sequence longer than max_bases is put into its own file.
118 .
119 fastaq_strip_illumina_suffix -
120 Strips /1 or /2 off the end of every read name
121 in a fasta/q file.
122 .
123 fastaq_to_fake_qual -
124 Makes fake quality scores file from a fasta/q file.
125 .
126 fastaq_to_fasta -
127 Converts sequence file to FASTA format.
128 .
129 fastaq_to_mira_xml -
130 Creates an xml file from a fasta/q file of reads,
131 for use with Mira assembler.
132 .
133 fastaq_to_orfs_gff -
134 Writes a GFF file of open reading frames from a fasta/q file
135 .
136 fastaq_to_perfect_reads -
137 Makes perfect paired end fastq reads from a fasta/q file,
138 with insert sizes sampled from a normal distribution.
139 Read orientation is innies. Output is an interleaved fastq file.
140 .
141 fastaq_to_random_subset -
142 Takes a random subset of reads from a fasta/q file and optionally
143 the corresponding read from a mates file.
144 Ouptut is interleaved if mates file given.
145 .
146 fastaq_to_tiling_bam -
147 Takes a fasta/q file. Makes a BAM file containing perfect
148 (unpaired) reads tiling the whole genome.
149 .
150 fastaq_to_unique_by_id -
151 Removes duplicate sequences from a fasta/q file,
152 based on their names. If the same name is found
153 more than once, then the longest sequence is kept.
154 Order of sequences is preserved in output.
155 .
156 fastaq_translate -
157 Translates all sequences in a fasta or fastq file.
158 Output is always fasta format
159 .
160 fastaq_trim_ends -
161 Trims set number of bases off each sequence in a fasta/q file
162 .
163 fastaq_trim_Ns_at_end -
164 Trims any Ns off each sequence in a fasta/q file.
165 Does nothing to gaps in the middle, just trims the ends
166 .
167 A developer API is also provided by this package.
168 There are plenty of examples in tasks.py
0 Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
1 Upstream-Name: Fastaq
2 Source: https://github.com/sanger-pathogens/Fastaq
3
4 Files: *
5 Copyright: © 2012-2013 Martin Hunt <mh12@sanger.ac.uk>
6 License: GPL-3
7 This package is free software; you can redistribute it and/or modify
8 it under the terms of the GNU General Public License as published by
9 the Free Software Foundation; either version 3 of the License, or
10 (at your option) any later version.
11 .
12 This package is distributed in the hope that it will be useful,
13 but WITHOUT ANY WARRANTY; without even the implied warranty of
14 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
15 GNU General Public License for more details.
16 .
17 You should have received a copy of the GNU General Public License
18 along with this program. If not, see <http://www.gnu.org/licenses/>
19 .
20 On Debian systems, the complete text of the GNU General
21 Public License version 3 can be found in "/usr/share/common-licenses/GPL-3".
0 debian/man/*
0 Description: Delay import of Fastaq modules by the python executables
1 Man pages for this package are being automatically created with through the
2 help2man wrapper called usage_to_man. help2man calls the python executables
3 with the -h option and converts the usage into a man page.
4 .
5 The first step done by all the executables is the import of the modules deployed
6 by this package. Since the package is not installed in the system at build time,
7 the man pages would never be properly created.
8 .
9 This patch solves this problem by importing the modules in this package after
10 the argument parsing code.
11 .
12 Upstream prefered to keep the code as it is for styling reasons, which is
13 perfectly reasonable
14 .
15 fastaq (1.5.0-1) UNRELEASED; urgency=low
16 .
17 * Initial release (Closes: #766321)
18 Author: Jorge Soares <j.s.soares@gmail.com>
19 Index: fastaq/scripts/fastaq_capillary_to_pairs
20 ===================================================================
21 --- fastaq.orig/scripts/fastaq_capillary_to_pairs
22 +++ fastaq/scripts/fastaq_capillary_to_pairs
23 @@ -1,7 +1,6 @@
24 #!/usr/bin/env python3
25
26 import argparse
27 -from fastaq import tasks
28
29 parser = argparse.ArgumentParser(
30 description = 'Given a fasta/q file of capillary reads, makes an interleaved file of read pairs (where more than read from same ligation, takes the longest read) and a file of unpaired reads. Replaces the .p1k/.q1k part of read names to denote fwd/rev reads with /1 and /2',
31 @@ -9,4 +8,8 @@ parser = argparse.ArgumentParser(
32 parser.add_argument('infile', help='Name of input fasta/q file')
33 parser.add_argument('outprefix', help='Prefix of output files', metavar='outfiles prefix')
34 options = parser.parse_args()
35 +
36 +
37 +from fastaq import tasks
38 +
39 tasks.capillary_to_pairs(options.infile, options.outprefix)
40 Index: fastaq/scripts/fastaq_chunker
41 ===================================================================
42 --- fastaq.orig/scripts/fastaq_chunker
43 +++ fastaq/scripts/fastaq_chunker
44 @@ -1,7 +1,6 @@
45 #!/usr/bin/env python3
46
47 import argparse
48 -from fastaq import tasks
49
50 parser = argparse.ArgumentParser(
51 description = 'Splits a multi fasta/q file into separate files. Splits sequences into chunks of a fixed size. Aims for chunk_size chunks in each file, but allows a little extra, so chunk can be up to (chunk_size + tolerance), to prevent tiny chunks made from the ends of sequences',
52 @@ -12,6 +11,10 @@ parser.add_argument('chunk_size', type=i
53 parser.add_argument('tolerance', type=int, help='Tolerance allowed in chunk size')
54 parser.add_argument('--skip_all_Ns', action='store_true', help='Do not output any sequence that consists of all Ns')
55 options = parser.parse_args()
56 +
57 +
58 +from fastaq import tasks
59 +
60 tasks.split_by_fixed_size(
61 options.infile,
62 options.outprefix,
63 Index: fastaq/scripts/fastaq_count_sequences
64 ===================================================================
65 --- fastaq.orig/scripts/fastaq_count_sequences
66 +++ fastaq/scripts/fastaq_count_sequences
67 @@ -1,11 +1,14 @@
68 #!/usr/bin/env python3
69
70 import argparse
71 -from fastaq import tasks
72
73 parser = argparse.ArgumentParser(
74 description = 'Counts the number of sequences in a fasta/q file',
75 usage = '%(prog)s <fasta/q in>')
76 parser.add_argument('infile', help='Name of input fasta/q file')
77 options = parser.parse_args()
78 +
79 +
80 +from fastaq import tasks
81 +
82 print(tasks.count_sequences(options.infile))
83 Index: fastaq/scripts/fastaq_deinterleave
84 ===================================================================
85 --- fastaq.orig/scripts/fastaq_deinterleave
86 +++ fastaq/scripts/fastaq_deinterleave
87 @@ -1,7 +1,6 @@
88 #!/usr/bin/env python3
89
90 import argparse
91 -from fastaq import tasks
92
93 parser = argparse.ArgumentParser(
94 description = 'Deinterleaves fasta/q file, so that reads are written alternately between two output files',
95 @@ -11,4 +10,8 @@ parser.add_argument('infile', help='Name
96 parser.add_argument('out_fwd', help='Name of output fasta/q file of forwards reads')
97 parser.add_argument('out_rev', help='Name of output fasta/q file of reverse reads')
98 options = parser.parse_args()
99 +
100 +
101 +from fastaq import tasks
102 +
103 tasks.deinterleave(options.infile, options.out_fwd, options.out_rev, fasta_out=options.fasta_out)
104 Index: fastaq/scripts/fastaq_enumerate_names
105 ===================================================================
106 --- fastaq.orig/scripts/fastaq_enumerate_names
107 +++ fastaq/scripts/fastaq_enumerate_names
108 @@ -1,7 +1,6 @@
109 #!/usr/bin/env python3
110
111 import argparse
112 -from fastaq import tasks
113
114 parser = argparse.ArgumentParser(
115 description = 'Renames sequences in a file, calling them 1,2,3... etc',
116 @@ -12,6 +11,10 @@ parser.add_argument('--keep_suffix', act
117 parser.add_argument('infile', help='Name of fasta/q file to be read')
118 parser.add_argument('outfile', help='Name of output fasta/q file')
119 options = parser.parse_args()
120 +
121 +
122 +from fastaq import tasks
123 +
124 tasks.enumerate_names(options.infile,
125 options.outfile,
126 start_index=options.start_index,
127 Index: fastaq/scripts/fastaq_expand_nucleotides
128 ===================================================================
129 --- fastaq.orig/scripts/fastaq_expand_nucleotides
130 +++ fastaq/scripts/fastaq_expand_nucleotides
131 @@ -1,7 +1,6 @@
132 #!/usr/bin/env python3
133
134 import argparse
135 -from fastaq import tasks
136
137 parser = argparse.ArgumentParser(
138 description = 'Makes all combinations of sequences in input file by using all possibilities of redundant bases. e.g. ART could be AAT or AGT. Assumes input is nucleotides, not amino acids',
139 @@ -9,6 +8,10 @@ parser = argparse.ArgumentParser(
140 parser.add_argument('infile', help='Name of input file. Can be any of FASTA, FASTQ, GFF3, EMBL, GBK, Phylip')
141 parser.add_argument('outfile', help='Name of output file')
142 options = parser.parse_args()
143 +
144 +
145 +from fastaq import tasks
146 +
147 tasks.expand_nucleotides(
148 options.infile,
149 options.outfile,
150 Index: fastaq/scripts/fastaq_extend_gaps
151 ===================================================================
152 --- fastaq.orig/scripts/fastaq_extend_gaps
153 +++ fastaq/scripts/fastaq_extend_gaps
154 @@ -1,7 +1,6 @@
155 #!/usr/bin/env python3
156
157 import argparse
158 -from fastaq import tasks
159
160 parser = argparse.ArgumentParser(
161 description = 'Extends the length of all gaps (and trims the start/end of sequences) in a fasta/q file. Does this by replacing a set number of bases either side of each gap with Ns. Any sequence that ends up as all Ns is lost',
162 @@ -10,4 +9,8 @@ parser.add_argument('--trim_number', typ
163 parser.add_argument('infile', help='Name of input fasta/q file')
164 parser.add_argument('outfile', help='Name of output fasta/q file')
165 options = parser.parse_args()
166 +
167 +
168 +from fastaq import tasks
169 +
170 tasks.extend_gaps(options.infile, options.outfile, options.trim_number)
171 Index: fastaq/scripts/fastaq_fasta_to_fastq
172 ===================================================================
173 --- fastaq.orig/scripts/fastaq_fasta_to_fastq
174 +++ fastaq/scripts/fastaq_fasta_to_fastq
175 @@ -1,7 +1,6 @@
176 #!/usr/bin/env python3
177
178 import argparse
179 -from fastaq import tasks
180
181 parser = argparse.ArgumentParser(
182 description = 'Given a fasta and qual file, makes a fastq file',
183 @@ -10,4 +9,8 @@ parser.add_argument('fasta', help='Name
184 parser.add_argument('qual', help='Name of input quality scores file', metavar='qual in')
185 parser.add_argument('outfile', help='Name of output fastq file', metavar='fastq out')
186 options = parser.parse_args()
187 +
188 +
189 +from fastaq import tasks
190 +
191 tasks.fasta_to_fastq(options.fasta, options.qual, options.outfile)
192 Index: fastaq/scripts/fastaq_filter
193 ===================================================================
194 --- fastaq.orig/scripts/fastaq_filter
195 +++ fastaq/scripts/fastaq_filter
196 @@ -1,7 +1,6 @@
197 #!/usr/bin/env python3
198
199 import argparse
200 -from fastaq import tasks
201
202 parser = argparse.ArgumentParser(
203 description = 'Filters a fasta/q file by sequence length and/or by name matching a regular expression',
204 @@ -14,6 +13,10 @@ parser.add_argument('-v', '--invert', ac
205 parser.add_argument('infile', help='Name of fasta/q file to be filtered')
206 parser.add_argument('outfile', help='Name of output fasta/q file')
207 options = parser.parse_args()
208 +
209 +
210 +from fastaq import tasks
211 +
212 tasks.filter(options.infile,
213 options.outfile,
214 minlength=options.min_length,
215 Index: fastaq/scripts/fastaq_get_ids
216 ===================================================================
217 --- fastaq.orig/scripts/fastaq_get_ids
218 +++ fastaq/scripts/fastaq_get_ids
219 @@ -1,7 +1,6 @@
220 #!/usr/bin/env python3
221
222 import argparse
223 -from fastaq import tasks
224
225 parser = argparse.ArgumentParser(
226 description = 'Gets IDs from each sequence in a fasta or fastq file',
227 @@ -9,4 +8,8 @@ parser = argparse.ArgumentParser(
228 parser.add_argument('infile', help='Name of input fasta/q file')
229 parser.add_argument('outfile', help='Name of output file')
230 options = parser.parse_args()
231 +
232 +
233 +from fastaq import tasks
234 +
235 tasks.get_ids(options.infile, options.outfile)
236 Index: fastaq/scripts/fastaq_get_seq_flanking_gaps
237 ===================================================================
238 --- fastaq.orig/scripts/fastaq_get_seq_flanking_gaps
239 +++ fastaq/scripts/fastaq_get_seq_flanking_gaps
240 @@ -1,7 +1,6 @@
241 #!/usr/bin/env python3
242
243 import argparse
244 -from fastaq import tasks
245
246 parser = argparse.ArgumentParser(
247 description = 'Gets the sequences either side of gaps in a fasta/q file',
248 @@ -11,4 +10,8 @@ parser.add_argument('--right', type=int,
249 parser.add_argument('infile', help='Name of input fasta/q file')
250 parser.add_argument('outfile', help='Name of output fasta/q file')
251 options = parser.parse_args()
252 +
253 +
254 +from fastaq import tasks
255 +
256 tasks.get_seqs_flanking_gaps(options.infile, options.outfile, options.left, options.right)
257 Index: fastaq/scripts/fastaq_insert_or_delete_bases
258 ===================================================================
259 --- fastaq.orig/scripts/fastaq_insert_or_delete_bases
260 +++ fastaq/scripts/fastaq_insert_or_delete_bases
261 @@ -1,9 +1,6 @@
262 #!/usr/bin/env python3
263
264 import argparse
265 -import sys
266 -import random
267 -from fastaq import sequences, utils, intervals
268
269 parser = argparse.ArgumentParser(
270 description = 'Deletes or inserts bases at given position(s) from a fasta/q file',
271 @@ -16,6 +13,11 @@ parser.add_argument('-i','--insert', act
272 parser.add_argument('--insert_range', help='Inserts random bases starting after position P in each sequence of the input file. Inserts start + (n-1)*step bases into sequence n.', metavar='P,start,step')
273 options = parser.parse_args()
274
275 +
276 +import sys
277 +import random
278 +from fastaq import sequences, utils, intervals
279 +
280 test_ops = [int(x is not None) for x in [options.delete, options.insert, options.delete_range, options.insert_range]]
281
282 if sum(test_ops) != 1:
283 Index: fastaq/scripts/fastaq_interleave
284 ===================================================================
285 --- fastaq.orig/scripts/fastaq_interleave
286 +++ fastaq/scripts/fastaq_interleave
287 @@ -1,7 +1,6 @@
288 #!/usr/bin/env python3
289
290 import argparse
291 -from fastaq import tasks
292
293 parser = argparse.ArgumentParser(
294 description = 'Interleaves two fasta/q files, so that reads are written alternately first/second in output file',
295 @@ -10,4 +9,8 @@ parser.add_argument('infile_1', help='Na
296 parser.add_argument('infile_2', help='Name of second input fasta/q file')
297 parser.add_argument('outfile', help='Name of output fasta/q file of interleaved reads')
298 options = parser.parse_args()
299 +
300 +
301 +from fastaq import tasks
302 +
303 tasks.interleave(options.infile_1, options.infile_2, options.outfile)
304 Index: fastaq/scripts/fastaq_long_read_simulate
305 ===================================================================
306 --- fastaq.orig/scripts/fastaq_long_read_simulate
307 +++ fastaq/scripts/fastaq_long_read_simulate
308 @@ -1,7 +1,6 @@
309 #!/usr/bin/env python3
310
311 import argparse
312 -from fastaq import tasks
313
314 parser = argparse.ArgumentParser(
315 description = 'Simulates long reads from a fasta/q file. Can optionally make insertions into the reads, like pacbio does. If insertions made, coverage calculation is done before the insertions (so total read length may appear longer then expected).',
316 @@ -29,8 +28,11 @@ ins_group = parser.add_argument_group('o
317 ins_group.add_argument('--ins_skip', type=int, help='Insert a random base every --skip bases plus or minus --ins_window. If this option is used, must also use --ins_window.', metavar='INT')
318 ins_group.add_argument('--ins_window', type=int, help='See --ins_skip. If this option is used, must also use --ins_skip.', metavar='INT')
319
320 -
321 options = parser.parse_args()
322 +
323 +
324 +from fastaq import tasks
325 +
326 tasks.make_long_reads(
327 options.infile,
328 options.outfile,
329 Index: fastaq/scripts/fastaq_make_random_contigs
330 ===================================================================
331 --- fastaq.orig/scripts/fastaq_make_random_contigs
332 +++ fastaq/scripts/fastaq_make_random_contigs
333 @@ -1,7 +1,6 @@
334 #!/usr/bin/env python3
335
336 import argparse
337 -from fastaq import tasks
338
339 parser = argparse.ArgumentParser(
340 description = 'Makes a multi-fasta file of random sequences, all of the same length. Each base has equal chance of being A,C,G or T',
341 @@ -14,6 +13,10 @@ parser.add_argument('contigs', type=int,
342 parser.add_argument('length', type=int, help='Length of each contig')
343 parser.add_argument('outfile', help='Name of output file')
344 options = parser.parse_args()
345 +
346 +
347 +from fastaq import tasks
348 +
349 tasks.make_random_contigs(
350 options.contigs,
351 options.length,
352 Index: fastaq/scripts/fastaq_merge
353 ===================================================================
354 --- fastaq.orig/scripts/fastaq_merge
355 +++ fastaq/scripts/fastaq_merge
356 @@ -1,7 +1,6 @@
357 #!/usr/bin/env python3
358
359 import argparse
360 -from fastaq import tasks
361
362 parser = argparse.ArgumentParser(
363 description = 'Converts multi fasta/q file to single sequence file, preserving original order of sequences',
364 @@ -10,6 +9,10 @@ parser.add_argument('infile', help='Name
365 parser.add_argument('outfile', help='Name of output file')
366 parser.add_argument('-n', '--name', help='Name of sequence in output file [%(default)s]', default='union')
367 options = parser.parse_args()
368 +
369 +
370 +from fastaq import tasks
371 +
372 tasks.merge_to_one_seq(
373 options.infile,
374 options.outfile,
375 Index: fastaq/scripts/fastaq_replace_bases
376 ===================================================================
377 --- fastaq.orig/scripts/fastaq_replace_bases
378 +++ fastaq/scripts/fastaq_replace_bases
379 @@ -1,7 +1,6 @@
380 #!/usr/bin/env python3
381
382 import argparse
383 -from fastaq import tasks
384
385 parser = argparse.ArgumentParser(
386 description = 'Replaces all occurences of one letter with another in a fasta/q file',
387 @@ -11,4 +10,8 @@ parser.add_argument('outfile', help='Nam
388 parser.add_argument('old', help='Base to be replaced')
389 parser.add_argument('new', help='Replace with this letter')
390 options = parser.parse_args()
391 +
392 +
393 +from fastaq import tasks
394 +
395 tasks.replace_bases(options.infile, options.outfile, options.old, options.new)
396 Index: fastaq/scripts/fastaq_reverse_complement
397 ===================================================================
398 --- fastaq.orig/scripts/fastaq_reverse_complement
399 +++ fastaq/scripts/fastaq_reverse_complement
400 @@ -1,7 +1,6 @@
401 #!/usr/bin/env python3
402
403 import argparse
404 -from fastaq import tasks
405
406 parser = argparse.ArgumentParser(
407 description = 'Reverse complements all sequences in a fasta/q file',
408 @@ -9,4 +8,8 @@ parser = argparse.ArgumentParser(
409 parser.add_argument('infile', help='Name of input fasta/q file')
410 parser.add_argument('outfile', help='Name of output fasta/q file')
411 options = parser.parse_args()
412 +
413 +
414 +from fastaq import tasks
415 +
416 tasks.reverse_complement(options.infile, options.outfile)
417 Index: fastaq/scripts/fastaq_scaffolds_to_contigs
418 ===================================================================
419 --- fastaq.orig/scripts/fastaq_scaffolds_to_contigs
420 +++ fastaq/scripts/fastaq_scaffolds_to_contigs
421 @@ -1,7 +1,6 @@
422 #!/usr/bin/env python3
423
424 import argparse
425 -from fastaq import tasks
426
427 parser = argparse.ArgumentParser(
428 description = 'Creates a file of contigs from a file of scaffolds - i.e. breaks at every gap in the input',
429 @@ -10,4 +9,7 @@ parser.add_argument('--number_contigs',
430 parser.add_argument('infile', help='Name of input fasta/q file')
431 parser.add_argument('outfile', help='Name of output contigs file')
432 options = parser.parse_args()
433 +
434 +from fastaq import tasks
435 +
436 tasks.scaffolds_to_contigs(options.infile, options.outfile, number_contigs=options.number_contigs)
437 Index: fastaq/scripts/fastaq_search_for_seq
438 ===================================================================
439 --- fastaq.orig/scripts/fastaq_search_for_seq
440 +++ fastaq/scripts/fastaq_search_for_seq
441 @@ -1,7 +1,6 @@
442 #!/usr/bin/env python3
443
444 import argparse
445 -from fastaq import tasks
446
447 parser = argparse.ArgumentParser(
448 description = 'Searches for an exact match on a given string and its reverese complement, in every sequences of a fasta/q file. Case insensitive. Guaranteed to find all hits',
449 @@ -10,4 +9,7 @@ parser.add_argument('infile', help='Name
450 parser.add_argument('outfile', help='Name of outputfile. Tab-delimited output: sequence name, position, strand')
451 parser.add_argument('search_string', help='String to search for in the sequences')
452 options = parser.parse_args()
453 +
454 +from fastaq import tasks
455 +
456 tasks.search_for_seq(options.infile, options.outfile, options.search_string)
457 Index: fastaq/scripts/fastaq_sequence_trim
458 ===================================================================
459 --- fastaq.orig/scripts/fastaq_sequence_trim
460 +++ fastaq/scripts/fastaq_sequence_trim
461 @@ -1,7 +1,6 @@
462 #!/usr/bin/env python3
463
464 import argparse
465 -from fastaq import tasks
466
467 parser = argparse.ArgumentParser(
468 description = 'Trims sequences off the start of all sequences in a pair of fasta/q files, whenever there is a perfect match. Only keeps a read pair if both reads of the pair are at least a minimum length after any trimming',
469 @@ -14,6 +13,10 @@ parser.add_argument('outfile_1', help='N
470 parser.add_argument('outfile_2', help='Name of output reverse fasta/q file', metavar='out_2')
471 parser.add_argument('trim_seqs', help='Name of fasta/q file of sequences to search for at the start of each input sequence', metavar='trim_seqs')
472 options = parser.parse_args()
473 +
474 +
475 +from fastaq import tasks
476 +
477 tasks.sequence_trim(
478 options.infile_1,
479 options.infile_2,
480 Index: fastaq/scripts/fastaq_split_by_base_count
481 ===================================================================
482 --- fastaq.orig/scripts/fastaq_split_by_base_count
483 +++ fastaq/scripts/fastaq_split_by_base_count
484 @@ -1,7 +1,6 @@
485 #!/usr/bin/env python3
486
487 import argparse
488 -from fastaq import tasks
489
490 parser = argparse.ArgumentParser(
491 description = 'Splits a multi fasta/q file into separate files. Does not split sequences. Puts up to max_bases into each split file. The exception is that any sequence longer than max_bases is put into its own file.',
492 @@ -12,4 +11,8 @@ parser.add_argument('max_bases', type=in
493 parser.add_argument('--max_seqs', type=int, help='Max number of sequences in each output split file [no limit]', metavar='INT')
494
495 options = parser.parse_args()
496 +
497 +
498 +from fastaq import tasks
499 +
500 tasks.split_by_base_count(options.infile, options.outprefix, options.max_bases, options.max_seqs)
501 Index: fastaq/scripts/fastaq_strip_illumina_suffix
502 ===================================================================
503 --- fastaq.orig/scripts/fastaq_strip_illumina_suffix
504 +++ fastaq/scripts/fastaq_strip_illumina_suffix
505 @@ -1,7 +1,6 @@
506 #!/usr/bin/env python3
507
508 import argparse
509 -from fastaq import tasks
510
511 parser = argparse.ArgumentParser(
512 description = 'Strips /1 or /2 off the end of every read name in a fasta/q file',
513 @@ -9,4 +8,8 @@ parser = argparse.ArgumentParser(
514 parser.add_argument('infile', help='Name of input fasta/q file')
515 parser.add_argument('outfile', help='Name of output fasta/q file')
516 options = parser.parse_args()
517 +
518 +
519 +from fastaq import tasks
520 +
521 tasks.strip_illumina_suffix(options.infile, options.outfile)
522 Index: fastaq/scripts/fastaq_to_fake_qual
523 ===================================================================
524 --- fastaq.orig/scripts/fastaq_to_fake_qual
525 +++ fastaq/scripts/fastaq_to_fake_qual
526 @@ -1,7 +1,6 @@
527 #!/usr/bin/env python3
528
529 import argparse
530 -from fastaq import tasks
531
532 parser = argparse.ArgumentParser(
533 description = 'Makes fake quality scores file from a fasta/q file',
534 @@ -10,6 +9,10 @@ parser.add_argument('infile', help='Name
535 parser.add_argument('outfile', help='Name of output file')
536 parser.add_argument('-q', '--qual', type=int, help='Quality score to assign to all bases [%(default)s]', default=40)
537 options = parser.parse_args()
538 +
539 +
540 +from fastaq import tasks
541 +
542 tasks.fastaq_to_fake_qual(
543 options.infile,
544 options.outfile,
545 Index: fastaq/scripts/fastaq_to_fasta
546 ===================================================================
547 --- fastaq.orig/scripts/fastaq_to_fasta
548 +++ fastaq/scripts/fastaq_to_fasta
549 @@ -1,7 +1,6 @@
550 #!/usr/bin/env python3
551
552 import argparse
553 -from fastaq import tasks
554
555 parser = argparse.ArgumentParser(
556 description = 'Converts sequence file to FASTA format',
557 @@ -11,6 +10,10 @@ parser.add_argument('outfile', help='Nam
558 parser.add_argument('-l', '--line_length', type=int, help='Number of bases on each sequence line of output file [%(default)s]', default=60)
559 parser.add_argument('-s', '--strip_after_whitespace', action='store_true', help='Remove everything after first whitesapce in every sequence name')
560 options = parser.parse_args()
561 +
562 +
563 +from fastaq import tasks
564 +
565 tasks.to_fasta(
566 options.infile,
567 options.outfile,
568 Index: fastaq/scripts/fastaq_to_mira_xml
569 ===================================================================
570 --- fastaq.orig/scripts/fastaq_to_mira_xml
571 +++ fastaq/scripts/fastaq_to_mira_xml
572 @@ -1,7 +1,6 @@
573 #!/usr/bin/env python3
574
575 import argparse
576 -from fastaq import tasks
577
578 parser = argparse.ArgumentParser(
579 description = 'Creates an xml file from a fasta/q file of reads, for use with Mira assembler',
580 @@ -9,4 +8,8 @@ parser = argparse.ArgumentParser(
581 parser.add_argument('infile', help='Name of input fasta/q file')
582 parser.add_argument('xml_out', help='Name of output xml file')
583 options = parser.parse_args()
584 +
585 +
586 +from fastaq import tasks
587 +
588 tasks.fastaq_to_mira_xml(options.infile, options.xml_out)
589 Index: fastaq/scripts/fastaq_to_orfs_gff
590 ===================================================================
591 --- fastaq.orig/scripts/fastaq_to_orfs_gff
592 +++ fastaq/scripts/fastaq_to_orfs_gff
593 @@ -1,7 +1,6 @@
594 #!/usr/bin/env python3
595
596 import argparse
597 -from fastaq import tasks
598
599 parser = argparse.ArgumentParser(
600 description = 'Writes a GFF file of open reading frames from a fasta/q file',
601 @@ -10,4 +9,8 @@ parser.add_argument('--min_length', type
602 parser.add_argument('infile', help='Name of input fasta/q file')
603 parser.add_argument('gff_out', help='Name of output gff file')
604 options = parser.parse_args()
605 +
606 +
607 +from fastaq import tasks
608 +
609 tasks.fastaq_to_orfs_gff(options.infile, options.gff_out, min_length=options.min_length)
610 Index: fastaq/scripts/fastaq_to_perfect_reads
611 ===================================================================
612 --- fastaq.orig/scripts/fastaq_to_perfect_reads
613 +++ fastaq/scripts/fastaq_to_perfect_reads
614 @@ -1,10 +1,6 @@
615 #!/usr/bin/env python3
616
617 import argparse
618 -import random
619 -from math import floor, ceil
620 -from fastaq import sequences, utils
621 -import sys
622
623 parser = argparse.ArgumentParser(
624 description = 'Makes perfect paired end fastq reads from a fasta/q file, with insert sizes sampled from a normal distribution. Read orientation is innies. Output is an interleaved fastq file.',
625 @@ -20,6 +16,12 @@ parser.add_argument('--no_n', action='st
626 parser.add_argument('--seed', type=int, help='Seed for random number generator. Default is to use python\'s default', default=None, metavar='INT')
627 options = parser.parse_args()
628
629 +
630 +import random
631 +from math import floor, ceil
632 +from fastaq import sequences, utils
633 +import sys
634 +
635 random.seed(a=options.seed)
636
637 seq_reader = sequences.file_reader(options.infile)
638 Index: fastaq/scripts/fastaq_to_random_subset
639 ===================================================================
640 --- fastaq.orig/scripts/fastaq_to_random_subset
641 +++ fastaq/scripts/fastaq_to_random_subset
642 @@ -1,9 +1,6 @@
643 #!/usr/bin/env python3
644
645 -import sys
646 import argparse
647 -import random
648 -from fastaq import sequences, utils
649
650 parser = argparse.ArgumentParser(
651 description = 'Takes a random subset of reads from a fasta/q file and optionally the corresponding read ' +
652 @@ -15,6 +12,11 @@ parser.add_argument('outfile', help='Nam
653 parser.add_argument('probability', type=int, help='Probability of keeping any given read (pair) in [0,100]', metavar='INT')
654 options = parser.parse_args()
655
656 +
657 +import sys
658 +import random
659 +from fastaq import sequences, utils
660 +
661 seq_reader = sequences.file_reader(options.infile)
662 fout = utils.open_file_write(options.outfile)
663
664 Index: fastaq/scripts/fastaq_to_tiling_bam
665 ===================================================================
666 --- fastaq.orig/scripts/fastaq_to_tiling_bam
667 +++ fastaq/scripts/fastaq_to_tiling_bam
668 @@ -1,9 +1,6 @@
669 #!/usr/bin/env python3
670
671 import argparse
672 -import sys
673 -import os
674 -from fastaq import sequences, utils
675
676 parser = argparse.ArgumentParser(
677 description = 'Takes a fasta/q file. Makes a BAM file containing perfect (unpaired) reads tiling the whole genome',
678 @@ -17,6 +14,11 @@ parser.add_argument('outfile', help='Nam
679 parser.add_argument('--read_group', help='Add the given read group ID to all reads [%(default)s]' ,default='42')
680 options = parser.parse_args()
681
682 +
683 +import sys
684 +import os
685 +from fastaq import sequences, utils
686 +
687 # make a header first - we need to add the @RG line to the default header made by samtools
688 tmp_empty_file = options.outfile + '.tmp.empty'
689 f = utils.open_file_write(tmp_empty_file)
690 Index: fastaq/scripts/fastaq_to_unique_by_id
691 ===================================================================
692 --- fastaq.orig/scripts/fastaq_to_unique_by_id
693 +++ fastaq/scripts/fastaq_to_unique_by_id
694 @@ -1,7 +1,6 @@
695 #!/usr/bin/env python3
696
697 import argparse
698 -from fastaq import tasks
699
700 parser = argparse.ArgumentParser(
701 description = 'Removes duplicate sequences from a fasta/q file, based on their names. If the same name is found more than once, then the longest sequence is kept. Order of sequences is preserved in output',
702 @@ -9,4 +8,8 @@ parser = argparse.ArgumentParser(
703 parser.add_argument('infile', help='Name of input fasta/q file')
704 parser.add_argument('outfile', help='Name of output fasta/q file')
705 options = parser.parse_args()
706 +
707 +
708 +from fastaq import tasks
709 +
710 tasks.to_unique_by_id(options.infile, options.outfile)
711 Index: fastaq/scripts/fastaq_translate
712 ===================================================================
713 --- fastaq.orig/scripts/fastaq_translate
714 +++ fastaq/scripts/fastaq_translate
715 @@ -1,7 +1,6 @@
716 #!/usr/bin/env python3
717
718 import argparse
719 -from fastaq import tasks
720
721 parser = argparse.ArgumentParser(
722 description = 'Translates all sequences in a fasta or fastq file. Output is always fasta format',
723 @@ -10,4 +9,8 @@ parser.add_argument('--frame', type=int,
724 parser.add_argument('infile', help='Name of fasta/q file to be translated', metavar='in.fasta/q')
725 parser.add_argument('outfile', help='Name of output fasta file', metavar='out.fasta')
726 options = parser.parse_args()
727 +
728 +
729 +from fastaq import tasks
730 +
731 tasks.translate(options.infile, options.outfile, frame=options.frame)
732 Index: fastaq/scripts/fastaq_trim_Ns_at_end
733 ===================================================================
734 --- fastaq.orig/scripts/fastaq_trim_Ns_at_end
735 +++ fastaq/scripts/fastaq_trim_Ns_at_end
736 @@ -1,7 +1,6 @@
737 #!/usr/bin/env python3
738
739 import argparse
740 -from fastaq import tasks
741
742 parser = argparse.ArgumentParser(
743 description = 'Trims any Ns off each sequence in a fasta/q file. Does nothing to gaps in the middle, just trims the ends',
744 @@ -9,4 +8,8 @@ parser = argparse.ArgumentParser(
745 parser.add_argument('infile', help='Name of input fasta/q file')
746 parser.add_argument('outfile', help='Name of output fasta/q file')
747 options = parser.parse_args()
748 +
749 +
750 +from fastaq import tasks
751 +
752 tasks.trim_Ns_at_end(options.infile, options.outfile)
753 Index: fastaq/scripts/fastaq_trim_ends
754 ===================================================================
755 --- fastaq.orig/scripts/fastaq_trim_ends
756 +++ fastaq/scripts/fastaq_trim_ends
757 @@ -1,7 +1,6 @@
758 #!/usr/bin/env python3
759
760 import argparse
761 -from fastaq import tasks
762
763 parser = argparse.ArgumentParser(
764 description = 'Trims set number of bases off each sequence in a fasta/q file',
765 @@ -11,4 +10,8 @@ parser.add_argument('start_trim', type=i
766 parser.add_argument('end_trim', type=int, help='Number of bases to trim off end')
767 parser.add_argument('outfile', help='Name of output fasta/q file')
768 options = parser.parse_args()
769 +
770 +
771 +from fastaq import tasks
772 +
773 tasks.trim(options.infile, options.outfile, options.start_trim, options.end_trim)
0 delay-import-statements-for-manpage-creation.patch
0 #!/usr/bin/make -f
1
2 export DH_VERBOSE := 1
3 export PYBUILD_NAME=fastaq
4
5 mandir := $(CURDIR)/debian/man
6 debfolder := $(CURDIR)/debian
7
8 %:
9 dh $@ --with python3 --buildsystem=pybuild
10
11 override_dh_auto_build:
12 dh_python3
13 dh_auto_build
14 mkdir $(CURDIR)/doc
15 cd $(CURDIR)/doc
16
17 override_dh_auto_clean:
18 rm -rf build .pybuild
19 rm -rf $(mandir)
20
21 override_dh_installman:
22 mkdir -p $(mandir)
23 $(debfolder)/usage_to_man
24 dh_installman --
0 3.0 (quilt)
0 Reference:
1 Author:
2 Title:
3 Journal:
4 Year:
5 Volume:
6 Number:
7 Pages:
8 DOI:
9 PMID:
10 URL:
11 eprint:
0 #!/usr/bin/perl
1 use strict;
2 use warnings;
3
4 #Converts Fastaq python scripts usage into man pages.
5 #The man pages are placed in the man folder of the main Fastaq directory
6
7 createManPages();
8
9 sub createManPages {
10
11 my $source= 'scripts';
12 my $destination= 'debian/man';
13 my $app_name = 'Fastaq';
14 my $descriptions = shortDescription();
15
16 unless ( -d $destination ) {
17 system(mkdir $destination);
18 }
19
20 my @files;
21
22 push(@files,`ls $source/fastaq_*`);
23
24 if ( scalar @files > 0 ) {
25
26 print "Creating manpages\n";
27 for my $file ( @files ) {
28 $file =~ s/\n$//;
29
30 my $filename = $file;
31 $filename =~ s/$source\///;
32
33 my $uc_filename = uc($filename);
34 my $man_file = $filename;
35
36 $man_file = $destination . '/' . $man_file . '.1';
37
38 open (my $man_fh, ">", $man_file);
39
40 my $grep_string = $filename . ': error: too few arguments';
41
42 my $cmd = "help2man -m $filename -n $filename --no-discard-stderr $file | sed 's/usage://gi'";
43 my @output;
44 push(@output, `$cmd`);
45
46 for my $line (@output) {
47 $line =~ s/\n$//;
48
49 }
50
51 for (my $i = 0; $i < scalar @output; $i++) {
52 my $output_line = $output[$i];
53
54 if ($output_line =~ m/^\.TH/) {
55 $output_line =~ s/\s+/ /g;
56 $output_line =~ s/(\.TH) ("\d+") ("[a-zA-Z0-9_ ]*") ("[a-zA-Z0-9_<>\[\]\/\.\(\), ]*") ("[a-zA-Z0-9_]*")/$1 $uc_filename $2 $3 "$app_name" "Fastaq executables"/;
57 }
58
59 $output_line =~ s/ \\- $filename/$filename \- $descriptions->{$filename}/;
60
61 if ( $output_line =~ m/^.PP/ && $output[$i + 1] =~ m/^$filename\:/ ) {
62 $output_line = $output[$i + 1] = '';
63 }
64
65 if ($output_line =~ m/^\.SH "SEE ALSO"/) {
66 last;
67 }
68 print $man_fh "$output_line\n";
69 }
70
71 writeAuthorAndCopyright($man_fh,$filename);
72 close($man_fh);
73 }
74 print "Manpage creation complete\n";
75 }
76 }
77
78 sub writeAuthorAndCopyright {
79
80 my ($man_fh,$filename) = @_;
81
82 my $author_blurb = <<END_OF_AUTHOR_BLURB;
83 .SH "AUTHOR"
84 .sp
85 $filename was originally written by Martin Hunt (mh12\@sanger.ac.uk)
86 END_OF_AUTHOR_BLURB
87
88 print $man_fh "$author_blurb\n";
89
90 my $copyright_blurb = <<'END_OF_C_BLURB';
91 .SH "COPYING"
92 .sp
93 Wellcome Trust Sanger Institute Copyright \(co 2013 Wellcome Trust Sanger Institute This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version\&.
94 END_OF_C_BLURB
95
96 print $man_fh "$copyright_blurb\n";
97
98 }
99
100
101 sub shortDescription {
102
103 my %descriptions = (
104 fastaq_capillary_to_pairs => 'makes an interleaved file of read pairs',
105 fastaq_chunker => 'splits a multi fasta/q file into separate files',
106 fastaq_count_sequences => 'counts the number of sequences in a fasta/q file',
107 fastaq_deinterleave => 'deinterleaves fasta/q file',
108 fastaq_enumerate_names => 'renames sequences in a file, calling them 1,2,3...',
109 fastaq_expand_nucleotides => 'makes all combinations of sequences in input file',
110 fastaq_extend_gaps => 'extends the length of all gaps in a fasta/q file',
111 fastaq_fasta_to_fastq => 'given a fasta and qual file, makes a fastq file',
112 fastaq_filter => 'filters a fasta/q file by sequence length and/or by name',
113 fastaq_get_ids => 'gets ids from each sequence in a fasta or fastq file',
114 fastaq_get_seq_flanking_gaps => 'gets the sequences either side of gaps in a fasta/q file',
115 fastaq_insert_or_delete_bases => 'deletes or inserts bases at given position(s)',
116 fastaq_interleave => 'interleaves two fasta/q files',
117 fastaq_long_read_simulate => 'simulates long reads from a fasta/q file',
118 fastaq_make_random_contigs => 'makes a multi-fasta file of random sequences',
119 fastaq_merge => 'converts multi fasta/q file to single sequence file',
120 fastaq_replace_bases => 'replaces all occurences of one letter with another',
121 fastaq_reverse_complement => 'reverse complements all sequences',
122 fastaq_scaffolds_to_contigs => 'creates a file of contigs from a file of scaffolds',
123 fastaq_search_for_seq => 'searches for an exact match on a given string and its reverese complement. guaranteed to find all hits',
124 fastaq_sequence_trim => 'trims sequences off the start of all sequences in a pair of fasta/q files',
125 fastaq_split_by_base_count => 'splits a multi fasta/q file into separate files',
126 fastaq_strip_illumina_suffix => 'strips /1 or /2 off the end of every read name',
127 fastaq_to_fake_qual => 'makes fake quality scores file',
128 fastaq_to_fasta => 'converts sequence file to fasta format',
129 fastaq_to_mira_xml => 'creates an xml file from a fasta/q file of reads, for use with mira assembler',
130 fastaq_to_orfs_gff => 'writes a gff file of open reading frames',
131 fastaq_to_perfect_reads => 'makes perfect paired end fastq reads',
132 fastaq_to_random_subset => 'takes a random subset of reads',
133 fastaq_to_tiling_bam => 'makes a bam file containing perfect (unpaired) reads tiling the whole genome',
134 fastaq_to_unique_by_id => 'removes duplicate sequences',
135 fastaq_translate => 'translates all sequences',
136 fastaq_trim_ends => 'trims set number of bases off each sequence',
137 fastaq_trim_Ns_at_end => 'trims any ns off each sequence'
138 );
139
140 return(\%descriptions);
141 }
0 version=3
1 https://github.com/sanger-pathogens/fastaq/releases .*/archive/v(\d[\d.-]+)\.(?:tar(?:\.gz|\.bz2)?|tgz)
2