Imported Upstream version 1.0.0
Sascha Steinbiss
7 years ago
36 | 36 | # Add Tests |
37 | 37 | # ---------------------------------------------------------------------------- |
38 | 38 | |
39 | # message ("\n${ColourBold}Setting up unit tests${ColourReset}") | |
40 | # add_subdirectory(tests) | |
39 | message ("\n${ColourBold}Setting up unit tests${ColourReset}") | |
40 | add_subdirectory(tests) |
0 | GNU Affero General Public License | |
1 | ================================= | |
2 | ||
3 | *Version 3, 19 November 2007* | |
4 | *Copyright © 2007 Free Software Foundation, In* <http://fsf.org> | |
5 | ||
6 | Everyone is permitted to copy and distribute verbatim copies | |
7 | of this license document, but changing it is not allowed. | |
8 | ||
9 | Preamble | |
10 | -------- | |
11 | ||
12 | The GNU Affero General Public License is a free, copyleft license for | |
13 | software and other kinds of works, specifically designed to ensure | |
14 | cooperation with the community in the case of network server software. | |
15 | ||
16 | The licenses for most software and other practical works are designed | |
17 | to take away your freedom to share and change the works. By contrast, | |
18 | our General Public Licenses are intended to guarantee your freedom to | |
19 | share and change all versions of a program--to make sure it remains free | |
20 | software for all its users. | |
21 | ||
22 | When we speak of free software, we are referring to freedom, not | |
23 | price. Our General Public Licenses are designed to make sure that you | |
24 | have the freedom to distribute copies of free software (and charge for | |
25 | them if you wish), that you receive source code or can get it if you | |
26 | want it, that you can change the software or use pieces of it in new | |
27 | free programs, and that you know you can do these things. | |
28 | ||
29 | Developers that use our General Public Licenses protect your rights | |
30 | with two steps: **(1)** assert copyright on the software, and **(2)** offer | |
31 | you this License which gives you legal permission to copy, distribute | |
32 | and/or modify the software. | |
33 | ||
34 | A secondary benefit of defending all users' freedom is that | |
35 | improvements made in alternate versions of the program, if they | |
36 | receive widespread use, become available for other developers to | |
37 | incorporate. Many developers of free software are heartened and | |
38 | encouraged by the resulting cooperation. However, in the case of | |
39 | software used on network servers, this result may fail to come about. | |
40 | The GNU General Public License permits making a modified version and | |
41 | letting the public access it on a server without ever releasing its | |
42 | source code to the public. | |
43 | ||
44 | The GNU Affero General Public License is designed specifically to | |
45 | ensure that, in such cases, the modified source code becomes available | |
46 | to the community. It requires the operator of a network server to | |
47 | provide the source code of the modified version running there to the | |
48 | users of that server. Therefore, public use of a modified version, on | |
49 | a publicly accessible server, gives the public access to the source | |
50 | code of the modified version. | |
51 | ||
52 | An older license, called the Affero General Public License and | |
53 | published by Affero, was designed to accomplish similar goals. This is | |
54 | a different license, not a version of the Affero GPL, but Affero has | |
55 | released a new version of the Affero GPL which permits relicensing under | |
56 | this license. | |
57 | ||
58 | The precise terms and conditions for copying, distribution and | |
59 | modification follow. | |
60 | ||
61 | TERMS AND CONDITIONS | |
62 | -------------------- | |
63 | ||
64 | 0. Definitions | |
65 | ~~~~~~~~~~~~~~ | |
66 | ||
67 | "This License" refers to version 3 of the GNU Affero General Public License. | |
68 | ||
69 | "Copyright" also means copyright-like laws that apply to other kinds of | |
70 | works, such as semiconductor masks. | |
71 | ||
72 | "The Program" refers to any copyrightable work licensed under this | |
73 | License. Each licensee is addressed as "you". "Licensees" and | |
74 | "recipients" may be individuals or organizations. | |
75 | ||
76 | To "modify" a work means to copy from or adapt all or part of the work | |
77 | in a fashion requiring copyright permission, other than the making of an | |
78 | exact copy. The resulting work is called a "modified version" of the | |
79 | earlier work or a work "based on" the earlier work. | |
80 | ||
81 | A "covered work" means either the unmodified Program or a work based | |
82 | on the Program. | |
83 | ||
84 | To "propagate" a work means to do anything with it that, without | |
85 | permission, would make you directly or secondarily liable for | |
86 | infringement under applicable copyright law, except executing it on a | |
87 | computer or modifying a private copy. Propagation includes copying, | |
88 | distribution (with or without modification), making available to the | |
89 | public, and in some countries other activities as well. | |
90 | ||
91 | To "convey" a work means any kind of propagation that enables other | |
92 | parties to make or receive copies. Mere interaction with a user through | |
93 | a computer network, with no transfer of a copy, is not conveying. | |
94 | ||
95 | An interactive user interface displays "Appropriate Legal Notices" | |
96 | to the extent that it includes a convenient and prominently visible | |
97 | feature that **(1)** displays an appropriate copyright notice, and **(2)** | |
98 | tells the user that there is no warranty for the work (except to the | |
99 | extent that warranties are provided), that licensees may convey the | |
100 | work under this License, and how to view a copy of this License. If | |
101 | the interface presents a list of user commands or options, such as a | |
102 | menu, a prominent item in the list meets this criterion. | |
103 | ||
104 | 1. Source Code | |
105 | ~~~~~~~~~~~~~~ | |
106 | ||
107 | The "source code" for a work means the preferred form of the work | |
108 | for making modifications to it. "Object code" means any non-source | |
109 | form of a work. | |
110 | ||
111 | A "Standard Interface" means an interface that either is an official | |
112 | standard defined by a recognized standards body, or, in the case of | |
113 | interfaces specified for a particular programming language, one that | |
114 | is widely used among developers working in that language. | |
115 | ||
116 | The "System Libraries" of an executable work include anything, other | |
117 | than the work as a whole, that **(a)** is included in the normal form of | |
118 | packaging a Major Component, but which is not part of that Major | |
119 | Component, and **(b)** serves only to enable use of the work with that | |
120 | Major Component, or to implement a Standard Interface for which an | |
121 | implementation is available to the public in source code form. A | |
122 | "Major Component", in this context, means a major essential component | |
123 | (kernel, window system, and so on) of the specific operating system | |
124 | (if any) on which the executable work runs, or a compiler used to | |
125 | produce the work, or an object code interpreter used to run it. | |
126 | ||
127 | The "Corresponding Source" for a work in object code form means all | |
128 | the source code needed to generate, install, and (for an executable | |
129 | work) run the object code and to modify the work, including scripts to | |
130 | control those activities. However, it does not include the work's | |
131 | System Libraries, or general-purpose tools or generally available free | |
132 | programs which are used unmodified in performing those activities but | |
133 | which are not part of the work. For example, Corresponding Source | |
134 | includes interface definition files associated with source files for | |
135 | the work, and the source code for shared libraries and dynamically | |
136 | linked subprograms that the work is specifically designed to require, | |
137 | such as by intimate data communication or control flow between those | |
138 | subprograms and other parts of the work. | |
139 | ||
140 | The Corresponding Source need not include anything that users | |
141 | can regenerate automatically from other parts of the Corresponding | |
142 | Source. | |
143 | ||
144 | The Corresponding Source for a work in source code form is that | |
145 | same work. | |
146 | ||
147 | 2. Basic Permissions | |
148 | ~~~~~~~~~~~~~~~~~~~~ | |
149 | ||
150 | All rights granted under this License are granted for the term of | |
151 | copyright on the Program, and are irrevocable provided the stated | |
152 | conditions are met. This License explicitly affirms your unlimited | |
153 | permission to run the unmodified Program. The output from running a | |
154 | covered work is covered by this License only if the output, given its | |
155 | content, constitutes a covered work. This License acknowledges your | |
156 | rights of fair use or other equivalent, as provided by copyright law. | |
157 | ||
158 | You may make, run and propagate covered works that you do not | |
159 | convey, without conditions so long as your license otherwise remains | |
160 | in force. You may convey covered works to others for the sole purpose | |
161 | of having them make modifications exclusively for you, or provide you | |
162 | with facilities for running those works, provided that you comply with | |
163 | the terms of this License in conveying all material for which you do | |
164 | not control copyright. Those thus making or running the covered works | |
165 | for you must do so exclusively on your behalf, under your direction | |
166 | and control, on terms that prohibit them from making any copies of | |
167 | your copyrighted material outside their relationship with you. | |
168 | ||
169 | Conveying under any other circumstances is permitted solely under | |
170 | the conditions stated below. Sublicensing is not allowed; section 10 | |
171 | makes it unnecessary. | |
172 | ||
173 | 3. Protecting Users' Legal Rights From Anti-Circumvention Law | |
174 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
175 | ||
176 | No covered work shall be deemed part of an effective technological | |
177 | measure under any applicable law fulfilling obligations under article | |
178 | 11 of the WIPO copyright treaty adopted on 20 December 1996, or | |
179 | similar laws prohibiting or restricting circumvention of such | |
180 | measures. | |
181 | ||
182 | When you convey a covered work, you waive any legal power to forbid | |
183 | circumvention of technological measures to the extent such circumvention | |
184 | is effected by exercising rights under this License with respect to | |
185 | the covered work, and you disclaim any intention to limit operation or | |
186 | modification of the work as a means of enforcing, against the work's | |
187 | users, your or third parties' legal rights to forbid circumvention of | |
188 | technological measures. | |
189 | ||
190 | 4. Conveying Verbatim Copies | |
191 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
192 | ||
193 | You may convey verbatim copies of the Program's source code as you | |
194 | receive it, in any medium, provided that you conspicuously and | |
195 | appropriately publish on each copy an appropriate copyright notice; | |
196 | keep intact all notices stating that this License and any | |
197 | non-permissive terms added in accord with section 7 apply to the code; | |
198 | keep intact all notices of the absence of any warranty; and give all | |
199 | recipients a copy of this License along with the Program. | |
200 | ||
201 | You may charge any price or no price for each copy that you convey, | |
202 | and you may offer support or warranty protection for a fee. | |
203 | ||
204 | ### 5. Conveying Modified Source Versions | |
205 | ||
206 | You may convey a work based on the Program, or the modifications to | |
207 | produce it from the Program, in the form of source code under the | |
208 | terms of section 4, provided that you also meet all of these conditions: | |
209 | ||
210 | * **a)** The work must carry prominent notices stating that you modified | |
211 | it, and giving a relevant date. | |
212 | * **b)** The work must carry prominent notices stating that it is | |
213 | released under this License and any conditions added under section 7. | |
214 | This requirement modifies the requirement in section 4 to | |
215 | "keep intact all notices". | |
216 | * **c)** You must license the entire work, as a whole, under this | |
217 | License to anyone who comes into possession of a copy. This | |
218 | License will therefore apply, along with any applicable section 7 | |
219 | additional terms, to the whole of the work, and all its parts, | |
220 | regardless of how they are packaged. This License gives no | |
221 | permission to license the work in any other way, but it does not | |
222 | invalidate such permission if you have separately received it. | |
223 | * **d)** If the work has interactive user interfaces, each must display | |
224 | Appropriate Legal Notices; however, if the Program has interactive | |
225 | interfaces that do not display Appropriate Legal Notices, your | |
226 | work need not make them do so. | |
227 | ||
228 | A compilation of a covered work with other separate and independent | |
229 | works, which are not by their nature extensions of the covered work, | |
230 | and which are not combined with it such as to form a larger program, | |
231 | in or on a volume of a storage or distribution medium, is called an | |
232 | "aggregate" if the compilation and its resulting copyright are not | |
233 | used to limit the access or legal rights of the compilation's users | |
234 | beyond what the individual works permit. Inclusion of a covered work | |
235 | in an aggregate does not cause this License to apply to the other | |
236 | parts of the aggregate. | |
237 | ||
238 | 6. Conveying Non-Source Forms | |
239 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
240 | ||
241 | You may convey a covered work in object code form under the terms | |
242 | of sections 4 and 5, provided that you also convey the | |
243 | machine-readable Corresponding Source under the terms of this License, | |
244 | in one of these ways: | |
245 | ||
246 | * **a)** Convey the object code in, or embodied in, a physical product | |
247 | (including a physical distribution medium), accompanied by the | |
248 | Corresponding Source fixed on a durable physical medium | |
249 | customarily used for software interchange. | |
250 | * **b)** Convey the object code in, or embodied in, a physical product | |
251 | (including a physical distribution medium), accompanied by a | |
252 | written offer, valid for at least three years and valid for as | |
253 | long as you offer spare parts or customer support for that product | |
254 | model, to give anyone who possesses the object code either **(1)** a | |
255 | copy of the Corresponding Source for all the software in the | |
256 | product that is covered by this License, on a durable physical | |
257 | medium customarily used for software interchange, for a price no | |
258 | more than your reasonable cost of physically performing this | |
259 | conveying of source, or **(2)** access to copy the | |
260 | Corresponding Source from a network server at no charge. | |
261 | * **c)** Convey individual copies of the object code with a copy of the | |
262 | written offer to provide the Corresponding Source. This | |
263 | alternative is allowed only occasionally and noncommercially, and | |
264 | only if you received the object code with such an offer, in accord | |
265 | with subsection 6b. | |
266 | * **d)** Convey the object code by offering access from a designated | |
267 | place (gratis or for a charge), and offer equivalent access to the | |
268 | Corresponding Source in the same way through the same place at no | |
269 | further charge. You need not require recipients to copy the | |
270 | Corresponding Source along with the object code. If the place to | |
271 | copy the object code is a network server, the Corresponding Source | |
272 | may be on a different server (operated by you or a third party) | |
273 | that supports equivalent copying facilities, provided you maintain | |
274 | clear directions next to the object code saying where to find the | |
275 | Corresponding Source. Regardless of what server hosts the | |
276 | Corresponding Source, you remain obligated to ensure that it is | |
277 | available for as long as needed to satisfy these requirements. | |
278 | * **e)** Convey the object code using peer-to-peer transmission, provided | |
279 | you inform other peers where the object code and Corresponding | |
280 | Source of the work are being offered to the general public at no | |
281 | charge under subsection 6d. | |
282 | ||
283 | A separable portion of the object code, whose source code is excluded | |
284 | from the Corresponding Source as a System Library, need not be | |
285 | included in conveying the object code work. | |
286 | ||
287 | A "User Product" is either **(1)** a "consumer product", which means any | |
288 | tangible personal property which is normally used for personal, family, | |
289 | or household purposes, or **(2)** anything designed or sold for incorporation | |
290 | into a dwelling. In determining whether a product is a consumer product, | |
291 | doubtful cases shall be resolved in favor of coverage. For a particular | |
292 | product received by a particular user, "normally used" refers to a | |
293 | typical or common use of that class of product, regardless of the status | |
294 | of the particular user or of the way in which the particular user | |
295 | actually uses, or expects or is expected to use, the product. A product | |
296 | is a consumer product regardless of whether the product has substantial | |
297 | commercial, industrial or non-consumer uses, unless such uses represent | |
298 | the only significant mode of use of the product. | |
299 | ||
300 | "Installation Information" for a User Product means any methods, | |
301 | procedures, authorization keys, or other information required to install | |
302 | and execute modified versions of a covered work in that User Product from | |
303 | a modified version of its Corresponding Source. The information must | |
304 | suffice to ensure that the continued functioning of the modified object | |
305 | code is in no case prevented or interfered with solely because | |
306 | modification has been made. | |
307 | ||
308 | If you convey an object code work under this section in, or with, or | |
309 | specifically for use in, a User Product, and the conveying occurs as | |
310 | part of a transaction in which the right of possession and use of the | |
311 | User Product is transferred to the recipient in perpetuity or for a | |
312 | fixed term (regardless of how the transaction is characterized), the | |
313 | Corresponding Source conveyed under this section must be accompanied | |
314 | by the Installation Information. But this requirement does not apply | |
315 | if neither you nor any third party retains the ability to install | |
316 | modified object code on the User Product (for example, the work has | |
317 | been installed in ROM). | |
318 | ||
319 | The requirement to provide Installation Information does not include a | |
320 | requirement to continue to provide support service, warranty, or updates | |
321 | for a work that has been modified or installed by the recipient, or for | |
322 | the User Product in which it has been modified or installed. Access to a | |
323 | network may be denied when the modification itself materially and | |
324 | adversely affects the operation of the network or violates the rules and | |
325 | protocols for communication across the network. | |
326 | ||
327 | Corresponding Source conveyed, and Installation Information provided, | |
328 | in accord with this section must be in a format that is publicly | |
329 | documented (and with an implementation available to the public in | |
330 | source code form), and must require no special password or key for | |
331 | unpacking, reading or copying. | |
332 | ||
333 | 7. Additional Terms | |
334 | ~~~~~~~~~~~~~~~~~~~ | |
335 | ||
336 | "Additional permissions" are terms that supplement the terms of this | |
337 | License by making exceptions from one or more of its conditions. | |
338 | Additional permissions that are applicable to the entire Program shall | |
339 | be treated as though they were included in this License, to the extent | |
340 | that they are valid under applicable law. If additional permissions | |
341 | apply only to part of the Program, that part may be used separately | |
342 | under those permissions, but the entire Program remains governed by | |
343 | this License without regard to the additional permissions. | |
344 | ||
345 | When you convey a copy of a covered work, you may at your option | |
346 | remove any additional permissions from that copy, or from any part of | |
347 | it. (Additional permissions may be written to require their own | |
348 | removal in certain cases when you modify the work.) You may place | |
349 | additional permissions on material, added by you to a covered work, | |
350 | for which you have or can give appropriate copyright permission. | |
351 | ||
352 | Notwithstanding any other provision of this License, for material you | |
353 | add to a covered work, you may (if authorized by the copyright holders of | |
354 | that material) supplement the terms of this License with terms: | |
355 | ||
356 | * **a)** Disclaiming warranty or limiting liability differently from the | |
357 | terms of sections 15 and 16 of this License; or | |
358 | * **b)** Requiring preservation of specified reasonable legal notices or | |
359 | author attributions in that material or in the Appropriate Legal | |
360 | Notices displayed by works containing it; or | |
361 | * **c)** Prohibiting misrepresentation of the origin of that material, or | |
362 | requiring that modified versions of such material be marked in | |
363 | reasonable ways as different from the original version; or | |
364 | * **d)** Limiting the use for publicity purposes of names of licensors or | |
365 | authors of the material; or | |
366 | * **e)** Declining to grant rights under trademark law for use of some | |
367 | trade names, trademarks, or service marks; or | |
368 | * **f)** Requiring indemnification of licensors and authors of that | |
369 | material by anyone who conveys the material (or modified versions of | |
370 | it) with contractual assumptions of liability to the recipient, for | |
371 | any liability that these contractual assumptions directly impose on | |
372 | those licensors and authors. | |
373 | ||
374 | All other non-permissive additional terms are considered "further | |
375 | restrictions" within the meaning of section 10. If the Program as you | |
376 | received it, or any part of it, contains a notice stating that it is | |
377 | governed by this License along with a term that is a further | |
378 | restriction, you may remove that term. If a license document contains | |
379 | a further restriction but permits relicensing or conveying under this | |
380 | License, you may add to a covered work material governed by the terms | |
381 | of that license document, provided that the further restriction does | |
382 | not survive such relicensing or conveying. | |
383 | ||
384 | If you add terms to a covered work in accord with this section, you | |
385 | must place, in the relevant source files, a statement of the | |
386 | additional terms that apply to those files, or a notice indicating | |
387 | where to find the applicable terms. | |
388 | ||
389 | Additional terms, permissive or non-permissive, may be stated in the | |
390 | form of a separately written license, or stated as exceptions; | |
391 | the above requirements apply either way. | |
392 | ||
393 | 8. Termination | |
394 | ~~~~~~~~~~~~~~ | |
395 | ||
396 | You may not propagate or modify a covered work except as expressly | |
397 | provided under this License. Any attempt otherwise to propagate or | |
398 | modify it is void, and will automatically terminate your rights under | |
399 | this License (including any patent licenses granted under the third | |
400 | paragraph of section 11). | |
401 | ||
402 | However, if you cease all violation of this License, then your | |
403 | license from a particular copyright holder is reinstated **(a)** | |
404 | provisionally, unless and until the copyright holder explicitly and | |
405 | finally terminates your license, and **(b)** permanently, if the copyright | |
406 | holder fails to notify you of the violation by some reasonable means | |
407 | prior to 60 days after the cessation. | |
408 | ||
409 | Moreover, your license from a particular copyright holder is | |
410 | reinstated permanently if the copyright holder notifies you of the | |
411 | violation by some reasonable means, this is the first time you have | |
412 | received notice of violation of this License (for any work) from that | |
413 | copyright holder, and you cure the violation prior to 30 days after | |
414 | your receipt of the notice. | |
415 | ||
416 | Termination of your rights under this section does not terminate the | |
417 | licenses of parties who have received copies or rights from you under | |
418 | this License. If your rights have been terminated and not permanently | |
419 | reinstated, you do not qualify to receive new licenses for the same | |
420 | material under section 10. | |
421 | ||
422 | 9. Acceptance Not Required for Having Copies | |
423 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
424 | ||
425 | You are not required to accept this License in order to receive or | |
426 | run a copy of the Program. Ancillary propagation of a covered work | |
427 | occurring solely as a consequence of using peer-to-peer transmission | |
428 | to receive a copy likewise does not require acceptance. However, | |
429 | nothing other than this License grants you permission to propagate or | |
430 | modify any covered work. These actions infringe copyright if you do | |
431 | not accept this License. Therefore, by modifying or propagating a | |
432 | covered work, you indicate your acceptance of this License to do so. | |
433 | ||
434 | 10. Automatic Licensing of Downstream Recipients | |
435 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
436 | ||
437 | Each time you convey a covered work, the recipient automatically | |
438 | receives a license from the original licensors, to run, modify and | |
439 | propagate that work, subject to this License. You are not responsible | |
440 | for enforcing compliance by third parties with this License. | |
441 | ||
442 | An "entity transaction" is a transaction transferring control of an | |
443 | organization, or substantially all assets of one, or subdividing an | |
444 | organization, or merging organizations. If propagation of a covered | |
445 | work results from an entity transaction, each party to that | |
446 | transaction who receives a copy of the work also receives whatever | |
447 | licenses to the work the party's predecessor in interest had or could | |
448 | give under the previous paragraph, plus a right to possession of the | |
449 | Corresponding Source of the work from the predecessor in interest, if | |
450 | the predecessor has it or can get it with reasonable efforts. | |
451 | ||
452 | You may not impose any further restrictions on the exercise of the | |
453 | rights granted or affirmed under this License. For example, you may | |
454 | not impose a license fee, royalty, or other charge for exercise of | |
455 | rights granted under this License, and you may not initiate litigation | |
456 | (including a cross-claim or counterclaim in a lawsuit) alleging that | |
457 | any patent claim is infringed by making, using, selling, offering for | |
458 | sale, or importing the Program or any portion of it. | |
459 | ||
460 | 11. Patents | |
461 | ~~~~~~~~~~~ | |
462 | ||
463 | A "contributor" is a copyright holder who authorizes use under this | |
464 | License of the Program or a work on which the Program is based. The | |
465 | work thus licensed is called the contributor's "contributor version". | |
466 | ||
467 | A contributor's "essential patent claims" are all patent claims | |
468 | owned or controlled by the contributor, whether already acquired or | |
469 | hereafter acquired, that would be infringed by some manner, permitted | |
470 | by this License, of making, using, or selling its contributor version, | |
471 | but do not include claims that would be infringed only as a | |
472 | consequence of further modification of the contributor version. For | |
473 | purposes of this definition, "control" includes the right to grant | |
474 | patent sublicenses in a manner consistent with the requirements of | |
475 | this License. | |
476 | ||
477 | Each contributor grants you a non-exclusive, worldwide, royalty-free | |
478 | patent license under the contributor's essential patent claims, to | |
479 | make, use, sell, offer for sale, import and otherwise run, modify and | |
480 | propagate the contents of its contributor version. | |
481 | ||
482 | In the following three paragraphs, a "patent license" is any express | |
483 | agreement or commitment, however denominated, not to enforce a patent | |
484 | (such as an express permission to practice a patent or covenant not to | |
485 | sue for patent infringement). To "grant" such a patent license to a | |
486 | party means to make such an agreement or commitment not to enforce a | |
487 | patent against the party. | |
488 | ||
489 | If you convey a covered work, knowingly relying on a patent license, | |
490 | and the Corresponding Source of the work is not available for anyone | |
491 | to copy, free of charge and under the terms of this License, through a | |
492 | publicly available network server or other readily accessible means, | |
493 | then you must either **(1)** cause the Corresponding Source to be so | |
494 | available, or **(2)** arrange to deprive yourself of the benefit of the | |
495 | patent license for this particular work, or **(3)** arrange, in a manner | |
496 | consistent with the requirements of this License, to extend the patent | |
497 | license to downstream recipients. "Knowingly relying" means you have | |
498 | actual knowledge that, but for the patent license, your conveying the | |
499 | covered work in a country, or your recipient's use of the covered work | |
500 | in a country, would infringe one or more identifiable patents in that | |
501 | country that you have reason to believe are valid. | |
502 | ||
503 | If, pursuant to or in connection with a single transaction or | |
504 | arrangement, you convey, or propagate by procuring conveyance of, a | |
505 | covered work, and grant a patent license to some of the parties | |
506 | receiving the covered work authorizing them to use, propagate, modify | |
507 | or convey a specific copy of the covered work, then the patent license | |
508 | you grant is automatically extended to all recipients of the covered | |
509 | work and works based on it. | |
510 | ||
511 | A patent license is "discriminatory" if it does not include within | |
512 | the scope of its coverage, prohibits the exercise of, or is | |
513 | conditioned on the non-exercise of one or more of the rights that are | |
514 | specifically granted under this License. You may not convey a covered | |
515 | work if you are a party to an arrangement with a third party that is | |
516 | in the business of distributing software, under which you make payment | |
517 | to the third party based on the extent of your activity of conveying | |
518 | the work, and under which the third party grants, to any of the | |
519 | parties who would receive the covered work from you, a discriminatory | |
520 | patent license **(a)** in connection with copies of the covered work | |
521 | conveyed by you (or copies made from those copies), or **(b)** primarily | |
522 | for and in connection with specific products or compilations that | |
523 | contain the covered work, unless you entered into that arrangement, | |
524 | or that patent license was granted, prior to 28 March 2007. | |
525 | ||
526 | Nothing in this License shall be construed as excluding or limiting | |
527 | any implied license or other defenses to infringement that may | |
528 | otherwise be available to you under applicable patent law. | |
529 | ||
530 | 12. No Surrender of Others' Freedom | |
531 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
532 | ||
533 | If conditions are imposed on you (whether by court order, agreement or | |
534 | otherwise) that contradict the conditions of this License, they do not | |
535 | excuse you from the conditions of this License. If you cannot convey a | |
536 | covered work so as to satisfy simultaneously your obligations under this | |
537 | License and any other pertinent obligations, then as a consequence you may | |
538 | not convey it at all. For example, if you agree to terms that obligate you | |
539 | to collect a royalty for further conveying from those to whom you convey | |
540 | the Program, the only way you could satisfy both those terms and this | |
541 | License would be to refrain entirely from conveying the Program. | |
542 | ||
543 | 13. Remote Network Interaction; Use with the GNU General Public License | |
544 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
545 | ||
546 | Notwithstanding any other provision of this License, if you modify the | |
547 | Program, your modified version must prominently offer all users | |
548 | interacting with it remotely through a computer network (if your version | |
549 | supports such interaction) an opportunity to receive the Corresponding | |
550 | Source of your version by providing access to the Corresponding Source | |
551 | from a network server at no charge, through some standard or customary | |
552 | means of facilitating copying of software. This Corresponding Source | |
553 | shall include the Corresponding Source for any work covered by version 3 | |
554 | of the GNU General Public License that is incorporated pursuant to the | |
555 | following paragraph. | |
556 | ||
557 | Notwithstanding any other provision of this License, you have | |
558 | permission to link or combine any covered work with a work licensed | |
559 | under version 3 of the GNU General Public License into a single | |
560 | combined work, and to convey the resulting work. The terms of this | |
561 | License will continue to apply to the part which is the covered work, | |
562 | but the work with which it is combined will remain governed by version | |
563 | 3 of the GNU General Public License. | |
564 | ||
565 | 14. Revised Versions of this License | |
566 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
567 | ||
568 | The Free Software Foundation may publish revised and/or new versions of | |
569 | the GNU Affero General Public License from time to time. Such new versions | |
570 | will be similar in spirit to the present version, but may differ in detail to | |
571 | address new problems or concerns. | |
572 | ||
573 | Each version is given a distinguishing version number. If the | |
574 | Program specifies that a certain numbered version of the GNU Affero General | |
575 | Public License "or any later version" applies to it, you have the | |
576 | option of following the terms and conditions either of that numbered | |
577 | version or of any later version published by the Free Software | |
578 | Foundation. If the Program does not specify a version number of the | |
579 | GNU Affero General Public License, you may choose any version ever published | |
580 | by the Free Software Foundation. | |
581 | ||
582 | If the Program specifies that a proxy can decide which future | |
583 | versions of the GNU Affero General Public License can be used, that proxy's | |
584 | public statement of acceptance of a version permanently authorizes you | |
585 | to choose that version for the Program. | |
586 | ||
587 | Later license versions may give you additional or different | |
588 | permissions. However, no additional obligations are imposed on any | |
589 | author or copyright holder as a result of your choosing to follow a | |
590 | later version. | |
591 | ||
592 | 15. Disclaimer of Warranty | |
593 | ~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
594 | ||
595 | THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY | |
596 | APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT | |
597 | HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY | |
598 | OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, | |
599 | THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR | |
600 | PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM | |
601 | IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF | |
602 | ALL NECESSARY SERVICING, REPAIR OR CORRECTION. | |
603 | ||
604 | 16. Limitation of Liability | |
605 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
606 | ||
607 | IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING | |
608 | WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS | |
609 | THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY | |
610 | GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE | |
611 | USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF | |
612 | DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD | |
613 | PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), | |
614 | EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF | |
615 | SUCH DAMAGES. | |
616 | ||
617 | 17. Interpretation of Sections 15 and 16 | |
618 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
619 | ||
620 | If the disclaimer of warranty and limitation of liability provided | |
621 | above cannot be given local legal effect according to their terms, | |
622 | reviewing courts shall apply local law that most closely approximates | |
623 | an absolute waiver of all civil liability in connection with the | |
624 | Program, unless a warranty or assumption of liability accompanies a | |
625 | copy of the Program in return for a fee. | |
626 | ||
627 | *END OF TERMS AND CONDITIONS* | |
628 | ||
629 | How to Apply These Terms to Your New Programs | |
630 | --------------------------------------------- | |
631 | ||
632 | If you develop a new program, and you want it to be of the greatest | |
633 | possible use to the public, the best way to achieve this is to make it | |
634 | free software which everyone can redistribute and change under these terms. | |
635 | ||
636 | To do so, attach the following notices to the program. It is safest | |
637 | to attach them to the start of each source file to most effectively | |
638 | state the exclusion of warranty; and each file should have at least | |
639 | the "copyright" line and a pointer to where the full notice is found. | |
640 | ||
641 | | <one line to give the program's name and a brief idea of what it does.> | |
642 | | Copyright (C) <year> <name of author> | |
643 | | | |
644 | | This program is free software: you can redistribute it and/or modify | |
645 | | it under the terms of the GNU Affero General Public License as published by | |
646 | | the Free Software Foundation, either version 3 of the License, or | |
647 | | (at your option) any later version. | |
648 | | | |
649 | | This program is distributed in the hope that it will be useful, | |
650 | | but WITHOUT ANY WARRANTY; without even the implied warranty of | |
651 | | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | |
652 | | GNU Affero General Public License for more details. | |
653 | | | |
654 | | You should have received a copy of the GNU Affero General Public License | |
655 | | along with this program. If not, see <http://www.gnu.org/licenses/>. | |
656 | ||
657 | Also add information on how to contact you by electronic and paper mail. | |
658 | ||
659 | If your software can interact with users remotely through a computer | |
660 | network, you should also make sure that it provides a way for users to | |
661 | get its source. For example, if your program is a web application, its | |
662 | interface could display a "Source" link that leads users to an archive | |
663 | of the code. There are many ways you could offer source, and different | |
664 | solutions will be better for different programs; see section 13 for the | |
665 | specific requirements. | |
666 | ||
667 | You should also get your employer (if you work as a programmer) or school, | |
668 | if any, to sign a "copyright disclaimer" for the program, if necessary. | |
669 | For more information on this, and how to apply and follow the GNU AGPL, see | |
670 | <http://www.gnu.org/licenses/>. |
0 | GNU GENERAL PUBLIC LICENSE | |
1 | ========================== | |
2 | ||
3 | Version 3, 29 June 2007 | |
4 | ||
5 | Copyright (C) 2007 `Free Software Foundation, Inc. <http://fsf.org/>`_ | |
6 | ||
7 | Everyone is permitted to copy and distribute verbatim copies of this | |
8 | license document, but changing it is not allowed. | |
9 | ||
10 | Preamble | |
11 | -------- | |
12 | ||
13 | The GNU General Public License is a free, copyleft license for software | |
14 | and other kinds of works. | |
15 | ||
16 | The licenses for most software and other practical works are designed to | |
17 | take away your freedom to share and change the works. By contrast, the | |
18 | GNU General Public License is intended to guarantee your freedom to | |
19 | share and change all versions of a program--to make sure it remains free | |
20 | software for all its users. We, the Free Software Foundation, use the | |
21 | GNU General Public License for most of our software; it applies also to | |
22 | any other work released this way by its authors. You can apply it to | |
23 | your programs, too. | |
24 | ||
25 | When we speak of free software, we are referring to freedom, not price. | |
26 | Our General Public Licenses are designed to make sure that you have the | |
27 | freedom to distribute copies of free software (and charge for them if | |
28 | you wish), that you receive source code or can get it if you want it, | |
29 | that you can change the software or use pieces of it in new free | |
30 | programs, and that you know you can do these things. | |
31 | ||
32 | To protect your rights, we need to prevent others from denying you these | |
33 | rights or asking you to surrender the rights. Therefore, you have | |
34 | certain responsibilities if you distribute copies of the software, or if | |
35 | you modify it: responsibilities to respect the freedom of others. | |
36 | ||
37 | For example, if you distribute copies of such a program, whether gratis | |
38 | or for a fee, you must pass on to the recipients the same freedoms that | |
39 | you received. You must make sure that they, too, receive or can get the | |
40 | source code. And you must show them these terms so they know their | |
41 | rights. | |
42 | ||
43 | Developers that use the GNU GPL protect your rights with two steps: | |
44 | ||
45 | 1. assert copyright on the software, and | |
46 | 2. offer you this License giving you legal permission to copy, | |
47 | distribute and/or modify it. | |
48 | ||
49 | For the developers' and authors' protection, the GPL clearly explains | |
50 | that there is no warranty for this free software. For both users' and | |
51 | authors' sake, the GPL requires that modified versions be marked as | |
52 | changed, so that their problems will not be attributed erroneously to | |
53 | authors of previous versions. | |
54 | ||
55 | Some devices are designed to deny users access to install or run | |
56 | modified versions of the software inside them, although the manufacturer | |
57 | can do so. This is fundamentally incompatible with the aim of protecting | |
58 | users' freedom to change the software. The systematic pattern of such | |
59 | abuse occurs in the area of products for individuals to use, which is | |
60 | precisely where it is most unacceptable. Therefore, we have designed | |
61 | this version of the GPL to prohibit the practice for those products. If | |
62 | such problems arise substantially in other domains, we stand ready to | |
63 | extend this provision to those domains in future versions of the GPL, as | |
64 | needed to protect the freedom of users. | |
65 | ||
66 | Finally, every program is threatened constantly by software patents. | |
67 | States should not allow patents to restrict development and use of | |
68 | software on general-purpose computers, but in those that do, we wish to | |
69 | avoid the special danger that patents applied to a free program could | |
70 | make it effectively proprietary. To prevent this, the GPL assures that | |
71 | patents cannot be used to render the program non-free. | |
72 | ||
73 | The precise terms and conditions for copying, distribution and | |
74 | modification follow. | |
75 | ||
76 | TERMS AND CONDITIONS | |
77 | -------------------- | |
78 | ||
79 | 0. Definitions. | |
80 | ~~~~~~~~~~~~~~~ | |
81 | ||
82 | *This License* refers to version 3 of the GNU General Public License. | |
83 | ||
84 | *Copyright* also means copyright-like laws that apply to other kinds of | |
85 | works, such as semiconductor masks. | |
86 | ||
87 | *The Program* refers to any copyrightable work licensed under this | |
88 | License. Each licensee is addressed as *you*. *Licensees* and | |
89 | *recipients* may be individuals or organizations. | |
90 | ||
91 | To *modify* a work means to copy from or adapt all or part of the work | |
92 | in a fashion requiring copyright permission, other than the making of an | |
93 | exact copy. The resulting work is called a *modified version* of the | |
94 | earlier work or a work *based on* the earlier work. | |
95 | ||
96 | A *covered work* means either the unmodified Program or a work based on | |
97 | the Program. | |
98 | ||
99 | To *propagate* a work means to do anything with it that, without | |
100 | permission, would make you directly or secondarily liable for | |
101 | infringement under applicable copyright law, except executing it on a | |
102 | computer or modifying a private copy. Propagation includes copying, | |
103 | distribution (with or without modification), making available to the | |
104 | public, and in some countries other activities as well. | |
105 | ||
106 | To *convey* a work means any kind of propagation that enables other | |
107 | parties to make or receive copies. Mere interaction with a user through | |
108 | a computer network, with no transfer of a copy, is not conveying. | |
109 | ||
110 | An interactive user interface displays *Appropriate Legal Notices* to | |
111 | the extent that it includes a convenient and prominently visible feature | |
112 | that | |
113 | ||
114 | 1. displays an appropriate copyright notice, and | |
115 | 2. tells the user that there is no warranty for the work (except to the | |
116 | extent that warranties are provided), that licensees may convey the | |
117 | work under this License, and how to view a copy of this License. | |
118 | ||
119 | If the interface presents a list of user commands or options, such as a | |
120 | menu, a prominent item in the list meets this criterion. | |
121 | ||
122 | 1. Source Code. | |
123 | ~~~~~~~~~~~~~~~ | |
124 | ||
125 | The *source code* for a work means the preferred form of the work for | |
126 | making modifications to it. *Object code* means any non-source form of a | |
127 | work. | |
128 | ||
129 | A *Standard Interface* means an interface that either is an official | |
130 | standard defined by a recognized standards body, or, in the case of | |
131 | interfaces specified for a particular programming language, one that is | |
132 | widely used among developers working in that language. | |
133 | ||
134 | The *System Libraries* of an executable work include anything, other | |
135 | than the work as a whole, that (a) is included in the normal form of | |
136 | packaging a Major Component, but which is not part of that Major | |
137 | Component, and (b) serves only to enable use of the work with that Major | |
138 | Component, or to implement a Standard Interface for which an | |
139 | implementation is available to the public in source code form. A *Major | |
140 | Component*, in this context, means a major essential component (kernel, | |
141 | window system, and so on) of the specific operating system (if any) on | |
142 | which the executable work runs, or a compiler used to produce the work, | |
143 | or an object code interpreter used to run it. | |
144 | ||
145 | The *Corresponding Source* for a work in object code form means all the | |
146 | source code needed to generate, install, and (for an executable work) | |
147 | run the object code and to modify the work, including scripts to control | |
148 | those activities. However, it does not include the work's System | |
149 | Libraries, or general-purpose tools or generally available free programs | |
150 | which are used unmodified in performing those activities but which are | |
151 | not part of the work. For example, Corresponding Source includes | |
152 | interface definition files associated with source files for the work, | |
153 | and the source code for shared libraries and dynamically linked | |
154 | subprograms that the work is specifically designed to require, such as | |
155 | by intimate data communication or control flow between those subprograms | |
156 | and other parts of the work. | |
157 | ||
158 | The Corresponding Source need not include anything that users can | |
159 | regenerate automatically from other parts of the Corresponding Source. | |
160 | ||
161 | The Corresponding Source for a work in source code form is that same | |
162 | work. | |
163 | ||
164 | 2. Basic Permissions. | |
165 | ~~~~~~~~~~~~~~~~~~~~~ | |
166 | ||
167 | All rights granted under this License are granted for the term of | |
168 | copyright on the Program, and are irrevocable provided the stated | |
169 | conditions are met. This License explicitly affirms your unlimited | |
170 | permission to run the unmodified Program. The output from running a | |
171 | covered work is covered by this License only if the output, given its | |
172 | content, constitutes a covered work. This License acknowledges your | |
173 | rights of fair use or other equivalent, as provided by copyright law. | |
174 | ||
175 | You may make, run and propagate covered works that you do not convey, | |
176 | without conditions so long as your license otherwise remains in force. | |
177 | You may convey covered works to others for the sole purpose of having | |
178 | them make modifications exclusively for you, or provide you with | |
179 | facilities for running those works, provided that you comply with the | |
180 | terms of this License in conveying all material for which you do not | |
181 | control copyright. Those thus making or running the covered works for | |
182 | you must do so exclusively on your behalf, under your direction and | |
183 | control, on terms that prohibit them from making any copies of your | |
184 | copyrighted material outside their relationship with you. | |
185 | ||
186 | Conveying under any other circumstances is permitted solely under the | |
187 | conditions stated below. Sublicensing is not allowed; section 10 makes | |
188 | it unnecessary. | |
189 | ||
190 | 3. Protecting Users' Legal Rights From Anti-Circumvention Law. | |
191 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
192 | ||
193 | No covered work shall be deemed part of an effective technological | |
194 | measure under any applicable law fulfilling obligations under article 11 | |
195 | of the WIPO copyright treaty adopted on 20 December 1996, or similar | |
196 | laws prohibiting or restricting circumvention of such measures. | |
197 | ||
198 | When you convey a covered work, you waive any legal power to forbid | |
199 | circumvention of technological measures to the extent such circumvention | |
200 | is effected by exercising rights under this License with respect to the | |
201 | covered work, and you disclaim any intention to limit operation or | |
202 | modification of the work as a means of enforcing, against the work's | |
203 | users, your or third parties' legal rights to forbid circumvention of | |
204 | technological measures. | |
205 | ||
206 | 4. Conveying Verbatim Copies. | |
207 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
208 | ||
209 | You may convey verbatim copies of the Program's source code as you | |
210 | receive it, in any medium, provided that you conspicuously and | |
211 | appropriately publish on each copy an appropriate copyright notice; keep | |
212 | intact all notices stating that this License and any non-permissive | |
213 | terms added in accord with section 7 apply to the code; keep intact all | |
214 | notices of the absence of any warranty; and give all recipients a copy | |
215 | of this License along with the Program. | |
216 | ||
217 | You may charge any price or no price for each copy that you convey, and | |
218 | you may offer support or warranty protection for a fee. | |
219 | ||
220 | 5. Conveying Modified Source Versions. | |
221 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
222 | ||
223 | You may convey a work based on the Program, or the modifications to | |
224 | produce it from the Program, in the form of source code under the terms | |
225 | of section 4, provided that you also meet all of these conditions: | |
226 | ||
227 | - a) The work must carry prominent notices stating that you modified it, | |
228 | and giving a relevant date. - b) The work must carry prominent notices | |
229 | stating that it is released under this License and any conditions added | |
230 | under section 7. This requirement modifies the requirement in section 4 | |
231 | to *keep intact all notices*. - c) You must license the entire work, as | |
232 | a whole, under this License to anyone who comes into possession of a | |
233 | copy. This License will therefore apply, along with any applicable | |
234 | section 7 additional terms, to the whole of the work, and all its parts, | |
235 | regardless of how they are packaged. This License gives no permission to | |
236 | license the work in any other way, but it does not invalidate such | |
237 | permission if you have separately received it. - d) If the work has | |
238 | interactive user interfaces, each must display Appropriate Legal | |
239 | Notices; however, if the Program has interactive interfaces that do not | |
240 | display Appropriate Legal Notices, your work need not make them do so. | |
241 | ||
242 | A compilation of a covered work with other separate and independent | |
243 | works, which are not by their nature extensions of the covered work, and | |
244 | which are not combined with it such as to form a larger program, in or | |
245 | on a volume of a storage or distribution medium, is called an | |
246 | *aggregate* if the compilation and its resulting copyright are not used | |
247 | to limit the access or legal rights of the compilation's users beyond | |
248 | what the individual works permit. Inclusion of a covered work in an | |
249 | aggregate does not cause this License to apply to the other parts of the | |
250 | aggregate. | |
251 | ||
252 | 6. Conveying Non-Source Forms. | |
253 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
254 | ||
255 | You may convey a covered work in object code form under the terms of | |
256 | sections 4 and 5, provided that you also convey the machine-readable | |
257 | Corresponding Source under the terms of this License, in one of these | |
258 | ways: | |
259 | ||
260 | - a) Convey the object code in, or embodied in, a physical product | |
261 | (including a physical distribution medium), accompanied by the | |
262 | Corresponding Source fixed on a durable physical medium customarily used | |
263 | for software interchange. - b) Convey the object code in, or embodied | |
264 | in, a physical product (including a physical distribution medium), | |
265 | accompanied by a written offer, valid for at least three years and valid | |
266 | for as long as you offer spare parts or customer support for that | |
267 | product model, to give anyone who possesses the object code either | |
268 | 1. a copy of the Corresponding Source for all the software in the | |
269 | product that is covered by this License, on a durable physical medium | |
270 | customarily used for software interchange, for a price no more than your | |
271 | reasonable cost of physically performing this conveying of source, or 2. | |
272 | access to copy the Corresponding Source from a network server at no | |
273 | charge. | |
274 | ||
275 | - c) Convey individual copies of the object code with a copy of the | |
276 | written offer to provide the Corresponding Source. This alternative is | |
277 | allowed only occasionally and noncommercially, and only if you received | |
278 | the object code with such an offer, in accord with subsection 6b. - d) | |
279 | Convey the object code by offering access from a designated place | |
280 | (gratis or for a charge), and offer equivalent access to the | |
281 | Corresponding Source in the same way through the same place at no | |
282 | further charge. You need not require recipients to copy the | |
283 | Corresponding Source along with the object code. If the place to copy | |
284 | the object code is a network server, the Corresponding Source may be on | |
285 | a different server operated by you or a third party) that supports | |
286 | equivalent copying facilities, provided you maintain clear directions | |
287 | next to the object code saying where to find the Corresponding Source. | |
288 | Regardless of what server hosts the Corresponding Source, you remain | |
289 | obligated to ensure that it is available for as long as needed to | |
290 | satisfy these requirements. - e) Convey the object code using | |
291 | peer-to-peer transmission, provided you inform other peers where the | |
292 | object code and Corresponding Source of the work are being offered to | |
293 | the general public at no charge under subsection 6d. | |
294 | ||
295 | A separable portion of the object code, whose source code is excluded | |
296 | from the Corresponding Source as a System Library, need not be included | |
297 | in conveying the object code work. | |
298 | ||
299 | A *User Product* is either | |
300 | ||
301 | 1. a *consumer product*, which means any tangible personal property | |
302 | which is normally used for personal, family, or household purposes, | |
303 | or | |
304 | 2. anything designed or sold for incorporation into a dwelling. | |
305 | ||
306 | In determining whether a product is a consumer product, doubtful cases | |
307 | shall be resolved in favor of coverage. For a particular product | |
308 | received by a particular user, *normally used* refers to a typical or | |
309 | common use of that class of product, regardless of the status of the | |
310 | particular user or of the way in which the particular user actually | |
311 | uses, or expects or is expected to use, the product. A product is a | |
312 | consumer product regardless of whether the product has substantial | |
313 | commercial, industrial or non-consumer uses, unless such uses represent | |
314 | the only significant mode of use of the product. | |
315 | ||
316 | *Installation Information* for a User Product means any methods, | |
317 | procedures, authorization keys, or other information required to install | |
318 | and execute modified versions of a covered work in that User Product | |
319 | from a modified version of its Corresponding Source. The information | |
320 | must suffice to ensure that the continued functioning of the modified | |
321 | object code is in no case prevented or interfered with solely because | |
322 | modification has been made. | |
323 | ||
324 | If you convey an object code work under this section in, or with, or | |
325 | specifically for use in, a User Product, and the conveying occurs as | |
326 | part of a transaction in which the right of possession and use of the | |
327 | User Product is transferred to the recipient in perpetuity or for a | |
328 | fixed term (regardless of how the transaction is characterized), the | |
329 | Corresponding Source conveyed under this section must be accompanied by | |
330 | the Installation Information. But this requirement does not apply if | |
331 | neither you nor any third party retains the ability to install modified | |
332 | object code on the User Product (for example, the work has been | |
333 | installed in ROM). | |
334 | ||
335 | The requirement to provide Installation Information does not include a | |
336 | requirement to continue to provide support service, warranty, or updates | |
337 | for a work that has been modified or installed by the recipient, or for | |
338 | the User Product in which it has been modified or installed. Access to a | |
339 | network may be denied when the modification itself materially and | |
340 | adversely affects the operation of the network or violates the rules and | |
341 | protocols for communication across the network. | |
342 | ||
343 | Corresponding Source conveyed, and Installation Information provided, in | |
344 | accord with this section must be in a format that is publicly documented | |
345 | (and with an implementation available to the public in source code | |
346 | form), and must require no special password or key for unpacking, | |
347 | reading or copying. | |
348 | ||
349 | 7. Additional Terms. | |
350 | ~~~~~~~~~~~~~~~~~~~~ | |
351 | ||
352 | *Additional permissions* are terms that supplement the terms of this | |
353 | License by making exceptions from one or more of its conditions. | |
354 | Additional permissions that are applicable to the entire Program shall | |
355 | be treated as though they were included in this License, to the extent | |
356 | that they are valid under applicable law. If additional permissions | |
357 | apply only to part of the Program, that part may be used separately | |
358 | under those permissions, but the entire Program remains governed by this | |
359 | License without regard to the additional permissions. | |
360 | ||
361 | When you convey a copy of a covered work, you may at your option remove | |
362 | any additional permissions from that copy, or from any part of it. | |
363 | (Additional permissions may be written to require their own removal in | |
364 | certain cases when you modify the work.) You may place additional | |
365 | permissions on material, added by you to a covered work, for which you | |
366 | have or can give appropriate copyright permission. | |
367 | ||
368 | Notwithstanding any other provision of this License, for material you | |
369 | add to a covered work, you may (if authorized by the copyright holders | |
370 | of that material) supplement the terms of this License with terms: | |
371 | ||
372 | a. Disclaiming warranty or limiting liability differently from the terms | |
373 | of sections 15 and 16 of this License; or | |
374 | b. Requiring preservation of specified reasonable legal notices or | |
375 | author attributions in that material or in the Appropriate Legal | |
376 | Notices displayed by works containing it; or | |
377 | c. Prohibiting misrepresentation of the origin of that material, or | |
378 | requiring that modified versions of such material be marked in | |
379 | reasonable ways as different from the original version; or | |
380 | d. Limiting the use for publicity purposes of names of licensors or | |
381 | authors of the material; or | |
382 | e. Declining to grant rights under trademark law for use of some trade | |
383 | names, trademarks, or service marks; or | |
384 | f. Requiring indemnification of licensors and authors of that material | |
385 | by anyone who conveys the material (or modified versions of it) with | |
386 | contractual assumptions of liability to the recipient, for any | |
387 | liability that these contractual assumptions directly impose on those | |
388 | licensors and authors. | |
389 | ||
390 | All other non-permissive additional terms are considered *further | |
391 | restrictions* within the meaning of section 10. If the Program as you | |
392 | received it, or any part of it, contains a notice stating that it is | |
393 | governed by this License along with a term that is a further | |
394 | restriction, you may remove that term. If a license document contains a | |
395 | further restriction but permits relicensing or conveying under this | |
396 | License, you may add to a covered work material governed by the terms of | |
397 | that license document, provided that the further restriction does not | |
398 | survive such relicensing or conveying. | |
399 | ||
400 | If you add terms to a covered work in accord with this section, you must | |
401 | place, in the relevant source files, a statement of the additional terms | |
402 | that apply to those files, or a notice indicating where to find the | |
403 | applicable terms. | |
404 | ||
405 | Additional terms, permissive or non-permissive, may be stated in the | |
406 | form of a separately written license, or stated as exceptions; the above | |
407 | requirements apply either way. | |
408 | ||
409 | 8. Termination. | |
410 | ~~~~~~~~~~~~~~~ | |
411 | ||
412 | You may not propagate or modify a covered work except as expressly | |
413 | provided under this License. Any attempt otherwise to propagate or | |
414 | modify it is void, and will automatically terminate your rights under | |
415 | this License (including any patent licenses granted under the third | |
416 | paragraph of section 11). | |
417 | ||
418 | However, if you cease all violation of this License, then your license | |
419 | from a particular copyright holder is reinstated | |
420 | ||
421 | a. provisionally, unless and until the copyright holder explicitly and | |
422 | finally terminates your license, and | |
423 | b. permanently, if the copyright holder fails to notify you of the | |
424 | violation by some reasonable means prior to 60 days after the | |
425 | cessation. | |
426 | ||
427 | Moreover, your license from a particular copyright holder is reinstated | |
428 | permanently if the copyright holder notifies you of the violation by | |
429 | some reasonable means, this is the first time you have received notice | |
430 | of violation of this License (for any work) from that copyright holder, | |
431 | and you cure the violation prior to 30 days after your receipt of the | |
432 | notice. | |
433 | ||
434 | Termination of your rights under this section does not terminate the | |
435 | licenses of parties who have received copies or rights from you under | |
436 | this License. If your rights have been terminated and not permanently | |
437 | reinstated, you do not qualify to receive new licenses for the same | |
438 | material under section 10. | |
439 | ||
440 | 9. Acceptance Not Required for Having Copies. | |
441 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
442 | ||
443 | You are not required to accept this License in order to receive or run a | |
444 | copy of the Program. Ancillary propagation of a covered work occurring | |
445 | solely as a consequence of using peer-to-peer transmission to receive a | |
446 | copy likewise does not require acceptance. However, nothing other than | |
447 | this License grants you permission to propagate or modify any covered | |
448 | work. These actions infringe copyright if you do not accept this | |
449 | License. Therefore, by modifying or propagating a covered work, you | |
450 | indicate your acceptance of this License to do so. | |
451 | ||
452 | 10. Automatic Licensing of Downstream Recipients. | |
453 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
454 | ||
455 | Each time you convey a covered work, the recipient automatically | |
456 | receives a license from the original licensors, to run, modify and | |
457 | propagate that work, subject to this License. You are not responsible | |
458 | for enforcing compliance by third parties with this License. | |
459 | ||
460 | An *entity transaction* is a transaction transferring control of an | |
461 | organization, or substantially all assets of one, or subdividing an | |
462 | organization, or merging organizations. If propagation of a covered work | |
463 | results from an entity transaction, each party to that transaction who | |
464 | receives a copy of the work also receives whatever licenses to the work | |
465 | the party's predecessor in interest had or could give under the previous | |
466 | paragraph, plus a right to possession of the Corresponding Source of the | |
467 | work from the predecessor in interest, if the predecessor has it or can | |
468 | get it with reasonable efforts. | |
469 | ||
470 | You may not impose any further restrictions on the exercise of the | |
471 | rights granted or affirmed under this License. For example, you may not | |
472 | impose a license fee, royalty, or other charge for exercise of rights | |
473 | granted under this License, and you may not initiate litigation | |
474 | (including a cross-claim or counterclaim in a lawsuit) alleging that any | |
475 | patent claim is infringed by making, using, selling, offering for sale, | |
476 | or importing the Program or any portion of it. | |
477 | ||
478 | 11. Patents. | |
479 | ~~~~~~~~~~~~ | |
480 | ||
481 | A *contributor* is a copyright holder who authorizes use under this | |
482 | License of the Program or a work on which the Program is based. The work | |
483 | thus licensed is called the contributor's *contributor version*. | |
484 | ||
485 | A contributor's *essential patent claims* are all patent claims owned or | |
486 | controlled by the contributor, whether already acquired or hereafter | |
487 | acquired, that would be infringed by some manner, permitted by this | |
488 | License, of making, using, or selling its contributor version, but do | |
489 | not include claims that would be infringed only as a consequence of | |
490 | further modification of the contributor version. For purposes of this | |
491 | definition, *control* includes the right to grant patent sublicenses in | |
492 | a manner consistent with the requirements of this License. | |
493 | ||
494 | Each contributor grants you a non-exclusive, worldwide, royalty-free | |
495 | patent license under the contributor's essential patent claims, to make, | |
496 | use, sell, offer for sale, import and otherwise run, modify and | |
497 | propagate the contents of its contributor version. | |
498 | ||
499 | In the following three paragraphs, a *patent license* is any express | |
500 | agreement or commitment, however denominated, not to enforce a patent | |
501 | (such as an express permission to practice a patent or covenant not to | |
502 | sue for patent infringement). To *grant* such a patent license to a | |
503 | party means to make such an agreement or commitment not to enforce a | |
504 | patent against the party. | |
505 | ||
506 | If you convey a covered work, knowingly relying on a patent license, and | |
507 | the Corresponding Source of the work is not available for anyone to | |
508 | copy, free of charge and under the terms of this License, through a | |
509 | publicly available network server or other readily accessible means, | |
510 | then you must either | |
511 | ||
512 | 1. cause the Corresponding Source to be so available, or | |
513 | 2. arrange to deprive yourself of the benefit of the patent license for | |
514 | this particular work, or | |
515 | 3. arrange, in a manner consistent with the requirements of this | |
516 | License, to extend the patent license to downstream recipients. | |
517 | ||
518 | *Knowingly relying* means you have actual knowledge that, but for the | |
519 | patent license, your conveying the covered work in a country, or your | |
520 | recipient's use of the covered work in a country, would infringe one or | |
521 | more identifiable patents in that country that you have reason to | |
522 | believe are valid. | |
523 | ||
524 | If, pursuant to or in connection with a single transaction or | |
525 | arrangement, you convey, or propagate by procuring conveyance of, a | |
526 | covered work, and grant a patent license to some of the parties | |
527 | receiving the covered work authorizing them to use, propagate, modify or | |
528 | convey a specific copy of the covered work, then the patent license you | |
529 | grant is automatically extended to all recipients of the covered work | |
530 | and works based on it. | |
531 | ||
532 | A patent license is *discriminatory* if it does not include within the | |
533 | scope of its coverage, prohibits the exercise of, or is conditioned on | |
534 | the non-exercise of one or more of the rights that are specifically | |
535 | granted under this License. You may not convey a covered work if you are | |
536 | a party to an arrangement with a third party that is in the business of | |
537 | distributing software, under which you make payment to the third party | |
538 | based on the extent of your activity of conveying the work, and under | |
539 | which the third party grants, to any of the parties who would receive | |
540 | the covered work from you, a discriminatory patent license | |
541 | ||
542 | a. in connection with copies of the covered work conveyed by you (or | |
543 | copies made from those copies), or | |
544 | b. primarily for and in connection with specific products or | |
545 | compilations that contain the covered work, unless you entered into | |
546 | that arrangement, or that patent license was granted, prior to 28 | |
547 | March 2007. | |
548 | ||
549 | Nothing in this License shall be construed as excluding or limiting any | |
550 | implied license or other defenses to infringement that may otherwise be | |
551 | available to you under applicable patent law. | |
552 | ||
553 | 12. No Surrender of Others' Freedom. | |
554 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
555 | ||
556 | If conditions are imposed on you (whether by court order, agreement or | |
557 | otherwise) that contradict the conditions of this License, they do not | |
558 | excuse you from the conditions of this License. If you cannot convey a | |
559 | covered work so as to satisfy simultaneously your obligations under this | |
560 | License and any other pertinent obligations, then as a consequence you | |
561 | may not convey it at all. For example, if you agree to terms that | |
562 | obligate you to collect a royalty for further conveying from those to | |
563 | whom you convey the Program, the only way you could satisfy both those | |
564 | terms and this License would be to refrain entirely from conveying the | |
565 | Program. | |
566 | ||
567 | 13. Use with the GNU Affero General Public License. | |
568 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
569 | ||
570 | Notwithstanding any other provision of this License, you have permission | |
571 | to link or combine any covered work with a work licensed under version 3 | |
572 | of the GNU Affero General Public License into a single combined work, | |
573 | and to convey the resulting work. The terms of this License will | |
574 | continue to apply to the part which is the covered work, but the special | |
575 | requirements of the GNU Affero General Public License, section 13, | |
576 | concerning interaction through a network will apply to the combination | |
577 | as such. | |
578 | ||
579 | 14. Revised Versions of this License. | |
580 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
581 | ||
582 | The Free Software Foundation may publish revised and/or new versions of | |
583 | the GNU General Public License from time to time. Such new versions will | |
584 | be similar in spirit to the present version, but may differ in detail to | |
585 | address new problems or concerns. | |
586 | ||
587 | Each version is given a distinguishing version number. If the Program | |
588 | specifies that a certain numbered version of the GNU General Public | |
589 | License *or any later version* applies to it, you have the option of | |
590 | following the terms and conditions either of that numbered version or of | |
591 | any later version published by the Free Software Foundation. If the | |
592 | Program does not specify a version number of the GNU General Public | |
593 | License, you may choose any version ever published by the Free Software | |
594 | Foundation. | |
595 | ||
596 | If the Program specifies that a proxy can decide which future versions | |
597 | of the GNU General Public License can be used, that proxy's public | |
598 | statement of acceptance of a version permanently authorizes you to | |
599 | choose that version for the Program. | |
600 | ||
601 | Later license versions may give you additional or different permissions. | |
602 | However, no additional obligations are imposed on any author or | |
603 | copyright holder as a result of your choosing to follow a later version. | |
604 | ||
605 | 15. Disclaimer of Warranty. | |
606 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
607 | ||
608 | THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY | |
609 | APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT | |
610 | HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM *AS IS* WITHOUT | |
611 | WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT | |
612 | LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A | |
613 | PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF | |
614 | THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME | |
615 | THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. | |
616 | ||
617 | 16. Limitation of Liability. | |
618 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
619 | ||
620 | IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING | |
621 | WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR | |
622 | CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, | |
623 | INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES | |
624 | ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT | |
625 | NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES | |
626 | SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE | |
627 | WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN | |
628 | ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. | |
629 | ||
630 | 17. Interpretation of Sections 15 and 16. | |
631 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
632 | ||
633 | If the disclaimer of warranty and limitation of liability provided above | |
634 | cannot be given local legal effect according to their terms, reviewing | |
635 | courts shall apply local law that most closely approximates an absolute | |
636 | waiver of all civil liability in connection with the Program, unless a | |
637 | warranty or assumption of liability accompanies a copy of the Program in | |
638 | return for a fee. | |
639 | ||
640 | END OF TERMS AND CONDITIONS | |
641 | --------------------------- | |
642 | ||
643 | How to Apply These Terms to Your New Programs | |
644 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
645 | ||
646 | If you develop a new program, and you want it to be of the greatest | |
647 | possible use to the public, the best way to achieve this is to make it | |
648 | free software which everyone can redistribute and change under these | |
649 | terms. | |
650 | ||
651 | To do so, attach the following notices to the program. It is safest to | |
652 | attach them to the start of each source file to most effectively state | |
653 | the exclusion of warranty; and each file should have at least the | |
654 | *copyright* line and a pointer to where the full notice is found. | |
655 | ||
656 | :: | |
657 | ||
658 | <one line to give the program's name and a brief idea of what it does.> | |
659 | Copyright (C) <year> <name of author> | |
660 | ||
661 | This program is free software: you can redistribute it and/or modify | |
662 | it under the terms of the GNU General Public License as published by | |
663 | the Free Software Foundation, either version 3 of the License, or | |
664 | (at your option) any later version. | |
665 | ||
666 | This program is distributed in the hope that it will be useful, | |
667 | but WITHOUT ANY WARRANTY; without even the implied warranty of | |
668 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | |
669 | GNU General Public License for more details. | |
670 | ||
671 | You should have received a copy of the GNU General Public License | |
672 | along with this program. If not, see <http://www.gnu.org/licenses/>. | |
673 | ||
674 | Also add information on how to contact you by electronic and paper mail. | |
675 | ||
676 | If the program does terminal interaction, make it output a short notice | |
677 | like this when it starts in an interactive mode: | |
678 | ||
679 | :: | |
680 | ||
681 | <program> Copyright (C) <year> <name of author> | |
682 | This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. | |
683 | This is free software, and you are welcome to redistribute it | |
684 | under certain conditions; type `show c' for details. | |
685 | ||
686 | The hypothetical commands ``show w`` and ``show c`` should show the | |
687 | appropriate parts of the General Public License. Of course, your | |
688 | program's commands might be different; for a GUI interface, you would | |
689 | use an *about box*. | |
690 | ||
691 | You should also get your employer (if you work as a programmer) or | |
692 | school, if any, to sign a *copyright disclaimer* for the program, if | |
693 | necessary. For more information on this, and how to apply and follow the | |
694 | GNU GPL, see | |
695 | `http://www.gnu.org/licenses/ <http://www.gnu.org/licenses/>`_. | |
696 | ||
697 | The GNU General Public License does not permit incorporating your | |
698 | program into proprietary programs. If your program is a subroutine | |
699 | library, you may consider it more useful to permit linking proprietary | |
700 | applications with the library. If this is what you want to do, use the | |
701 | GNU Lesser General Public License instead of this License. But first, | |
702 | please read | |
703 | `http://www.gnu.org/philosophy/why-not-lgpl.html <http://www.gnu.org/philosophy/why-not-lgpl.html>`_. |
5 | 5 | All rights reserved. |
6 | 6 | |
7 | 7 | Lambda is *free software*: you can redistribute it and/or modify |
8 | it under the terms of the GNU Affero General Public License as | |
9 | published by the Free Software Foundation, either version 3 of the | |
10 | License, or (at your option) any later version. | |
8 | it under the terms of the GNU General Public License as published by | |
9 | the Free Software Foundation, either version 3 of the License, or | |
10 | (at your option) any later version. | |
11 | 11 | |
12 | 12 | Lambda is distributed in the hope that it will be useful, |
13 | 13 | but **without any warranty**; without even the implied warranty of |
14 | 14 | **merchantability** or **fitness for a particular purpose**. |
15 | 15 | |
16 | See the file `LICENSE-AGPL3.rst <./LICENSE-AGPL3.rst>`__ or | |
16 | See the file `LICENSE-GPL3.rst <./LICENSE-GPL3.rst>`__ or | |
17 | 17 | http://www.gnu.org/licenses/ for a full text of the license and the |
18 | 18 | rights and obligations implied. |
19 | 19 |
11 | 11 | |
12 | 12 | # change this after every release |
13 | 13 | set (SEQAN_APP_VERSION_MAJOR "1") |
14 | set (SEQAN_APP_VERSION_MINOR "9") | |
14 | set (SEQAN_APP_VERSION_MINOR "0") | |
15 | 15 | set (SEQAN_APP_VERSION_PATCH "0") |
16 | 16 | |
17 | 17 | # don't change the following |
85 | 85 | option (LAMBDA_FASTBUILD "Build only blastp and blastx modes (speeds up build)." OFF) |
86 | 86 | option (LAMBDA_NATIVE_BUILD "Architecture-specific optimizations, i.e. g++ -march=native." ON) |
87 | 87 | option (LAMBDA_STATIC_BUILD "Include all libraries in the binaries." OFF) |
88 | option (LAMBDA_MMAPPED_DB "Use mmapped access to the database." OFF) | |
88 | option (LAMBDA_MMAPPED_DB "Use mmapped access to the database." ON) | |
89 | 89 | option (LAMBDA_LINGAPS_OPT "Add optimized codepaths for linear gap costs (inc. bin size and compile time)." OFF) |
90 | 90 | |
91 | 91 | if (LAMBDA_FASTBUILD) |
219 | 219 | # Install non-binary files for the package to share/lambda |
220 | 220 | install (FILES ../LICENSE.rst |
221 | 221 | ../LICENSE-BSD.rst |
222 | ../LICENSE-AGPL3.rst | |
222 | ../LICENSE-GPL3.rst | |
223 | 223 | ../README.rst |
224 | 224 | DESTINATION "share/doc/lambda") |
225 | 225 |
53 | 53 | uint64_t hitsMerged; |
54 | 54 | uint64_t hitsTooShort; |
55 | 55 | uint64_t hitsMasked; |
56 | std::vector<uint16_t> seedLengths; | |
57 | 56 | |
58 | 57 | // pre-extension |
59 | 58 | uint64_t hitsFailedPreExtendTest; |
70 | 69 | uint64_t hitsFinal; |
71 | 70 | uint64_t qrysWithHit; |
72 | 71 | |
73 | // times | |
74 | double timeGenSeeds; | |
75 | double timeSearch; | |
76 | double timeSort; | |
77 | double timeExtend; | |
78 | ||
79 | 72 | StatsHolder() |
80 | 73 | { |
81 | 74 | clear(); |
87 | 80 | hitsMerged = 0; |
88 | 81 | hitsTooShort = 0; |
89 | 82 | hitsMasked = 0; |
90 | seedLengths.clear(); | |
91 | 83 | |
92 | 84 | hitsFailedPreExtendTest = 0; |
93 | 85 | hitsPutativeDuplicate = 0; |
100 | 92 | |
101 | 93 | hitsFinal = 0; |
102 | 94 | qrysWithHit = 0; |
103 | ||
104 | timeGenSeeds = 0; | |
105 | timeSearch = 0; | |
106 | timeSort = 0; | |
107 | timeExtend = 0; | |
108 | 95 | } |
109 | 96 | |
110 | 97 | StatsHolder plus(StatsHolder const & rhs) |
113 | 100 | hitsMerged += rhs.hitsMerged; |
114 | 101 | hitsTooShort += rhs.hitsTooShort; |
115 | 102 | hitsMasked += rhs.hitsMasked; |
116 | append(seedLengths, rhs.seedLengths); | |
117 | 103 | |
118 | 104 | hitsFailedPreExtendTest += rhs.hitsFailedPreExtendTest; |
119 | 105 | hitsPutativeDuplicate += rhs.hitsPutativeDuplicate; |
126 | 112 | |
127 | 113 | hitsFinal += rhs.hitsFinal; |
128 | 114 | qrysWithHit += rhs.qrysWithHit; |
129 | ||
130 | timeGenSeeds += rhs.timeGenSeeds; | |
131 | timeSearch += rhs.timeSearch; | |
132 | timeSort += rhs.timeSort; | |
133 | timeExtend += rhs.timeExtend; | |
134 | ||
135 | 115 | return *this; |
136 | 116 | } |
137 | 117 | |
165 | 145 | std::cout << "Remaining\033[0m" |
166 | 146 | << "\n after Seeding "; BLANKS; |
167 | 147 | std::cout << R << rem; |
168 | if (stats.hitsMasked) | |
169 | std::cout << "\n - masked " << R << stats.hitsMasked | |
170 | << RR << (rem -= stats.hitsMasked); | |
171 | if (options.mergePutativeSiblings) | |
172 | std::cout << "\n - merged " << R << stats.hitsMerged | |
173 | << RR << (rem -= stats.hitsMerged); | |
174 | if (options.filterPutativeDuplicates) | |
175 | std::cout << "\n - putative duplicates " << R | |
176 | << stats.hitsPutativeDuplicate << RR | |
177 | << (rem -= stats.hitsPutativeDuplicate); | |
178 | if (options.filterPutativeAbundant) | |
179 | std::cout << "\n - putative abundant " << R | |
180 | << stats.hitsPutativeAbundant << RR | |
181 | << (rem -= stats.hitsPutativeAbundant); | |
182 | if (options.preScoring) | |
183 | std::cout << "\n - failed pre-extend test " << R | |
184 | << stats.hitsFailedPreExtendTest << RR | |
185 | << (rem -= stats.hitsFailedPreExtendTest); | |
148 | std::cout << "\n - masked " << R << stats.hitsMasked | |
149 | << RR << (rem -= stats.hitsMasked); | |
150 | std::cout << "\n - merged " << R << stats.hitsMerged | |
151 | << RR << (rem -= stats.hitsMerged); | |
152 | std::cout << "\n - putative duplicates " << R | |
153 | << stats.hitsPutativeDuplicate << RR | |
154 | << (rem -= stats.hitsPutativeDuplicate); | |
155 | std::cout << "\n - putative abundant " << R | |
156 | << stats.hitsPutativeAbundant << RR | |
157 | << (rem -= stats.hitsPutativeAbundant); | |
158 | std::cout << "\n - failed pre-extend test " << R | |
159 | << stats.hitsFailedPreExtendTest << RR | |
160 | << (rem -= stats.hitsFailedPreExtendTest); | |
186 | 161 | std::cout << "\n - failed %-identity test " << R |
187 | 162 | << stats.hitsFailedExtendPercentIdentTest << RR |
188 | 163 | << (rem -= stats.hitsFailedExtendPercentIdentTest); |
199 | 174 | |
200 | 175 | if (rem != stats.hitsFinal) |
201 | 176 | std::cout << "WARNING: hits dont add up\n"; |
202 | ||
203 | std::cout << "Detailed Non-Wall-Clock times:\n" | |
204 | << " genSeeds: " << stats.timeGenSeeds << "\n" | |
205 | << " search: " << stats.timeSearch << "\n" | |
206 | << " sort: " << stats.timeSort << "\n" | |
207 | << " extend: " << stats.timeExtend << "\n\n"; | |
208 | ||
209 | if (length(stats.seedLengths)) | |
210 | { | |
211 | double _seedLengthSum = std::accumulate(stats.seedLengths.begin(), stats.seedLengths.end(), 0.0); | |
212 | double seedLengthMean = _seedLengthSum / stats.seedLengths.size(); | |
213 | ||
214 | double _seedLengthMeanSqSum = std::inner_product(stats.seedLengths.begin(), | |
215 | stats.seedLengths.end(), | |
216 | stats.seedLengths.begin(), | |
217 | 0.0); | |
218 | double seedLengthStdDev = std::sqrt(_seedLengthMeanSqSum / stats.seedLengths.size() - | |
219 | seedLengthMean * seedLengthMean); | |
220 | uint16_t seedLengthMax = *std::max_element(stats.seedLengths.begin(), stats.seedLengths.end()); | |
221 | ||
222 | std::cout << "SeedStats:\n" | |
223 | << " avgLength: " << seedLengthMean << "\n" | |
224 | << " stddev: " << seedLengthStdDev << "\n" | |
225 | << " max: " << seedLengthMax << "\n\n"; | |
226 | } | |
227 | 177 | } |
228 | 178 | |
229 | 179 | if (options.verbosity >= 1) |
398 | 348 | // ---------------------------------------------------------------------------- |
399 | 349 | |
400 | 350 | template <typename TGlobalHolder_, |
401 | typename TScoreExtension_> | |
351 | typename TScoreExtension> | |
402 | 352 | class LocalDataHolder |
403 | 353 | { |
404 | 354 | public: |
407 | 357 | using TSeeds = StringSet<typename Infix<TRedQrySeq const>::Type>; |
408 | 358 | using TSeedIndex = Index<TSeeds, IndexSa<>>; |
409 | 359 | using TMatch = typename TGlobalHolder::TMatch; |
410 | using TScoreExtension = TScoreExtension_; | |
411 | 360 | |
412 | 361 | |
413 | 362 | // references to global stuff |
19 | 19 | // ========================================================================== |
20 | 20 | |
21 | 21 | #include <iostream> |
22 | ||
23 | //TODO TEMPORARY REMOVE | |
24 | #define amd64 | |
25 | 22 | |
26 | 23 | #include <seqan/basic.h> |
27 | 24 | #include <seqan/sequence.h> |
53 | 50 | // forwards |
54 | 51 | |
55 | 52 | inline int |
56 | argConv0(LambdaOptions & options); | |
53 | argConv0(LambdaOptions const & options); | |
57 | 54 | //- |
58 | 55 | template <typename TOutFormat, |
59 | 56 | BlastTabularSpec h> |
60 | 57 | inline int |
61 | argConv1(LambdaOptions & options, | |
58 | argConv1(LambdaOptions const & options, | |
62 | 59 | TOutFormat const & /**/, |
63 | 60 | BlastTabularSpecSelector<h> const &); |
64 | 61 | //- |
66 | 63 | BlastTabularSpec h, |
67 | 64 | BlastProgram p> |
68 | 65 | inline int |
69 | argConv2(LambdaOptions & options, | |
66 | argConv2(LambdaOptions const & options, | |
70 | 67 | TOutFormat const & /**/, |
71 | 68 | BlastTabularSpecSelector<h> const &, |
72 | 69 | BlastProgramSelector<p> const &); |
76 | 73 | BlastTabularSpec h, |
77 | 74 | BlastProgram p> |
78 | 75 | inline int |
79 | argConv3(LambdaOptions & options, | |
76 | argConv3(LambdaOptions const & options, | |
80 | 77 | TOutFormat const &, |
81 | 78 | BlastTabularSpecSelector<h> const &, |
82 | 79 | BlastProgramSelector<p> const &, |
88 | 85 | BlastTabularSpec h, |
89 | 86 | BlastProgram p> |
90 | 87 | inline int |
91 | argConv4(LambdaOptions & options, | |
88 | argConv4(LambdaOptions const & options, | |
92 | 89 | TOutFormat const & /**/, |
93 | 90 | BlastTabularSpecSelector<h> const &, |
94 | 91 | BlastProgramSelector<p> const &, |
102 | 99 | BlastProgram p, |
103 | 100 | BlastTabularSpec h> |
104 | 101 | inline int |
105 | realMain(LambdaOptions & options, | |
102 | realMain(LambdaOptions const & options, | |
106 | 103 | TOutFormat const & /**/, |
107 | 104 | BlastTabularSpecSelector<h> const &, |
108 | 105 | BlastProgramSelector<p> const &, |
135 | 132 | |
136 | 133 | // CONVERT Run-time options to compile-time Format-Type |
137 | 134 | inline int |
138 | argConv0(LambdaOptions & options) | |
135 | argConv0(LambdaOptions const & options) | |
139 | 136 | { |
140 | 137 | CharString output = options.output; |
141 | 138 | if (endsWith(output, ".gz")) |
159 | 156 | template <typename TOutFormat, |
160 | 157 | BlastTabularSpec h> |
161 | 158 | inline int |
162 | argConv1(LambdaOptions & options, | |
159 | argConv1(LambdaOptions const & options, | |
163 | 160 | TOutFormat const & /**/, |
164 | 161 | BlastTabularSpecSelector<h> const &) |
165 | 162 | { |
208 | 205 | BlastTabularSpec h, |
209 | 206 | BlastProgram p> |
210 | 207 | inline int |
211 | argConv2(LambdaOptions & options, | |
208 | argConv2(LambdaOptions const & options, | |
212 | 209 | TOutFormat const & /**/, |
213 | 210 | BlastTabularSpecSelector<h> const &, |
214 | 211 | BlastProgramSelector<p> const &) |
244 | 241 | BlastTabularSpec h, |
245 | 242 | BlastProgram p> |
246 | 243 | inline int |
247 | argConv3(LambdaOptions & options, | |
244 | argConv3(LambdaOptions const & options, | |
248 | 245 | TOutFormat const &, |
249 | 246 | BlastTabularSpecSelector<h> const &, |
250 | 247 | BlastProgramSelector<p> const &, |
279 | 276 | BlastTabularSpec h, |
280 | 277 | BlastProgram p> |
281 | 278 | inline int |
282 | argConv4(LambdaOptions & options, | |
279 | argConv4(LambdaOptions const & options, | |
283 | 280 | TOutFormat const & /**/, |
284 | 281 | BlastTabularSpecSelector<h> const &, |
285 | 282 | BlastProgramSelector<p> const &, |
286 | 283 | TRedAlph const & /**/, |
287 | 284 | TScoreExtension const & /**/) |
288 | 285 | { |
289 | if (options.dbIndexType == DbIndexType::SUFFIX_ARRAY) | |
286 | int indexType = options.dbIndexType; | |
287 | // if (indexType == -1) // autodetect | |
288 | // { | |
289 | // //TODO FIX THIS WITH NEW EXTENSIONS | |
290 | // CharString file = options.dbFile; | |
291 | // append(file, ".sa"); | |
292 | // struct stat buffer; | |
293 | // if (stat(toCString(file), &buffer) == 0) | |
294 | // { | |
295 | // indexType = 0; | |
296 | // } else | |
297 | // { | |
298 | // file = options.dbFile; | |
299 | // append(file, ".sa.val"); // FM Index | |
300 | // struct stat buffer; | |
301 | // if (stat(toCString(file), &buffer) == 0) | |
302 | // { | |
303 | // indexType = 1; | |
304 | // } else | |
305 | // { | |
306 | // std::cerr << "No Index file could be found, please make sure paths " | |
307 | // << "are correct and the files are readable.\n" << std::flush; | |
308 | // | |
309 | // return -1; | |
310 | // } | |
311 | // } | |
312 | // } | |
313 | ||
314 | if (indexType == 0) | |
290 | 315 | return realMain<IndexSa<>>(options, |
291 | 316 | TOutFormat(), |
292 | 317 | BlastTabularSpecSelector<h>(), |
315 | 340 | BlastProgram p, |
316 | 341 | BlastTabularSpec h> |
317 | 342 | inline int |
318 | realMain(LambdaOptions & options, | |
343 | realMain(LambdaOptions const & options, | |
319 | 344 | TOutFormat const & /**/, |
320 | 345 | BlastTabularSpecSelector<h> const &, |
321 | 346 | BlastProgramSelector<p> const &, |
329 | 354 | "\n======================================================" |
330 | 355 | "\nVersion ", SEQAN_APP_VERSION, "\n\n"); |
331 | 356 | |
332 | int ret = validateIndexOptions<TRedAlph, p>(options); | |
333 | if (ret) | |
334 | return ret; | |
335 | ||
336 | 357 | if (options.verbosity >= 2) |
337 | 358 | printOptions<TLocalHolder>(options); |
338 | 359 | |
339 | 360 | TGlobalHolder globalHolder; |
340 | 361 | // context(globalHolder.outfile).scoringScheme._internalScheme = matr; |
341 | 362 | |
342 | ret = prepareScoring(globalHolder, options); | |
363 | int ret = prepareScoring(globalHolder, options); | |
343 | 364 | if (ret) |
344 | 365 | return ret; |
345 | 366 | |
351 | 372 | if (ret) |
352 | 373 | return ret; |
353 | 374 | |
354 | // ret = loadSegintervals(globalHolder, options); | |
355 | // if (ret) | |
356 | // return ret; | |
375 | ret = loadSegintervals(globalHolder, options); | |
376 | if (ret) | |
377 | return ret; | |
357 | 378 | |
358 | 379 | ret = loadQuery(globalHolder, options); |
359 | 380 | if (ret) |
419 | 440 | localHolder.init(t); |
420 | 441 | |
421 | 442 | // seed |
422 | double buf = sysTime(); | |
423 | if (!options.adaptiveSeeding) | |
424 | { | |
425 | res = generateSeeds(localHolder); | |
426 | if (res) | |
427 | continue; | |
428 | } | |
443 | res = generateSeeds(localHolder); | |
444 | if (res) | |
445 | continue; | |
429 | 446 | |
430 | 447 | if (options.doubleIndexing) |
431 | 448 | { |
433 | 450 | if (res) |
434 | 451 | continue; |
435 | 452 | } |
436 | localHolder.stats.timeGenSeeds += sysTime() - buf; | |
437 | 453 | |
438 | 454 | // search |
439 | buf = sysTime(); | |
440 | search(localHolder); //TODO seed refining if iterateMatches gives 0 results | |
441 | localHolder.stats.timeSearch += sysTime() - buf; | |
442 | ||
443 | // // TODO DEBUG | |
444 | // for (auto const & m : localHolder.matches) | |
445 | // _printMatch(m); | |
455 | search(localHolder); | |
446 | 456 | |
447 | 457 | // sort |
448 | if (options.filterPutativeAbundant || options.filterPutativeDuplicates || options.mergePutativeSiblings) | |
449 | { | |
450 | buf = sysTime(); | |
451 | sortMatches(localHolder); | |
452 | localHolder.stats.timeSort += sysTime() - buf; | |
453 | } | |
458 | sortMatches(localHolder); | |
454 | 459 | |
455 | 460 | // extend |
456 | buf = sysTime(); | |
457 | if (length(localHolder.matches) > 0) | |
458 | res = iterateMatches(localHolder); | |
459 | localHolder.stats.timeExtend += sysTime() - buf; | |
461 | res = iterateMatches(localHolder); | |
460 | 462 | if (res) |
461 | 463 | continue; |
464 | ||
462 | 465 | |
463 | 466 | if ((!options.doubleIndexing) && (TID == 0) && |
464 | 467 | (options.verbosity >= 1)) |
485 | 488 | |
486 | 489 | if (!options.doubleIndexing) |
487 | 490 | { |
488 | myPrint(options, 2, "Runtime total: ", sysTime() - start, "s.\n\n"); | |
491 | myPrint(options, 2, "Runtime: ", sysTime() - start, "s.\n\n"); | |
489 | 492 | } |
490 | 493 | |
491 | 494 | printStats(globalHolder.stats, options); |
94 | 94 | // ============================================================================ |
95 | 95 | |
96 | 96 | // -------------------------------------------------------------------------- |
97 | // Function readIndexOption() | |
98 | // -------------------------------------------------------------------------- | |
99 | ||
100 | inline void | |
101 | readIndexOption(std::string & optionString, | |
102 | std::string const & optionIdentifier, | |
103 | LambdaOptions const & options) | |
104 | { | |
105 | std::ifstream f{(options.indexDir + "/option:" + optionIdentifier).c_str(), | |
106 | std::ios_base::in | std::ios_base::binary}; | |
107 | if (f.is_open()) | |
108 | { | |
109 | auto fit = directionIterator(f, Input()); | |
110 | readLine(optionString, fit); | |
111 | f.close(); | |
112 | } | |
113 | else | |
114 | { | |
115 | throw std::runtime_error("ERROR: Expected option specifier:\n" + options.indexDir + "/option:" + | |
116 | optionIdentifier + "\nYour index seems incompatible, try to recreate it " | |
117 | "and report a bug if the issue persists."); | |
118 | } | |
119 | } | |
120 | ||
121 | // -------------------------------------------------------------------------- | |
122 | // Function validateIndexOptions() | |
123 | // -------------------------------------------------------------------------- | |
124 | ||
125 | template <typename TRedAlph, | |
126 | BlastProgram p> | |
127 | inline int | |
128 | validateIndexOptions(LambdaOptions const & options) | |
129 | { | |
130 | std::string buffer; | |
131 | readIndexOption(buffer, "alph_translated", options); | |
132 | if (buffer != _alphName(TransAlph<p>())) | |
133 | { | |
134 | std::cerr << "ERROR: Your index is of translated alphabet type: " << buffer << "\n But lambda expected: " | |
135 | << _alphName(TransAlph<p>()) << "\n Did you specify the right -p parameter?\n\n"; | |
136 | return -1; | |
137 | ||
138 | } | |
139 | buffer.clear(); | |
140 | readIndexOption(buffer, "alph_reduced", options); | |
141 | if (buffer != _alphName(TRedAlph())) | |
142 | { | |
143 | std::cerr << "ERROR: Your index is of reduced alphabet type: " << buffer << "\n But lambda expected: " | |
144 | << _alphName(TRedAlph()) << "\n Did you specify the right -ar parameter?\n\n"; | |
145 | return -1; | |
146 | } | |
147 | buffer.clear(); | |
148 | readIndexOption(buffer, "db_index_type", options); | |
149 | unsigned long b = 0; | |
150 | if ((!lexicalCast(b, buffer)) || (b != static_cast<unsigned long>(options.dbIndexType))) | |
151 | { | |
152 | std::cerr << "ERROR: Your index type is: " << _indexName(static_cast<DbIndexType>(std::stoul(buffer))) | |
153 | << "\n But lambda expected: " << _indexName(options.dbIndexType) | |
154 | << "\n Did you specify the right -di parameter?\n\n"; | |
155 | return -1; | |
156 | } | |
157 | if (qIsTranslated(p) && sIsTranslated(p)) | |
158 | { | |
159 | buffer.clear(); | |
160 | readIndexOption(buffer, "genetic_code", options); | |
161 | unsigned long b = 0; | |
162 | if ((!lexicalCast(b, buffer)) || (b != static_cast<unsigned long>(options.geneticCode))) | |
163 | { | |
164 | std::cerr << "WARNING: The codon translation table used during indexing and during search are different. " | |
165 | "This is not a problem per se, but is likely not what you want.\n\n"; | |
166 | } | |
167 | } | |
168 | return 0; | |
169 | } | |
170 | ||
171 | // -------------------------------------------------------------------------- | |
172 | 97 | // Function prepareScoring() |
173 | 98 | // -------------------------------------------------------------------------- |
174 | 99 | |
269 | 194 | strIdent = "Loading Subj Sequences..."; |
270 | 195 | myPrint(options, 1, strIdent); |
271 | 196 | |
272 | _dbSeqs = options.indexDir; | |
273 | append(_dbSeqs, "/translated_seqs"); | |
197 | _dbSeqs = options.dbFile; | |
198 | append(_dbSeqs, "."); | |
199 | append(_dbSeqs, _alphName(TransAlph<p>())); | |
274 | 200 | |
275 | 201 | ret = open(globalHolder.subjSeqs, toCString(_dbSeqs), OPEN_RDONLY); |
276 | 202 | if (ret != true) |
299 | 225 | strIdent = "Loading Subj Ids..."; |
300 | 226 | myPrint(options, 1, strIdent); |
301 | 227 | |
302 | _dbSeqs = options.indexDir; | |
303 | append(_dbSeqs, "/seq_ids"); | |
228 | _dbSeqs = options.dbFile; | |
229 | append(_dbSeqs, ".ids"); | |
304 | 230 | ret = open(globalHolder.subjIds, toCString(_dbSeqs), OPEN_RDONLY); |
305 | 231 | if (ret != true) |
306 | 232 | { |
312 | 238 | myPrint(options, 1, " done.\n"); |
313 | 239 | myPrint(options, 2, "Runtime: ", finish, "s \n\n"); |
314 | 240 | |
315 | context(globalHolder.outfile).dbName = options.indexDir; | |
241 | context(globalHolder.outfile).dbName = options.dbFile; | |
316 | 242 | |
317 | 243 | // if subjects where translated, we don't have the untranslated seqs at all |
318 | 244 | // but we still need the data for statistics and position un-translation |
322 | 248 | std::string strIdent = "Loading Lengths of untranslated Subj sequences..."; |
323 | 249 | myPrint(options, 1, strIdent); |
324 | 250 | |
325 | _dbSeqs = options.indexDir; | |
326 | append(_dbSeqs, "/untranslated_seq_lengths"); | |
251 | _dbSeqs = options.dbFile; | |
252 | append(_dbSeqs, ".untranslengths"); | |
327 | 253 | ret = open(globalHolder.untransSubjSeqLengths, toCString(_dbSeqs), OPEN_RDONLY); |
328 | 254 | if (ret != true) |
329 | 255 | { |
352 | 278 | std::string strIdent = "Loading Database Index..."; |
353 | 279 | myPrint(options, 1, strIdent); |
354 | 280 | double start = sysTime(); |
355 | std::string path = toCString(options.indexDir); | |
356 | path += "/index"; | |
281 | std::string path = toCString(options.dbFile); | |
282 | path += '.' + std::string(_alphName(typename TGlobalHolder::TRedAlph())); | |
283 | if (TGlobalHolder::indexIsFM) | |
284 | path += ".fm"; | |
285 | else | |
286 | path += ".sa"; | |
287 | ||
288 | // Check if the index is of the old format (pre 0.9.0) by looking for different files | |
289 | if ((globalHolder.blastProgram != BlastProgram::BLASTN) && // BLASTN indexes are compatible | |
290 | ((TGlobalHolder::alphReduction && fileExists(toCString(path + ".txt.concat"))) || | |
291 | (!TGlobalHolder::alphReduction && TGlobalHolder::indexIsFM && !fileExists(toCString(path + ".lf.drv.wtc.24"))))) | |
292 | { | |
293 | std::cerr << ((options.verbosity == 0) ? strIdent : std::string()) | |
294 | << " failed.\n" | |
295 | << "It appears you tried to open an old index (created before 0.9.0) which " | |
296 | << "is not supported. Please remove the old files and create a new index with lambda_indexer!\n"; | |
297 | return 200; | |
298 | } | |
357 | 299 | |
358 | 300 | int ret = open(globalHolder.dbIndex, path.c_str(), OPEN_RDONLY); |
359 | 301 | if (ret != true) |
375 | 317 | length(indexSA(globalHolder.dbIndex)), "\n\n"); |
376 | 318 | |
377 | 319 | // this is actually part of prepareScoring(), but the values are just available now |
378 | if (sIsTranslated(TGlobalHolder::blastProgram )) | |
320 | if (sIsTranslated(globalHolder.blastProgram )) | |
379 | 321 | { |
380 | 322 | // last value has sum of lengths |
381 | 323 | context(globalHolder.outfile).dbTotalLength = back(globalHolder.untransSubjSeqLengths); |
393 | 335 | // Function loadSegintervals() |
394 | 336 | // -------------------------------------------------------------------------- |
395 | 337 | |
396 | // template <BlastTabularSpec h, | |
397 | // BlastProgram p, | |
398 | // typename TRedAlph, | |
399 | // typename TIndexSpec, | |
400 | // typename TOutFormat> | |
401 | // inline int | |
402 | // loadSegintervals(GlobalDataHolder<TRedAlph, TIndexSpec, TOutFormat, p, h> & globalHolder, | |
403 | // LambdaOptions const & options) | |
404 | // { | |
405 | // | |
406 | // double start = sysTime(); | |
407 | // std::string strIdent = "Loading Database Masking file..."; | |
408 | // myPrint(options, 1, strIdent); | |
409 | // | |
410 | // CharString segFileS = options.dbFile; | |
411 | // append(segFileS, ".binseg_s.concat"); | |
412 | // CharString segFileE = options.dbFile; | |
413 | // append(segFileE, ".binseg_e.concat"); | |
414 | // bool fail = false; | |
415 | // struct stat buffer; | |
416 | // // file exists | |
417 | // if ((stat(toCString(segFileS), &buffer) == 0) && | |
418 | // (stat(toCString(segFileE), &buffer) == 0)) | |
419 | // { | |
420 | // //cut off ".concat" again | |
421 | // resize(segFileS, length(segFileS) - 7); | |
422 | // resize(segFileE, length(segFileE) - 7); | |
423 | // | |
424 | // fail = !open(globalHolder.segIntStarts, toCString(segFileS), OPEN_RDONLY); | |
425 | // if (!fail) | |
426 | // fail = !open(globalHolder.segIntEnds, toCString(segFileE), OPEN_RDONLY); | |
427 | // } else | |
428 | // { | |
429 | // fail = true; | |
430 | // } | |
431 | // | |
432 | // if (fail) | |
433 | // { | |
434 | // std::cerr << ((options.verbosity == 0) ? strIdent : std::string()) | |
435 | // << " failed.\n"; | |
436 | // return 1; | |
437 | // } | |
438 | // | |
439 | // double finish = sysTime() - start; | |
440 | // myPrint(options, 1, " done.\n"); | |
441 | // myPrint(options, 2, "Runtime: ", finish, "s \n\n"); | |
442 | // return 0; | |
443 | // } | |
338 | template <BlastTabularSpec h, | |
339 | BlastProgram p, | |
340 | typename TRedAlph, | |
341 | typename TIndexSpec, | |
342 | typename TOutFormat> | |
343 | inline int | |
344 | loadSegintervals(GlobalDataHolder<TRedAlph, TIndexSpec, TOutFormat, p, h> & globalHolder, | |
345 | LambdaOptions const & options) | |
346 | { | |
347 | ||
348 | double start = sysTime(); | |
349 | std::string strIdent = "Loading Database Masking file..."; | |
350 | myPrint(options, 1, strIdent); | |
351 | ||
352 | CharString segFileS = options.dbFile; | |
353 | append(segFileS, ".binseg_s.concat"); | |
354 | CharString segFileE = options.dbFile; | |
355 | append(segFileE, ".binseg_e.concat"); | |
356 | bool fail = false; | |
357 | struct stat buffer; | |
358 | // file exists | |
359 | if ((stat(toCString(segFileS), &buffer) == 0) && | |
360 | (stat(toCString(segFileE), &buffer) == 0)) | |
361 | { | |
362 | //cut off ".concat" again | |
363 | resize(segFileS, length(segFileS) - 7); | |
364 | resize(segFileE, length(segFileE) - 7); | |
365 | ||
366 | fail = !open(globalHolder.segIntStarts, toCString(segFileS), OPEN_RDONLY); | |
367 | if (!fail) | |
368 | fail = !open(globalHolder.segIntEnds, toCString(segFileE), OPEN_RDONLY); | |
369 | } else | |
370 | { | |
371 | fail = true; | |
372 | } | |
373 | ||
374 | if (fail) | |
375 | { | |
376 | std::cerr << ((options.verbosity == 0) ? strIdent : std::string()) | |
377 | << " failed.\n"; | |
378 | return 1; | |
379 | } | |
380 | ||
381 | double finish = sysTime() - start; | |
382 | myPrint(options, 1, " done.\n"); | |
383 | myPrint(options, 2, "Runtime: ", finish, "s \n\n"); | |
384 | return 0; | |
385 | } | |
444 | 386 | |
445 | 387 | // -------------------------------------------------------------------------- |
446 | 388 | // Function loadQuery() |
545 | 487 | typename TIndexSpec, |
546 | 488 | typename TOutFormat> |
547 | 489 | inline int |
548 | loadQuery(GlobalDataHolder<TRedAlph, TIndexSpec, TOutFormat, p, h> & globalHolder, | |
549 | LambdaOptions & options) | |
490 | loadQuery(GlobalDataHolder<TRedAlph, TIndexSpec, TOutFormat, p, h> & globalHolder, | |
491 | LambdaOptions const & options) | |
550 | 492 | { |
551 | 493 | using TGH = GlobalDataHolder<TRedAlph, TIndexSpec, TOutFormat, p, h>; |
552 | 494 | double start = sysTime(); |
578 | 520 | options); |
579 | 521 | |
580 | 522 | // sam and bam need original sequences if translation happened |
581 | if (qIsTranslated(TGH::blastProgram) && (options.outFileFormat > 0) && | |
523 | if (qIsTranslated(globalHolder.blastProgram) && (options.outFileFormat > 0) && | |
582 | 524 | (options.samBamSeq > 0)) |
583 | 525 | std::swap(origSeqs, globalHolder.untranslatedQrySeqs); |
584 | 526 | |
619 | 561 | << ".\n"; |
620 | 562 | return -1; |
621 | 563 | } |
622 | ||
623 | if (options.extensionMode == LambdaOptions::ExtensionMode::AUTO) | |
624 | { | |
625 | if (maxLen <= 100) | |
626 | { | |
627 | #if 0 // defined(SEQAN_SIMD_ENABLED) && defined(__AVX2__) | |
628 | options.extensionMode = LambdaOptions::ExtensionMode::FULL_SIMD; | |
629 | options.band = -1; | |
630 | #else | |
631 | options.extensionMode = LambdaOptions::ExtensionMode::FULL_SERIAL; | |
632 | #endif | |
633 | options.xDropOff = -1; | |
634 | options.filterPutativeAbundant = false; | |
635 | options.filterPutativeDuplicates = false; | |
636 | options.mergePutativeSiblings = false; | |
637 | } | |
638 | else | |
639 | { | |
640 | options.extensionMode = LambdaOptions::ExtensionMode::XDROP; | |
641 | } | |
642 | } | |
643 | ||
644 | 564 | return 0; |
645 | 565 | } |
646 | 566 | |
763 | 683 | |
764 | 684 | int64_t effectiveQBegin = m.qryStart; |
765 | 685 | int64_t effectiveSBegin = m.subjStart; |
766 | uint64_t actualLength = m.qryEnd - m.qryStart; | |
767 | uint64_t effectiveLength = std::max(static_cast<uint64_t>(lH.options.seedLength * lH.options.preScoring), | |
768 | actualLength); | |
769 | ||
770 | if (effectiveLength > actualLength) | |
686 | uint64_t effectiveLength = lH.options.seedLength * lH.options.preScoring; | |
687 | if (lH.options.preScoring > 1) | |
771 | 688 | { |
772 | 689 | effectiveQBegin -= (lH.options.preScoring - 1) * |
773 | actualLength / 2; | |
690 | lH.options.seedLength / 2; | |
774 | 691 | effectiveSBegin -= (lH.options.preScoring - 1) * |
775 | actualLength / 2; | |
692 | lH.options.seedLength / 2; | |
776 | 693 | // std::cout << effectiveQBegin << "\t" << effectiveSBegin << "\n"; |
777 | 694 | int64_t min = std::min(effectiveQBegin, effectiveSBegin); |
778 | 695 | if (min < 0) |
842 | 759 | - getSeqOffset(subjOcc) |
843 | 760 | - lH.options.seedLength); |
844 | 761 | |
845 | TMatch m {static_cast<typename TMatch::TQId>(lH.seedRefs[seedId]), | |
846 | static_cast<typename TMatch::TSId>(getSeqNo(subjOcc)), | |
847 | static_cast<typename TMatch::TPos>(lH.seedRanks[seedId] * lH.options.seedOffset), | |
848 | static_cast<typename TMatch::TPos>(lH.seedRanks[seedId] * lH.options.seedOffset + lH.options.seedLength), | |
849 | static_cast<typename TMatch::TPos>(getSeqOffset(subjOcc)), | |
850 | static_cast<typename TMatch::TPos>(getSeqOffset(subjOcc) + lH.options.seedLength)}; | |
762 | TMatch m{static_cast<typename TMatch::TQId>(lH.seedRefs[seedId]), | |
763 | static_cast<typename TMatch::TSId>(getSeqNo(subjOcc)), | |
764 | static_cast<typename TMatch::TPos>(lH.seedRanks[seedId] * lH.options.seedOffset), | |
765 | static_cast<typename TMatch::TPos>(getSeqOffset(subjOcc))}; | |
851 | 766 | |
852 | 767 | bool discarded = false; |
853 | 768 | auto const halfSubjL = lH.options.seedLength / 2; |
854 | 769 | |
855 | if (!sIsTranslated(TGlobalHolder::blastProgram)) | |
770 | if (!sIsTranslated(lH.gH.blastProgram)) | |
856 | 771 | { |
857 | 772 | for (unsigned k = 0; k < length(lH.gH.segIntStarts[m.subjId]); ++k) |
858 | 773 | { |
859 | 774 | // more than half of the seed falls into masked interval |
860 | 775 | if (intervalOverlap(m.subjStart, |
861 | m.subjEnd, | |
776 | m.subjStart + lH.options.seedLength, | |
862 | 777 | lH.gH.segIntStarts[m.subjId][k], |
863 | 778 | lH.gH.segIntEnds[m.subjId][k]) |
864 | 779 | >= halfSubjL) |
880 | 795 | lH.matches.emplace_back(m); |
881 | 796 | } |
882 | 797 | |
883 | template <typename TGlobalHolder, | |
884 | typename TScoreExtension, | |
885 | typename TSubjOcc> | |
886 | inline void | |
887 | onFindVariable(LocalDataHolder<TGlobalHolder, TScoreExtension> & lH, | |
888 | TSubjOcc subjOcc, | |
889 | typename TGlobalHolder::TMatch::TQId const seedId, | |
890 | typename TGlobalHolder::TMatch::TPos const seedBegin, | |
891 | typename TGlobalHolder::TMatch::TPos const seedLength) | |
892 | { | |
893 | using TMatch = typename TGlobalHolder::TMatch; | |
894 | if (TGlobalHolder::indexIsFM) // positions are reversed | |
895 | setSeqOffset(subjOcc, | |
896 | length(lH.gH.subjSeqs[getSeqNo(subjOcc)]) | |
897 | - getSeqOffset(subjOcc) | |
898 | - seedLength); | |
899 | ||
900 | TMatch m {seedId, | |
901 | static_cast<typename TGlobalHolder::TMatch::TSId>(getSeqNo(subjOcc)), | |
902 | seedBegin, | |
903 | static_cast<typename TGlobalHolder::TMatch::TPos>(seedBegin + seedLength), | |
904 | static_cast<typename TGlobalHolder::TMatch::TPos>(getSeqOffset(subjOcc)), | |
905 | static_cast<typename TGlobalHolder::TMatch::TPos>(getSeqOffset(subjOcc) + seedLength)}; | |
906 | ||
907 | if (!seedLooksPromising(lH, m)) | |
908 | ++lH.stats.hitsFailedPreExtendTest; | |
909 | else | |
910 | lH.matches.emplace_back(m); | |
911 | } | |
912 | ||
913 | 798 | // -------------------------------------------------------------------------- |
914 | 799 | // Function search() |
915 | 800 | // -------------------------------------------------------------------------- |
916 | ||
917 | //TODO experiment with tuned branch prediction | |
918 | ||
919 | template <typename TIndexIt, typename TNeedleIt, typename TLambda, typename TLambda2> | |
920 | inline void | |
921 | __goDownNoErrors(TIndexIt const & indexIt, | |
922 | TNeedleIt const & needleIt, | |
923 | TNeedleIt const & needleItEnd, | |
924 | TLambda & continRunnable, | |
925 | TLambda2 & reportRunnable) | |
926 | { | |
927 | TIndexIt nextIndexIt(indexIt); | |
928 | if ((needleIt != needleItEnd) && | |
929 | goDown(nextIndexIt, *needleIt) && | |
930 | continRunnable(indexIt, nextIndexIt)) | |
931 | { | |
932 | __goDownNoErrors(nextIndexIt, needleIt + 1, needleItEnd, continRunnable, reportRunnable); | |
933 | } else | |
934 | { | |
935 | reportRunnable(indexIt); | |
936 | } | |
937 | } | |
938 | ||
939 | template <typename TIndexIt, typename TNeedleIt, typename TLambda, typename TLambda2> | |
940 | inline void | |
941 | __goDownErrors(TIndexIt const & indexIt, | |
942 | TNeedleIt const & needleIt, | |
943 | TNeedleIt const & needleItEnd, | |
944 | TLambda & continRunnable, | |
945 | TLambda2 & reportRunnable) | |
946 | { | |
947 | using TAlph = typename Value<TNeedleIt>::Type; | |
948 | ||
949 | unsigned contin = 0; | |
950 | ||
951 | if (needleIt != needleItEnd) | |
952 | { | |
953 | for (unsigned i = 0; i < ValueSize<TAlph>::VALUE; ++i) | |
954 | { | |
955 | TIndexIt nextIndexIt(indexIt); | |
956 | if (goDown(nextIndexIt, static_cast<TAlph>(i)) && | |
957 | continRunnable(indexIt, nextIndexIt)) | |
958 | { | |
959 | ++contin; | |
960 | if (ordValue(*needleIt) == i) | |
961 | __goDownErrors(nextIndexIt, needleIt + 1, needleItEnd, continRunnable, reportRunnable); | |
962 | else | |
963 | __goDownNoErrors(nextIndexIt, needleIt + 1, needleItEnd, continRunnable, reportRunnable); | |
964 | } | |
965 | } | |
966 | } | |
967 | ||
968 | if (contin == 0) | |
969 | reportRunnable(indexIt); | |
970 | } | |
971 | ||
972 | template <typename TGlobalHolder, | |
973 | typename TScoreExtension> | |
974 | inline void | |
975 | __serachAdaptive(LocalDataHolder<TGlobalHolder, TScoreExtension> & lH, | |
976 | uint64_t const seedLength) | |
977 | { | |
978 | typedef typename Iterator<typename TGlobalHolder::TDbIndex, TopDown<> >::Type TIndexIt; | |
979 | ||
980 | // TODO optionize | |
981 | size_t constexpr seedHeurFactor = 10; | |
982 | size_t constexpr minResults = 1; | |
983 | ||
984 | size_t needlesSum = lH.gH.redQrySeqs.limits[lH.indexEndQry] - lH.gH.redQrySeqs.limits[lH.indexBeginQry]; | |
985 | // BROKEN:lengthSum(infix(lH.gH.redQrySeqs, lH.indexBeginQry, lH.indexEndQry)); | |
986 | // the above is faster anyway (but only works on concatdirect sets) | |
987 | ||
988 | size_t needlesPos = 0; | |
989 | ||
990 | TIndexIt root(lH.gH.dbIndex); | |
991 | TIndexIt indexIt = root; | |
992 | ||
993 | for (size_t i = lH.indexBeginQry; i < lH.indexEndQry; ++i) | |
994 | { | |
995 | for (size_t seedBegin = 0; /* below */; seedBegin += lH.options.seedOffset) | |
996 | { | |
997 | // skip proteine 'X' or Dna 'N' | |
998 | while ((lH.gH.qrySeqs[i][seedBegin] == unknownValue<TransAlph<TGlobalHolder::blastProgram>>()) && | |
999 | (seedBegin <= length(lH.gH.redQrySeqs[i]) - seedLength)) | |
1000 | ++seedBegin; | |
1001 | ||
1002 | // termination criterium | |
1003 | if (seedBegin > length(lH.gH.redQrySeqs[i]) - seedLength) | |
1004 | break; | |
1005 | ||
1006 | indexIt = root; | |
1007 | ||
1008 | size_t desiredOccs = length(lH.matches) >= lH.options.maxMatches | |
1009 | ? minResults | |
1010 | : (lH.options.maxMatches - length(lH.matches)) * seedHeurFactor / | |
1011 | ((needlesSum - needlesPos - seedBegin) / lH.options.seedOffset); | |
1012 | ||
1013 | if (desiredOccs == 0) | |
1014 | desiredOccs = minResults; | |
1015 | ||
1016 | // go down seedOffset number of characters without errors | |
1017 | for (size_t k = 0; k < lH.options.seedOffset; ++k) | |
1018 | if (!goDown(indexIt, lH.gH.redQrySeqs[i][seedBegin + k])) | |
1019 | break; | |
1020 | // if unsuccessful, move to next seed | |
1021 | if (repLength(indexIt) != lH.options.seedOffset) | |
1022 | continue; | |
1023 | ||
1024 | auto continRunnable = [&seedLength, &desiredOccs] (TIndexIt const & prevIndexIt, TIndexIt const & indexIt) | |
1025 | { | |
1026 | // NON-ADAPTIVE | |
1027 | // return (repLength(indexIt) <= seedLength); | |
1028 | // ADAPTIVE SEEDING: | |
1029 | ||
1030 | // always continue if minimum seed length not reached | |
1031 | if (repLength(indexIt) <= seedLength) | |
1032 | return true; | |
1033 | ||
1034 | // always continue if it means not loosing hits | |
1035 | if (countOccurrences(indexIt) == countOccurrences(prevIndexIt)) | |
1036 | return true; | |
1037 | ||
1038 | // do vodoo heuristics to see if this hit is to frequent | |
1039 | if (countOccurrences(indexIt) < desiredOccs) | |
1040 | return false; | |
1041 | ||
1042 | return true; | |
1043 | }; | |
1044 | ||
1045 | auto reportRunnable = [&seedLength, &lH, &i, &seedBegin] (TIndexIt const & indexIt) | |
1046 | { | |
1047 | if (repLength(indexIt) >= seedLength) | |
1048 | { | |
1049 | appendValue(lH.stats.seedLengths, repLength(indexIt)); | |
1050 | lH.stats.hitsAfterSeeding += countOccurrences(indexIt); | |
1051 | for (auto const & occ : getOccurrences(indexIt)) | |
1052 | onFindVariable(lH, occ, i, seedBegin, repLength(indexIt)); | |
1053 | } | |
1054 | }; | |
1055 | ||
1056 | __goDownErrors(indexIt, | |
1057 | begin(lH.gH.redQrySeqs[i], Standard()) + seedBegin + lH.options.seedOffset, | |
1058 | end(lH.gH.redQrySeqs[i], Standard()), | |
1059 | continRunnable, | |
1060 | reportRunnable); | |
1061 | } | |
1062 | ||
1063 | needlesPos += length(lH.gH.redQrySeqs[i]); | |
1064 | } | |
1065 | } | |
1066 | 801 | |
1067 | 802 | template <typename BackSpec, typename TLocalHolder> |
1068 | 803 | inline void |
1149 | 884 | inline void |
1150 | 885 | search(TLocalHolder & lH) |
1151 | 886 | { |
1152 | //TODO implement adaptive seeding with 0-n mismatches | |
1153 | 887 | if (lH.options.maxSeedDist == 0) |
1154 | 888 | __search<Backtracking<Exact>>(lH); |
1155 | else if (lH.options.adaptiveSeeding) | |
1156 | __serachAdaptive(lH, lH.options.seedLength); | |
889 | else if (lH.options.hammingOnly) | |
890 | __search<Backtracking<HammingDistance>>(lH); | |
1157 | 891 | else |
1158 | __search<Backtracking<HammingDistance>>(lH); | |
892 | #if 0 // reactivate if edit-distance seeding is readded | |
893 | __search<Backtracking<EditDistance>>(lH); | |
894 | #else | |
895 | return; | |
896 | #endif | |
1159 | 897 | } |
1160 | 898 | |
1161 | 899 | // -------------------------------------------------------------------------- |
1162 | 900 | // Function joinAndFilterMatches() |
1163 | 901 | // -------------------------------------------------------------------------- |
902 | ||
1164 | 903 | |
1165 | 904 | template <typename TLocalHolder> |
1166 | 905 | inline void |
1201 | 940 | } |
1202 | 941 | } |
1203 | 942 | |
1204 | // -------------------------------------------------------------------------- | |
1205 | // Function _setFrames() | |
1206 | // -------------------------------------------------------------------------- | |
1207 | ||
1208 | template <typename TBlastMatch, | |
1209 | typename TLocalHolder> | |
1210 | inline void | |
1211 | _setFrames(TBlastMatch & bm, | |
1212 | typename TLocalHolder::TMatch const & m, | |
1213 | TLocalHolder const & lH) | |
1214 | { | |
1215 | if (qIsTranslated(TLocalHolder::TGlobalHolder::blastProgram)) | |
1216 | { | |
1217 | bm.qFrameShift = (m.qryId % 3) + 1; | |
1218 | if (m.qryId % 6 > 2) | |
1219 | bm.qFrameShift = -bm.qFrameShift; | |
1220 | } else if (qHasRevComp(TLocalHolder::TGlobalHolder::blastProgram)) | |
1221 | { | |
1222 | bm.qFrameShift = 1; | |
1223 | if (m.qryId % 2) | |
1224 | bm.qFrameShift = -bm.qFrameShift; | |
1225 | } else | |
1226 | { | |
1227 | bm.qFrameShift = 0; | |
1228 | } | |
1229 | ||
1230 | if (sIsTranslated(TLocalHolder::TGlobalHolder::blastProgram)) | |
1231 | { | |
1232 | bm.sFrameShift = (m.subjId % 3) + 1; | |
1233 | if (m.subjId % 6 > 2) | |
1234 | bm.sFrameShift = -bm.sFrameShift; | |
1235 | } else if (sHasRevComp(TLocalHolder::TGlobalHolder::blastProgram)) | |
1236 | { | |
1237 | bm.sFrameShift = 1; | |
1238 | if (m.subjId % 2) | |
1239 | bm.sFrameShift = -bm.sFrameShift; | |
1240 | } else | |
1241 | { | |
1242 | bm.sFrameShift = 0; | |
1243 | } | |
1244 | } | |
1245 | ||
1246 | // -------------------------------------------------------------------------- | |
1247 | // Function _writeMatches() | |
1248 | // -------------------------------------------------------------------------- | |
1249 | ||
1250 | template <typename TBlastRecord, | |
1251 | typename TLocalHolder> | |
1252 | inline void | |
1253 | _writeRecord(TBlastRecord & record, | |
1254 | TLocalHolder & lH) | |
1255 | { | |
1256 | if (length(record.matches) > 0) | |
1257 | { | |
1258 | ++lH.stats.qrysWithHit; | |
1259 | // sort and remove duplicates -> STL, yeah! | |
1260 | auto const before = record.matches.size(); | |
1261 | record.matches.sort(); | |
1262 | if (!lH.options.filterPutativeDuplicates) | |
1263 | { | |
1264 | record.matches.unique(); | |
1265 | lH.stats.hitsDuplicate += before - record.matches.size(); | |
1266 | } | |
1267 | if (record.matches.size() > lH.options.maxMatches) | |
1268 | { | |
1269 | lH.stats.hitsAbundant += record.matches.size() - | |
1270 | lH.options.maxMatches; | |
1271 | record.matches.resize(lH.options.maxMatches); | |
1272 | } | |
1273 | lH.stats.hitsFinal += record.matches.size(); | |
1274 | ||
1275 | myWriteRecord(lH, record); | |
1276 | } | |
1277 | } | |
1278 | ||
1279 | // -------------------------------------------------------------------------- | |
1280 | // Function computeBlastMatch() | |
1281 | // -------------------------------------------------------------------------- | |
1282 | ||
1283 | 943 | template <typename TBlastMatch, |
1284 | 944 | typename TLocalHolder> |
1285 | 945 | inline int |
1303 | 963 | // bm.sEnd); |
1304 | 964 | |
1305 | 965 | // std::cout << "Query Id: " << m.qryId |
1306 | // << "\t TrueQryId: " << getTrueQryId(bm.m, lH.options, TGlobalHolder::blastProgram) | |
966 | // << "\t TrueQryId: " << getTrueQryId(bm.m, lH.options, lH.gH.blastProgram) | |
1307 | 967 | // << "\t length(qryIds): " << length(qryIds) |
1308 | 968 | // << "Subj Id: " << m.subjId |
1309 | // << "\t TrueSubjId: " << getTrueSubjId(bm.m, lH.options, TGlobalHolder::blastProgram) | |
969 | // << "\t TrueSubjId: " << getTrueSubjId(bm.m, lH.options, lH.gH.blastProgram) | |
1310 | 970 | // << "\t length(subjIds): " << length(subjIds) << "\n\n"; |
1311 | 971 | |
1312 | 972 | assignSource(bm.alignRow0, infix(lH.gH.qrySeqs[m.qryId], bm.qStart, bm.qEnd)); |
1608 | 1268 | // std::cout << "ALIGN BEFORE STATS:\n" << bm.align << "\n"; |
1609 | 1269 | |
1610 | 1270 | computeAlignmentStats(bm, context(lH.gH.outfile)); |
1271 | ||
1611 | 1272 | if (bm.alignStats.alignmentIdentity < lH.options.idCutOff) |
1612 | 1273 | return PERCENTIDENT; |
1613 | 1274 | |
1614 | 1275 | // const unsigned long qryLength = length(row0); |
1615 | 1276 | computeBitScore(bm, context(lH.gH.outfile)); |
1616 | 1277 | |
1617 | computeEValueThreadSafe(bm, context(lH.gH.outfile)); | |
1278 | // the length adjustment cache must no be written to by multiple threads | |
1279 | SEQAN_OMP_PRAGMA(critical(evalue_length_adj_cache)) | |
1280 | { | |
1281 | computeEValue(bm, context(lH.gH.outfile)); | |
1282 | } | |
1283 | ||
1618 | 1284 | if (bm.eValue > lH.options.eCutOff) |
1285 | { | |
1619 | 1286 | return EVALUE; |
1620 | ||
1621 | _setFrames(bm, m, lH); | |
1287 | } | |
1288 | ||
1289 | if (qIsTranslated(TLocalHolder::TGlobalHolder::blastProgram)) | |
1290 | { | |
1291 | bm.qFrameShift = (m.qryId % 3) + 1; | |
1292 | if (m.qryId % 6 > 2) | |
1293 | bm.qFrameShift = -bm.qFrameShift; | |
1294 | } else if (qHasRevComp(TLocalHolder::TGlobalHolder::blastProgram)) | |
1295 | { | |
1296 | bm.qFrameShift = 1; | |
1297 | if (m.qryId % 2) | |
1298 | bm.qFrameShift = -bm.qFrameShift; | |
1299 | } else | |
1300 | { | |
1301 | bm.qFrameShift = 0; | |
1302 | } | |
1303 | ||
1304 | if (sIsTranslated(TLocalHolder::TGlobalHolder::blastProgram)) | |
1305 | { | |
1306 | bm.sFrameShift = (m.subjId % 3) + 1; | |
1307 | if (m.subjId % 6 > 2) | |
1308 | bm.sFrameShift = -bm.sFrameShift; | |
1309 | } else if (sHasRevComp(TLocalHolder::TGlobalHolder::blastProgram)) | |
1310 | { | |
1311 | bm.sFrameShift = 1; | |
1312 | if (m.subjId % 2) | |
1313 | bm.sFrameShift = -bm.sFrameShift; | |
1314 | } else | |
1315 | { | |
1316 | bm.sFrameShift = 0; | |
1317 | } | |
1622 | 1318 | |
1623 | 1319 | return 0; |
1624 | 1320 | } |
1626 | 1322 | |
1627 | 1323 | template <typename TLocalHolder> |
1628 | 1324 | inline int |
1629 | iterateMatchesExtend(TLocalHolder & lH) | |
1325 | iterateMatches(TLocalHolder & lH) | |
1630 | 1326 | { |
1631 | 1327 | using TGlobalHolder = typename TLocalHolder::TGlobalHolder; |
1632 | 1328 | // using TMatch = typename TGlobalHolder::TMatch; |
1642 | 1338 | using TBlastRecord = BlastRecord<TBlastMatch>; |
1643 | 1339 | |
1644 | 1340 | // constexpr TPos TPosMax = std::numeric_limits<TPos>::max(); |
1645 | // constexpr uint8_t qFactor = qHasRevComp(TGlobalHolder::blastProgram) ? 3 : 1; | |
1646 | // constexpr uint8_t sFactor = sHasRevComp(TGlobalHolder::blastProgram) ? 3 : 1; | |
1341 | // constexpr uint8_t qFactor = qHasRevComp(lH.gH.blastProgram) ? 3 : 1; | |
1342 | // constexpr uint8_t sFactor = sHasRevComp(lH.gH.blastProgram) ? 3 : 1; | |
1647 | 1343 | |
1648 | 1344 | double start = sysTime(); |
1649 | 1345 | if (lH.options.doubleIndexing) |
1657 | 1353 | // std::cout << "Length of matches: " << length(lH.matches); |
1658 | 1354 | // for (auto const & m : lH.matches) |
1659 | 1355 | // { |
1660 | // std::cout << m.qryId << "\t" << getTrueQryId(m,lH.options, TGlobalHolder::blastProgram) << "\n"; | |
1356 | // std::cout << m.qryId << "\t" << getTrueQryId(m,lH.options, lH.gH.blastProgram) << "\n"; | |
1661 | 1357 | // } |
1662 | 1358 | |
1663 | 1359 | // double topMaxMatchesMedianBitScore = 0; |
1670 | 1366 | ++it) |
1671 | 1367 | { |
1672 | 1368 | itN = std::next(it,1); |
1673 | auto const trueQryId = it->qryId / qNumFrames(TGlobalHolder::blastProgram); | |
1369 | auto const trueQryId = it->qryId / qNumFrames(lH.gH.blastProgram); | |
1674 | 1370 | |
1675 | 1371 | TBlastRecord record(lH.gH.qryIds[trueQryId]); |
1676 | 1372 | |
1677 | record.qLength = (qIsTranslated(TGlobalHolder::blastProgram) | |
1373 | record.qLength = (qIsTranslated(lH.gH.blastProgram) | |
1678 | 1374 | ? lH.gH.untransQrySeqLengths[trueQryId] |
1679 | 1375 | : length(lH.gH.qrySeqs[it->qryId])); |
1680 | 1376 | |
1683 | 1379 | // inner loop over matches per record |
1684 | 1380 | for (; it != itEnd; ++it) |
1685 | 1381 | { |
1686 | auto const trueSubjId = it->subjId / sNumFrames(TGlobalHolder::blastProgram); | |
1382 | auto const trueSubjId = it->subjId / sNumFrames(lH.gH.blastProgram); | |
1687 | 1383 | itN = std::next(it,1); |
1688 | 1384 | // std::cout << "FOO\n" << std::flush; |
1689 | 1385 | // std::cout << "QryStart: " << it->qryStart << "\n" << std::flush; |
1735 | 1431 | { |
1736 | 1432 | // declare all the rest as putative abundant |
1737 | 1433 | while ((it != itEnd) && |
1738 | (trueQryId == it->qryId / qNumFrames(TGlobalHolder::blastProgram))) | |
1434 | (trueQryId == it->qryId / qNumFrames(lH.gH.blastProgram))) | |
1739 | 1435 | { |
1740 | 1436 | // not already marked as abundant, duplicate or merged |
1741 | 1437 | if (!isSetToSkip(*it)) |
1757 | 1453 | auto & bm = back(record.matches); |
1758 | 1454 | |
1759 | 1455 | bm.qStart = it->qryStart; |
1760 | bm.qEnd = it->qryEnd; // it->qryStart + lH.options.seedLength; | |
1456 | bm.qEnd = it->qryStart + lH.options.seedLength; | |
1761 | 1457 | bm.sStart = it->subjStart; |
1762 | bm.sEnd = it->subjEnd;//it->subjStart + lH.options.seedLength; | |
1458 | bm.sEnd = it->subjStart + lH.options.seedLength; | |
1763 | 1459 | |
1764 | 1460 | bm.qLength = record.qLength; |
1765 | bm.sLength = sIsTranslated(TGlobalHolder::blastProgram) | |
1461 | bm.sLength = sIsTranslated(lH.gH.blastProgram) | |
1766 | 1462 | ? lH.gH.untransSubjSeqLengths[trueSubjId] |
1767 | 1463 | : length(lH.gH.subjSeqs[it->subjId]); |
1768 | 1464 | |
1769 | 1465 | // MERGE PUTATIVE SIBLINGS INTO THIS MATCH |
1770 | if (lH.options.mergePutativeSiblings) | |
1466 | for (auto it2 = itN; | |
1467 | (it2 != itEnd) && | |
1468 | (trueQryId == it2->qryId / qNumFrames(lH.gH.blastProgram)) && | |
1469 | (trueSubjId == it2->subjId / sNumFrames(lH.gH.blastProgram)); | |
1470 | ++it2) | |
1771 | 1471 | { |
1772 | for (auto it2 = itN; | |
1773 | (it2 != itEnd) && | |
1774 | (trueQryId == it2->qryId / qNumFrames(TGlobalHolder::blastProgram)) && | |
1775 | (trueSubjId == it2->subjId / sNumFrames(TGlobalHolder::blastProgram)); | |
1776 | ++it2) | |
1472 | // same frame | |
1473 | if ((it->qryId % qNumFrames(lH.gH.blastProgram) == it2->qryId % qNumFrames(lH.gH.blastProgram)) && | |
1474 | (it->subjId % sNumFrames(lH.gH.blastProgram) == it2->subjId % sNumFrames(lH.gH.blastProgram))) | |
1777 | 1475 | { |
1778 | // same frame | |
1779 | if ((it->qryId % qNumFrames(TGlobalHolder::blastProgram) == it2->qryId % qNumFrames(TGlobalHolder::blastProgram)) && | |
1780 | (it->subjId % sNumFrames(TGlobalHolder::blastProgram) == it2->subjId % sNumFrames(TGlobalHolder::blastProgram))) | |
1476 | ||
1477 | // TPos const qDist = (it2->qryStart >= bm.qEnd) | |
1478 | // ? it2->qryStart - bm.qEnd // upstream | |
1479 | // : 0; // overlap | |
1480 | // | |
1481 | // TPos sDist = TPosMax; // subj match region downstream of *it | |
1482 | // if (it2->subjStart >= bm.sEnd) // upstream | |
1483 | // sDist = it2->subjStart - bm.sEnd; | |
1484 | // else if (it2->subjStart >= it->subjStart) // overlap | |
1485 | // sDist = 0; | |
1486 | ||
1487 | // due to sorting it2->qryStart never <= it->qStart | |
1488 | // so subject sequences must have same order | |
1489 | if (it2->subjStart < it->subjStart) | |
1490 | continue; | |
1491 | ||
1492 | long const qDist = it2->qryStart - bm.qEnd; | |
1493 | long const sDist = it2->subjStart - bm.sEnd; | |
1494 | ||
1495 | if ((qDist == sDist) && | |
1496 | (qDist <= (long)lH.options.seedGravity)) | |
1781 | 1497 | { |
1782 | ||
1783 | // TPos const qDist = (it2->qryStart >= bm.qEnd) | |
1784 | // ? it2->qryStart - bm.qEnd // upstream | |
1785 | // : 0; // overlap | |
1786 | // | |
1787 | // TPos sDist = TPosMax; // subj match region downstream of *it | |
1788 | // if (it2->subjStart >= bm.sEnd) // upstream | |
1789 | // sDist = it2->subjStart - bm.sEnd; | |
1790 | // else if (it2->subjStart >= it->subjStart) // overlap | |
1791 | // sDist = 0; | |
1792 | ||
1793 | // due to sorting it2->qryStart never <= it->qStart | |
1794 | // so subject sequences must have same order | |
1795 | if (it2->subjStart < it->subjStart) | |
1796 | continue; | |
1797 | ||
1798 | long const qDist = it2->qryStart - bm.qEnd; | |
1799 | long const sDist = it2->subjStart - bm.sEnd; | |
1800 | ||
1801 | if ((qDist == sDist) && | |
1802 | (qDist <= (long)lH.options.seedGravity)) | |
1803 | { | |
1804 | bm.qEnd = std::max(bm.qEnd, static_cast<TBlastPos>(it2->qryEnd)); | |
1805 | bm.sEnd = std::max(bm.sEnd, static_cast<TBlastPos>(it2->subjEnd)); | |
1806 | ++lH.stats.hitsMerged; | |
1807 | ||
1808 | setToSkip(*it2); | |
1809 | } | |
1498 | bm.qEnd = std::max(bm.qEnd, | |
1499 | static_cast<TBlastPos>(it2->qryStart | |
1500 | + lH.options.seedLength)); | |
1501 | bm.sEnd = std::max(bm.sEnd, | |
1502 | static_cast<TBlastPos>(it2->subjStart | |
1503 | + lH.options.seedLength)); | |
1504 | ++lH.stats.hitsMerged; | |
1505 | ||
1506 | setToSkip(*it2); | |
1810 | 1507 | } |
1811 | 1508 | } |
1812 | 1509 | } |
1820 | 1517 | // ++lH.stats.goodMatches; |
1821 | 1518 | if (lH.options.outFileFormat > 0) |
1822 | 1519 | { |
1823 | bm._n_qId = it->qryId / qNumFrames(TGlobalHolder::blastProgram); | |
1824 | bm._n_sId = it->subjId / sNumFrames(TGlobalHolder::blastProgram); | |
1520 | bm._n_qId = it->qryId / qNumFrames(lH.gH.blastProgram); | |
1521 | bm._n_sId = it->subjId / sNumFrames(lH.gH.blastProgram); | |
1825 | 1522 | } |
1826 | 1523 | break; |
1827 | 1524 | case EVALUE: |
1839 | 1536 | << "subjId: " << it->subjId << "\t" |
1840 | 1537 | << "seed qry: " << infix(lH.gH.redQrySeqs, |
1841 | 1538 | it->qryStart, |
1842 | it->qryEnd) | |
1843 | // it->qryStart + lH.options.seedLength) | |
1539 | it->qryStart + lH.options.seedLength) | |
1844 | 1540 | << "\n subj: " << infix(lH.gH.redSubjSeqs, |
1845 | 1541 | it->subjStart, |
1846 | it->subjEnd) | |
1847 | // it->subjStart + lH.options.seedLength) | |
1542 | it->subjStart + lH.options.seedLength) | |
1848 | 1543 | << "\nunred qry: " << infix(lH.gH.qrySeqs, |
1849 | 1544 | it->qryStart, |
1850 | it->qryEnd) | |
1851 | // it->qryStart + lH.options.seedLength) | |
1545 | it->qryStart + lH.options.seedLength) | |
1852 | 1546 | << "\n subj: " << infix(lH.gH.subjSeqs, |
1853 | 1547 | it->subjStart, |
1854 | it->subjEnd) | |
1855 | // it->subjStart + lH.options.seedLength) | |
1548 | it->subjStart + lH.options.seedLength) | |
1856 | 1549 | << "\nmatch qry: " << infix(lH.gH.qrySeqs, |
1857 | 1550 | bm.qStart, |
1858 | 1551 | bm.qEnd) |
1873 | 1566 | // PUTATIVE DUBLICATES CHECK |
1874 | 1567 | for (auto it2 = itN; |
1875 | 1568 | (it2 != itEnd) && |
1876 | (trueQryId == it2->qryId / qNumFrames(TGlobalHolder::blastProgram)) && | |
1877 | (trueSubjId == it2->subjId / sNumFrames(TGlobalHolder::blastProgram)); | |
1569 | (trueQryId == it2->qryId / qNumFrames(lH.gH.blastProgram)) && | |
1570 | (trueSubjId == it2->subjId / sNumFrames(lH.gH.blastProgram)); | |
1878 | 1571 | ++it2) |
1879 | 1572 | { |
1880 | 1573 | // same frame and same range |
1881 | 1574 | if ((it->qryId == it2->qryId) && |
1882 | 1575 | (it->subjId == it2->subjId) && |
1883 | 1576 | (intervalOverlap(it2->qryStart, |
1884 | it2->qryEnd, | |
1885 | // it2->qryStart + lH.options.seedLength, | |
1577 | it2->qryStart + lH.options.seedLength, | |
1886 | 1578 | bm.qStart, |
1887 | 1579 | bm.qEnd) > 0) && |
1888 | 1580 | (intervalOverlap(it2->subjStart, |
1889 | it2->subjEnd, | |
1890 | // it2->subjStart + lH.options.seedLength, | |
1581 | it2->subjStart + lH.options.seedLength, | |
1891 | 1582 | bm.sStart, |
1892 | 1583 | bm.sEnd) > 0)) |
1893 | 1584 | { |
1912 | 1603 | |
1913 | 1604 | // last item or new TrueQryId |
1914 | 1605 | if ((itN == itEnd) || |
1915 | (trueQryId != itN->qryId / qNumFrames(TGlobalHolder::blastProgram))) | |
1606 | (trueQryId != itN->qryId / qNumFrames(lH.gH.blastProgram))) | |
1916 | 1607 | break; |
1917 | 1608 | } |
1918 | 1609 | |
1919 | _writeRecord(record, lH); | |
1610 | if (length(record.matches) > 0) | |
1611 | { | |
1612 | ++lH.stats.qrysWithHit; | |
1613 | // sort and remove duplicates -> STL, yeah! | |
1614 | auto const before = record.matches.size(); | |
1615 | record.matches.sort(); | |
1616 | if (!lH.options.filterPutativeDuplicates) | |
1617 | { | |
1618 | record.matches.unique(); | |
1619 | lH.stats.hitsDuplicate += before - record.matches.size(); | |
1620 | } | |
1621 | if (record.matches.size() > lH.options.maxMatches) | |
1622 | { | |
1623 | lH.stats.hitsAbundant += record.matches.size() - | |
1624 | lH.options.maxMatches; | |
1625 | record.matches.resize(lH.options.maxMatches); | |
1626 | } | |
1627 | lH.stats.hitsFinal += record.matches.size(); | |
1628 | ||
1629 | myWriteRecord(lH, record); | |
1630 | } | |
1631 | ||
1920 | 1632 | } |
1921 | 1633 | |
1922 | 1634 | if (lH.options.doubleIndexing) |
1931 | 1643 | return 0; |
1932 | 1644 | } |
1933 | 1645 | |
1934 | #ifdef SEQAN_SIMD_ENABLED | |
1935 | template <typename TLocalHolder> | |
1936 | inline int | |
1937 | iterateMatchesFullSimd(TLocalHolder & lH) | |
1938 | { | |
1939 | using TGlobalHolder = typename TLocalHolder::TGlobalHolder; | |
1940 | using TMatch = typename TGlobalHolder::TMatch; | |
1941 | using TPos = typename TMatch::TPos; | |
1942 | using TBlastPos = uint32_t; //TODO why can't this be == TPos | |
1943 | using TBlastMatch = BlastMatch< | |
1944 | typename TLocalHolder::TAlignRow0, | |
1945 | typename TLocalHolder::TAlignRow1, | |
1946 | TBlastPos, | |
1947 | typename Value<typename TGlobalHolder::TQryIds>::Type,// const &, | |
1948 | typename Value<typename TGlobalHolder::TSubjIds>::Type// const &, | |
1949 | >; | |
1950 | using TBlastRecord = BlastRecord<TBlastMatch>; | |
1951 | ||
1952 | typedef FreeEndGaps_<True, True, True, True> TFreeEndGaps; | |
1953 | typedef AlignConfig2<LocalAlignment_<>, | |
1954 | DPBandConfig<BandOff>, | |
1955 | TFreeEndGaps, | |
1956 | TracebackOn<TracebackConfig_<CompleteTrace, GapsLeft> > > TAlignConfig; | |
1957 | ||
1958 | typedef int TScoreValue; //TODO don't hardcode | |
1959 | typedef typename Size<typename TLocalHolder::TAlignRow0>::Type TSize; | |
1960 | typedef TraceSegment_<TPos, TSize> TTraceSegment; | |
1961 | ||
1962 | typedef typename SimdVector<int16_t>::Type TSimdAlign; | |
1963 | ||
1964 | unsigned const numAlignments = length(lH.matches); | |
1965 | unsigned const sizeBatch = LENGTH<TSimdAlign>::VALUE; | |
1966 | unsigned const fullSize = sizeBatch * ((numAlignments + sizeBatch - 1) / sizeBatch); | |
1967 | ||
1968 | String<TScoreValue> results; | |
1969 | resize(results, numAlignments); | |
1970 | ||
1971 | // Create a SIMD scoring scheme. | |
1972 | Score<TSimdAlign, ScoreSimdWrapper<typename TGlobalHolder::TScoreScheme> > simdScoringScheme(seqanScheme(context(lH.gH.outfile).scoringScheme)); | |
1973 | ||
1974 | // Prepare string sets with sequences. | |
1975 | StringSet<typename Source<typename TLocalHolder::TAlignRow0>::Type, Dependent<> > depSetH; | |
1976 | StringSet<typename Source<typename TLocalHolder::TAlignRow1>::Type, Dependent<> > depSetV; | |
1977 | reserve(depSetH, fullSize); | |
1978 | reserve(depSetV, fullSize); | |
1979 | ||
1980 | ||
1981 | auto const trueQryId = lH.matches[0].qryId / qNumFrames(TGlobalHolder::blastProgram); | |
1982 | ||
1983 | TBlastRecord record(lH.gH.qryIds[trueQryId]); | |
1984 | record.qLength = (qIsTranslated(TGlobalHolder::blastProgram) | |
1985 | ? lH.gH.untransQrySeqLengths[trueQryId] | |
1986 | : length(lH.gH.qrySeqs[lH.matches[0].qryId])); | |
1987 | ||
1988 | size_t maxDist = 0; | |
1989 | switch (lH.options.band) | |
1990 | { | |
1991 | case -3: maxDist = ceil(log2(record.qLength)); break; | |
1992 | case -2: maxDist = floor(sqrt(record.qLength)); break; | |
1993 | case -1: break; | |
1994 | default: maxDist = lH.options.band; break; | |
1995 | } | |
1996 | ||
1997 | TAlignConfig config;//(-maxDist, maxDist); | |
1998 | ||
1999 | // create blast matches | |
2000 | for (auto it = lH.matches.begin(), itEnd = lH.matches.end(); it != itEnd; ++it) | |
2001 | { | |
2002 | auto const trueSubjId = it->subjId / sNumFrames(TGlobalHolder::blastProgram); | |
2003 | ||
2004 | // create blastmatch in list without copy or move | |
2005 | record.matches.emplace_back(lH.gH.qryIds [trueQryId], | |
2006 | lH.gH.subjIds[trueSubjId]); | |
2007 | ||
2008 | auto & bm = back(record.matches); | |
2009 | auto & m = *it; | |
2010 | ||
2011 | bm.qLength = record.qLength; | |
2012 | bm.sLength = sIsTranslated(TGlobalHolder::blastProgram) | |
2013 | ? lH.gH.untransSubjSeqLengths[trueSubjId] | |
2014 | : length(lH.gH.subjSeqs[it->subjId]); | |
2015 | ||
2016 | long lenDiff = (long)it->subjStart - (long)it->qryStart; | |
2017 | ||
2018 | TPos sStart; | |
2019 | TPos qStart; | |
2020 | if (lenDiff >= 0) | |
2021 | { | |
2022 | sStart = lenDiff; | |
2023 | qStart = 0; | |
2024 | } | |
2025 | else | |
2026 | { | |
2027 | sStart = 0; | |
2028 | qStart = -lenDiff; | |
2029 | } | |
2030 | TPos sEnd = std::min(sStart + length(lH.gH.qrySeqs[it->qryId]), length(lH.gH.subjSeqs[it->subjId])); | |
2031 | ||
2032 | assignSource(bm.alignRow0, infix(lH.gH.qrySeqs[it->qryId], qStart, length(lH.gH.qrySeqs[it->qryId]))); | |
2033 | assignSource(bm.alignRow1, infix(lH.gH.subjSeqs[it->subjId], sStart, sEnd)); | |
2034 | ||
2035 | ||
2036 | appendValue(depSetH, source(bm.alignRow0)); | |
2037 | appendValue(depSetV, source(bm.alignRow1)); | |
2038 | ||
2039 | _setFrames(bm, *it, lH); | |
2040 | } | |
2041 | ||
2042 | // fill up last batch | |
2043 | for (size_t i = numAlignments; i < fullSize; ++i) | |
2044 | { | |
2045 | appendValue(depSetH, source(back(record.matches).alignRow0)); | |
2046 | appendValue(depSetV, source(back(record.matches).alignRow1)); | |
2047 | } | |
2048 | ||
2049 | // Run alignments in batches. | |
2050 | auto matchIt = record.matches.begin(); | |
2051 | for (auto pos = 0u; pos < fullSize; pos += sizeBatch) | |
2052 | { | |
2053 | auto infSetH = infixWithLength(depSetH, pos, sizeBatch); | |
2054 | auto infSetV = infixWithLength(depSetV, pos, sizeBatch); | |
2055 | ||
2056 | TSimdAlign resultsBatch; | |
2057 | ||
2058 | StringSet<String<TTraceSegment> > trace; | |
2059 | resize(trace, sizeBatch, Exact()); | |
2060 | ||
2061 | _prepareAndRunSimdAlignment(resultsBatch, trace, infSetH, infSetV, simdScoringScheme, config, typename TLocalHolder::TScoreExtension()); | |
2062 | ||
2063 | // copy results and finish traceback | |
2064 | // TODO(rrahn): Could be parallelized! | |
2065 | // to for_each call | |
2066 | for(auto x = pos; x < pos + sizeBatch && x < numAlignments; ++x) | |
2067 | { | |
2068 | results[x] = resultsBatch[x - pos]; | |
2069 | _adaptTraceSegmentsTo(matchIt->alignRow0, matchIt->alignRow1, trace[x - pos]); | |
2070 | ++matchIt; | |
2071 | } | |
2072 | } | |
2073 | ||
2074 | // TODO share this code with above function | |
2075 | for (auto it = record.matches.begin(), itEnd = record.matches.end(); it != itEnd; /*below*/) | |
2076 | { | |
2077 | TBlastMatch & bm = *it; | |
2078 | ||
2079 | bm.sStart = beginPosition(bm.alignRow1); | |
2080 | bm.qStart = beginPosition(bm.alignRow0); | |
2081 | bm.sEnd = endPosition(bm.alignRow1); | |
2082 | bm.qEnd = endPosition(bm.alignRow0); | |
2083 | ||
2084 | computeAlignmentStats(bm, context(lH.gH.outfile)); | |
2085 | ||
2086 | if (bm.alignStats.alignmentIdentity < lH.options.idCutOff) | |
2087 | { | |
2088 | ++lH.stats.hitsFailedExtendPercentIdentTest; | |
2089 | it = record.matches.erase(it); | |
2090 | continue; | |
2091 | } | |
2092 | ||
2093 | computeBitScore(bm, context(lH.gH.outfile)); | |
2094 | ||
2095 | computeEValueThreadSafe(bm, context(lH.gH.outfile)); | |
2096 | ||
2097 | if (bm.eValue > lH.options.eCutOff) | |
2098 | { | |
2099 | ++lH.stats.hitsFailedExtendEValueTest; | |
2100 | it = record.matches.erase(it); | |
2101 | continue; | |
2102 | } | |
2103 | ||
2104 | ++it; | |
2105 | } | |
2106 | ||
2107 | _writeRecord(record, lH); | |
2108 | ||
2109 | return 0; | |
2110 | } | |
2111 | ||
2112 | #endif // SEQAN_SIMD_ENABLED | |
2113 | ||
2114 | template <typename TLocalHolder> | |
2115 | inline int | |
2116 | iterateMatchesFullSerial(TLocalHolder & lH) | |
2117 | { | |
2118 | using TGlobalHolder = typename TLocalHolder::TGlobalHolder; | |
2119 | using TMatch = typename TGlobalHolder::TMatch; | |
2120 | using TPos = typename TMatch::TPos; | |
2121 | using TBlastPos = uint32_t; //TODO why can't this be == TPos | |
2122 | using TBlastMatch = BlastMatch< | |
2123 | typename TLocalHolder::TAlignRow0, | |
2124 | typename TLocalHolder::TAlignRow1, | |
2125 | TBlastPos, | |
2126 | typename Value<typename TGlobalHolder::TQryIds>::Type,// const &, | |
2127 | typename Value<typename TGlobalHolder::TSubjIds>::Type// const &, | |
2128 | >; | |
2129 | using TBlastRecord = BlastRecord<TBlastMatch>; | |
2130 | ||
2131 | auto const trueQryId = lH.matches[0].qryId / qNumFrames(TGlobalHolder::blastProgram); | |
2132 | ||
2133 | TBlastRecord record(lH.gH.qryIds[trueQryId]); | |
2134 | record.qLength = (qIsTranslated(TGlobalHolder::blastProgram) | |
2135 | ? lH.gH.untransQrySeqLengths[trueQryId] | |
2136 | : length(lH.gH.qrySeqs[lH.matches[0].qryId])); | |
2137 | ||
2138 | unsigned maxDist = 0; | |
2139 | switch (lH.options.band) | |
2140 | { | |
2141 | case -3: maxDist = ceil(log2(record.qLength)); break; | |
2142 | case -2: maxDist = floor(sqrt(record.qLength)); break; | |
2143 | case -1: break; | |
2144 | default: maxDist = lH.options.band; break; | |
2145 | } | |
2146 | ||
2147 | // create blast matches | |
2148 | for (auto it = lH.matches.begin(), itEnd = lH.matches.end(); it != itEnd; ++it) | |
2149 | { | |
2150 | auto const trueSubjId = it->subjId / sNumFrames(TGlobalHolder::blastProgram); | |
2151 | ||
2152 | // create blastmatch in list without copy or move | |
2153 | record.matches.emplace_back(lH.gH.qryIds [trueQryId], | |
2154 | lH.gH.subjIds[trueSubjId]); | |
2155 | ||
2156 | auto & bm = back(record.matches); | |
2157 | auto & m = *it; | |
2158 | ||
2159 | bm.qLength = record.qLength; | |
2160 | bm.sLength = sIsTranslated(TGlobalHolder::blastProgram) | |
2161 | ? lH.gH.untransSubjSeqLengths[trueSubjId] | |
2162 | : length(lH.gH.subjSeqs[it->subjId]); | |
2163 | ||
2164 | long lenDiff = (long)it->subjStart - (long)it->qryStart; | |
2165 | ||
2166 | TPos sStart; | |
2167 | TPos qStart; | |
2168 | if (lenDiff >= 0) | |
2169 | { | |
2170 | sStart = lenDiff; | |
2171 | qStart = 0; | |
2172 | } | |
2173 | else | |
2174 | { | |
2175 | sStart = 0; | |
2176 | qStart = -lenDiff; | |
2177 | } | |
2178 | TPos sEnd = std::min(sStart + length(lH.gH.qrySeqs[it->qryId]), length(lH.gH.subjSeqs[it->subjId])); | |
2179 | ||
2180 | assignSource(bm.alignRow0, infix(lH.gH.qrySeqs[it->qryId], qStart, length(lH.gH.qrySeqs[it->qryId]))); | |
2181 | assignSource(bm.alignRow1, infix(lH.gH.subjSeqs[it->subjId], sStart, sEnd)); | |
2182 | ||
2183 | // localAlignment2(bm.alignRow0, | |
2184 | // bm.alignRow1, | |
2185 | // seqanScheme(context(lH.gH.outfile).scoringScheme), | |
2186 | // -maxDist, | |
2187 | // maxDist, | |
2188 | // lH.alignContext); | |
2189 | localAlignment(bm.alignRow0, | |
2190 | bm.alignRow1, | |
2191 | seqanScheme(context(lH.gH.outfile).scoringScheme), | |
2192 | -maxDist, | |
2193 | maxDist); | |
2194 | ||
2195 | bm.sStart = beginPosition(bm.alignRow1); | |
2196 | bm.qStart = beginPosition(bm.alignRow0); | |
2197 | bm.sEnd = endPosition(bm.alignRow1); | |
2198 | bm.qEnd = endPosition(bm.alignRow0); | |
2199 | ||
2200 | computeAlignmentStats(bm, context(lH.gH.outfile)); | |
2201 | ||
2202 | if (bm.alignStats.alignmentIdentity < lH.options.idCutOff) | |
2203 | { | |
2204 | ++lH.stats.hitsFailedExtendPercentIdentTest; | |
2205 | record.matches.pop_back(); | |
2206 | continue; | |
2207 | } | |
2208 | ||
2209 | computeBitScore(bm, context(lH.gH.outfile)); | |
2210 | ||
2211 | computeEValueThreadSafe(bm, context(lH.gH.outfile)); | |
2212 | ||
2213 | if (bm.eValue > lH.options.eCutOff) | |
2214 | { | |
2215 | ++lH.stats.hitsFailedExtendEValueTest; | |
2216 | record.matches.pop_back(); | |
2217 | continue; | |
2218 | } | |
2219 | ||
2220 | _setFrames(bm, m, lH); | |
2221 | } | |
2222 | ||
2223 | _writeRecord(record, lH); | |
2224 | ||
2225 | return 0; | |
2226 | } | |
2227 | ||
2228 | template <typename TLocalHolder> | |
2229 | inline int | |
2230 | iterateMatches(TLocalHolder & lH) | |
2231 | { | |
2232 | #ifdef SEQAN_SIMD_ENABLED | |
2233 | if (lH.options.extensionMode == LambdaOptions::ExtensionMode::FULL_SIMD) | |
2234 | return iterateMatchesFullSimd(lH); | |
2235 | else | |
2236 | #endif | |
2237 | if (lH.options.extensionMode == LambdaOptions::ExtensionMode::FULL_SERIAL) | |
2238 | return iterateMatchesFullSerial(lH); | |
2239 | else | |
2240 | return iterateMatchesExtend(lH); | |
2241 | } | |
2242 | ||
2243 | 1646 | #endif // HEADER GUARD |
18 | 18 | // lambda.cpp: Main File for the main application |
19 | 19 | // ========================================================================== |
20 | 20 | |
21 | #include <initializer_list> | |
22 | ||
23 | 21 | #include <seqan/basic.h> |
22 | ||
24 | 23 | #include <seqan/arg_parse.h> |
25 | 24 | #include <seqan/seq_io.h> |
26 | 25 | |
178 | 177 | if (sIsTranslated(p)) |
179 | 178 | _saveOriginalSeqLengths(originalSeqs.limits, options); |
180 | 179 | |
181 | // // convert the seg file to seqan binary format | |
182 | // ret = convertMaskingFile(length(originalSeqs), options); | |
183 | // if (ret) | |
184 | // return ret; | |
180 | // convert the seg file to seqan binary format | |
181 | ret = convertMaskingFile(length(originalSeqs), options); | |
182 | if (ret) | |
183 | return ret; | |
185 | 184 | |
186 | 185 | // translate or swap depending on program |
187 | 186 | translateOrSwap(translatedSeqs, originalSeqs, options); |
188 | 187 | } |
189 | 188 | |
190 | 189 | // dump translated and unreduced sequences (except where they are included in index) |
191 | if ((options.alphReduction != 0) || (options.dbIndexType == DbIndexType::FM_INDEX)) | |
190 | if ((options.alphReduction != 0) || (options.dbIndexType != 0)) | |
192 | 191 | dumpTranslatedSeqs(translatedSeqs, options); |
193 | 192 | |
194 | 193 | // see if final sequence set actually fits into index |
195 | 194 | if (!checkIndexSize(translatedSeqs)) |
196 | 195 | return -1; |
197 | 196 | |
198 | if (options.dbIndexType == DbIndexType::FM_INDEX) | |
197 | if (options.dbIndexType == 1) | |
199 | 198 | { |
200 | 199 | using TIndexSpec = TFMIndex<TIndexSpecSpec>; |
201 | 200 | generateIndexAndDump<TIndexSpec,TIndexSpecSpec>(translatedSeqs, |
211 | 210 | TRedAlph()); |
212 | 211 | } |
213 | 212 | |
214 | // dump options | |
215 | for (auto && s : std::initializer_list<std::pair<std::string, std::string>> | |
216 | { | |
217 | { options.indexDir + "/option:db_index_type", std::to_string(static_cast<uint32_t>(options.dbIndexType))}, | |
218 | { options.indexDir + "/option:alph_original", std::string(_alphName(OrigSubjAlph<p>())) }, | |
219 | { options.indexDir + "/option:alph_translated", std::string(_alphName(TransAlph<p>())) }, | |
220 | { options.indexDir + "/option:alph_reduced", std::string(_alphName(TRedAlph())) }, | |
221 | { options.indexDir + "/option:genetic_code", std::to_string(options.geneticCode) } | |
222 | }) | |
223 | { | |
224 | std::ofstream f{std::get<0>(s).c_str(), std::ios_base::out | std::ios_base::binary}; | |
225 | f << std::get<1>(s); | |
226 | f.close(); | |
227 | } | |
228 | ||
229 | 213 | return 0; |
230 | 214 | } |
231 | 215 |
112 | 112 | myPrint(options, 1, "Dumping Subj Ids..."); |
113 | 113 | |
114 | 114 | //TODO save to TMPDIR instead |
115 | CharString _path = options.indexDir; | |
116 | append(_path, "/seq_ids"); | |
115 | CharString _path = options.dbFile; | |
116 | append(_path, ".ids"); | |
117 | 117 | save(ids, toCString(_path)); |
118 | 118 | |
119 | 119 | myPrint(options, 1, " done.\n"); |
138 | 138 | |
139 | 139 | myPrint(options, 1, " dumping untranslated subject lengths..."); |
140 | 140 | //TODO save to TMPDIR instead |
141 | CharString _path = options.indexDir; | |
142 | append(_path, "/untranslated_seq_lengths"); | |
141 | CharString _path = options.dbFile; | |
142 | append(_path, ".untranslengths"); | |
143 | 143 | save(limits, toCString(_path)); |
144 | 144 | } |
145 | 145 | |
183 | 183 | myPrint(options, 1, "Dumping unreduced Subj Sequences..."); |
184 | 184 | |
185 | 185 | //TODO save to TMPDIR instead |
186 | std::string _path = options.indexDir + "/translated_seqs"; | |
186 | std::string _path = options.dbFile + '.' + std::string(_alphName(TTransAlph())); | |
187 | 187 | save(translatedSeqs, _path.c_str()); |
188 | 188 | |
189 | 189 | myPrint(options, 1, " done.\n"); |
251 | 251 | return true; |
252 | 252 | } |
253 | 253 | |
254 | // // -------------------------------------------------------------------------- | |
255 | // // Function loadSubj() | |
256 | // // -------------------------------------------------------------------------- | |
257 | // | |
258 | // inline int | |
259 | // convertMaskingFile(uint64_t numberOfSeqs, | |
260 | // LambdaIndexerOptions const & options) | |
261 | // | |
262 | // { | |
263 | // StringSet<String<unsigned>, Owner<ConcatDirect<>>> segIntStarts; | |
264 | // StringSet<String<unsigned>, Owner<ConcatDirect<>>> segIntEnds; | |
265 | // // resize(segIntervals, numberOfSeqs, Exact()); | |
266 | // | |
267 | // if (options.segFile != "") | |
254 | // -------------------------------------------------------------------------- | |
255 | // Function loadSubj() | |
256 | // -------------------------------------------------------------------------- | |
257 | ||
258 | inline int | |
259 | convertMaskingFile(uint64_t numberOfSeqs, | |
260 | LambdaIndexerOptions const & options) | |
261 | ||
262 | { | |
263 | StringSet<String<unsigned>, Owner<ConcatDirect<>>> segIntStarts; | |
264 | StringSet<String<unsigned>, Owner<ConcatDirect<>>> segIntEnds; | |
265 | // resize(segIntervals, numberOfSeqs, Exact()); | |
266 | ||
267 | if (options.segFile != "") | |
268 | { | |
269 | myPrint(options, 1, "Constructing binary seqan masking from seg-file..."); | |
270 | ||
271 | std::ifstream stream; | |
272 | stream.open(toCString(options.segFile)); | |
273 | if (!stream.is_open()) | |
274 | { | |
275 | std::cerr << "ERROR: could not open seg file.\n"; | |
276 | return -1; | |
277 | } | |
278 | ||
279 | auto reader = directionIterator(stream, Input()); | |
280 | ||
281 | // StringSet<String<Tuple<unsigned, 2>>> _segIntervals; | |
282 | // auto & _segIntervals = segIntervals; | |
283 | // resize(_segIntervals, numberOfSeqs, Exact()); | |
284 | StringSet<String<unsigned>> _segIntStarts; | |
285 | StringSet<String<unsigned>> _segIntEnds; | |
286 | resize(_segIntStarts, numberOfSeqs, Exact()); | |
287 | resize(_segIntEnds, numberOfSeqs, Exact()); | |
288 | CharString buf; | |
289 | // std::tuple<unsigned, unsigned> tup; | |
290 | ||
291 | // auto curSeq = begin(_segIntervals); | |
292 | unsigned curSeq = 0; | |
293 | while (value(reader) == '>') | |
294 | { | |
295 | // if (curSeq == end(_segIntervals)) | |
296 | // return -7; | |
297 | if (curSeq == numberOfSeqs) | |
298 | { | |
299 | std::cerr << "ERROR: seg file has more entries then database.\n"; | |
300 | return -7; | |
301 | } | |
302 | skipLine(reader); | |
303 | if (atEnd(reader)) | |
304 | break; | |
305 | ||
306 | unsigned curInt = 0; | |
307 | while ((!atEnd(reader)) && (value(reader) != '>')) | |
308 | { | |
309 | resize(_segIntStarts[curSeq], length(_segIntStarts[curSeq])+1); | |
310 | resize(_segIntEnds[curSeq], length(_segIntEnds[curSeq])+1); | |
311 | clear(buf); | |
312 | readUntil(buf, reader, IsWhitespace()); | |
313 | ||
314 | // std::get<0>(tup) = strtoumax(toCString(buf), 0, 10); | |
315 | _segIntStarts[curSeq][curInt] = strtoumax(toCString(buf), 0, 10); | |
316 | skipUntil(reader, IsDigit()); | |
317 | ||
318 | clear(buf); | |
319 | readUntil(buf, reader, IsWhitespace()); | |
320 | ||
321 | // std::get<1>(tup) = strtoumax(toCString(buf), 0, 10); | |
322 | _segIntEnds[curSeq][curInt] = strtoumax(toCString(buf), 0, 10); | |
323 | ||
324 | // appendValue(*curSeq, tup); | |
325 | ||
326 | skipLine(reader); | |
327 | curInt++; | |
328 | } | |
329 | if (atEnd(reader)) | |
330 | break; | |
331 | else | |
332 | curSeq++; | |
333 | } | |
334 | // if (curSeq != end(_segIntervals)) | |
335 | // return -9; | |
336 | if (curSeq != (numberOfSeqs - 1)) | |
337 | { | |
338 | std::cerr << "ERROR: seg file has less entries (" << curSeq + 1 | |
339 | << ") than database (" << numberOfSeqs << ").\n"; | |
340 | return -9; | |
341 | } | |
342 | ||
343 | segIntStarts.concat = concat(_segIntStarts); | |
344 | segIntStarts.limits = stringSetLimits(_segIntStarts); | |
345 | segIntEnds.concat = concat(_segIntEnds); | |
346 | segIntEnds.limits = stringSetLimits(_segIntEnds); | |
347 | // segIntEnds = _segIntEnds; | |
348 | // segIntervals = _segIntervals; // non-concatdirect to concatdirect | |
349 | ||
350 | stream.close(); | |
351 | ||
352 | } else | |
353 | { | |
354 | myPrint(options, 1, "No Seg-File specified, no masking will take place.\n"); | |
355 | // resize(segIntervals, numberOfSeqs, Exact()); | |
356 | resize(segIntStarts, numberOfSeqs, Exact()); | |
357 | resize(segIntEnds, numberOfSeqs, Exact()); | |
358 | } | |
359 | ||
360 | // for (unsigned u = 0; u < length(segIntStarts); ++u) | |
268 | 361 | // { |
269 | // myPrint(options, 1, "Constructing binary seqan masking from seg-file..."); | |
270 | // | |
271 | // std::ifstream stream; | |
272 | // stream.open(toCString(options.segFile)); | |
273 | // if (!stream.is_open()) | |
362 | // myPrint(options, 1,u, ": "; | |
363 | // for (unsigned v = 0; v < length(segIntStarts[u]); ++v) | |
274 | 364 | // { |
275 | // std::cerr << "ERROR: could not open seg file.\n"; | |
276 | // return -1; | |
365 | // myPrint(options, 1,'(', segIntStarts[u][v], ", ", segIntEnds[u][v], ") "; | |
277 | 366 | // } |
278 | // | |
279 | // auto reader = directionIterator(stream, Input()); | |
280 | // | |
281 | // // StringSet<String<Tuple<unsigned, 2>>> _segIntervals; | |
282 | // // auto & _segIntervals = segIntervals; | |
283 | // // resize(_segIntervals, numberOfSeqs, Exact()); | |
284 | // StringSet<String<unsigned>> _segIntStarts; | |
285 | // StringSet<String<unsigned>> _segIntEnds; | |
286 | // resize(_segIntStarts, numberOfSeqs, Exact()); | |
287 | // resize(_segIntEnds, numberOfSeqs, Exact()); | |
288 | // CharString buf; | |
289 | // // std::tuple<unsigned, unsigned> tup; | |
290 | // | |
291 | // // auto curSeq = begin(_segIntervals); | |
292 | // unsigned curSeq = 0; | |
293 | // while (value(reader) == '>') | |
294 | // { | |
295 | // // if (curSeq == end(_segIntervals)) | |
296 | // // return -7; | |
297 | // if (curSeq == numberOfSeqs) | |
298 | // { | |
299 | // std::cerr << "ERROR: seg file has more entries then database.\n"; | |
300 | // return -7; | |
301 | // } | |
302 | // skipLine(reader); | |
303 | // if (atEnd(reader)) | |
304 | // break; | |
305 | // | |
306 | // unsigned curInt = 0; | |
307 | // while ((!atEnd(reader)) && (value(reader) != '>')) | |
308 | // { | |
309 | // resize(_segIntStarts[curSeq], length(_segIntStarts[curSeq])+1); | |
310 | // resize(_segIntEnds[curSeq], length(_segIntEnds[curSeq])+1); | |
311 | // clear(buf); | |
312 | // readUntil(buf, reader, IsWhitespace()); | |
313 | // | |
314 | // // std::get<0>(tup) = strtoumax(toCString(buf), 0, 10); | |
315 | // _segIntStarts[curSeq][curInt] = strtoumax(toCString(buf), 0, 10); | |
316 | // skipUntil(reader, IsDigit()); | |
317 | // | |
318 | // clear(buf); | |
319 | // readUntil(buf, reader, IsWhitespace()); | |
320 | // | |
321 | // // std::get<1>(tup) = strtoumax(toCString(buf), 0, 10); | |
322 | // _segIntEnds[curSeq][curInt] = strtoumax(toCString(buf), 0, 10); | |
323 | // | |
324 | // // appendValue(*curSeq, tup); | |
325 | // | |
326 | // skipLine(reader); | |
327 | // curInt++; | |
328 | // } | |
329 | // if (atEnd(reader)) | |
330 | // break; | |
331 | // else | |
332 | // curSeq++; | |
333 | // } | |
334 | // // if (curSeq != end(_segIntervals)) | |
335 | // // return -9; | |
336 | // if (curSeq != (numberOfSeqs - 1)) | |
337 | // { | |
338 | // std::cerr << "ERROR: seg file has less entries (" << curSeq + 1 | |
339 | // << ") than database (" << numberOfSeqs << ").\n"; | |
340 | // return -9; | |
341 | // } | |
342 | // | |
343 | // segIntStarts.concat = concat(_segIntStarts); | |
344 | // segIntStarts.limits = stringSetLimits(_segIntStarts); | |
345 | // segIntEnds.concat = concat(_segIntEnds); | |
346 | // segIntEnds.limits = stringSetLimits(_segIntEnds); | |
347 | // // segIntEnds = _segIntEnds; | |
348 | // // segIntervals = _segIntervals; // non-concatdirect to concatdirect | |
349 | // | |
350 | // stream.close(); | |
351 | // | |
352 | // } else | |
353 | // { | |
354 | // myPrint(options, 1, "No Seg-File specified, no masking will take place.\n"); | |
355 | // // resize(segIntervals, numberOfSeqs, Exact()); | |
356 | // resize(segIntStarts, numberOfSeqs, Exact()); | |
357 | // resize(segIntEnds, numberOfSeqs, Exact()); | |
367 | // myPrint(options, 1,'\n'; | |
358 | 368 | // } |
359 | // | |
360 | // // for (unsigned u = 0; u < length(segIntStarts); ++u) | |
361 | // // { | |
362 | // // myPrint(options, 1,u, ": "; | |
363 | // // for (unsigned v = 0; v < length(segIntStarts[u]); ++v) | |
364 | // // { | |
365 | // // myPrint(options, 1,'(', segIntStarts[u][v], ", ", segIntEnds[u][v], ") "; | |
366 | // // } | |
367 | // // myPrint(options, 1,'\n'; | |
368 | // // } | |
369 | // myPrint(options, 1, "Dumping binary seqan mask file..."); | |
370 | // CharString _path = options.dbFile; | |
371 | // append(_path, ".binseg_s"); | |
372 | // save(segIntStarts, toCString(_path)); | |
373 | // _path = options.dbFile; | |
374 | // append(_path, ".binseg_e"); | |
375 | // save(segIntEnds, toCString(_path)); | |
376 | // myPrint(options, 1, " done.\n"); | |
377 | // myPrint(options, 2, "\n"); | |
378 | // return 0; | |
379 | // } | |
369 | myPrint(options, 1, "Dumping binary seqan mask file..."); | |
370 | CharString _path = options.dbFile; | |
371 | append(_path, ".binseg_s"); | |
372 | save(segIntStarts, toCString(_path)); | |
373 | _path = options.dbFile; | |
374 | append(_path, ".binseg_e"); | |
375 | save(segIntEnds, toCString(_path)); | |
376 | myPrint(options, 1, " done.\n"); | |
377 | myPrint(options, 2, "\n"); | |
378 | return 0; | |
379 | } | |
380 | 380 | |
381 | 381 | // -------------------------------------------------------------------------- |
382 | 382 | // Function createSuffixArray() |
565 | 565 | // Dump Index |
566 | 566 | myPrint(options, 1, "Writing Index to disk..."); |
567 | 567 | s = sysTime(); |
568 | std::string path = options.indexDir + "/index"; | |
569 | ||
568 | std::string path = toCString(options.dbFile); | |
569 | path += '.' + std::string(_alphName(TRedAlph())); | |
570 | if (indexIsFM) | |
571 | path += ".fm"; | |
572 | else | |
573 | path += ".sa"; | |
570 | 574 | save(dbIndex, path.c_str()); |
571 | ||
572 | 575 | e = sysTime() - s; |
573 | 576 | myPrint(options, 1, " done.\n"); |
574 | 577 | myPrint(options, 2, "Runtime: ", e, "s \n"); |
43 | 43 | TQId qryId; |
44 | 44 | TSId subjId; |
45 | 45 | TPos qryStart; |
46 | TPos qryEnd; | |
46 | // TPos qryEnd; | |
47 | 47 | |
48 | 48 | TPos subjStart; |
49 | TPos subjEnd; | |
49 | // TPos subjEnd; | |
50 | 50 | |
51 | 51 | // Match() |
52 | 52 | // : |
66 | 66 | |
67 | 67 | inline bool operator== (Match const & m2) const |
68 | 68 | { |
69 | return std::tie(qryId, subjId, qryStart, subjStart, qryEnd, subjEnd) | |
70 | == std::tie(m2.qryId, m2.subjId, m2.qryStart, m2.subjStart, m2.qryEnd, m2.subjEnd); | |
69 | return std::tie(qryId, subjId, qryStart, subjStart/*, qryEnd, subjEnd*/) | |
70 | == std::tie(m2.qryId, m2.subjId, m2.qryStart, m2.subjStart/*, m2.qryEnd, m2.subjEnd*/); | |
71 | 71 | } |
72 | 72 | inline bool operator< (Match const & m2) const |
73 | 73 | { |
74 | return std::tie(qryId, subjId, qryStart, subjStart, qryEnd, subjEnd) | |
75 | < std::tie(m2.qryId, m2.subjId, m2.qryStart, m2.subjStart, m2.qryEnd, m2.subjEnd); | |
74 | return std::tie(qryId, subjId, qryStart, subjStart/*, qryEnd, subjEnd*/) | |
75 | < std::tie(m2.qryId, m2.subjId, m2.qryStart, m2.subjStart/*, m2.qryEnd, m2.subjEnd*/); | |
76 | 76 | } |
77 | 77 | }; |
78 | 78 | |
271 | 271 | // m1.subjEnd = std::max(m1.subjEnd, m2.subjEnd); |
272 | 272 | // } |
273 | 273 | |
274 | template <typename TAlph> | |
275 | inline void | |
276 | _printMatch(Match<TAlph> const & m) | |
277 | { | |
278 | std::cout << "MATCH Query " << m.qryId | |
279 | << "(" << m.qryStart << ", " << m.qryEnd | |
280 | << ") on Subject "<< m.subjId | |
281 | << "(" << m.subjStart << ", " << m.subjEnd | |
282 | << ")" << std::endl << std::flush; | |
283 | } | |
274 | ||
275 | // inline void | |
276 | // _printMatch(Match const & m) | |
277 | // { | |
278 | // std::cout << "MATCH Query " << m.qryId | |
279 | // << "(" << m.qryStart << ", " << m.qryEnd | |
280 | // << ") on Subject "<< m.subjId | |
281 | // << "(" << m.subjStart << ", " << m.subjEnd | |
282 | // << ")" << std::endl << std::flush; | |
283 | // } | |
284 | 284 | |
285 | 285 | |
286 | 286 |
411 | 411 | } |
412 | 412 | |
413 | 413 | // ---------------------------------------------------------------------------- |
414 | // Function computeEValueThreadSafe | |
415 | // ---------------------------------------------------------------------------- | |
416 | ||
417 | template <typename TBlastMatch, | |
418 | typename TScore, | |
419 | BlastProgram p, | |
420 | BlastTabularSpec h> | |
421 | inline double | |
422 | computeEValueThreadSafe(TBlastMatch & match, | |
423 | BlastIOContext<TScore, p, h> & context) | |
424 | { | |
425 | #if defined(__FreeBSD__) && defined(STDLIB_LLVM) | |
426 | // https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=192320 | |
427 | static std::vector<std::unordered_map<uint64_t, uint64_t>> _cachedLengthAdjustmentsArray(omp_get_num_threads()); | |
428 | static std::unordered_map<uint64_t, uint64_t> & _cachedLengthAdjustments = _cachedLengthAdjustmentsArray[omp_get_thread_num()]; | |
429 | #else | |
430 | static thread_local std::unordered_map<uint64_t, uint64_t> _cachedLengthAdjustments; | |
431 | #endif | |
432 | ||
433 | // convert to 64bit and divide for translated sequences | |
434 | uint64_t ql = match.qLength / (qIsTranslated(context.blastProgram) ? 3 : 1); | |
435 | // length adjustment not yet computed | |
436 | if (_cachedLengthAdjustments.find(ql) == _cachedLengthAdjustments.end()) | |
437 | _cachedLengthAdjustments[ql] = _lengthAdjustment(context.dbTotalLength, ql, context.scoringScheme); | |
438 | ||
439 | uint64_t adj = _cachedLengthAdjustments[ql]; | |
440 | ||
441 | match.eValue = _computeEValue(match.alignStats.alignmentScore, | |
442 | ql - adj, | |
443 | context.dbTotalLength - adj, | |
444 | context.scoringScheme); | |
445 | return match.eValue; | |
446 | } | |
447 | ||
448 | // ---------------------------------------------------------------------------- | |
449 | 414 | // remove tag type |
450 | 415 | // ---------------------------------------------------------------------------- |
451 | 416 |
103 | 103 | template <typename TDirection, typename TStorageSpec> |
104 | 104 | struct FormattedFileContext<FormattedFile<Bam, TDirection, BlastTabular>, TStorageSpec> |
105 | 105 | { |
106 | typedef typename DefaultIndexStringSpec<StringSet<void, void>>::Type TStringSpec; // see above | |
107 | typedef StringSet<Segment<String<char, TStringSpec>, InfixSegment> > TNameStore; | |
108 | typedef NameStoreCache<TNameStore> TNameStoreCache; | |
109 | typedef BamIOContext<TNameStore, TNameStoreCache, TStorageSpec> Type; | |
106 | typedef StringSet<Segment<String<char, MMap<> >, InfixSegment> > TNameStore; | |
107 | typedef NameStoreCache<TNameStore> TNameStoreCache; | |
108 | typedef BamIOContext<TNameStore, TNameStoreCache, TStorageSpec> Type; | |
110 | 109 | }; |
111 | 110 | |
112 | 111 | } |
122 | 121 | #else |
123 | 122 | using TAlloc = Alloc<>; |
124 | 123 | #endif |
125 | // using Bwt = WaveletTree<void, WTRDConfig<LengthSum, TAlloc> >; | |
126 | using Bwt = Levels<void, LevelsRDConfig<LengthSum, TAlloc, 1, 3> >; | |
124 | using Bwt = WaveletTree<void, WTRDConfig<LengthSum, TAlloc> >; | |
127 | 125 | using Sentinels = Levels<void, LevelsRDConfig<LengthSum, TAlloc> >; |
128 | 126 | |
129 | 127 | static const unsigned SAMPLING = 10; |
188 | 186 | } |
189 | 187 | |
190 | 188 | // ========================================================================== |
191 | // Option Enums | |
192 | // ========================================================================== | |
193 | ||
194 | enum class DbIndexType : uint8_t | |
195 | { | |
196 | SUFFIX_ARRAY, | |
197 | FM_INDEX, | |
198 | BI_FM_INDEX | |
199 | }; | |
200 | ||
201 | // ========================================================================== | |
202 | 189 | // Classes |
203 | 190 | // ========================================================================== |
204 | 191 | |
205 | 192 | // -------------------------------------------------------------------------- |
206 | // Class SharedOptions | |
193 | // Class LambdaOptions | |
207 | 194 | // -------------------------------------------------------------------------- |
208 | 195 | |
209 | 196 | // This struct stores the options from the command line. |
215 | 202 | |
216 | 203 | std::string commandLine; |
217 | 204 | |
218 | std::string indexDir; | |
219 | ||
220 | DbIndexType dbIndexType; | |
205 | std::string dbFile; | |
206 | ||
207 | int dbIndexType = 0; | |
208 | // for indexer, the file format of database sequences | |
209 | // for main app, the file format of query sequences | |
210 | // 0 -- fasta, 1 -- fastq | |
211 | // int fileFormat = 0; | |
221 | 212 | |
222 | 213 | int alphReduction = 0; |
223 | 214 | |
241 | 232 | } |
242 | 233 | }; |
243 | 234 | |
244 | // -------------------------------------------------------------------------- | |
245 | // Class LambdaOptions | |
246 | // -------------------------------------------------------------------------- | |
247 | 235 | |
248 | 236 | struct LambdaOptions : public SharedOptions |
249 | 237 | { |
266 | 254 | // bool semiGlobal; |
267 | 255 | |
268 | 256 | bool doubleIndexing = true; |
269 | bool adaptiveSeeding; | |
270 | 257 | |
271 | 258 | unsigned seedLength = 0; |
272 | 259 | unsigned maxSeedDist = 1; |
293 | 280 | int idCutOff = 0; |
294 | 281 | unsigned long maxMatches = 500; |
295 | 282 | |
296 | enum class ExtensionMode : uint8_t | |
297 | { | |
298 | AUTO, | |
299 | XDROP, | |
300 | FULL_SERIAL, | |
301 | FULL_SIMD | |
302 | }; | |
303 | ExtensionMode extensionMode; | |
304 | ||
305 | 283 | bool filterPutativeDuplicates = true; |
306 | 284 | bool filterPutativeAbundant = true; |
307 | bool mergePutativeSiblings = true; | |
308 | 285 | |
309 | 286 | int preScoring = 0; // 0 = off, 1 = seed, 2 = region ( |
310 | 287 | double preScoringThresh = 0.0; |
315 | 292 | } |
316 | 293 | }; |
317 | 294 | |
318 | // -------------------------------------------------------------------------- | |
319 | // Class LambdaIndexerOptions | |
320 | // -------------------------------------------------------------------------- | |
321 | ||
322 | 295 | struct LambdaIndexerOptions : public SharedOptions |
323 | 296 | { |
324 | std::string dbFile; | |
325 | // std::string segFile = ""; | |
297 | std::string segFile = ""; | |
326 | 298 | std::string algo = ""; |
327 | 299 | |
328 | 300 | bool truncateIDs; |
337 | 309 | // ========================================================================== |
338 | 310 | |
339 | 311 | // -------------------------------------------------------------------------- |
340 | // Function sharedSetup() | |
312 | // Function displayCopyright() | |
341 | 313 | // -------------------------------------------------------------------------- |
342 | 314 | |
343 | 315 | void |
348 | 320 | std::string(SEQAN_REVISION) + ")"; |
349 | 321 | setVersion(parser, versionString); |
350 | 322 | setDate(parser, __DATE__); |
351 | setShortCopyright(parser, "2013-2016 Hannes Hauswedell, released under the GNU AGPL v3 (or later); " | |
323 | setShortCopyright(parser, "2013-2016 Hannes Hauswedell, released under the GNU GPL v3 (or later); " | |
352 | 324 | "2016 Knut Reinert and Freie Universität Berlin, released under the 3-clause-BSDL"); |
353 | 325 | |
354 | 326 | setCitation(parser, "Hauswedell et al (2014); doi: 10.1093/bioinformatics/btu439"); |
357 | 329 | " Copyright (c) 2013-2016, Hannes Hauswedell\n" |
358 | 330 | " All rights reserved.\n" |
359 | 331 | "\n" |
360 | " This program is free software: you can redistribute it and/or modify\n" | |
361 | " it under the terms of the GNU Affero General Public License as\n" | |
362 | " published by the Free Software Foundation, either version 3 of the\n" | |
363 | " License, or (at your option) any later version.\n" | |
332 | " Lambda is free software: you can redistribute it and/or modify\n" | |
333 | " it under the terms of the GNU General Public License as published by\n" | |
334 | " the Free Software Foundation, either version 3 of the License, or\n" | |
335 | " (at your option) any later version.\n" | |
364 | 336 | "\n" |
365 | 337 | " Lambda is distributed in the hope that it will be useful,\n" |
366 | 338 | " but WITHOUT ANY WARRANTY; without even the implied warranty of\n" |
367 | 339 | " MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the\n" |
368 | 340 | " GNU General Public License for more details.\n" |
369 | 341 | "\n" |
370 | " You should have received a copy of the GNU Affero General Public License\n" | |
371 | " along with this program. If not, see <http://www.gnu.org/licenses/>.\n" | |
342 | " You should have received a copy of the GNU General Public License\n" | |
343 | " along with Lambda. If not, see <http://www.gnu.org/licenses/>.\n" | |
372 | 344 | "\n" |
373 | 345 | " Copyright (c) 2016 Knut Reinert and Freie Universität Berlin\n" |
374 | 346 | " All rights reserved.\n" |
437 | 409 | |
438 | 410 | // Define usage line and long description. |
439 | 411 | addUsageLine(parser, "[\\fIOPTIONS\\fP] \\fI-q QUERY.fasta\\fP " |
440 | "\\fI-i INDEX.lambda\\fP " | |
412 | "\\fI-d DATABASE.fasta\\fP " | |
441 | 413 | "[\\fI-o output.m8\\fP]"); |
442 | 414 | |
443 | 415 | sharedSetup(parser); |
450 | 422 | setValidValues(parser, "query", toCString(concat(getFileExtensions(SeqFileIn()), ' '))); |
451 | 423 | setRequired(parser, "q"); |
452 | 424 | |
453 | addOption(parser, ArgParseOption("i", "index", | |
454 | "The database index (created by the lambda_indexer executable).", | |
425 | addOption(parser, ArgParseOption("d", "database", | |
426 | "Path to original database sequences (a precomputed index with .sa or .fm needs to exist!).", | |
455 | 427 | ArgParseArgument::INPUT_FILE, |
456 | 428 | "IN")); |
457 | setRequired(parser, "index"); | |
458 | setValidValues(parser, "index", ".lambda"); | |
429 | setValidValues(parser, "database", toCString(concat(getFileExtensions(SeqFileIn()), ' '))); | |
430 | setRequired(parser, "d"); | |
459 | 431 | |
460 | 432 | addOption(parser, ArgParseOption("di", "db-index-type", |
461 | 433 | "database index is in this format.", |
654 | 626 | // ArgParseArgument::INTEGER)); |
655 | 627 | // setDefaultValue(parser, "ungapped-seeds", "1"); |
656 | 628 | |
657 | addOption(parser, ArgParseOption("as", "adaptive-seeding", | |
658 | "SECRET", | |
659 | ArgParseArgument::STRING, | |
660 | "STR")); | |
661 | setValidValues(parser, "adaptive-seeding", "on off"); | |
662 | setDefaultValue(parser, "adaptive-seeding", "on"); | |
663 | setAdvanced(parser, "adaptive-seeding"); | |
664 | ||
665 | 629 | addOption(parser, ArgParseOption("sl", "seed-length", |
666 | 630 | "Length of the seeds (default = 14 for BLASTN).", |
667 | 631 | ArgParseArgument::INTEGER)); |
672 | 636 | "Offset for seeding (if unset = seed-length, non-overlapping; " |
673 | 637 | "default = 5 for BLASTN).", |
674 | 638 | ArgParseArgument::INTEGER)); |
675 | setDefaultValue(parser, "seed-offset", "5"); | |
639 | setDefaultValue(parser, "seed-offset", "10"); | |
676 | 640 | setAdvanced(parser, "seed-offset"); |
677 | 641 | |
678 | 642 | addOption(parser, ArgParseOption("sd", "seed-delta", |
726 | 690 | setDefaultValue(parser, "filter-putative-abundant", "on"); |
727 | 691 | setAdvanced(parser, "filter-putative-abundant"); |
728 | 692 | |
729 | addOption(parser, ArgParseOption("pm", "merge-putative-siblings", | |
730 | "Merge seed from one region, " | |
731 | "stop searching if the remaining realm looks unfeasable.", | |
732 | ArgParseArgument::STRING)); | |
733 | setValidValues(parser, "merge-putative-siblings", "on off"); | |
734 | setDefaultValue(parser, "merge-putative-siblings", "on"); | |
735 | setAdvanced(parser, "merge-putative-siblings"); | |
736 | ||
737 | 693 | // addOption(parser, ArgParseOption("se", |
738 | 694 | // "seedminevalue", |
739 | 695 | // "after postproc worse seeds are " |
801 | 757 | setDefaultValue(parser, "band", "-3"); |
802 | 758 | setMinValue(parser, "band", "-3"); |
803 | 759 | setAdvanced(parser, "band"); |
804 | ||
805 | addOption(parser, ArgParseOption("em", "extension-mode", | |
806 | "Choice of extension algorithms.", | |
807 | ArgParseArgument::STRING)); | |
808 | #ifdef SEQAN_SIMD_ENABLED | |
809 | setValidValues(parser, "extension-mode", "auto xdrop fullSerial fullSIMD"); | |
810 | #else | |
811 | setValidValues(parser, "extension-mode", "auto xdrop fullSerial"); | |
812 | #endif | |
813 | setDefaultValue(parser, "extension-mode", "auto"); | |
814 | setAdvanced(parser, "extension-mode"); | |
815 | 760 | |
816 | 761 | addTextSection(parser, "Tuning"); |
817 | 762 | addText(parser, "Tuning the seeding parameters and (de)activating alphabet " |
858 | 803 | |
859 | 804 | // Extract option values. |
860 | 805 | getOptionValue(options.queryFile, parser, "query"); |
861 | ||
862 | getOptionValue(options.indexDir, parser, "index"); | |
863 | ||
864 | 806 | // if (endsWith(options.queryFile, ".fastq") || |
865 | 807 | // endsWith(options.queryFile, ".fq")) |
866 | 808 | // options.fileFormat = 1; |
986 | 928 | options.versionInformationToOutputFile = (buffer == "on"); |
987 | 929 | |
988 | 930 | clear(buffer); |
989 | getOptionValue(buffer, parser, "adaptive-seeding"); | |
990 | options.adaptiveSeeding = (buffer == "on"); | |
991 | ||
992 | clear(buffer); | |
993 | 931 | getOptionValue(options.seedLength, parser, "seed-length"); |
994 | 932 | if ((!isSet(parser, "seed-length")) && |
995 | 933 | (options.blastProgram == BlastProgram::BLASTN)) |
998 | 936 | if (isSet(parser, "seed-offset")) |
999 | 937 | getOptionValue(options.seedOffset, parser, "seed-offset"); |
1000 | 938 | else |
1001 | options.seedOffset = options.seedLength / 2; | |
939 | options.seedOffset = options.seedLength; | |
1002 | 940 | |
1003 | 941 | if (isSet(parser, "seed-gravity")) |
1004 | 942 | getOptionValue(options.seedGravity, parser, "seed-gravity"); |
1070 | 1008 | getOptionValue(buffer, parser, "filter-putative-abundant"); |
1071 | 1009 | options.filterPutativeAbundant = (buffer == "on"); |
1072 | 1010 | |
1073 | getOptionValue(buffer, parser, "merge-putative-siblings"); | |
1074 | options.mergePutativeSiblings = (buffer == "on"); | |
1075 | ||
1076 | 1011 | // TODO always prescore 1 |
1077 | 1012 | getOptionValue(options.preScoring, parser, "pre-scoring"); |
1078 | 1013 | if ((!isSet(parser, "pre-scoring")) && |
1079 | 1014 | (options.alphReduction == 0)) |
1080 | 1015 | options.preScoring = 1; |
1081 | // for adaptive seeding we take the full resized seed (and no surroundings) | |
1082 | // if (options.adaptiveSeeding) | |
1083 | // options.preScoring = 1; | |
1084 | 1016 | |
1085 | 1017 | getOptionValue(options.preScoringThresh, parser, "pre-scoring-threshold"); |
1086 | 1018 | // if (options.preScoring == 0) |
1090 | 1022 | getOptionValue(numbuf, parser, "num-matches"); |
1091 | 1023 | options.maxMatches = static_cast<unsigned long>(numbuf); |
1092 | 1024 | |
1093 | getOptionValue(buffer, parser, "extension-mode"); | |
1094 | if (buffer == "fullSIMD") | |
1095 | { | |
1096 | options.extensionMode = LambdaOptions::ExtensionMode::FULL_SIMD; | |
1097 | options.filterPutativeAbundant = false; | |
1098 | options.filterPutativeDuplicates = false; | |
1099 | options.mergePutativeSiblings = false; | |
1100 | options.xDropOff = -1; | |
1101 | options.band = -1; | |
1102 | } | |
1103 | else if (buffer == "fullSerial") | |
1104 | { | |
1105 | options.extensionMode = LambdaOptions::ExtensionMode::FULL_SERIAL; | |
1106 | options.filterPutativeAbundant = false; | |
1107 | options.filterPutativeDuplicates = false; | |
1108 | options.mergePutativeSiblings = false; | |
1109 | options.xDropOff = -1; | |
1110 | } | |
1111 | else if (buffer == "xdrop") | |
1112 | { | |
1113 | options.extensionMode = LambdaOptions::ExtensionMode::XDROP; | |
1114 | } | |
1115 | else | |
1116 | { | |
1117 | options.extensionMode = LambdaOptions::ExtensionMode::AUTO; | |
1118 | } | |
1119 | ||
1120 | 1025 | return ArgumentParser::PARSE_OK; |
1121 | 1026 | } |
1122 | 1027 | |
1128 | 1033 | ArgumentParser parser("lambda_indexer"); |
1129 | 1034 | |
1130 | 1035 | // Define usage line and long description. |
1131 | addUsageLine(parser, "[\\fIOPTIONS\\fP] \\-d DATABASE.fasta [-i INDEX.lambda]\\fP"); | |
1036 | addUsageLine(parser, "[\\fIOPTIONS\\fP] \\-d DATABASE.fasta\\fP"); | |
1132 | 1037 | |
1133 | 1038 | sharedSetup(parser); |
1134 | 1039 | |
1142 | 1047 | setRequired(parser, "database"); |
1143 | 1048 | setValidValues(parser, "database", toCString(concat(getFileExtensions(SeqFileIn()), ' '))); |
1144 | 1049 | |
1145 | // addOption(parser, ArgParseOption("s", | |
1146 | // "segfile", | |
1147 | // "SEG intervals for database" | |
1148 | // "(optional).", | |
1149 | // ArgParseArgument::INPUT_FILE)); | |
1150 | // setValidValues(parser, "segfile", "seg"); | |
1151 | // hideOption(parser, "segfile"); // TODO remove completely | |
1050 | addOption(parser, ArgParseOption("s", | |
1051 | "segfile", | |
1052 | "SEG intervals for database" | |
1053 | "(optional).", | |
1054 | ArgParseArgument::INPUT_FILE)); | |
1055 | ||
1056 | setValidValues(parser, "segfile", "seg"); | |
1152 | 1057 | |
1153 | 1058 | addSection(parser, "Output Options"); |
1154 | addOption(parser, ArgParseOption("i", "index", | |
1155 | "The output directory for the index files (defaults to \"DATABASE.lambda\").", | |
1156 | ArgParseArgument::INPUT_FILE, | |
1157 | "OUT")); | |
1158 | setValidValues(parser, "index", ".lambda"); | |
1059 | // addOption(parser, ArgParseOption("o", | |
1060 | // "output", | |
1061 | // "Index of database sequences", | |
1062 | // ArgParseArgument::OUTPUT_FILE, | |
1063 | // "OUT")); | |
1064 | // setValidValues(parser, "output", "sa fm"); | |
1159 | 1065 | |
1160 | 1066 | addOption(parser, ArgParseOption("di", "db-index-type", |
1161 | 1067 | "Suffix array or full-text minute space.", |
1279 | 1185 | return res; |
1280 | 1186 | |
1281 | 1187 | // Extract option values |
1282 | // getOptionValue(options.segFile, parser, "segfile"); | |
1188 | getOptionValue(options.segFile, parser, "segfile"); | |
1283 | 1189 | getOptionValue(options.algo, parser, "algorithm"); |
1284 | 1190 | if ((options.algo == "mergesort") || (options.algo == "quicksort") || (options.algo == "quicksortbuckets")) |
1285 | 1191 | { |
1295 | 1201 | getOptionValue(buffer, parser, "truncate-ids"); |
1296 | 1202 | options.truncateIDs = (buffer == "on"); |
1297 | 1203 | |
1298 | ||
1299 | getOptionValue(options.dbFile, parser, "database"); | |
1300 | if (isSet(parser, "index")) | |
1301 | getOptionValue(options.indexDir, parser, "index"); | |
1302 | else | |
1303 | options.indexDir = options.dbFile + ".lambda"; | |
1304 | ||
1305 | ||
1306 | if (fileExists(options.indexDir.c_str())) | |
1307 | { | |
1308 | std::cerr << "ERROR: An output directory already exists at " << options.indexDir << '\n' | |
1309 | << "Remove it, or choose a different location.\n"; | |
1310 | return ArgumentParser::PARSE_ERROR; | |
1311 | } | |
1312 | else | |
1313 | { | |
1314 | if (mkdir(options.indexDir.c_str(), S_IRWXU | S_IRGRP | S_IXGRP | S_IROTH | S_IXOTH)) | |
1315 | { | |
1316 | std::cerr << "ERROR: Cannot create output directory at " << options.indexDir << '\n';; | |
1317 | return ArgumentParser::PARSE_ERROR; | |
1318 | } | |
1319 | } | |
1320 | ||
1321 | 1204 | return ArgumentParser::PARSE_OK; |
1322 | 1205 | } |
1323 | 1206 | |
1328 | 1211 | int buf = 0; |
1329 | 1212 | std::string buffer; |
1330 | 1213 | |
1214 | getOptionValue(options.dbFile, parser, "database"); | |
1215 | ||
1331 | 1216 | getOptionValue(buffer, parser, "db-index-type"); |
1332 | 1217 | if (buffer == "sa") |
1333 | options.dbIndexType = DbIndexType::SUFFIX_ARRAY; | |
1334 | else if (buffer == "bifm") | |
1335 | options.dbIndexType = DbIndexType::BI_FM_INDEX; | |
1336 | else | |
1337 | options.dbIndexType = DbIndexType::FM_INDEX; | |
1218 | options.dbIndexType = 0; | |
1219 | else // if fm | |
1220 | options.dbIndexType = 1; | |
1338 | 1221 | |
1339 | 1222 | getOptionValue(buffer, parser, "program"); |
1340 | 1223 | if (buffer == "blastn") |
1389 | 1272 | |
1390 | 1273 | return ArgumentParser::PARSE_OK; |
1391 | 1274 | } |
1392 | ||
1393 | // -------------------------------------------------------------------------- | |
1394 | // Function _alphName() | |
1395 | // -------------------------------------------------------------------------- | |
1396 | 1275 | |
1397 | 1276 | constexpr const char * |
1398 | 1277 | _alphName(AminoAcid const & /**/) |
1436 | 1315 | return "dna5"; |
1437 | 1316 | } |
1438 | 1317 | |
1439 | // -------------------------------------------------------------------------- | |
1440 | // Function _indexName() | |
1441 | // -------------------------------------------------------------------------- | |
1442 | ||
1443 | inline std::string | |
1444 | _indexName(DbIndexType const t) | |
1445 | { | |
1446 | switch (t) | |
1447 | { | |
1448 | case DbIndexType::SUFFIX_ARRAY: return "suffix_array"; | |
1449 | case DbIndexType::FM_INDEX: return "fm_index"; | |
1450 | case DbIndexType::BI_FM_INDEX: return "bi_fm_index"; | |
1451 | } | |
1452 | return "ERROR_UNKNOWN_INDEX_TYPE"; | |
1453 | } | |
1454 | ||
1455 | // -------------------------------------------------------------------------- | |
1456 | // Function printOptions() | |
1457 | // -------------------------------------------------------------------------- | |
1458 | ||
1459 | 1318 | template <typename TLH> |
1460 | 1319 | inline void |
1461 | 1320 | printOptions(LambdaOptions const & options) |
1474 | 1333 | std::cout << "OPTIONS\n" |
1475 | 1334 | << " INPUT\n" |
1476 | 1335 | << " query file: " << options.queryFile << "\n" |
1477 | << " index directory: " << options.indexDir << "\n" | |
1336 | << " db file: " << options.dbFile << "\n" | |
1478 | 1337 | << " db index type: " << (TGH::indexIsFM |
1479 | 1338 | ? "FM-Index\n" |
1480 | 1339 | : "SA-Index\n") |
1537 | 1396 | << " putative-duplicates: " << (options.filterPutativeDuplicates |
1538 | 1397 | ? std::string("on") |
1539 | 1398 | : std::string("off")) << "\n" |
1540 | ||
1541 | 1399 | << " SCORING\n" |
1542 | 1400 | << " scoring scheme: " << options.scoringMethod << "\n" |
1543 | 1401 | << " score-match: " << (options.scoringMethod |
1548 | 1406 | : std::to_string(options.misMatch)) << "\n" |
1549 | 1407 | << " score-gap: " << options.gapExtend << "\n" |
1550 | 1408 | << " score-gap-open: " << options.gapOpen << "\n" |
1551 | << " EXTENSION\n"; | |
1552 | switch (options.extensionMode) | |
1553 | { | |
1554 | case LambdaOptions::ExtensionMode::AUTO: | |
1555 | std::cout | |
1556 | << " extensionMode: auto (depends on query length)\n" | |
1409 | << " EXTENSION\n" | |
1557 | 1410 | << " x-drop: " << options.xDropOff << "\n" |
1558 | 1411 | << " band: " << bandStr << "\n" |
1559 | << " [depending on the automatically chosen mode x-drop or band might get disabled.\n"; | |
1560 | break; | |
1561 | case LambdaOptions::ExtensionMode::XDROP: | |
1562 | std::cout | |
1563 | << " extensionMode: individual\n" | |
1564 | << " x-drop: " << options.xDropOff << "\n" | |
1565 | << " band: " << bandStr << "\n"; | |
1566 | break; | |
1567 | case LambdaOptions::ExtensionMode::FULL_SERIAL: | |
1568 | std::cout | |
1569 | << " extensionMode: batch, but serialized\n" | |
1570 | << " x-drop: not used\n" | |
1571 | << " band: " << bandStr << "\n"; | |
1572 | break; | |
1573 | case LambdaOptions::ExtensionMode::FULL_SIMD: | |
1574 | std::cout | |
1575 | << " extensionMode: batch with SIMD\n" | |
1576 | << " x-drop: not used\n" | |
1577 | << " band: not used\n"; | |
1578 | break; | |
1579 | } | |
1580 | std::cout << " BUILD OPTIONS:\n" | |
1412 | << " BUILD OPTIONS:\n" | |
1581 | 1413 | << " cmake_build_type: " << std::string(CMAKE_BUILD_TYPE) << "\n" |
1582 | 1414 | << " fastbuild: " |
1583 | 1415 | #if defined(FASTBUILD) |
1609 | 1441 | #else |
1610 | 1442 | << "off\n" |
1611 | 1443 | #endif |
1612 | << " seqan_simd: " | |
1613 | #if defined(SEQAN_SIMD_ENABLED) && defined(__AVX2__) | |
1614 | << "avx2\n" | |
1615 | #elif defined(SEQAN_SIMD_ENABLED) && defined(__SSE4_2__) | |
1616 | << "sse4\n" | |
1617 | #else | |
1618 | << "off\n" | |
1619 | #endif | |
1620 | 1444 | << "\n"; |
1621 | 1445 | } |
1622 | 1446 |
109 | 109 | TLocalHolder const & lH) |
110 | 110 | { |
111 | 111 | using TCElem = typename Value<TCigar>::Type; |
112 | using TGlobalHolder = typename TLocalHolder::TGlobalHolder; | |
113 | 112 | |
114 | 113 | SEQAN_ASSERT_EQ(length(m.alignRow0), length(m.alignRow1)); |
115 | 114 | |
116 | 115 | // translate positions into dna space |
117 | unsigned const transFac = qIsTranslated(TGlobalHolder::blastProgram) ? 3 : 1; | |
116 | unsigned const transFac = qIsTranslated(lH.gH.blastProgram) ? 3 : 1; | |
118 | 117 | // clips resulting from translation / frameshift are always hard clips |
119 | 118 | unsigned const leftFrameClip = std::abs(m.qFrameShift) - 1; |
120 | unsigned const rightFrameClip = qIsTranslated(TGlobalHolder::blastProgram) ? (m.qLength - leftFrameClip) % 3 : 0; | |
119 | unsigned const rightFrameClip = qIsTranslated(lH.gH.blastProgram) ? (m.qLength - leftFrameClip) % 3 : 0; | |
121 | 120 | // regular clipping from local alignment (regions outside match) can be hard or soft |
122 | 121 | unsigned const leftClip = m.qStart * transFac; |
123 | 122 | unsigned const rightClip = (length(source(m.alignRow0)) - m.qEnd) * transFac; |
192 | 191 | TLocalHolder const & lH) |
193 | 192 | { |
194 | 193 | using TCElem = typename Value<TCigar>::Type; |
195 | using TGlobalHolder = typename TLocalHolder::TGlobalHolder; | |
196 | 194 | |
197 | 195 | SEQAN_ASSERT_EQ(length(m.alignRow0), length(m.alignRow1)); |
198 | 196 | |
302 | 300 | context(globalHolder.outfile).fields = options.columns; |
303 | 301 | auto & versionString = context(globalHolder.outfile).versionString; |
304 | 302 | clear(versionString); |
305 | append(versionString, _programTagToString(TGH::blastProgram)); | |
303 | append(versionString, _programTagToString(globalHolder.blastProgram)); | |
306 | 304 | append(versionString, " 2.2.26+ [created by LAMBDA"); |
307 | 305 | if (options.versionInformationToOutputFile) |
308 | 306 | { |
319 | 317 | auto & subjIds = contigNames(context); |
320 | 318 | |
321 | 319 | // set sequence lengths |
322 | if (sIsTranslated(TGH::blastProgram)) | |
320 | if (sIsTranslated(globalHolder.blastProgram)) | |
323 | 321 | { |
324 | 322 | //TODO can we get around a copy? |
325 | 323 | subjSeqLengths = globalHolder.untransSubjSeqLengths; |
426 | 424 | inline void |
427 | 425 | myWriteRecord(TLH & lH, TRecord const & record) |
428 | 426 | { |
429 | using TGH = typename TLH::TGlobalHolder; | |
430 | 427 | if (lH.options.outFileFormat == 0) // BLAST |
431 | 428 | { |
432 | 429 | SEQAN_OMP_PRAGMA(critical(filewrite)) |
447 | 444 | for (auto & bamR : bamRecords) |
448 | 445 | { |
449 | 446 | // untranslate for sIsTranslated |
450 | if (sIsTranslated(TGH::blastProgram)) | |
447 | if (sIsTranslated(lH.gH.blastProgram)) | |
451 | 448 | { |
452 | 449 | bamR.beginPos = mIt->sStart * 3 + std::abs(mIt->sFrameShift) - 1; |
453 | 450 | if (mIt->sFrameShift < 0) |
474 | 471 | { |
475 | 472 | clear(protCigar); |
476 | 473 | // native protein |
477 | if ((TGH::blastProgram == BlastProgram::BLASTP) || (TGH::blastProgram == BlastProgram::TBLASTN)) | |
474 | if ((lH.gH.blastProgram == BlastProgram::BLASTP) || (lH.gH.blastProgram == BlastProgram::TBLASTN)) | |
478 | 475 | blastMatchOneCigar(protCigar, *mIt, lH); |
479 | else if (qIsTranslated(TGH::blastProgram)) // translated | |
476 | else if (qIsTranslated(lH.gH.blastProgram)) // translated | |
480 | 477 | blastMatchTwoCigar(bamR.cigar, protCigar, *mIt, lH); |
481 | 478 | else // BLASTN can't have protein sequence |
482 | 479 | blastMatchOneCigar(bamR.cigar, *mIt, lH); |
483 | 480 | } |
484 | 481 | else |
485 | 482 | { |
486 | if ((TGH::blastProgram != BlastProgram::BLASTP) && (TGH::blastProgram != BlastProgram::TBLASTN)) | |
483 | if ((lH.gH.blastProgram != BlastProgram::BLASTP) && (lH.gH.blastProgram != BlastProgram::TBLASTN)) | |
487 | 484 | blastMatchOneCigar(bamR.cigar, *mIt, lH); |
488 | 485 | } |
489 | 486 | // we want to include the seq |
500 | 497 | (endPosition(mIt->alignRow0) != endPosition(mPrevIt->alignRow0))); |
501 | 498 | } |
502 | 499 | |
503 | if (TGH::blastProgram == BlastProgram::BLASTN) | |
500 | if (lH.gH.blastProgram == BlastProgram::BLASTN) | |
504 | 501 | { |
505 | 502 | if (lH.options.samBamHardClip) |
506 | 503 | { |
514 | 511 | bamR.seq = source(mIt->alignRow0); |
515 | 512 | } |
516 | 513 | } |
517 | else if (qIsTranslated(TGH::blastProgram)) | |
514 | else if (qIsTranslated(lH.gH.blastProgram)) | |
518 | 515 | { |
519 | 516 | if (lH.options.samBamHardClip) |
520 | 517 | { |
573 | 570 | int8_t(mIt->sFrameShift), 'c'); |
574 | 571 | if (lH.options.samBamTags[SamBamExtraTags<>::Q_AA_SEQ]) |
575 | 572 | { |
576 | if ((TGH::blastProgram == BlastProgram::BLASTN) || (!writeSeq)) | |
573 | if ((lH.gH.blastProgram == BlastProgram::BLASTN) || (!writeSeq)) | |
577 | 574 | appendTagValue(bamR.tags, |
578 | 575 | std::get<0>(SamBamExtraTags<>::keyDescPairs[SamBamExtraTags<>::Q_AA_SEQ]), |
579 | 576 | "*", 'Z'); |