Codebase list vmatch / 03c2142
New upstream version 2.3.0+dfsg Sascha Steinbiss 4 years ago
20 changed file(s) with 0 addition(s) and 5709 deletion(s). Raw diff Collapse all Expand all
+0
-1
src/doc/WWW/.gitignore less more
0 vmweb.pdf
+0
-660
src/doc/WWW/.ispell_american less more
0 EMBL
1 Fasta
2 Genbank
3 Gzipped
4 Kurtz
5 SWISSPROT
6 Vmatch
7 postprocessing
8 pt
9 AtGDB
10 barleypop
11 biophys
12 cDNA
13 dataflow
14 Dataflowfig
15 DJ
16 duesseldorf
17 edu
18 Fernandes
19 Gbp
20 GCB
21 genalyzer
22 GeneChip
23 genomic
24 gif
25 GL
26 html
27 hypa
28 iastate
29 img
30 matchgraph
31 mga
32 oligo
33 patternsearch
34 PatternSearch
35 pdf
36 PDF
37 php
38 PlantGDB
39 possumsearch
40 PossumSearch
41 RescueMu
42 RSA
43 rsat
44 splicenest
45 transposons
46 virtman
47 vmatchlic
48 vmweb
49 vrac
50 Walbot
51 Xenopus
52 zmdb
53 ac
54 AMD
55 Arabidopsis
56 BarleyBase
57 bibiserv
58 bielefeld
59 Bioinformatics
60 biotools
61 cgi
62 commolbio
63 Debian
64 ESTs
65 GenBank
66 hamburg
67 Helden
68 http
69 HyPa
70 KPATH
71 kurtz
72 latexonly
73 mkvtree
74 molgen
75 mpg
76 Multimat
77 oligos
78 org
79 OSX
80 pl
81 plantgdb
82 PowerPC
83 Promide
84 Rahmann
85 Redhat
86 reputer
87 REPuter
88 Sparc
89 SpliceNest
90 src
91 SuSe
92 Sven
93 techfak
94 thaliana
95 ulb
96 vmatch
97 www
98 zbh
99 Zea
100 ASRP
101 GSSanalysis
102 laevis
103 meningioma
104 microarray
105 micromatches
106 miRNA
107 mRNA
108 mutransposon
109 Transposon
110 ABR
111 AKA
112 ALI
113 ALT
114 ALV
115 AME
116 ANG
117 AQU
118 ARN
119 Affymetrix
120 Agave
121 Apr
122 Aureobasidium
123 Azadirachta
124 BAI
125 BAL
126 BEC
127 BEH
128 BEI
129 BEL
130 BER
131 BLA
132 BLI
133 BLO
134 BOD
135 BOH
136 BOR
137 BOU
138 BOV
139 BRE
140 BRI
141 BRO
142 BRU
143 BUC
144 BUE
145 BioExtract
146 Biopieces
147 BlastP
148 CAS
149 CAV
150 CEV
151 CHA
152 CHAU
153 CHE
154 CHI
155 CHO
156 CHOU
157 CHU
158 CLA
159 CLO
160 COH
161 COL
162 CRISPRFinder
163 CRISPRdatabase
164 CRISPRfinder
165 CRISPRs
166 CRO
167 CUL
168 Chlorophyceae
169 Chs
170 CrossLink
171 Cscript
172 CurrentUsage
173 DAR
174 DAS
175 DAV
176 DEA
177 DEE
178 DES
179 DEZ
180 DHA
181 DIBO
182 DIC
183 DIJ
184 DNAVis
185 DOL
186 DOO
187 DOR
188 DUC
189 DUM
190 DUR
191 EBR
192 EIS
193 ELD
194 ELL
195 ENG
196 ERI
197 EST
198 Ecoli
199 FANTOM
200 FAU
201 FER
202 FESCH
203 FIC
204 FIE
205 FLE
206 FOF
207 FOS
208 FRE
209 Floydiella
210 Frankia
211 GAE
212 GAO
213 GAR
214 GAU
215 GEN
216 GEO
217 GER
218 GES
219 GEY
220 GIE
221 GLA
222 GLAE
223 GNF
224 GNI
225 GOL
226 GOM
227 GON
228 GOS
229 GRAE
230 GRE
231 GRI
232 GRO
233 GRU
234 GU
235 GUA
236 Genome
237 GenomeThreader
238 Gepard
239 GmbH
240 Graminella
241 HAA
242 HAB
243 HAC
244 HAK
245 HAN
246 HAR
247 HAU
248 HAZ
249 HEA
250 HED
251 HEL
252 HEP
253 HES
254 HIL
255 HIN
256 HLE
257 HOEH
258 HOF
259 HOL
260 HOM
261 HOR
262 HRI
263 HRV
264 HUL
265 HUR
266 HUS
267 HYS
268 HYT
269 Heliothis
270 IKE
271 IMM
272 Illumina
273 JAC
274 JAI
275 JEN
276 JI
277 JOH
278 JOR
279 JOU
280 KAE
281 KAM
282 KAN
283 KAR
284 KAT
285 KAW
286 KEM
287 KER
288 KEV
289 KHA
290 KLE
291 KNA
292 KOE
293 KOEN
294 KOG
295 KOL
296 KON
297 KOP
298 KOS
299 KOU
300 KOZ
301 KRI
302 KRO
303 KRU
304 KRUE
305 KUB
306 KUC
307 KUD
308 KUG
309 KUM
310 KUR
311 LAL
312 LAM
313 LAR
314 LAU
315 LAZ
316 LBNL
317 LEA
318 LEM
319 LER
320 LI
321 LIA
322 LIN
323 LIU
324 LOEW
325 LOM
326 LOR
327 LScSA
328 LU
329 LUO
330 LUS
331 Leptosira
332 MAA
333 MAL
334 MCC
335 MCL
336 MCM
337 MEH
338 MER
339 MEY
340 MIC
341 MIN
342 MIPSPlantsDB
343 MOC
344 MOE
345 MOH
346 MOL
347 MOR
348 MUL
349 MYE
350 Macronuclear
351 Maize
352 MicroRNAs
353 Mu
354 Myotis
355 NAI
356 NAO
357 NAR
358 NEL
359 NIC
360 NOU
361 NOV
362 NUR
363 NUS
364 NYG
365 Neochloris
366 OHL
367 OKA
368 OSS
369 OTI
370 OTT
371 PAU
372 PAV
373 PEC
374 PEE
375 PEL
376 PFE
377 PLE
378 PLEXdb
379 POB
380 POM
381 POU
382 PRI
383 Piwik
384 PriMUX
385 ProbeMatch
386 QPALMA
387 RAD
388 RAH
389 REE
390 REH
391 REI
392 REN
393 RIN
394 RIS
395 RIV
396 RNA
397 RNAi
398 RNAseq
399 ROS
400 RT
401 RefSeq
402 Rnnotator
403 SAF
404 SAK
405 SAR
406 SCHAA
407 SCHAEF
408 SCHAR
409 SCHAT
410 SCHEL
411 SCHIJ
412 SCHIL
413 SCHLE
414 SCHLU
415 SCHLUE
416 SCHMU
417 SCHNE
418 SCHOE
419 SCHOF
420 SCHOL
421 SCHOO
422 SCHRO
423 SCHUL
424 SCHWA
425 SCHWE
426 SCO
427 SCZ
428 SEG
429 SEI
430 SEK
431 SEV
432 SHA
433 SHI
434 SHU
435 SIE
436 SIM
437 SIMAP
438 SLE
439 SLO
440 SMA
441 SMI
442 SMY
443 SNY
444 SOE
445 SOU
446 SPA
447 SPI
448 STE
449 STEI
450 STEU
451 STO
452 STR
453 STU
454 SU
455 SUL
456 SZY
457 Schwab
458 Seidel
459 Shewanella
460 Silva
461 Slezak
462 Spirodela
463 Stefan
464 Streptococcaceae
465 TAC
466 TAF
467 TAK
468 TAL
469 TAY
470 TEM
471 THA
472 THI
473 TIK
474 TIR
475 TIS
476 TOL
477 TOR
478 TOU
479 TRI
480 TRO
481 TRU
482 TSA
483 TSU
484 TT
485 TUR
486 Tembe
487 Tetrahymena
488 Thellungiella
489 UDE
490 UPA
491 URA
492 UST
493 Univ
494 VAI
495 VAL
496 VAS
497 VAU
498 VEL
499 VER
500 VHA
501 VIN
502 VIT
503 VIV
504 VLA
505 VMatchForArabidopsis
506 VOI
507 VOS
508 VOSS
509 VRA
510 WAL
511 WEI
512 WES
513 WHA
514 WHE
515 WIC
516 WIE
517 WIL
518 WIS
519 WOD
520 WOL
521 WOR
522 WRE
523 WU
524 XU
525 Xanthomonas
526 YAN
527 YU
528 YUN
529 ZAJ
530 ZAL
531 ZAV
532 ZEM
533 ZHA
534 ZHE
535 ZHO
536 ZHU
537 ZIM
538 ZIN
539 aRNH
540 ab
541 al
542 alt
543 articlerender
544 artid
545 biomoby
546 biopieces
547 bp
548 budworm
549 cand
550 chromosomal
551 chromosome
552 chromosomes
553 colorlinks
554 com
555 contigs
556 crispr
557 crosslink
558 daytrial
559 de
560 div
561 dnavis
562 doc
563 enableLinkTracking
564 endophytes
565 epichloid
566 exome
567 exon
568 fcgi
569 fescue
570 fr
571 genmome
572 genome
573 genomes
574 genomethreader
575 getTracker
576 goltsman
577 gov
578 gsf
579 href
580 ht
581 https
582 idsite
583 indels
584 indica
585 informatik
586 interspaced
587 isotigs
588 javascript
589 jgi
590 js
591 li
592 linkcolor
593 llnl
594 lscsa
595 ltr
596 lucifugus
597 mailto
598 maize
599 meliloti
600 mer
601 mers
602 metagenome
603 metagenomes
604 miRNAs
605 microalgae
606 mips
607 mircoRNA
608 mitochondrial
609 ncRNAs
610 nigrifrons
611 nih
612 nl
613 noscript
614 novo
615 oleoabundans
616 oneidensis
617 oryzae
618 parvula
619 pesticidal
620 pigeonpea
621 piwik
622 piwikTracker
623 pkBaseURL
624 plastid
625 polyrhiza
626 preprocess
627 prj
628 probeBase
629 proteome
630 psud
631 pubmedcentral
632 pullulans
633 retroviruses
634 rnafolding
635 seq
636 siRNA
637 simap
638 sp
639 str
640 subfamilies
641 terrestris
642 tex
643 thermophila
644 tigr
645 trackPageView
646 transcriptional
647 transcriptome
648 trialbox
649 tue
650 tuebingen
651 ul
652 unescape
653 uni
654 var
655 virescens
656 weigelworld
657 wiki
658 wmd
659 zenlicensemanager
+0
-19
src/doc/WWW/Dataflowfig.tex less more
0 \documentclass[11pt]{article}
1 \usepackage{a4wide}
2 \pagestyle{empty}
3 \begin{document}
4 \input{Dataflow.inc}
5
6 \noindent The dataflow in \emph{Vmatch}.
7 The programs
8 are shown in ellipses. The inputs are the database
9 and the query sequences and the alphabet transformation,
10 represented by rectangles. At the center of all
11 computations the persistent index is shown. All other
12 rectangles represent the different kinds of output.
13
14 \begin{center}
15 \centerline{\box\graph}
16 \end{center}
17
18 \end{document}
+0
-99
src/doc/WWW/Makefile less more
0 vmwebfiles=Dataflowfig.pdf\
1 AboKurOhl2004.pdf\
2 AboKurOhl2002.pdf\
3 BreKurWal2002.pdf\
4 ChoSchleKurGie2004.pdf\
5 FitGarKucKurMyeOttSleVitZemMcc2002.pdf\
6 KurChoOhlSchleStoGie2001.pdf\
7 HoehKurOhl2002.pdf\
8 GraeStrKurSte2001.pdf\
9 BecStroHomGieKur2004.pdf\
10 KrueSczKurGie2004.pdf\
11 virtman.pdf
12
13 PAPERS=${WORK}/archive-etc/own-papers
14
15 SERVER=vmatchserver
16 WWWBASEDIR=/var/www/html
17
18 all:vmweb.pdf vmweb.tgz
19
20 vmweb.pdf:vmweb.tex introduction.inc
21 latexmk vmweb
22
23 introduction.inc:../virtman.tex introexclude
24 extractpart.pl introduction ../virtman.tex |\
25 grep -v -f introexclude |\
26 sed -e 's/\\subsection\*{The parts/\\section*{The parts/' | \
27 perl -pe 's/\\cite{ABO:KUR:OHL:2004}\.\%\%second/(Abouelhoda, Kurtz, Ohlebusch 2004)./' | \
28 perl -pe 's/\\cite{(.*)}[\.,]?/\n\\bibentry{\1}\n/' > $@
29
30 Dataflow.inc:../Dataflow.pic
31 pic -t ../Dataflow.pic > $@
32
33 index.html:vmweb.tex vmweb.pdf introduction.inc replace-header.rb replace-par.rb
34 htlatex vmweb.tex "xhtml, charset=utf-8" " -cunihtf -utf8"
35 cat vmweb.html | replace-header.rb | \
36 replace-par.rb | \
37 sed -e 's/CONTENT=/content=/' \
38 -e 's/ALT=/alt=/' > $@
39
40 validate:index.html xhtml-lat1.ent xhtml-symbol.ent xhtml-special.ent xhtml1-transitional.dtd
41 cat index.html | sed -e 's/\"http:\/\/www.w3.org\/TR\/xhtml1\/DTD\//\"/' > .tmp.html
42 xmllint --valid --noout .tmp.html
43 rm -f .tmp.html
44
45 # also run validation on https://validator.w3.org/
46
47 Dataflowfig.dvi:Dataflowfig.tex Dataflow.inc
48 latex Dataflowfig.tex
49
50 Dataflowfig.pdf:Dataflowfig.dvi
51 dvipdf $<
52
53 vmweb.tgz:matchgraph.gif ${vmwebfiles} index.html vmweb.pdf
54 tar -cvzf $@ matchgraph.gif ${vmwebfiles} index.html vmweb.pdf
55
56 virtman.pdf:
57 cp ../virtman.pdf .
58
59 AboKurOhl2004.pdf:
60 cp ${PAPERS}/$@ .
61
62 AboKurOhl2002.pdf:
63 cp ${PAPERS}/$@ .
64
65 BreKurWal2002.pdf:
66 cp ${PAPERS}/$@ .
67
68 ChoSchleKurGie2004.pdf:
69 cp ${PAPERS}/$@ .
70
71 FitGarKucKurMyeOttSleVitZemMcc2002.pdf:
72 cp ${PAPERS}/$@ .
73
74 KurChoOhlSchleStoGie2001.pdf:
75 cp ${PAPERS}/$@ .
76
77 HoehKurOhl2002.pdf:
78 cp ${PAPERS}/$@ .
79
80 GraeStrKurSte2001.pdf:
81 cp ${PAPERS}/$@ .
82
83 KrueSczKurGie2004.pdf:
84 cp ${PAPERS}/$@ .
85
86 BecStroHomGieKur2004.pdf:
87 cp ${PAPERS}/$@ .
88
89 installwww:
90 rsync -rv index.html vmweb.css download.html virtman.pdf vmweb.pdf matchgraph.gif distributions $(SERVER):$(WWWBASEDIR)
91
92 clean:
93 rm -f *.toc *.ilg *.out *.idx *.ind *.dvi *.ps *.log *.aux
94 rm -f *.bbl *.blg
95 rm -f vmweb.4ct vmweb.xref vmweb.tmp vmweb.lg vmweb.idv vmweb.html vmweb.4tc vmweb.fls vmweb.fdb_latexmk
96 rm -f comment.cut introduction.inc
97 rm -f Dataflow.inc
98 rm -f ${vmwebfiles}
+0
-1
src/doc/WWW/PUSH-cmd.txt less more
0 git push kurtz@genometools.org:/home/kurtz/lscsa/vstree.git sk
+0
-53
src/doc/WWW/download.html less more
0 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
1 <html>
2
3 <head>
4 <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
5 <title>Download Vmatch</title>
6 <link rel="stylesheet" type="text/css" href="vmweb.css">
7 </head>
8
9 <body>
10
11 <h1>Download <i>Vmatch</i></h1>
12
13 <p>
14 By downloading Vmatch you agree to the following:
15 </p>
16
17 <p>
18 <code>
19 Vmatch is provided on an <strong>AS IS</strong> basis. The developers do not warrant its validity
20 of performance, efficiency, or suitability, nor do they warrant that Vmatch is
21 free from errors. All warranties, including without limitation, any warranty or
22 merchantability or fitness for a particular purpose, are hereby excluded.
23 </code>
24 </p>
25
26 <h2>Linux</h2>
27 <a href="distributions/vmatch-2.3.0-Linux_x86_64-64bit.tar.gz">vmatch-2.3.0-Linux_x86_64-64bit.tar.gz</a> (2017-06-12)<br/>
28 <a href="distributions/vmatch-2.3.0-Linux_i386-32bit.tar.gz">vmatch-2.3.0-Linux_i386-32bit.tar.gz</a> (2017-06-12)<br/>
29
30 <h2>Mac OS</h2>
31 <a href="distributions/vmatch-2.3.0-Darwin_i386-64bit.tar.gz">vmatch-2.3.0-Darwin_i386-64bit.tar.gz</a> (2017-06-12)<br/>
32 <a href="distributions/vmatch-2.3.0-Darwin_i386-32bit.tar.gz">vmatch-2.3.0-Darwin_i386-32bit.tar.gz</a> (2017-06-12)<br/>
33
34 <h2>Windows</h2>
35 <a href="distributions/vmatch-2.3.0-Windows_i686-64bit.zip">vmatch-2.3.0-Windows_i686-64bit.zip</a> (2017-06-12)<br/>
36 <a href="distributions/vmatch-2.3.0-Windows_i686-32bit.zip">vmatch-2.3.0-Windows_i686-32bit.zip</a> (2017-06-12)<br/>
37
38 <!-- Piwik -->
39 <script type="text/javascript">
40 var pkBaseURL = "https://zenlicensemanager.com/piwik/";
41 document.write(unescape("%3Cscript src='" + pkBaseURL + "piwik.js' type='text/javascript'%3E%3C/script%3E"));
42 </script><script type="text/javascript">
43 try {
44 var piwikTracker = Piwik.getTracker(pkBaseURL + "piwik.php", 3);
45 piwikTracker.trackPageView();
46 piwikTracker.enableLinkTracking();
47 } catch( err ) {}
48 </script><noscript><p><img src="https://zenlicensemanager.com/piwik/piwik.php?idsite=3" style="border:0" alt="" /></p></noscript>
49 <!-- End Piwik Tracking Tag -->
50
51 </body>
52 </html>
+0
-85
src/doc/WWW/extractpart.pl less more
0 #!/usr/bin/env perl
1
2 # select a list of options from a documentation
3
4 use strict;
5 use warnings;
6
7 my $numofargs = scalar @ARGV;
8
9 if($numofargs le 1)
10 {
11 print STDERR "Usage: $0 <tags to be selected> <filename>\n";
12 exit 1;
13 }
14
15 my(%tagtab) = ();
16
17 my $filename = $ARGV[$numofargs-1];
18
19 for(my $i=0; $i<$numofargs-1; $i++)
20 {
21 my $key = $ARGV[$i];
22 $tagtab{$key} = 1;
23 }
24
25 # Get file data
26
27 my @filecontents = get_file_data($filename);
28
29 my $currenttagtext = '';
30 my $currenttag = '';
31 my $intag = 0; # inside an option (which can be multiline)
32
33 for my $line (@filecontents)
34 {
35 if($intag)
36 {
37 if($line =~ /^\%\%\%END{([a-z]+)/)
38 {
39 if($1 ne $currenttag)
40 {
41 print STDERR "$0: BEGIN{$currenttag} ends with END{$1}\n";
42 exit 1;
43 }
44 print STDERR "\%\%\%matched END with \"$currenttag\"\n";
45 $intag = 0;
46 print $currenttagtext, "\n";
47 $currenttagtext = '';
48 $currenttag = '';
49 } else
50 {
51 $currenttagtext .= $line;
52 }
53 } else
54 {
55 if($line =~ /^\%\%\%BEGIN{([a-z]+)}/)
56 {
57 print STDERR "\%\%\%matched BEGIN with \"$1\"\n";
58 if(exists $tagtab{$1})
59 {
60 $intag = 1;
61 $currenttag = $1;
62 }
63 }
64 }
65 }
66
67 exit 0;
68
69 # check if the number of arguments is as expected
70 # get last argument
71
72 sub get_file_data
73 {
74 my($filename) = @_;
75
76 unless(open(GET_FILE_DATA, $filename))
77 {
78 print STDERR "$0: Cannot open file $filename\n";
79 exit 1;
80 }
81 my @filedata = <GET_FILE_DATA>;
82 close GET_FILE_DATA;
83 return @filedata;
84 }
+0
-1980
src/doc/WWW/index.html less more
0 <?xml version="1.0" encoding="utf-8" ?>
1 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
2 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
3 <!--http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd-->
4 <html xmlns="http://www.w3.org/1999/xhtml"
5 >
6 <head><title>The Vmatch large scale sequence analysis software</title>
7 <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
8 <meta name="generator" content="TeX4ht (http://www.tug.org/tex4ht/)" />
9 <meta name="originator" content="TeX4ht (http://www.tug.org/tex4ht/)" />
10 <!-- xhtml,charset=utf-8,html -->
11 <meta name="src" content="vmweb.tex" />
12 <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"/>
13 <meta name="description" content="The Vmatch large scale sequence analysis
14 software is a versatile software tool for efficiently solving large scale sequence matching tasks."/>
15 <meta name="keywords" content="sequence analysis, sequence mapping, BLAST, bioinformatics, computational biology"/>
16 <meta http-equiv="Content-Style-Type" content="text/css"/>
17 <link rel="stylesheet" type="text/css" href="vmweb.css" />
18 </head><body
19 >
20 <div class="maketitle">
21
22
23
24
25
26
27
28 <h1 align="center" class="titleHead">The Vmatch large scale sequence analysis
29 software</h1>
30 <div align="center" class="author" ><span
31 class="ptmr7t-x-x-144">Stefan Kurtz</span></div>
32 <br />
33 <div align="center" class="date" ><span
34 class="ptmr7t-x-x-144">June 15, 2017</span></div>
35 </div>
36 <!--l. 61--> <br/> <center> <img src="matchgraph.gif" alt="show matches of different sizes in a matchgraph"/> </center> <div id="downloadbox"> <ul> <li><a href="download.html">Download <i>Vmatch</i>!</a></li> </ul> </div>
37 <!--l. 63--><p class="noindent" >This is the web-site for <span
38 class="ptmri7t-x-x-120">Vmatch</span>, a versatile software tool for efficiently solving large
39 scale sequence matching tasks. <span
40 class="ptmri7t-x-x-120">Vmatch </span>subsumes the software tool <a
41 href="http://bibiserv.techfak.uni-bielefeld.de/reputer" >REPuter</a>, but is
42 much more general, with a very flexible user interface, and improved space and time
43 requirements. <a href="vmweb.pdf">Here</a> is a printable version of this HTML-page in PDF.
44 </p>
45 <h3 class="likesectionHead"><a
46 id="x1-1000"></a>Features of <span
47 class="ptmri7t-x-x-120">Vmatch</span></h3>
48 <!--l. 76--><p class="noindent" >The <a
49 href="virtman.pdf" ><span
50 class="ptmri7t-x-x-120">Vmatch</span>-manual</a> gives many examples on how to use <span
51 class="ptmri7t-x-x-120">Vmatch</span>. Here are the
52 program&#x2019;s most important features.
53 </p><!--l. 3--><p class="noindent" >
54 </p>
55
56
57
58 <h4 class="likesubsectionHead"><a
59 id="x1-2000"></a>Persistent index</h4>
60 <!--l. 4--><p class="noindent" >Usually, in a large scale matching problem, extensive portions of the sequences under
61 consideration are static, i.e. they do not change much over time. Therefore it makes
62 sense to preprocess this static data to extract information from it and to store this in a
63 structured manner, allowing efficient searches. <span
64 class="ptmri7t-x-x-120">Vmatch </span>does exactly this: it
65 preprocesses a set of sequences into an index structure. This is stored as a collection of
66 several files constituting the persistent index. The index efficiently represents all
67 substrings of the preprocessed sequences and, unlike many other sequence
68 comparison tools, allows matching tasks to be solved in time, <span
69 class="ptmri7t-x-x-120">independent </span>of
70 the size of the index. Different matching tasks require different parts of the
71 index, but only the required parts of the index are accessed during the matching
72 process.
73 </p><!--l. 21--><p class="noindent" >
74 </p>
75 <h4 class="likesubsectionHead"><a
76 id="x1-3000"></a>Alphabet independency</h4>
77 <!--l. 22--><p class="noindent" >Most software tools for sequence analysis are restricted to DNA and/or protein
78 sequences. In contrast, <span
79 class="ptmri7t-x-x-120">Vmatch </span>can process sequences over any user defined alphabet
80 not larger than 250 symbols. <span
81 class="ptmri7t-x-x-120">Vmatch </span>fully implements the concept of <span
82 class="ptmri7t-x-x-120">symbol</span>
83 <span
84 class="ptmri7t-x-x-120">mappings</span>, denoting alphabet transformations. These allow the user to specify that
85 different characters in the input sequences should be considered identical in
86 the matching process. This feature is used to group similar amino acids, for
87 example.
88 </p><!--l. 31--><p class="noindent" >
89 </p>
90 <h4 class="likesubsectionHead"><a
91 id="x1-4000"></a>Versatility</h4>
92 <!--l. 32--><p class="noindent" ><span
93 class="ptmri7t-x-x-120">Vmatch </span>allows a multitude of different matching tasks to be solved using the
94 persistent index. Every matching task is basically characterized by (1) the <span
95 class="ptmri7t-x-x-120">kind</span>
96 <span
97 class="ptmri7t-x-x-120">of sequences </span>to be matched, (2) the <span
98 class="ptmri7t-x-x-120">kind of matches </span>sought, (3) additional
99 <span
100 class="ptmri7t-x-x-120">constraints </span>on the matches, and (4) the <span
101 class="ptmri7t-x-x-120">kind of postprocessing </span>to be done with the
102 matches.
103
104
105
106 </p><!--l. 39--><p class="noindent" >In the standard case, <span
107 class="ptmri7t-x-x-120">Vmatch </span>matches sequences over the same alphabet. Additionally,
108 DNA sequences can be matched against a protein sequence index in all six reading
109 frames. Finally, DNA sequences can be transformed in all six reading frames and
110 compared against itself.
111 </p><!--l. 44--><p class="noindent" >Where appropriate, <span
112 class="ptmri7t-x-x-120">Vmatch </span>can compute the following kinds of matches, using
113 state-of-the-art algorithms:
114 </p>
115 <ul class="itemize1">
116 <li class="itemize">maximal and supermaximal repeats using the algorithms of <a
117 id="XABO:KUR:OHL:2004"></a>M.I.
118 Abouelhoda, S. Kurtz, and E. Ohlebusch. Replacing suffix trees with
119 enhanced suffix arrays. <span
120 class="ptmri7t-x-x-120">Journal of Discrete Algorithms</span>, 2:53–86, 2004
121 </li>
122 <li class="itemize">branching tandem repeats using the algorithm of <a
123 id="XABO:KUR:OHL:2002"></a>M.I. Abouelhoda,
124 S. Kurtz, and E. Ohlebusch. The enhanced suffix array and its applications
125 to genome analysis. In <span
126 class="ptmri7t-x-x-120">Proceedings of the Second Workshop on Algorithms</span>
127 <span
128 class="ptmri7t-x-x-120">in Bioinformatics</span>, pages 449–463. Lecture Notes in Computer Science
129 2452, Springer-Verlag, 2002
130 </li>
131 <li class="itemize">maximal (unique) substring matches using the algorithms of <a
132 id="XKUR:2002B"></a>S. Kurtz. A
133 Time and Space Efficient Algorithm for the Substring Matching Problem,
134 2002
135 </li>
136 <li class="itemize">complete matches using the algorithms of <a
137 id="XMAN:MYE:1993"></a>U. Manber and E.W. Myers.
138 Suffix Arrays: A New Method for On-Line String Searches. <span
139 class="ptmri7t-x-x-120">SIAM Journal</span>
140 <span
141 class="ptmri7t-x-x-120">on Computing</span>, 22(5):935–948, 1993 and [<a
142 href="#XMYE:1999">86</a>]
143 </li></ul>
144 <!--l. 69--><p class="noindent" >To compute degenerate substring matches or degenerate repeats, each kind
145 of match (with the exception of tandem repeats and complete matches) can
146 be taken as an exact seed and extended by either of two different strategies:
147 </p>
148 <ul class="itemize1">
149 <li class="itemize">the <span
150 class="ptmri7t-x-x-120">maximum error </span>extension strategy, as described in
151
152
153
154 <!--l. 77--><p class="noindent" ><a
155 id="XKUR:CHO:OHL:SCHLE:STO:GIE:2001"></a>S. Kurtz, J.V. Choudhuri, E. Ohlebusch, C. Schleiermacher, J. Stoye, and
156 R. Giegerich. REPuter: The manifold applications of repeat analysis on
157 a genomic scale. <span
158 class="ptmri7t-x-x-120">Nucleic Acids Res.</span>, 29(22):4633–4642, 2001 for repeat
159 detection,
160 </p></li>
161 <li class="itemize">the <span
162 class="ptmri7t-x-x-120">greedy </span>extension strategy of <a
163 id="XZHA:SCHWA:WAG:MIL:2000"></a>Z. Zhang, S. Schwartz, L. Wagner,
164 and W. Miller. A Greedy Algorithm for Aligning DNA Sequences.
165 <span
166 class="ptmri7t-x-x-120">J.</span><span
167 class="ptmri7t-x-x-120"> Comp.</span><span
168 class="ptmri7t-x-x-120"> Biol.</span>, 7(1/2):203–214, 2000
169 </li></ul>
170 <!--l. 84--><p class="noindent" >Matches can be selected according to their length, their E-value, their identity value, or
171 match score.
172 </p><!--l. 87--><p class="noindent" >In the standard case, a match is displayed as an alignment including positional
173 information. Alternatively, a match can directly be postprocessed in different
174 ways:
175 </p>
176 <ul class="itemize1">
177 <li class="itemize"><span
178 class="ptmri7t-x-x-120">inverse output</span>, i.e. reporting of substrings <span
179 class="ptmri7t-x-x-120">not </span>covered by a match.
180 </li>
181 <li class="itemize"><span
182 class="ptmri7t-x-x-120">masking </span>of substrings covered by a match.
183 </li>
184 <li class="itemize"><span
185 class="ptmri7t-x-x-120">clustering </span>of sequences according to the matches found.
186 </li>
187 <li class="itemize"><span
188 class="ptmri7t-x-x-120">chaining </span>of matches, i.e. finding optimal subsets of matches which do not
189 cross, using the algorithms described in
190 <!--l. 104--><p class="noindent" ><a
191 id="XABO:OHL:2003"></a>M.I. Abouelhoda and E. Ohlebusch. A Local Chaining Algorithm
192 and its Applications in Comparative Genomics. In <span
193 class="ptmri7t-x-x-120">Proc. 3rd Worksh.</span>
194 <span
195 class="ptmri7t-x-x-120">Algorithms in Bioinformatics (WABI 2003)</span>, number 2812 in Lecture Notes
196 in Bioinformatics, pages 1–16. Springer-Verlag, 2003
197 </p></li>
198 <li class="itemize"><span
199 class="ptmri7t-x-x-120">clustering </span>of matches according to pairwise sequence similarities computed
200
201
202
203 by the dynamic programming algorithm of <a
204 id="XUKK:1985A"></a>E. Ukkonen. Algorithms for
205 Approximate String Matching. <span
206 class="ptmri7t-x-x-120">Information and Control</span>, 64:100–118, 1985
207 </li>
208 <li class="itemize"><span
209 class="ptmri7t-x-x-120">clustering </span>of matches according to the positions where they occur, following
210 the approach of
211 <!--l. 115--><p class="noindent" ><a
212 id="XVOL:HAA:SAL:2001"></a>N. Volfovsky,
213 B.J. Haas, and S.L. Salzberg. A Clustering Method for Repeat Analysis in
214 DNA Sequences. <span
215 class="ptmri7t-x-x-120">Genome Biology</span>, 2(8):research0027.1–0027.11, 2001
216 </p>
217 </li></ul>
218 <!--l. 119--><p class="noindent" >
219 </p>
220 <h4 class="likesubsectionHead"><a
221 id="x1-5000"></a>Efficient algorithms and data structures</h4>
222 <!--l. 120--><p class="noindent" ><span
223 class="ptmri7t-x-x-120">Vmatch </span>is based on enhanced suffix arrays described Abouelhoda, Kurtz &#x0026; Ohlebusch,
224 2004. This data structure has been shown to be as powerful as suffix trees, with the
225 advantage of a reduced space requirement and reduced processing time. Careful
226 implementation of the algorithms and data structures incorporated in <span
227 class="ptmri7t-x-x-120">Vmatch</span>
228 have led to exceedingly fast and robust software, allowing very large sequence
229 sets to be processed quickly. The 32-bit version of <span
230 class="ptmri7t-x-x-120">Vmatch </span>can process up to
231 400 million symbols, if enough memory is available. For large server class
232 machines (e.g. SUN-Sparc/Solaris, Intel Xeon/Linux, Compaq-Alpha/Tru64)
233 <span
234 class="ptmri7t-x-x-120">Vmatch </span>is available as a 64 bit version, enabling gigabytes of sequences to be
235 processed.
236 </p><!--l. 138--><p class="noindent" >
237 </p>
238 <h4 class="likesubsectionHead"><a
239 id="x1-6000"></a>Flexible input format</h4>
240 <!--l. 139--><p class="noindent" >The most common formats for input sequences (Fasta, Genbank, EMBL, and
241 SWISSPROT) are accepted. The user does not have to specify the input format. It is
242
243
244
245 automatically recognized. All input files can contain an arbitrary number of sequences.
246 Gzipped compressed inputs are accepted.
247 </p><!--l. 145--><p class="noindent" >
248 </p>
249 <h4 class="likesubsectionHead"><a
250 id="x1-7000"></a>Customized output and match selection</h4>
251 <!--l. 146--><p class="noindent" ><span
252 class="ptmri7t-x-x-120">Vmatch</span>&#x2019;s output can be parsed by other programs easily. Furthermore, several options
253 allow for its customization. XML output is available and new output formats can easily
254 be incorporated without changing <span
255 class="ptmri7t-x-x-120">Vmatch</span>&#x2019;s program code. Certain matches can easily
256 be selected by user defined criteria, without intermediate output and subsequent
257 parsing.
258 </p><!--l. 154--><p class="noindent" >
259 </p>
260 <h3 class="likesectionHead"><a
261 id="x1-8000"></a>The parts of Vmatch</h3>
262 <!--l. 155--><p class="noindent" >Up until now we have referred to <span
263 class="ptmri7t-x-x-120">Vmatch </span>as a collection of programs. In the following
264 we use the same name, <span
265 class="cmtt-12">vmatch </span>(in typewriter font), for the most important
266 program in this collection. Besides <span
267 class="cmtt-12">vmatch</span>, there are the following programs
268 available:
269 </p><ol class="enumerate1" >
270 <li
271 class="enumerate" id="x1-8002x1"><span
272 class="cmtt-12">mkvtree </span>constructs the persistent index and stores it on files.
273 </li>
274 <li
275 class="enumerate" id="x1-8004x2"><span
276 class="cmtt-12">mkdna6idx </span>constructs an index for a DNA sequence after translating this in
277 all six reading frames.
278 </li>
279 <li
280 class="enumerate" id="x1-8006x3"><span
281 class="cmtt-12">vseqinfo </span>delivers information about indexed database sequences.
282 </li>
283 <li
284 class="enumerate" id="x1-8008x4"><span
285 class="cmtt-12">vstree2tex </span>outputs a representation of the index in <span class="LATEX">L<span class="A">A</span><span class="TEX">T<span
286 class="E">E</span>X</span></span>-format. It can
287 be used, for example, for educational or debugging purposes.
288
289
290
291 </li>
292 <li
293 class="enumerate" id="x1-8010x5"><span
294 class="cmtt-12">vseqselect </span>selects indexed sequences satisfying specific criteria.
295 </li>
296 <li
297 class="enumerate" id="x1-8012x6"><span
298 class="cmtt-12">vsubseqselect </span>selects substrings of a specified length range from an
299 index.
300 </li>
301 <li
302 class="enumerate" id="x1-8014x7"><span
303 class="cmtt-12">vmigrate.sh </span>converts an index from big endian to little endian
304 architectures, or vice versa.
305 </li>
306 <li
307 class="enumerate" id="x1-8016x8"><span
308 class="cmtt-12">vmatchselect </span>sort and selects matches delivered by <span
309 class="cmtt-12">vmatch</span>.
310 </li>
311 <li
312 class="enumerate" id="x1-8018x9"><span
313 class="cmtt-12">chain2dim </span>computes optimal chains of matches from files in
314 <span
315 class="ptmri7t-x-x-120">Vmatch</span>-format.
316 </li>
317 <li
318 class="enumerate" id="x1-8020x10"><span
319 class="cmtt-12">matchcluster </span>computes clusters of matches from files in <span
320 class="ptmri7t-x-x-120">Vmatch</span>-format.</li></ol>
321 <!--l. 85--><p class="noindent" > <a href="Dataflowfig.pdf">Here</a> is an overview of the dataflow in <i>Vmatch</i>.
322 </p><!--l. 87--><p class="noindent" >
323 </p>
324 <h3 class="likesectionHead"><a
325 id="x1-9000"></a>Related tools</h3>
326 <!--l. 88--><p class="noindent" >There are several tools which are based on the persistent index of <span
327 class="ptmri7t-x-x-120">Vmatch</span>:
328 </p><!--l. 91--><p class="noindent" >
329 </p><dl class="description"><dt class="description">
330 <span
331 class="ptmb7t-x-x-120">Genalyzer</span> </dt><dd
332 class="description">is a graphical user interface to visualize the output of <span
333 class="ptmri7t-x-x-120">Vmatch </span>in form
334 of a match graph. For details see
335 <!--l. 97--><p class="noindent" ><a
336 id="XCHO:SCHLE:KUR:GIE:2004"></a>J.V. Choudhuri, C. Schleiermacher, S. Kurtz, and R. Giegerich.
337 Genalyzer: Interactive visualization of sequence similarities between entire
338 genomes. <span
339 class="ptmri7t-x-x-120">Bioinformatics</span>, 20:1964–1965, 2004
340 </p><!--l. 99--><p class="noindent" >Genalyzer is not available any more.
341
342
343
344 </p></dd><dt class="description">
345 <a
346 href="http://bibiserv.techfak.uni-bielefeld.de/mga/" ><span
347 class="ptmb7t-x-x-120">MGA</span></a> </dt><dd
348 class="description">is a program to compute multiple alignments of complete genomes. For
349 details see
350 <!--l. 104--><p class="noindent" ><a
351 id="XHOEH:KUR:OHL:2002"></a>M. Höhl, S. Kurtz, and E. Ohlebusch. Efficient multiple genome
352 alignment. <span
353 class="ptmri7t-x-x-120">Bioinformatics</span>, 18(Suppl. 1):S312–S320, 2002
354 </p></dd><dt class="description">
355 <span
356 class="ptmb7t-x-x-120">Multimat</span> </dt><dd
357 class="description">is a program to compute multiple exact matches between three or more
358 genome size sequences. For details see
359 <!--l. 108--><p class="noindent" ><a
360 id="XOHL:KUR:2008"></a>E. Ohlebusch and S. Kurtz. Space efficient computation of rare
361 maximal exact matches between multiple sequences. <span
362 class="ptmri7t-x-x-120">J.</span><span
363 class="ptmri7t-x-x-120"> Comp.</span><span
364 class="ptmri7t-x-x-120"> Biol.</span>,
365 15(4):357–377, 2008
366 </p><!--l. 110--><p class="noindent" >Please contact <a
367 href="http://www.zbh.uni-hamburg.de/kurtz" >Stefan Kurtz</a> if you are interested in using Multimat.
368 </p></dd><dt class="description">
369 <a
370 href="http://bibiserv.techfak.uni-bielefeld.de/possumsearch/" ><span
371 class="ptmb7t-x-x-120">PossumSearch</span></a> </dt><dd
372 class="description">Is a program to search for position specific scoring matrices. For
373 details, see
374 <!--l. 118--><p class="noindent" ><a
375 id="XBEC:HOM:GIE:KUR:2006"></a>M. Beckstette, R. Homann, R. Giegerich, and S. Kurtz. Fast index based
376 algorithms and software for matching position specific scoring matrices.
377 <span
378 class="ptmri7t-x-x-120">BMC Bioinformatics</span>, 7:389, 2006
379 </p></dd><dt class="description">
380 </dt><dd
381 class="description">
382 </dd><dt class="description">
383 <a
384 href="http://www.genomethreader.org/" ><span
385 class="ptmb7t-x-x-120">GenomeThreader</span></a> </dt><dd
386 class="description">is a software tool to compute gene structure predictions. The
387 gene structure predictions are calculated using a similarity-based approach
388 where additional cDNA/EST and/or protein sequences are used to predict
389 gene structures via spliced alignments. <span
390 class="ptmri7t-x-x-120">GenomeThreader </span>uses the matching
391 capabilities of <span
392 class="ptmri7t-x-x-120">Vmatch </span>to efficiently map the reference sequence to a
393 genomic sequence. For details, see
394 <!--l. 128--><p class="noindent" ><a
395 id="XGRE:BRE:SPA:KUR:2005"></a>G. Gremme, V. Brendel, M.E. Sparks, and S. Kurtz. Engineering a
396 software tool for gene prediction in higher organisms. <span
397 class="ptmri7t-x-x-120">Information and</span>
398 <span
399 class="ptmri7t-x-x-120">Software Technology</span>, 47(15):965–978, 2005
400 </p></dd><dt class="description">
401 </dt><dd
402 class="description">
403
404
405
406 </dd><dt class="description">
407 <a
408 href="http://www.biopieces.org/" ><span
409 class="ptmb7t-x-x-120">Biopieces</span></a> </dt><dd
410 class="description">is a collection of bioinformatics tools that can be pieced together
411 in a very easy and flexible manner to perform both simple and
412 complex tasks. Some Biopieces depend on <span
413 class="ptmri7t-x-x-120">Vmatch</span>. For details see
414 <a
415 href="http://www.biopieces.org/" class="url" ><span
416 class="cmtt-12">http://www.biopieces.org/</span></a>.</dd></dl>
417 <!--l. 139--><p class="noindent" > <a name="CurrentUsage"/>
418 </p>
419 <h3 class="likesectionHead"><a
420 id="x1-10000"></a>Previous and Current Usages</h3>
421 <!--l. 142--><p class="noindent" >We provide an annotated bibliography listing papers which applied <span
422 class="ptmri7t-x-x-120">Vmatch </span>and shortly
423 describe the tasks for which <span
424 class="ptmri7t-x-x-120">Vmatch </span>was used. We omit our own papers. The references
425 were collected by a <a
426 href="https://scholar.google.de/scholar?q=Vmatch+AND+Kurtz+OR+www.vmatch.de" >search in Google scholar</a> (which, as of Jan 2, 2016 retrieved 397
427 results.)
428 </p><!--l. 149--><p class="noindent" >
429 </p>
430 <h4 class="likesubsectionHead"><a
431 id="x1-11000"></a>Usages in Plant Genome Research</h4>
432 <!--l. 150--><p class="noindent" >
433 </p><ol class="enumerate1" >
434 <li
435 class="enumerate" id="x1-11002x1"><a
436 id="XBRE:KUR:WAL:2002"></a>V. Brendel, S. Kurtz, and V. Walbot. Comparative genomics of
437 Arabidopsis and Maize: Prospects and limitations. <span
438 class="ptmri7t-x-x-120">Genome Biology</span>,
439 3(3):reviews1005.1–1005.6, 2002
440 <!--l. 153--><p class="noindent" >In this work <span
441 class="ptmri7t-x-x-120">Vmatch </span>was used to a compute a non-redundant set from a
442 large collection of protein sequences from Zea-Maize.
443 </p><!--l. 155--><p class="noindent" >Similar applications are described in
444 </p><!--l. 157--><p class="noindent" ><a
445 id="XDON:ROY:FRE:WAL:BRE:2003"></a>Q. Dong, L. Roy, M. Freeling, V. Walbot, and V. Brendel. ZmDB, an
446 integrated Database for Maize Genome Research. <span
447 class="ptmri7t-x-x-120">Nucleic Acids Res.</span>,
448 31:244–247, 2003.
449
450
451
452 </p></li>
453 <li
454 class="enumerate" id="x1-11004x2">PLEXdb is a database for gene expression resources for plants and plant
455 pathogens, see
456 <!--l. 166--><p class="noindent" ><a
457 id="XDAS:VAN:HON:WIS:DIC:2012"></a>S. Dash, J. Van Hemert, L. Hong, R. P. Wise, and J. A. Dickerson.
458 PLEXdb: gene expression resources for plants and plant pathogens. <span
459 class="ptmri7t-x-x-120">Nucleic</span>
460 <span
461 class="ptmri7t-x-x-120">Acids Res.</span>, 40(Database issue):D1194–1201, Jan 2012
462 </p><!--l. 168--><p class="noindent" >PLEXdb provides a <span
463 class="ptmri7t-x-x-120">Vmatch</span>-based <a
464 href="http://www.plantgdb.org/cgi-bin/prj/PLEXdb/ProbeMatch.pl" >web-service</a> to match PLEXdb probes.
465 </p></li>
466 <li
467 class="enumerate" id="x1-11006x3">The assembly of the Arabidopsis thaliana genome from 2004 (GenBank
468 entries of 2/19/04) contained vector sequence contaminations. For example,
469 region 3 617 880 to 3 625 027 of chromosome II contained a cloning vector.
470 <span
471 class="ptmri7t-x-x-120">Vmatch </span>was used to detect the vector contamination, see <a
472 href="http://www.plantgdb.org/AtGDB/Annotation/vector.php" >here</a>
473 </li>
474 <li
475 class="enumerate" id="x1-11008x4"><a
476 id="XDON:LAW:SCHLUE:WIL:KUR:LUS:BRE:2005"></a>Q. Dong, C.J. Lawrence, S.D. Schlueter, M.D. Wilkerson, S. Kurtz,
477 C. Lushbough, and V. Brendel. Comparative Plant Genomics Resources at
478 PlantGDB. <span
479 class="ptmri7t-x-x-120">Plant Physiology, Plant Database Focus Issue</span>, 2005
480 <!--l. 183--><p class="noindent" >This work describes PlantGDB, which provides a service called
481 <a
482 href="http://www.plantgdb.org/PlantGDB-cgi/vmatch/patternsearch.pl" >PatternSearch@PlantGDB</a> for genome wide pattern searches in plant
483 sequences. The service is based on <span
484 class="ptmri7t-x-x-120">Vmatch</span>.
485 </p></li>
486 <li
487 class="enumerate" id="x1-11010x5"><a
488 id="XLIN:KRO:2005"></a>M. Lindow and A. Krogh. Computational evidence for hundreds of
489 non-conserved plant micrornas. <span
490 class="ptmri7t-x-x-120">BMC Genomics</span>, 6(1):119, 2005
491 <!--l. 202--><p class="noindent" >In this work <span
492 class="ptmri7t-x-x-120">Vmatch </span>was used for three different tasks: </p>
493 <ul class="itemize1">
494 <li class="itemize">Searching spliced mRNA in the Arabidopsis genome to detect
495 micromatches of length at least 20 with maximum 2 mismatches.
496 </li>
497 <li class="itemize">Finding matches of length at least 15 long with at most one mismatch
498 between predicted mature miRNA-sequences and a set of ESTs as well
499 as sequences from the Arabidopsis Small RNA Project (ASRP).
500 </li>
501 <li class="itemize">Aligning and performing single linkage clustering of the predicted
502 mature miRNA sequences. Candidate pairs aligning over at least 17
503 bases, allowing an edit distance of 1 were grouped in the same family.</li></ul>
504
505
506
507 </li>
508 <li
509 class="enumerate" id="x1-11012x6"><a
510 id="XPOM:LEM:TUR:2006"></a>J.-F. Pombert, C. Lemieux, and M. Turmel. The complete chloroplast DNA
511 sequence of the green alga Oltmannsiellopsis viridis reveals a distinctive
512 quadripartite architecture in the chloroplast genome of early diverging ulvophytes.
513 <span
514 class="ptmri7t-x-x-120">BMC Biology</span>, 4:3, 2006
515 <!--l. 207--><p class="noindent" ><a
516 id="XTUR:OTI:LEM:2006"></a>M. Turmel, C. Otis, and C. Lemieux. The Chloroplast Genome Sequence of
517 Chara vulgaris Sheds New Light into the Closest Green Algal Relatives of Land
518 Plants. <span
519 class="ptmri7t-x-x-120">Molecular Biology and Evolution</span>, 23:1324–1338, 2006
520 </p><!--l. 209--><p class="noindent" >In these papers <span
521 class="ptmri7t-x-x-120">Vmatch </span>was used to search and compare repeated elements in
522 different chloroplast DNA.
523 </p></li>
524 <li
525 class="enumerate" id="x1-11014x7"><a
526 id="XSPA:NOU:HAA:YAN:GUN:HIN:KLE:HAB:SCHOO:MAY:2007"></a>M. Spannagl, O. Noubibou, D. Haase, L. Yang, H. Gundlach, T. Hindemitt,
527 K. Klee, G. Haberer, H. Schoof, and K.F.X. Mayer. MIPSPlantsDB–plant
528 database resource for integrative and comparative plant genome research. <span
529 class="ptmri7t-x-x-120">Nucleic</span>
530 <span
531 class="ptmri7t-x-x-120">Acids Res</span>, 35(Database issue):D834–40, 2007 In this work about the
532 <span
533 class="ptmri7t-x-x-120">MIPSPlantsDB </span>database <span
534 class="ptmri7t-x-x-120">Vmatch </span>was used to cluster large sequence
535 sets.
536 </li>
537 <li
538 class="enumerate" id="x1-11016x8"><a
539 id="XSCHIJ:VOS:MAR:JON:ROS:MOL:TIK:ANG:TUN:BOV:2007"></a>E.G.W.M. Schijlen, C.H. Ric de Vos, S. Martens, H.H. Jonker, F.M. Rosin, J.W.
540 Molthoff, Y.M. Tikunov, G.C. Angenent, A.J. van Tunen, and A.G. Bovy. RNA
541 interference silencing of chalcone synthase, the first step in the flavonoid
542 biosynthesis pathway, leads to parthenocarpic tomato fruits. <span
543 class="ptmri7t-x-x-120">Plant Physiol</span>,
544 144(3):1520–30, 2007
545 <!--l. 218--><p class="noindent" >In this work <span
546 class="ptmri7t-x-x-120">Vmatch </span>was used to compare target genes of the tomato Chs RNAi
547 to a tomato gene index.
548 </p></li>
549 <li
550 class="enumerate" id="x1-11018x9"><a
551 id="XLIN:JAC:NYG:MAN:KRO:2007"></a>M. Lindow, A. Jacobsen, S. Nygaard, Y. Mang, and A. Krogh. Intragenomic
552 matching reveals a huge potential for mirna-mediated regulation in plants. <span
553 class="ptmri7t-x-x-120">PLOS</span>
554 <span
555 class="ptmri7t-x-x-120">Comput. Biol</span>, 3(11):e238, 2007
556 <!--l. 223--><p class="noindent" >In this work <span
557 class="ptmri7t-x-x-120">Vmatch </span>was used to search different plant genomes for matches of
558 length at least 20 with maximum of 2 mismatches. Here the fact that <span
559 class="ptmri7t-x-x-120">Vmatch </span>is an
560 exhaustive search tool is important.
561 </p></li>
562 <li
563 class="enumerate" id="x1-11020x10"><a
564 id="XDEC:OTI:THU:LEM:2007"></a>J.-C. de Cambiaire, C. Otis, M. Turmel, and C. Lemieux. The chloroplast
565
566
567
568 genome sequence of the green alga leptosira terrestris: multiple losses of
569 the inverted repeat and extensive genome rearrangements within the
570 trebouxiophyceae. <span
571 class="ptmri7t-x-x-120">BMC Genomics</span>, 8(1):213, 2007
572 <!--l. 228--><p class="noindent" >In this work <span
573 class="ptmri7t-x-x-120">Vmatch </span>was used to determine the presence of shared repeated
574 elements of minimum length 30, with up to 10% mismatches using in different
575 sequence sets from the green alga <span
576 class="ptmri7t-x-x-120">Leptosira terrestris</span>.
577 </p></li>
578 <li
579 class="enumerate" id="x1-11022x11"><a
580 id="XOSS:SCHNE:CLA:LAN:WAR:WEI:2008"></a>S. Ossowski, K. Schneeberger, R.M. Clark, C. Lanz, N. Warthmann, and
581 D. Weigel. Sequencing of natural strains of Arabidopsis thaliana with short
582 reads. <span
583 class="ptmri7t-x-x-120">Genome Res.</span>, 18:2024–2033, 2008
584 <!--l. 235--><p class="noindent" >In this work <span
585 class="ptmri7t-x-x-120">Vmatch </span>was used to map millions of short sequence reads to the
586 <span
587 class="ptmri7t-x-x-120">A.</span><span
588 class="ptmri7t-x-x-120"> Thaliana </span>genome. Up to four mismatches and up to three indels were allowed
589 in the matching process. The seed size was chosen to be 0. The reads were aligned
590 using the best match strategy by iteratively increasing the the allowed number of
591 mismatches and gaps at each round.
592 </p></li>
593 <li
594 class="enumerate" id="x1-11024x12"><a
595 id="XDIBO:OSS:SCHNE:RAT:2008"></a>F. De Bona, S. Ossowski, K. Schneeberger, and G. Ratsch. Optimal spliced
596 alignments of short sequence reads. <span
597 class="ptmri7t-x-x-120">Bioinformatics</span>, 24(16):i174–180,
598 2008
599 <!--l. 242--><p class="noindent" >In this work <span
600 class="ptmri7t-x-x-120">Vmatch </span>was used to map millions of short sequence reads to the
601 <span
602 class="ptmri7t-x-x-120">A.</span><span
603 class="ptmri7t-x-x-120"> Thaliana </span>genome. <span
604 class="ptmri7t-x-x-120">Vmatch </span>was part of a multi-step pipeline, combining a fast
605 matching algorithm (<span
606 class="ptmri7t-x-x-120">Vmatch</span>) for initial read mapping and an optimal alignment
607 algorithm based on dynamic programming (QPALMA) for high quality detection
608 of splice sites.
609 </p></li>
610 <li
611 class="enumerate" id="x1-11026x13"><a
612 id="XASS:HER:LIN:HUE:TAL:SMA:IMM:ELD:FIE:SCHAT:2010"></a>A. G. L. Assunção, E. Herrero, Y-F. Lin, B. Huettel, S. Talukdar,
613 C. Smaczniak, R. GH Immink, M. Van Eldik, M. Fiers, H. Schat, et al.
614 Arabidopsis thaliana transcription factors bzip19 and bzip23 regulate the
615 adaptation to zinc deficiency. <span
616 class="ptmri7t-x-x-120">Proceedings of the National Academy of Sciences</span>,
617 107(22):10296–10301, 2010
618 <!--l. 245--><p class="noindent" >In this work <span
619 class="ptmri7t-x-x-120">Vmatch </span>was used for motif searching in different plant
620 genomes.
621 </p></li>
622 <li
623 class="enumerate" id="x1-11028x14"><a
624 id="XEVE:SAT:GOL:MEY:BET:SAK:WAR:JAC:2010"></a>Andrea L Eveland, Namiko Satoh-Nagasawa, Alexander Goldshmidt, Sandra
625
626
627
628 Meyer, Mary Beatty, Hajime Sakai, Doreen Ware, and David Jackson. Digital
629 gene expression signatures for maize development. <span
630 class="ptmri7t-x-x-120">Plant physiology</span>,
631 154(3):1024–1039, 2010
632 <!--l. 248--><p class="noindent" >In this work <span
633 class="ptmri7t-x-x-120">Vmatch </span>was used to map unique consensus sequence tags to the
634 maize reference genome.
635 </p></li>
636 <li
637 class="enumerate" id="x1-11030x15"><a
638 id="XBRO:OTI:LEM:TUR:2010"></a>Jean-Simon Brouard, Christian Otis, Claude Lemieux, and Monique Turmel. The
639 exceptionally large chloroplast genome of the green alga floydiella terrestris
640 illuminates the evolutionary history of the chlorophyceae. <span
641 class="ptmri7t-x-x-120">Genome biology and</span>
642 <span
643 class="ptmri7t-x-x-120">evolution</span>, 2:240, 2010
644 <!--l. 252--><p class="noindent" >In this work <span
645 class="ptmri7t-x-x-120">Vmatch </span>was used to identify and cluster repeated sequences in
646 <span
647 class="ptmri7t-x-x-120">Floydiella </span>chloroplast genome.
648 </p></li>
649 <li
650 class="enumerate" id="x1-11032x16"><a
651 id="XREH:AQU:GRU:HEN:HIL:LAU:NAO:PAT:ROM:SHU:2010"></a>Hubert Rehrauer, Catharine Aquino, Wilhelm Gruissem, Stefan R Henz, Pierre
652 Hilson, Sascha Laubinger, Naira Naouar, Andrea Patrignani, Stephane Rombauts,
653 Huan Shu, et al. Agronomics1: a new resource for arabidopsis transcriptome
654 profiling. <span
655 class="ptmri7t-x-x-120">Plant Physiology</span>, 152(2):487–499, 2010
656 <!--l. 257--><p class="noindent" >In this work <span
657 class="ptmri7t-x-x-120">Vmatch </span>was used to calculate direct and reverse complementary
658 matches of length 17 bp or greater with edit distance 1 or less between
659 five nuclear chromosomes and mitochondrial and chloroplast genome
660 sequences.
661 </p></li>
662 <li
663 class="enumerate" id="x1-11034x17"><a
664 id="XSEK:LIN:CHI:HAN:BUE:LEO:KAE:2011"></a>R. S. Sekhon, H. Lin, K. L. Childs, C. N. Hansey, C. R. Buell, N. de Leon,
665 and S. M. Kaeppler. Genome-wide atlas of transcription during maize
666 development. <span
667 class="ptmri7t-x-x-120">Plant J.</span>, 66(4):553–563, May 2011
668 <!--l. 261--><p class="noindent" >In this work <span
669 class="ptmri7t-x-x-120">Vmatch </span>was used to search probe sequences against the maize
670 genome the cDNA sequences of the official maize gene models.
671 </p></li>
672 <li
673 class="enumerate" id="x1-11036x18"><a
674 id="XDAS:OH:HAA:HER:HON:ALI:YUN:BRE:ZHU:BOH:2011"></a>M. Dassanayake, D. H. Oh, J. S. Haas, A. Hernandez, H. Hong, S. Ali, D. J.
675 Yun, R. A. Bressan, J. K. Zhu, H. J. Bohnert, and J. M. Cheeseman. The
676 genome of the extremophile crucifer Thellungiella parvula. <span
677 class="ptmri7t-x-x-120">Nat. Genet.</span>,
678 43(9):913–918, Sep 2011
679 <!--l. 266--><p class="noindent" >In this work <span
680 class="ptmri7t-x-x-120">Vmatch </span>was used for clustering sequences assembled from 454-reads
681
682
683
684 of <span
685 class="ptmri7t-x-x-120">Thellungiella parvula</span>, a model for the evolution of plant adaptation to extreme
686 environments.
687 </p></li>
688 <li
689 class="enumerate" id="x1-11038x19"><a
690 id="XWIL:HOF:KLE:WEI:2011"></a>E. M. Willing, M. Hoffmann, J. D. Klein, D. Weigel, and C. Dreyer.
691 Paired-end RAD-seq for de novo assembly and marker design without available
692 reference. <span
693 class="ptmri7t-x-x-120">Bioinformatics</span>, 27(16):2187–2193, Aug 2011
694 <!--l. 270--><p class="noindent" >In this work <span
695 class="ptmri7t-x-x-120">Vmatch </span>was used for grouping short reads into pools representing
696 the same RAD tag.
697 </p></li>
698 <li
699 class="enumerate" id="x1-11040x20"><a
700 id="XGAO:ZHO:WAN:SU:WAN:2011"></a>L. Gao, Y. Zhou, Z.-W. Wang, Y.-J. Su, and T. Wang. Evolution of the
701 <span
702 class="ptmri7t-x-x-120">rpoB-psbZ </span>region in fern plastid genomes: notable structural rearrangements
703 and highly variable intergenic spacers. <span
704 class="ptmri7t-x-x-120">BMC Plant Biology</span>, 11(1):64,
705 2011
706 <!--l. 274--><p class="noindent" >In this work <span
707 class="ptmri7t-x-x-120">Vmatch </span>was used for detecting and clustering repetitive sequences in
708 diverse fern plastid genomes.
709 </p></li>
710 <li
711 class="enumerate" id="x1-11042x21"><a
712 id="XSLO:ALV:CHU:WU:MCC:PAL:TAY:2012"></a>D. B. Sloan, A. J. Alverson, J. P. Chuckalovcak, M. Wu, D. E. McCauley,
713 J. D. Palmer, and D. R. Taylor. Rapid evolution of enormous, multichromosomal
714 genomes in flowering plant mitochondria with exceptionally high mutation rates.
715 <span
716 class="ptmri7t-x-x-120">PLoS Biol.</span>, 10(1):e1001241, Jan 2012
717 <!--l. 278--><p class="noindent" >In this work <span
718 class="ptmri7t-x-x-120">Vmatch </span>was used to precisely define the boundaries of all repeats
719 with 100% sequence identity.
720 </p></li>
721 <li
722 class="enumerate" id="x1-11044x22"><a
723 id="XDUB:FAR:SCHLU:CAN:ABE:TUT:WOO:SHA:MUL:KUD:2011"></a>Anuja Dubey, Andrew Farmer, Jessica Schlueter, Steven B Cannon, Brian
724 Abernathy, Reetu Tuteja, Jimmy Woodward, Trushar Shah, Benjamin
725 Mulasmanovic, Himabindu Kudapa, et al. Defining the transcriptome
726 assembly and its use for genome dynamics and transcriptome profiling
727 studies in pigeonpea (<span
728 class="ptmri7t-x-x-120">Cajanus cajan </span>l.). <span
729 class="ptmri7t-x-x-120">DNA research</span>, 18(3):153–164,
730 2011
731 <!--l. 281--><p class="noindent" >In this work <span
732 class="ptmri7t-x-x-120">Vmatch </span>was used cluster sequences based on their six-frame
733 translation.
734 </p></li>
735 <li
736 class="enumerate" id="x1-11046x23"><a
737 id="XSAX:PEN:UPA:KUM:CAR:SCHLU:FAR:WHA:SAR:MAY:2012"></a>Rachit K Saxena, R Varma Penmetsa, Hari D Upadhyaya, Ashish Kumar,
738
739
740
741 Noelia Carrasquilla-Garcia, Jessica A Schlueter, Andrew Farmer, Adam M
742 Whaley, Birinchi K Sarma, Gregory D May, et al. Large-scale development of
743 cost-effective single-nucleotide polymorphism marker assays for genetic
744 mapping in pigeonpea and comparative mapping in legumes. <span
745 class="ptmri7t-x-x-120">DNA research</span>,
746 19(6):449–461, 2012
747 <!--l. 285--><p class="noindent" >In this work <span
748 class="ptmri7t-x-x-120">Vmatch </span>was used to identify reciprocal best matches between the
749 pigeonpea sequences and other legume sequences.
750 </p></li>
751 <li
752 class="enumerate" id="x1-11048x24"><a
753 id="XHAZ:REE:RIS:PEC:2012"></a>B. Z. Haznedaroglu, D. Reeves, H. Rismani-Yazdi, and J. Peccia. Optimization
754 of de novo transcriptome assembly from high-throughput short read sequencing
755 data improves functional annotation for non-model organisms. <span
756 class="ptmri7t-x-x-120">BMC</span>
757 <span
758 class="ptmri7t-x-x-120">Bioinformatics</span>, 13:170, 2012
759 <!--l. 290--><p class="noindent" >In this work <span
760 class="ptmri7t-x-x-120">Vmatch </span>was used for assembly clustering and optimization
761 of contigs for <span
762 class="ptmri7t-x-x-120">Neochloris oleoabundans </span>(a Chlorophyceae class green
763 microalgae).
764 </p></li>
765 <li
766 class="enumerate" id="x1-11050x25"><a
767 id="XMAR:KLE:BAN:BLA:MAC:SCHMU:SCHOL:GUN:WIC:SIM:2012"></a>M. M. Martis, S. Klemme, A. M. Banaei-Moghaddam, F. R. Blattner,
768 J. Macas, T. Schmutzer, U. Scholz, H. Gundlach, T. Wicker, H. Šimková,
769 P. Novak, P. Neumann, M. Kubalakova, E. Bauer, G. Haseneyer, J. Fuchs,
770 J. Dolezel, N. Stein, K. F. Mayer, and A. Houben. Selfish supernumerary
771 chromosome reveals its origin as a mosaic of host genome and organellar
772 sequences. <span
773 class="ptmri7t-x-x-120">Proc. Natl. Acad. Sci. U.S.A.</span>, 109(33):13343–13346, Aug
774 2012
775 <!--l. 294--><p class="noindent" >In this work <span
776 class="ptmri7t-x-x-120">Vmatch </span>was used to match reads against a repeat library to identity
777 the content of the repetitive DNA per sequence read.
778 </p></li>
779 <li
780 class="enumerate" id="x1-11052x26"><a
781 id="XCHI:DAV:BUE:2011"></a>K. L. Childs, R. M. Davidson, and C. R. Buell. Gene coexpression network
782 analysis as a source of functional annotation for rice genes. <span
783 class="ptmri7t-x-x-120">PloS one</span>,
784 6(7):e22196, 2011
785 <!--l. 297--><p class="noindent" >In this work <span
786 class="ptmri7t-x-x-120">Vmatch </span>was used to align individual probes to representative gene
787 models.
788 </p></li>
789 <li
790 class="enumerate" id="x1-11054x27"><a
791 id="XSEV:DIJ:HAM:2011"></a>E. I. Severing, A. D. J. van Dijk, and R. C. H. J. van Ham. Assessing the
792 contribution of alternative splicing to proteome diversity in arabidopsis thaliana
793
794
795
796 using proteomics data. <span
797 class="ptmri7t-x-x-120">BMC Plant Biology</span>, 11(1):82, 2011
798 <!--l. 301--><p class="noindent" >In this work <span
799 class="ptmri7t-x-x-120">Vmatch </span>was used for performing exact searches with peptides
800 against the filtered proteome of <span
801 class="ptmri7t-x-x-120">A. thaliana</span>.
802 </p></li>
803 <li
804 class="enumerate" id="x1-11056x28"><a
805 id="XWOL:WEI:SEG:ROS:BEI:DON:SPI:NOR:REH:KOE:2011"></a>P. Wolff, I. Weinhofer, J. Seguin, P. Roszak, C. Beisel, M.T. Donoghue,
806 C. Spillane, M. Nordborg, M. Rehmsmeier, and C. Köhler. High-resolution
807 analysis of parent-of-origin allelic expression in the arabidopsis endosperm. <span
808 class="ptmri7t-x-x-120">PLoS</span>
809 <span
810 class="ptmri7t-x-x-120">Genet</span>, 7(6):e1002126–e1002126, 2011
811 <!--l. 307--><p class="noindent" >In this work <span
812 class="ptmri7t-x-x-120">Vmatch </span>was used to map RNAseq reads, allowing up to two
813 mismatches (option <span
814 class="cmtt-12">-h 2</span>) and generating maximal substring matches that are
815 unique in some reference dataset (option <span
816 class="cmtt-12">-mum cand</span>).
817 </p></li>
818 <li
819 class="enumerate" id="x1-11058x29"><a
820 id="XFLE:KHA:JOH:YOU:MIT:WRE:HES:FOS:SCHAR:SCO:2011"></a>D. J. Fleetwood, A. K. Khan, R. D. Johnson, C. A. Young, S. Mittal, R. E.
821 Wrenn, U. Hesse, S. J. Foster, C. L. Schardl, and B. Scott. Abundant
822 degenerate miniature inverted-repeat transposable elements in genomes of
823 epichloid fungal endophytes of grasses. <span
824 class="ptmri7t-x-x-120">Genome Biol Evol</span>, 3:1253–1264,
825 2011
826 <!--l. 312--><p class="noindent" >In this work <span
827 class="ptmri7t-x-x-120">Vmatch </span>was used to identify terminal inverted repeats of length
828 range 10-65 bp, <span
829 class="zptmcm7y-x-x-120">≥ </span><span
830 class="zptmcm7t-x-x-120">80% </span>identity, maximum inter-TIR distance 650 bp in in
831 genomes of epichloid fungal endophytes of grasses.
832 </p></li>
833 <li
834 class="enumerate" id="x1-11060x30"><a
835 id="XCHI:KON:BUE:2012"></a>K. L. Childs, K. Konganti, and C. R. Buell. The Biofuel Feedstock Genomics
836 Resource: a web-based portal and database to enable functional genomics
837 of plant biofuel feedstock species. <span
838 class="ptmri7t-x-x-120">Database (Oxford)</span>, 2012:bar061,
839 2012
840 <!--l. 315--><p class="noindent" >In this work <span
841 class="ptmri7t-x-x-120">Vmatch </span>was used to match putative unique transcript sequence
842 assemblies.
843 </p></li>
844 <li
845 class="enumerate" id="x1-11062x31"><a
846 id="XCHE:CAS:BAI:RED:MIC:2012"></a>Y. Chen, B. J. Cassone, X. Bai, M. G. Redinbaugh, and A. P. Michel.
847 Transcriptome of the plant virus vector Graminella nigrifrons, and the molecular
848 interactions of maize fine streak rhabdovirus transmission. <span
849 class="ptmri7t-x-x-120">PLoS ONE</span>,
850 7(7):e40613, 2012
851 <!--l. 319--><p class="noindent" >In this work <span
852 class="ptmri7t-x-x-120">Vmatch </span>was used for refining assemblies of Illumina reads in
853
854
855
856 the context of a transcriptome project for plant virus vector <span
857 class="ptmri7t-x-x-120">Graminella</span>
858 <span
859 class="ptmri7t-x-x-120">nigrifrons</span>.
860 </p></li>
861 <li
862 class="enumerate" id="x1-11064x32"><a
863 id="XKRI:PAT:JAI:GAU:CHOU:VAI:DEE:HAR:KRI:NAI:2012"></a>N. M. Krishnan, S. Pattnaik, P. Jain, P. Gaur, R. Choudhary, S. Vaidyanathan,
864 S. Deepak, A. K. Hariharan, P. B. Krishna, J. Nair, L. Varghese, N. K.
865 Valivarthi, K. Dhas, K. Ramaswamy, and B. Panda. A draft of the genome and
866 four transcriptomes of a medicinal and pesticidal angiosperm Azadirachta indica.
867 <span
868 class="ptmri7t-x-x-120">BMC Genomics</span>, 13:464, 2012
869 <!--l. 324--><p class="noindent" >In this work <span
870 class="ptmri7t-x-x-120">Vmatch </span>was used for clustering repeats and for building a consensus
871 repeat library in the context of genome and transcriptome projects for <span
872 class="ptmri7t-x-x-120">Azadirachta</span>
873 <span
874 class="ptmri7t-x-x-120">indica</span>, a medicinal and pesticidal angiosperm.
875 </p></li>
876 <li
877 class="enumerate" id="x1-11066x33"><a
878 id="XLIU:KUM:ZHA:ZHE:WAR:2012"></a>Z. Liu, S. Kumari, L. Zhang, Y. Zheng, and D. Ware. Characterization of
879 mirnas in response to short-term waterlogging in three inbred lines of zea mays.
880 <span
881 class="ptmri7t-x-x-120">PLoS One</span>, 7(6):e39786, 2012
882 <!--l. 328--><p class="noindent" >In this work <span
883 class="ptmri7t-x-x-120">Vmatch </span>was used to map unique consensus sequences tags to the
884 maize reference genome and to predict targets of novel miRNAs.
885 </p></li>
886 <li
887 class="enumerate" id="x1-11068x34"><a
888 id="XBOU:KOU:PAV:MIN:TSA:DAR:2012"></a>A. Bousios, Y. A. I. Kourmpetis, P. Pavlidis, E. Minga, A. Tsaftaris, and
889 N. Darzentas. The turbulent life of sirevirus retrotransposons and the evolution of
890 the maize genome: more than ten thousand elements tell the story. <span
891 class="ptmri7t-x-x-120">The Plant</span>
892 <span
893 class="ptmri7t-x-x-120">Journal</span>, 69(3):475–488, 2012
894 <!--l. 331--><p class="noindent" >In this work <span
895 class="ptmri7t-x-x-120">Vmatch </span>was used for masking Long Terminal Repeats in the Maize
896 Genome Sequence.
897 </p></li>
898 <li
899 class="enumerate" id="x1-11070x35">In the papers
900 <!--l. 335--><p class="noindent" ><a
901 id="XHER:MAR:DOR:PFE:GAL:SCHAA:JOU:SIM:VAL:DOL:2012"></a>P. Hernandez, M. Martis, G. Dorado, M. Pfeifer, S. Galvez, S. Schaaf, N. Jouve,
902 H. Šimková, M. Valarik, J. Dolezel, and K. F. Mayer. Next-generation
903 sequencing and syntenic integration of flow-sorted arms of wheat chromosome
904 4A exposes the chromosome structure and gene content. <span
905 class="ptmri7t-x-x-120">Plant J.</span>, 69(3):377–386,
906 Feb 2012
907 </p><!--l. 337--><p class="noindent" ><a
908 id="XPHI:PAU:BER:SOU:CHO:LAU:SIM:SAF:BEL:VAU:2013"></a>R. Philippe, E. Paux, I. Bertin, P. Sourdille, F. Choulet, C. Laugier,
909 H. Šimková, J. Šafář, A. Bellec, S. Vautrin, et al. A high density physical map
910
911
912
913 of chromosome 1bl supports evolutionary studies, map-based cloning and
914 sequencing in wheat. <span
915 class="ptmri7t-x-x-120">Genome Biol</span>, 14(6):R64, 2013
916 </p><!--l. 339--><p class="noindent" ><span
917 class="ptmri7t-x-x-120">Vmatch </span>was used to mask repetitive DNA.
918 </p></li>
919 <li
920 class="enumerate" id="x1-11072x36"><a
921 id="XHOW:YU:KNA:CRO:KOL:DOL:LOR:DEA:2013"></a>G. T. Howe, J. Yu, B. Knaus, R. Cronn, S. Kolpak, P. Dolan, W. W. Lorenz,
922 and J. F. Dean. A SNP resource for Douglas-fir: de novo transcriptome
923 assembly and SNP detection and validation. <span
924 class="ptmri7t-x-x-120">BMC Genomics</span>, 14:137,
925 2013
926 <!--l. 342--><p class="noindent" >In this work <span
927 class="ptmri7t-x-x-120">Vmatch </span>was used to cluster 40 010 assembled isotigs.
928 </p></li>
929 <li
930 class="enumerate" id="x1-11074x37"><a
931 id="XKAR:HAA:MAL:GEE:BOV:LAM:ANG:MAA:2013"></a>R. Karlova, J. C. van Haarst, C. Maliepaard, H. van de Geest, A. G. Bovy,
932 M. Lammers, G. C. Angenent, and R. A. de Maagd. Identification of
933 microRNA targets in tomato fruit development using high-throughput
934 sequencing and degradome analysis. <span
935 class="ptmri7t-x-x-120">J. Exp. Bot.</span>, 64(7):1863–1878, Apr
936 2013
937 <!--l. 346--><p class="noindent" >In this work <span
938 class="ptmri7t-x-x-120">Vmatch </span>was used to preprocess short reads in the context of
939 identifying mircoRNA targets in tomato fruit development.
940 </p></li>
941 <li
942 class="enumerate" id="x1-11076x38"><a
943 id="XGRO:MAR:SIM:ABR:WAN:VIS:2013"></a>S. M. Gross, J. A. Martin, J. Simpson, M. J. Abraham-Juarez, Z. Wang, and
944 A. Visel. De novo transcriptome assembly of drought tolerant CAM
945 plants, Agave deserti and Agave tequilana. <span
946 class="ptmri7t-x-x-120">BMC Genomics</span>, 14:563,
947 2013
948 <!--l. 351--><p class="noindent" >In this work <span
949 class="ptmri7t-x-x-120">Vmatch </span>was used in an all-vs-all comparison to bin contigs into loci
950 based on a minimum of 200 bp sequence overlap in the context of transcriptome
951 assembly for two Agave-species.
952 </p></li>
953 <li
954 class="enumerate" id="x1-11078x39"><a
955 id="XKAN:HEL:DUR:WIN:ENG:BEH:HOL:BRA:HAU:FER:2013"></a>U. Kanter, W. Heller, J. Durner, J. B. Winkler, M. Engel, H. Behrendt,
956 A. Holzinger, P. Braun, M. Hauser, F. Ferreira, K. Mayer, M. Pfeifer, and
957 D. Ernst. Molecular and immunological characterization of ragweed (Ambrosia
958 artemisiifolia L.) pollen after exposure of the plants to elevated ozone over a
959 whole growing season. <span
960 class="ptmri7t-x-x-120">PLoS ONE</span>, 8(4):e61518, 2013
961 <!--l. 354--><p class="noindent" >In this work <span
962 class="ptmri7t-x-x-120">Vmatch </span>was used to align 454-reads to assembled isotigs for
963 Ragweed pollen.
964
965
966
967 </p></li>
968 <li
969 class="enumerate" id="x1-11080x40"><a
970 id="XKUG:SIE:NUS:AME:SPAN:STEI:LEM:MAY:BUE:SCHWE:2013"></a>K. G. Kugler, G. Siegwart, T. Nussbaumer, C. Ametz, M. Spannagl,
971 B. Steiner, M. Lemmens, K. F. X. Mayer, H. Buerstmayr, and W. Schweiger.
972 Quantitative trait loci-dependent analysis of a gene co-expression network
973 associated with fusarium head blight resistance in bread wheat (triticum aestivum
974 l.). <span
975 class="ptmri7t-x-x-120">BMC Genomics</span>, 14(1):728, 2013
976 <!--l. 357--><p class="noindent" >In this work <span
977 class="ptmri7t-x-x-120">Vmatch </span>was used for comparing gene sets.
978 </p></li>
979 <li
980 class="enumerate" id="x1-11082x41"><a
981 id="XMAR:ZHO:HAS:SCHMU:VRA:KUB:KOEN:KUG:SCHOL:HAC:2013"></a>Mihaela M Martis, Ruonan Zhou, Grit Haseneyer, Thomas Schmutzer, Jan
982 Vrána, Marie Kubaláková, Susanne König, Karl G Kugler, Uwe Scholz, Bernd
983 Hackauf, et al. Reticulate evolution of the rye genome. <span
984 class="ptmri7t-x-x-120">The Plant Cell</span>,
985 25(10):3685–3698, 2013
986 <!--l. 361--><p class="noindent" >In this work <span
987 class="ptmri7t-x-x-120">Vmatch </span>was used to detect repetitive DNA content of chromosomal
988 survey sequences from the Rye genome.
989 </p></li>
990 <li
991 class="enumerate" id="x1-11084x42">In the papers
992 <!--l. 366--><p class="noindent" ><a
993 id="XKOP:MAR:VHA:HRV:VRA:BAR:KOP:CAT:STO:NOV:2013"></a>D. Kopeckỳ, M. Martis, J. Číhalíková, E. Hřibová, J. Vrána, J. Bartoš,
994 J. Kopecká, F. Cattonaro, Š. Stočes, Petr Novák, et al. Flow sorting and
995 sequencing meadow fescue chromosome 4f. <span
996 class="ptmri7t-x-x-120">Plant Physiology</span>, 163(3):1323–1337,
997 2013
998 </p><!--l. 368--><p class="noindent" ><a
999 id="XKOP:MAR:CHA:HRI:VRA:BAR:2013"></a>D. Kopeckỳ, M Martis, J Číhalíková, E Hřibová, J Vrána, J Bartoš, et al.
1000 Genomics of meadow fescue chromosome 4f. <span
1001 class="ptmri7t-x-x-120">Plant Physiol</span>, 163:1323–1337,
1002 2013
1003 </p><!--l. 370--><p class="noindent" ><span
1004 class="ptmri7t-x-x-120">Vmatch </span>was used for identifying repetitive DNA content in contigs of meadow
1005 fescue chromosome 4F assembled from Illumina short reads.
1006 </p></li>
1007 <li
1008 class="enumerate" id="x1-11086x43">In the papers
1009 <!--l. 377--><p class="noindent" ><a
1010 id="XJAY:WAN:YU:TAC:PEL:COL:REN:VOI:2011"></a>F. Jay, Y. Wang, A. Yu, L. Taconnat, S. Pelletier, V. Colot, J.-P. Renou, and
1011 O. Voinnet. Misregulation of <span
1012 class="ptmri7t-x-x-120">AUXIN RESPONSE FACTOR 8 </span>underlies the
1013 developmental abnormalities caused by three distinct viral silencing
1014 suppressors in <span
1015 class="ptmri7t-x-x-120">Arabidopsis</span>. <span
1016 class="ptmri7t-x-x-120">PLoS Pathog</span>, 7(5):e1002035–e1002035,
1017 2011
1018 </p><!--l. 379--><p class="noindent" ><a
1019 id="XWAN:WEI:SMI:2013"></a>X. Wang, D. Weigel, and L. M. Smith. Transposon variants and their
1020
1021
1022
1023 effects on gene expression in arabidopsis. <span
1024 class="ptmri7t-x-x-120">PLoS Genet</span>, 9(2):e1003255,
1025 2013
1026 </p><!--l. 381--><p class="noindent" ><span
1027 class="ptmri7t-x-x-120">Vmatch </span>was used for mapping siRNA sequences to the <span
1028 class="ptmri7t-x-x-120">Arabidopsis thaliana</span>
1029 genome.
1030 </p></li>
1031 <li
1032 class="enumerate" id="x1-11088x44"><a
1033 id="XHEN:VIV:DES:CHAU:PAY:GUT:CAS:2014"></a>E. Henaff, C. Vives, B. Desvoyes, A. Chaurasia, J. Payet, C. Gutierrez, and
1034 J. M. Casacuberta. Extensive amplification of the E2F transcription factor
1035 binding sites by transposons during evolution of Brassica species. <span
1036 class="ptmri7t-x-x-120">Plant J.</span>,
1037 77(6):852–862, Mar 2014
1038 <!--l. 385--><p class="noindent" >In this work <span
1039 class="ptmri7t-x-x-120">Vmatch </span>was used for the identification of binding motifs.
1040 </p></li>
1041 <li
1042 class="enumerate" id="x1-11090x45"><a
1043 id="XWAN:HAB:GUN:GLAE:NUS:LUO:LOM:BOR:KER:SHA:2014"></a>W Wang, G Haberer, H Gundlach, C Gläßer, TCLM Nussbaumer,
1044 MC Luo, A Lomsadze, M Borodovsky, RA Kerstetter, J Shanklin,
1045 et al. The <span
1046 class="ptmri7t-x-x-120">Spirodela polyrhiza </span>genome reveals insights into its neotenous
1047 reduction fast growth and aquatic lifestyle. <span
1048 class="ptmri7t-x-x-120">Nature Communications</span>, 5,
1049 2014
1050 <!--l. 390--><p class="noindent" >In this work <span
1051 class="ptmri7t-x-x-120">Vmatch </span>was used for masking one sequence set with another and for
1052 mapping miRNA sequences of all plant species present in a reference database to
1053 whole-genome assembly of <span
1054 class="ptmri7t-x-x-120">Spirodela polyrhiza</span>.
1055 </p></li>
1056 <li
1057 class="enumerate" id="x1-11092x46"><a
1058 id="XLOG:SCHEL:NUR:SAM:PEN:2014"></a>M. D. Logacheva, M. I. Schelkunov, M. S. Nuraliev, T. H. Samigullin, and
1059 A. A. Penin. The plastid genome of mycoheterotrophic monocot petrosavia
1060 stellaris exhibits both gene losses and multiple rearrangements. <span
1061 class="ptmri7t-x-x-120">Genome biology</span>
1062 <span
1063 class="ptmri7t-x-x-120">and evolution</span>, 6(1):238–246, 2014
1064 <!--l. 393--><p class="noindent" >In this work <span
1065 class="ptmri7t-x-x-120">Vmatch </span>was used for repeat detection.
1066 </p></li>
1067 <li
1068 class="enumerate" id="x1-11094x47"><a
1069 id="XWAN:SHI:RIN:2015"></a>X. Wang, W. Shi, and T. Rinehart. Transcriptomes That Confer to Plant Defense
1070 against Powdery Mildew Disease in Lagerstroemia indica. <span
1071 class="ptmri7t-x-x-120">Int J Genomics</span>,
1072 2015:528395, 2015
1073 <!--l. 397--><p class="noindent" >In this work <span
1074 class="ptmri7t-x-x-120">Vmatch </span>was used to eliminate redundancies in assemblies of
1075 Illumina reads in the context of studying plant defense mechanisms.
1076 </p></li>
1077 <li
1078 class="enumerate" id="x1-11096x48"><a
1079 id="XASH:HUL:WAN:YAN:GUA:JON:MAT:MOC:CHE:STE:2015"></a>H. Ashrafi, A. M. Hulse-Kemp, F. Wang, S. S. Yang, X. Guan, D. C. Jones,
1080
1081
1082
1083 M. Matvienko, K. Mockaitis, Z. J. Chen, D. M. Stelly, et al. A long-read
1084 transcriptome assembly of cotton (l.) and intraspecific single nucleotide
1085 polymorphism discovery. <span
1086 class="ptmri7t-x-x-120">The Plant Genome</span>, 2015
1087 <!--l. 400--><p class="noindent" >In this work <span
1088 class="ptmri7t-x-x-120">Vmatch </span>was used for clustering to determine a non-redundant set of
1089 assembled contigs.
1090 </p></li>
1091 <li
1092 class="enumerate" id="x1-11098x49"><a
1093 id="XUST:NOV:BLI:SMY:2015"></a>K. Ustyantsev, O. Novikova, A. Blinov, and G. Smyshlyaev. Convergent
1094 evolution of ribonuclease h in ltr retrotransposons and retroviruses. <span
1095 class="ptmri7t-x-x-120">Molecular</span>
1096 <span
1097 class="ptmri7t-x-x-120">biology and evolution</span>, 32(5):1197–1207, 2015
1098 <!--l. 403--><p class="noindent" >In this work <span
1099 class="ptmri7t-x-x-120">Vmatch </span>was used for clustering sequences based on their RT and
1100 aRNH domain.
1101 </p></li>
1102 <li
1103 class="enumerate" id="x1-11100x50"><a
1104 id="XHLE:RIV:CLA:MAR:VAN:GON:GAR:LER:SIM:VAL:2015"></a>M. Helguera, M. Rivarola, B. Clavijo, M. M. Martis, L. S. Vanzetti,
1105 S. González, I. Garbus, P. Leroy, H. Šimková, M. Valárik, et al. New insights
1106 into the wheat chromosome 4d structure and virtual gene order, revealed by
1107 survey pyrosequencing. <span
1108 class="ptmri7t-x-x-120">Plant Science</span>, 233:200–212, 2015
1109 <!--l. 406--><p class="noindent" >In this work <span
1110 class="ptmri7t-x-x-120">Vmatch </span>was used for identifying repeats in contigs assembled from
1111 454-reads.
1112 </p></li>
1113 <li
1114 class="enumerate" id="x1-11102x51"><a
1115 id="XSHE:YAN:LU:WAN:SON:2015"></a>Qi Shen, Jun Yang, Chaolong Lu, Bo Wang, and Chi Song. The complete
1116 chloroplast genome sequence of perilla frutescens (l.). <span
1117 class="ptmri7t-x-x-120">Mitochondrial DNA</span>,
1118 preprint:1–2, 2015
1119 <!--l. 409--><p class="noindent" >In this work <span
1120 class="ptmri7t-x-x-120">Vmatch </span>was used for identifying inverted repeats in chloroplast
1121 genomes.
1122 </p></li>
1123 <li
1124 class="enumerate" id="x1-11104x52"><a
1125 id="XPAN:MOH:KHA:MEH:EBR:2015"></a>Bahman Panahi, Seyed Abolghasem Mohammadi, Reyhaneh Ebrahimi
1126 Khaksefidi, Jalil Fallah Mehrabadi, and Esmaeil Ebrahimie. Genome-wide
1127 analysis of alternative splicing events in <span
1128 class="ptmri7t-x-x-120">Hordeum vulgare</span>: Highlighting retention
1129 of intron-based splicing and its possible function through network analysis. <span
1130 class="ptmri7t-x-x-120">FEBS</span>
1131 <span
1132 class="ptmri7t-x-x-120">letters</span>, 589(23):3564–3575, 2015
1133 <!--l. 413--><p class="noindent" >In this work <span
1134 class="ptmri7t-x-x-120">Vmatch </span>was used to identify contaminations and repetitive
1135 elements by comparison of mRNA sequences to vector, bacterial and repeat
1136 databases.
1137
1138
1139
1140 </p></li>
1141 <li
1142 class="enumerate" id="x1-11106x53"><a
1143 id="XWOL:TWO:GAD:KNA:GRU:GEN:2015"></a>SN Wolfenbarger, MC Twomey, DM Gadoury, BJ Knaus, NJ Grünwald, and
1144 DH Gent. Identification and distribution of mating-type idiomorphs in
1145 populations of podosphaera macularis and development of chasmothecia of the
1146 fungus. <span
1147 class="ptmri7t-x-x-120">Plant Pathology</span>, 2015
1148 <!--l. 416--><p class="noindent" >In this work <span
1149 class="ptmri7t-x-x-120">Vmatch </span>was used to cluster contigs of different assemblies into
1150 groups of homologous sequences.
1151 </p></li>
1152 <li
1153 class="enumerate" id="x1-11108x54"><a
1154 id="XYAN:LU:SHE:YAN:XU:SON:2015"></a>Jun Yang, Chaolong Lu, Qi Shen, Yuying Yan, Changjiang Xu, and Chi Song.
1155 The complete chloroplast genome sequence of Fagopyrum cymosum.
1156 <span
1157 class="ptmri7t-x-x-120">Mitochondrial DNA</span>, pages 1–2, 2015
1158 <!--l. 419--><p class="noindent" >In this work <span
1159 class="ptmri7t-x-x-120">Vmatch </span>was used to identify inverted repeats in chloroplast
1160 genomes.
1161 </p>
1162 </li></ol>
1163 <!--l. 424--><p class="noindent" >
1164 </p>
1165 <h4 class="likesubsectionHead"><a
1166 id="x1-12000"></a>Usages in the Microbial Genome Research</h4>
1167 <!--l. 425--><p class="noindent" >
1168 </p><ol class="enumerate1" >
1169 <li
1170 class="enumerate" id="x1-12002x1">The <a
1171 href="http://www.llnl.gov/str/April04/Slezak.html" >KPATH system</a>, developed at the Lawrence Livermore National
1172 Laboratories, and described in
1173 <!--l. 432--><p class="noindent" ><a
1174 id="XFIT:GAR:KUC:KUR:MYE:OTT:SLE:VIT:ZEM:MCC:2002"></a>J.P. Fitch, S.N. Gardner, T.A. Kuczmarski, S. Kurtz, R. Myers, L.L.
1175 Ott, T.R. Slezak, E.A. Vitalis, A.T. Zemla, and P.M. McCready. Rapid
1176 development of nucleic acid diagnostics. <span
1177 class="ptmri7t-x-x-120">Proceedings of the IEEE</span>,
1178 90(11):1708–1721, 2002
1179 </p><!--l. 434--><p class="noindent" ><a
1180 id="XSLE:KUC:OTT:TOR:MED:SMI:TRU:MUL:LAM:VIT:ZEM:ZHO:GAR:2003"></a>T. Slezak, T. Kuczmarski, L. Ott, C. Torres, D. Medeiros, J. Smith,
1181 B. Truitt, N. Mulakken, M. Lam, E. Vitalis, A. Zemla, C.E. Zhou, and
1182 S. Gardner. Comparative Genomics Tools Applied to Bioterrorism
1183 Defense. <span
1184 class="ptmri7t-x-x-120">Briefings in Bioinformatics</span>, 4(2):133–149, 2003
1185
1186
1187
1188 </p><!--l. 436--><p class="noindent" >used <span
1189 class="ptmri7t-x-x-120">Vmatch </span>to detect unique substrings in large collection of DNA
1190 sequences. These unique substrings serve as signatures allowing for rapid
1191 and accurate diagnostics to identify pathogen bacteria and viruses. A
1192 similar application is reported in <a
1193 id="XGAR:KUC:VIT:SLE:2003"></a>S.N. Gardner, T.A. Kuczmarski, E.A.
1194 Vitalis, and T.R. Slezak. Limitations of TaqMan PCR for Detecting Viral
1195 Pathogens I llustrated by Hepatitis A, B, C, and E Viruses and Human
1196 Immunodeficiency Virus. <span
1197 class="ptmri7t-x-x-120">J.</span><span
1198 class="ptmri7t-x-x-120"> of Clinical Microbiology</span>, 41(6):2417–2427,
1199 2003.
1200 </p></li>
1201 <li
1202 class="enumerate" id="x1-12004x2"><a
1203 id="XPOB:WET:SZY:SCHIL:KUR:MEY:NAT:BECK:2006"></a>N. Pobigaylo, D. Wetter, S. Szymczak, U. Schiller, S. Kurtz, F. Meyer,
1204 T.W. Nattkemper, and Becker A. Construction of a large signature-tagged
1205 mini-Tn5 transposon library and its application to mutagenesis of
1206 <span
1207 class="ptmri7t-x-x-120">Sinorhizobium meliloti</span>. <span
1208 class="ptmri7t-x-x-120">Appl Environ Microbiol.</span>, 72(6):4329–4337, 2006
1209 <!--l. 444--><p class="noindent" >In this work <span
1210 class="ptmri7t-x-x-120">Vmatch </span>was used to map signature tags to the genome of
1211 <span
1212 class="ptmri7t-x-x-120">S.</span><span
1213 class="ptmri7t-x-x-120"> meliloti</span>.
1214 </p></li>
1215 <li
1216 class="enumerate" id="x1-12006x3">The <a
1217 href="http://crispr.u-psud.fr/Server/CRISPRfinder.php" >CRISPRFinder</a>-program and the <a
1218 href="http://crispr.u-psud.fr/crispr/CRISPRdatabase.php" >CRISPRdatabase</a>, described in
1219 <!--l. 452--><p class="noindent" ><a
1220 id="XGRI:VER:POU:2007A"></a>I. Grissa, G. Vergnaud, and C. Pourcel. CRISPRFinder: a web tool to
1221 identify clustered regularly interspaced short palindromic repeats. <span
1222 class="ptmri7t-x-x-120">Nucleic</span>
1223 <span
1224 class="ptmri7t-x-x-120">Acids Res</span>, 35(Web Server issue):W52–7, 2007
1225 </p><!--l. 454--><p class="noindent" ><a
1226 id="XGRI:VER:POU:2007B"></a>I. Grissa, G. Vergnaud, and C. Pourcel. The CRISPRdb database and tools
1227 to display CRISPRs and to generate dictionaries of spacers and repeats.
1228 <span
1229 class="ptmri7t-x-x-120">BMC Bioinformatics</span>, 8:172, 2007
1230 </p><!--l. 456--><p class="noindent" >used <span
1231 class="ptmri7t-x-x-120">Vmatch </span>to efficiently find maximal repeats, as a first step in localizing
1232 Clustered regularly interspaced short palindromic repeats (CRISPRs).
1233 </p></li>
1234 <li
1235 class="enumerate" id="x1-12008x4"><a
1236 id="XVOSS:GEO:SCHOE:UDE:HES:2009"></a>B. Voss, J. Georg, V. Schöon, S. Ude, and W. R. Hess. Biocomputational
1237 prediction of non-coding RNAs in model cyanobacteria. <span
1238 class="ptmri7t-x-x-120">BMC Genomics</span>,
1239 10:123, 2009
1240 <!--l. 462--><p class="noindent" >In this work <span
1241 class="ptmri7t-x-x-120">Vmatch </span>was used to map predicted sequences to information
1242 about Rho-independent terminators provided by a specific database.
1243 </p></li>
1244 <li
1245 class="enumerate" id="x1-12010x5"><a
1246 id="XSCHMU:CAN:SCHLU:MA:MIT:NEL:HYT:SON:THE:CHE:2010"></a>Jeremy Schmutz, Steven B Cannon, Jessica Schlueter, Jianxin Ma, Therese
1247
1248
1249
1250 Mitros, William Nelson, David L Hyten, Qijian Song, Jay J Thelen, Jianlin
1251 Cheng, et al. Genome sequence of the palaeopolyploid soybean. <span
1252 class="ptmri7t-x-x-120">Nature</span>,
1253 463(7278):178–183, 2010
1254 <!--l. 466--><p class="noindent" >In this work <span
1255 class="ptmri7t-x-x-120">Vmatch </span>was used to cluster DNA-sequences into families based
1256 on their six-frame translation.
1257 </p></li>
1258 <li
1259 class="enumerate" id="x1-12012x6"><a
1260 id="XZIM:GES:CHE:LOR:SCHRO:2010"></a>Bob Zimmermann, Tanja Gesell, Doris Chen, Christina Lorenz, Renée
1261 Schroeder, and J Valcarcel. Monitoring genomic sequences during selex
1262 using high-throughput sequencing: neutral selex. <span
1263 class="ptmri7t-x-x-120">PLoS One</span>, 5(2):e9169,
1264 2010
1265 <!--l. 469--><p class="noindent" >In this work <span
1266 class="ptmri7t-x-x-120">Vmatch </span>was used to align 454-sequences to the Ecoli-genome
1267 and to cluster the sequences.
1268 </p></li>
1269 <li
1270 class="enumerate" id="x1-12014x7"><a
1271 id="XTOU:DEN:MED:BAR:ELK:PET:2010"></a>Fabrice Touzain, Erick Denamur, Claudine Médigue, Valérie Barbe,
1272 Meriem El Karoui, Marie-Agnès Petit, et al. Small variable segments
1273 constitute a major type of diversity of bacterial genomes at the species level.
1274 <span
1275 class="ptmri7t-x-x-120">Genome Biol</span>, 11(4):R45, 2010
1276 <!--l. 472--><p class="noindent" >In this work <span
1277 class="ptmri7t-x-x-120">Vmatch </span>was used for detecting repeats in three bacterial
1278 species.
1279 </p></li>
1280 <li
1281 class="enumerate" id="x1-12016x8"><a
1282 id="XMAY:MAR:HED:SIM:LIU:MOR:STEU:TAU:ROE:GUN:2011"></a>Klaus FX Mayer, Mihaela Martis, Pete E Hedley, Hana Šimková, Hui Liu,
1283 Jenny A Morris, Burkhard Steuernagel, Stefan Taudien, Stephan Roessner,
1284 Heidrun Gundlach, et al. Unlocking the barley genome by chromosomal
1285 and comparative genomics. <span
1286 class="ptmri7t-x-x-120">The Plant Cell</span>, 23(4):1249–1263, 2011
1287 <!--l. 475--><p class="noindent" >In this work <span
1288 class="ptmri7t-x-x-120">Vmatch </span>was used for masking repeats in 454-reads.
1289 </p></li>
1290 <li
1291 class="enumerate" id="x1-12018x9"><a
1292 id="XPUS:MAN:JI:LI:EVA:CRA:MOR:MEA:SIN:SAX:2011"></a>Smruti Pushalkar, Shrinivasrao P Mane, Xiaojie Ji, Yihong Li, Clive Evans,
1293 Oswald R Crasta, Douglas Morse, Robert Meagher, Anup Singh, and
1294 Deepak Saxena. Microbial diversity in saliva of oral squamous cell
1295 carcinoma. <span
1296 class="ptmri7t-x-x-120">FEMS Immunology &#x0026; Medical Microbiology</span>, 61(3):269–277,
1297 2011
1298 <!--l. 478--><p class="noindent" >In this work <span
1299 class="ptmri7t-x-x-120">Vmatch </span>was used to identify distal primers.
1300
1301
1302
1303 </p></li>
1304 <li
1305 class="enumerate" id="x1-12020x10"><a
1306 id="XBRE:SHE:POP:2011"></a>J. E. Breitenbach, K. S. Shelby, and H. JR Popham. Baculovirus induced
1307 transcripts in hemocytes from the larvae of heliothis virescens. <span
1308 class="ptmri7t-x-x-120">Viruses</span>,
1309 3(11):2047–2064, 2011
1310 <!--l. 483--><p class="noindent" >In this work <span
1311 class="ptmri7t-x-x-120">Vmatch </span>was used for removing redundant transcripts
1312 assembled in an RNA-seq study based on Illumina reads for <span
1313 class="ptmri7t-x-x-120">Heliothis</span>
1314 <span
1315 class="ptmri7t-x-x-120">virescens </span>(tobacco budworm), infected with a virus.
1316 </p></li>
1317 <li
1318 class="enumerate" id="x1-12022x11"><a
1319 id="XTRI:HAM:BUE:TIS:VER:ZIN:LEA:2011"></a>LR Triplett, JP Hamilton, CR Buell, NA Tisserat, V. Verdier, F Zink,
1320 and JE Leach. Genomic analysis of xanthomonas oryzae isolates from
1321 rice grown in the united states reveals substantial divergence from
1322 known x. oryzae pathovars. <span
1323 class="ptmri7t-x-x-120">Applied and Environmental Microbiology</span>,
1324 77(12):3930–3937, 2011
1325 <!--l. 488--><p class="noindent" >In this work <span
1326 class="ptmri7t-x-x-120">Vmatch </span>was used to search unassembled Illumina reads of US
1327 and African strains of <span
1328 class="ptmri7t-x-x-120">Xanthomonas oryzae </span>for evidence of transcriptional
1329 activator-like effector sequences.
1330 </p></li>
1331 <li
1332 class="enumerate" id="x1-12024x12"><span
1333 class="ptmri7t-x-x-120">Vmatch </span>is used as an integral part of the PriMUX software package
1334 described in
1335 <!--l. 493--><p class="noindent" ><a
1336 id="XHYS:NAR:ELS:CAR:WIL:GAR:2012"></a>D. A. Hysom, P. Naraghi-Arani, M. Elsheikh, A. C. Carrillo, P. L.
1337 Williams, and S. N. Gardner. Skip the alignment: degenerate, multiplex
1338 primer and probe design using K-mer matching instead of alignments. <span
1339 class="ptmri7t-x-x-120">PLoS</span>
1340 <span
1341 class="ptmri7t-x-x-120">ONE</span>, 7(4):e34560, 2012
1342 </p><!--l. 495--><p class="noindent" >In this context <span
1343 class="ptmri7t-x-x-120">Vmatch </span>used for selecting multiplex compatible, degenerate
1344 primers and probes to detect diverse targets such as viruses.
1345 </p></li>
1346 <li
1347 class="enumerate" id="x1-12026x13"><a
1348 id="XSHE:POP:2012"></a>K. S. Shelby and H. JR Popham. Rna-seq study of microbially
1349 induced hemocyte transcripts from larval heliothis virescens (lepidoptera:
1350 Noctuidae). <span
1351 class="ptmri7t-x-x-120">Insects</span>, 3(3):743–762, 2012
1352 <!--l. 499--><p class="noindent" >In this work <span
1353 class="ptmri7t-x-x-120">Vmatch </span>was used to identify redundant contigs from de novo
1354 exome assemblies.
1355 </p></li>
1356 <li
1357 class="enumerate" id="x1-12028x14"><a
1358 id="XHUR:SUL:2013"></a>B. L. Hurwitz and M. B. Sullivan. The Pacific Ocean virome (POV):
1359
1360
1361
1362 a marine viral metagenomic dataset and associated protein clusters for
1363 quantitative viral ecology. <span
1364 class="ptmri7t-x-x-120">PLoS ONE</span>, 8(2):e57355, 2013
1365 <!--l. 503--><p class="noindent" >In this work <span
1366 class="ptmri7t-x-x-120">Vmatch </span>was used to identify reads which have no common
1367 20-mers with other reads in a context of a marine viral metagenome project.
1368 </p></li>
1369 <li
1370 class="enumerate" id="x1-12030x15"><a
1371 id="XZHU:RHO:FESCH:2013"></a>X. Zhuo, M. Rho, and C. Feschotte. Genome-wide characterization of
1372 endogenous retroviruses in the bat Myotis lucifugus reveals recent and
1373 diverse infections. <span
1374 class="ptmri7t-x-x-120">J. Virol.</span>, 87(15):8493–8501, Aug 2013
1375 <!--l. 507--><p class="noindent" >In this work <span
1376 class="ptmri7t-x-x-120">Vmatch </span>was used for clustering potential complete Endogenous
1377 retroviruses of the bat <span
1378 class="ptmri7t-x-x-120">Myotis lucifugus </span>into subfamilies.
1379 </p></li>
1380 <li
1381 class="enumerate" id="x1-12032x16">In the three papers
1382 <!--l. 511--><p class="noindent" ><a
1383 id="XHUR:WES:BRU:SUL:2014"></a>B. L. Hurwitz, A. H. Westveld, J. R. Brum, and M. B. Sullivan.
1384 Modeling ecological drivers in marine viral communities using comparative
1385 metagenomics and network analyses. <span
1386 class="ptmri7t-x-x-120">Proc. Natl. Acad. Sci. U.S.A.</span>,
1387 111(29):10714–10719, July 2014
1388 </p><!--l. 513--><p class="noindent" ><a
1389 id="XHUR:DEN:POU:SUL:2013"></a>B. L. Hurwitz, L. Deng, B. T. Poulos, and M. B. Sullivan. Evaluation
1390 of methods to concentrate and purify ocean virus communities
1391 through comparative, replicated metagenomics. <span
1392 class="ptmri7t-x-x-120">Environ. Microbiol.</span>,
1393 15(5):1428–1440, May 2013
1394 </p><!--l. 515--><p class="noindent" ><a
1395 id="XBRU:HUR:SCHOF:DUC:SUL:2015"></a>J. R. Brum, B. L. Hurwitz, O. Schofield, H. W. Ducklow, and M. B.
1396 Sullivan. Seasonal time bombs: dominant temperate viruses affect southern
1397 ocean microbial dynamics. <span
1398 class="ptmri7t-x-x-120">The ISME journal</span>, 2015
1399 </p><!--l. 517--><p class="noindent" ><span
1400 class="ptmri7t-x-x-120">Vmatch </span>was used for <span
1401 class="zptmcm7m-x-x-120">k</span>-mer analysis in the context of different marine
1402 metagenome projects.
1403 </p></li>
1404 <li
1405 class="enumerate" id="x1-12034x17"><a
1406 id="XDEC:PAR:2014"></a>C. J. Decker and R. Parker. Analysis of double-stranded rna from
1407 microbial communities identifies double-stranded rna virus-like elements.
1408 <span
1409 class="ptmri7t-x-x-120">Cell reports</span>, 7(3):898–906, 2014
1410 <!--l. 521--><p class="noindent" >In this work <span
1411 class="ptmri7t-x-x-120">Vmatch </span>was used for <span
1412 class="zptmcm7m-x-x-120">k</span>-mer analysis in the context of microbial
1413 communities.
1414 </p></li>
1415 <li
1416 class="enumerate" id="x1-12036x18"><a
1417 id="XBEN:BOU:FIC:KRI:LAR:2014"></a>J. Bengtsson-Palme, F. Boulund, J. Fick, E. Kristiansson, and D. G.
1418
1419
1420
1421 Larsson. Shotgun metagenomics reveals a wide array of antibiotic
1422 resistance genes and mobile elements in a polluted lake in India. <span
1423 class="ptmri7t-x-x-120">Front</span>
1424 <span
1425 class="ptmri7t-x-x-120">Microbiol</span>, 5:648, 2014
1426 <!--l. 525--><p class="noindent" >In this work <span
1427 class="ptmri7t-x-x-120">Vmatch </span>was used in an iterative scheme to construct contigs
1428 from reads associated with resistance genes in the context of a shotgun
1429 metagenome project.
1430 </p></li>
1431 <li
1432 class="enumerate" id="x1-12038x19"><a
1433 id="XNIC:THI:GAR:MCL:FOF:KOS:ELL:BRE:JAC:JAI:2013"></a>A Be Nicholas, James B Thissen, Shea N Gardner, Kevin S McLoughlin,
1434 Viacheslav Y Fofanov, Heather Koshinsky, Sally R Ellingson, Thomas S
1435 Brettin, Paul J Jackson, and Crystal J Jaing. Detection of <span
1436 class="ptmri7t-x-x-120">Bacillus</span>
1437 <span
1438 class="ptmri7t-x-x-120">anthracis </span>DNA in complex soil and air samples using next-generation
1439 sequencing. <span
1440 class="ptmri7t-x-x-120">PloS one</span>, 8(9), 2013
1441 <!--l. 529--><p class="noindent" >In this work <span
1442 class="ptmri7t-x-x-120">Vmatch </span>was used to match probe candidate sequences against
1443 viral sequences and the human genmome sequence.
1444 </p></li>
1445 <li
1446 class="enumerate" id="x1-12040x20"><a
1447 id="XHEN:RUM:SCZ:VEL:DIE:GER:GOM:RAH:STO:BOR:2014"></a>Birgit Henrich, Madis Rumming, Alexander Sczyrba, Eunike Velleuer, Ralf
1448 Dietrich, Wolfgang Gerlach, Michael Gombert, Sebastian Rahn, Jens Stoye,
1449 Arndt Borkhardt, et al. <span
1450 class="ptmri7t-x-x-120">Mycoplasma salivarium </span>as a dominant coloniser of
1451 <span
1452 class="ptmri7t-x-x-120">Fanconi anaemia </span>associated oral carcinoma. <span
1453 class="ptmri7t-x-x-120">PloS one</span>, 9(3), 2014
1454 <!--l. 533--><p class="noindent" >In this work <span
1455 class="ptmri7t-x-x-120">Vmatch </span>was used to identify the species of the
1456 Streptococcaceae by comparing with Silva 115 release 16S reference
1457 sequence database.</p></li></ol>
1458 <!--l. 537--><p class="noindent" >
1459 </p>
1460 <h4 class="likesubsectionHead"><a
1461 id="x1-13000"></a>Usages in General Web-Servers or Sequence Analysis Software</h4>
1462 <!--l. 538--><p class="noindent" >
1463 </p><ol class="enumerate1" >
1464 <li
1465 class="enumerate" id="x1-13002x1">Since 2000, the <a
1466 href="http://rsat.ulb.ac.be/rsat/" >RSA-tools</a>, described in
1467 <!--l. 543--><p class="noindent" ><a
1468 id="XHEL:RIO:COL:2000"></a>J. van Helden, A.F. Rios, and J. Collado-Vides. Discovering Regulatory
1469 Elements in Non-Coding Sequences by Analysis of Spaced Dyads. <span
1470 class="ptmri7t-x-x-120">Nucleic</span>
1471
1472
1473
1474 <span
1475 class="ptmri7t-x-x-120">Acids Res.</span>, 28(8):1808–1818, 2000
1476 </p><!--l. 545--><p class="noindent" >and developed by Jacques van Helden use <span
1477 class="ptmri7t-x-x-120">Vmatch </span>to <a
1478 href="http://rsat.ulb.ac.be/rsat/purge-sequence_form.cgi" >purge</a> sequences
1479 before computing sequence statistics. Similar applications are reported in
1480 the following papers:
1481 </p><!--l. 550--><p class="noindent" ><a
1482 id="XHUL:WEE:CRO:GER:HEP:HEL:2003"></a>R.J.M. Hulzink, H. Weerdesteyn, A.F. Croes, M.M.A. Gerats, T. van
1483 Herpen, and J. van Helden. In Silico Identification of Putative Regulatory
1484 Sequence Elements in the 5&#x2019;-Untranslated Region of Genes That Are
1485 Expressed during Male Gametogenesis Gene Co-regulation. <span
1486 class="ptmri7t-x-x-120">Plant Physiol.</span>,
1487 132:75–83, 2003
1488 </p><!--l. 552--><p class="noindent" ><a
1489 id="XSIM:WOD:COH:HEL:2004"></a>N. Simonis, S.J. Wodak, G.N. Cohen, and
1490 J van Helden. Combining Pattern Discovery and Discriminant Analysis to
1491 Predict Gene Co-regulation. <span
1492 class="ptmri7t-x-x-120">Bioinformatics</span>, 20:2370–2379, 2004
1493 </p><!--l. 554--><p class="noindent" ><a
1494 id="XSIM:HEL:COH:WOD:2004"></a>N. Simonis, J. van Helden, G.N. Cohen, and S.J. Wodak. Transcriptional
1495 regulation of protein complexes in yeast. <span
1496 class="ptmri7t-x-x-120">Genome Biology</span>, 5:R33, 2004.
1497 </p></li>
1498 <li
1499 class="enumerate" id="x1-13004x2">The program <a
1500 href="http://splicenest.molgen.mpg.de/" >SpliceNest</a>, described in
1501 <!--l. 559--><p class="noindent" ><a
1502 id="XCOW:HAA:VIN:2002"></a>E. Coward, S.A. Haas, and M. Vingron. SpliceNest: Visualization of Gene
1503 Structure and Alternative Splicing Based on EST Clusters. <span
1504 class="ptmri7t-x-x-120">Trends Genet.</span>,
1505 18(1):53–55, 2002
1506 </p><!--l. 561--><p class="noindent" >computes gene indices and uses <span
1507 class="ptmri7t-x-x-120">Vmatch </span>to <a
1508 href="http://splicenest.molgen.mpg.de/doc/help.html#mapping" >map</a> clustered sequences to large
1509 genomes.
1510 </p></li>
1511 <li
1512 class="enumerate" id="x1-13006x3"><a
1513 href="http://bibiserv.techfak.uni-bielefeld.de/e2g/" >e2g</a> is a web-based server which efficiently maps large EST and cDNA data
1514 sets to genomic DNA. The use of <span
1515 class="ptmri7t-x-x-120">Vmatch </span>allows to significantly extend the
1516 size of data that can be mapped in reasonable time. e2g is available as a
1517 web service and hosts large collections of EST sequences (e.g. 4.1 million
1518 mouse ESTs of 1.87 Gbp) in a precomputed persistent index. For details see
1519 <!--l. 579--><p class="noindent" ><a
1520 id="XKRUE:SCZ:KUR:GIE:2004"></a>J. Krüger, A. Sczyrba, S. Kurtz, and R. Giegerich. e2g: An interactive
1521 web-based server for efficiently mapping large EST and cDNA sets to
1522 genomic sequences. <span
1523 class="ptmri7t-x-x-120">Nucleic Acids Res.</span>, 32:W301–W304, 2004.
1524 </p></li>
1525 <li
1526 class="enumerate" id="x1-13008x4">The <a
1527 href="http://bibiserv.techfak.uni-bielefeld.de/" >Bielefeld Bioinformatics Server</a> provides the <a
1528 href="http://bibiserv.techfak.uni-bielefeld.de/reputer/" >REPuter</a> web-service to
1529 compute repeats in complete genomes. The service is based on <span
1530 class="ptmri7t-x-x-120">Vmatch</span>.
1531
1532
1533
1534 </li>
1535 <li
1536 class="enumerate" id="x1-13010x5"><a
1537 id="XFER:DON:SCHNE:MOR:NAN:BRE:WAL:2004"></a>J. Fernandes, Q. Dong, B. Schneider, D.J. Morrow,
1538 G.-L. Nan, V. Brendel, and V. Walbot. Genome-wide mutagenesis of Zea
1539 mays L. using RescueMu transposons. <span
1540 class="ptmri7t-x-x-120">Genome Biology</span>, 5(10):R82, 2004
1541 <!--l. 589--><p class="noindent" >In this work <span
1542 class="ptmri7t-x-x-120">Vmatch </span>was used to (1) match 130 861 vector-trimmed
1543 sequences against the maize repeat database, and (2) to cluster
1544 near-identical sequences.
1545 </p></li>
1546 <li
1547 class="enumerate" id="x1-13012x6"><a
1548 href="http://www-ab.informatik.uni-tuebingen.de/software/crosslink/welcome.html" >CrossLink</a>, described in
1549 <!--l. 595--><p class="noindent" ><a
1550 id="XDEZ:SCHAEF:WIE:WEI:HUS:2006"></a>T. Dezulian, M. Schaefer, R. Wiese, D. Weigel, and D.H. Huson.
1551 CrossLink: visualization and exploration of sequence relationships between
1552 (micro) RNAs. <span
1553 class="ptmri7t-x-x-120">Nucleic Acids Res.</span>, 34(Web Server Issue):W400–W404,
1554 200
1555 </p><!--l. 597--><p class="noindent" >is a versatile computational tool which aids in visualizing relationships
1556 between RNA sequences (particularly between ncRNAs and their putative
1557 target transcripts) in an intuitive and accessible way. Besides BLAST,
1558 CrossLink uses <span
1559 class="ptmri7t-x-x-120">Vmatch </span>to reveal the sequence relationships to be visualized.
1560 </p></li>
1561 <li
1562 class="enumerate" id="x1-13014x7">The early version of the web-service <a
1563 href="http://mips.gsf.de/simap/" >Similarity matrix of Proteins (SIMAP)</a>,
1564 see
1565 <!--l. 607--><p class="noindent" ><a
1566 id="XARN:RAT:TIS:TRU:STU:MEW:2005"></a>R. Arnold, T. Rattei, P. Tischler, M.-D. Truong, V. Stümpflen, and H.W.
1567 Mewes. SIMAP - The similarity matrix of proteins. <span
1568 class="ptmri7t-x-x-120">Bioinformatics</span>,
1569 21(Suppl. 2):ii42–ii46, 2005
1570 </p><!--l. 609--><p class="noindent" >used <span
1571 class="ptmri7t-x-x-120">Vmatch </span>to locate the sequences in SIMAP which are similar to a given
1572 query. This is much faster than running BLAST.
1573 </p></li>
1574 <li
1575 class="enumerate" id="x1-13016x8"><a
1576 id="XFIE:VAN:PEE:VAN:NAP:2005"></a>Fiers, M.W.E.J. and Van de Wetering, H. and Peeters, T.H.J.M. and van
1577 Wijk, J.J. and Nap, J-P. DNAVis: interactive visualization of comparative
1578 genome annotations. <span
1579 class="ptmri7t-x-x-120">Bioinformatics</span>, 22(3):354–355, 2005
1580 <!--l. 615--><p class="noindent" >In this work <span
1581 class="ptmri7t-x-x-120">Vmatch </span>was used to compute similarities between genomes,
1582 which are then visualized by the program <a
1583 href="http://www.win.tue.nl/dnavis/" >DNAVis</a>.
1584 </p></li>
1585 <li
1586 class="enumerate" id="x1-13018x9">In the paper
1587
1588
1589
1590 <!--l. 619--><p class="noindent" ><a
1591 id="XSEI:KRUE:HAR:SCHWA:LOEW:MER:DAN:GIE:2006"></a>P.N. Seibel, J. Krüger, S. Hartmeier, K. Schwarzer, K. Löwenthal,
1592 H. Mersch, T. Dandekar, and R. Giegerich. XML schemas for common
1593 bioinformatic data types and their application in workflow systems. <span
1594 class="ptmri7t-x-x-120">BMC</span>
1595 <span
1596 class="ptmri7t-x-x-120">Bioinformatics</span>, 7:490, 2006
1597 </p><!--l. 621--><p class="noindent" >Seidel et. al. describe methods for creating web-services and give
1598 examples which, among other tools, also integrate <span
1599 class="ptmri7t-x-x-120">Vmatch</span>.
1600 </p></li>
1601 <li
1602 class="enumerate" id="x1-13020x10">The program <span
1603 class="ptmri7t-x-x-120">Gepard</span>
1604 <!--l. 628--><p class="noindent" ><a
1605 id="XKRU:ARN:RAT:2007"></a>J. Krumsiek, R. Arnold, and T. Rattei. Gepard: a rapid and sensitive tool
1606 for creating dotplots on genome scale. <span
1607 class="ptmri7t-x-x-120">Bioinformatics</span>, 23(8):1026–8, 2007
1608 </p><!--l. 630--><p class="noindent" >uses <span
1609 class="ptmri7t-x-x-120">mkvtree </span>to compute enhanced suffix arrays.
1610 </p></li>
1611 <li
1612 class="enumerate" id="x1-13022x11"><span
1613 class="ptmri7t-x-x-120">Vmatch </span>is used a part of the transcriptome assembler software Rnnotator,
1614 described in
1615 <!--l. 636--><p class="noindent" ><a
1616 id="XMAR:BRU:FAN:MEN:BLO:ZHA:SHE:SNY:WAN:2010"></a>J. Martin, V. M. Bruno, Z. Fang, X. Meng, M. Blow, T. Zhang,
1617 G. Sherlock, M. Snyder, and Z. Wang. Rnnotator: an automated de novo
1618 transcriptome assembly pipeline from stranded RNA-Seq reads. <span
1619 class="ptmri7t-x-x-120">BMC</span>
1620 <span
1621 class="ptmri7t-x-x-120">Genomics</span>, 11:663, 2010
1622 </p></li>
1623 <li
1624 class="enumerate" id="x1-13024x12">The BioExtract-Server described in
1625 <!--l. 640--><p class="noindent" ><a
1626 id="XLUS:JEN:BRE:2011"></a>C. M. Lushbough, D. M. Jennewein, and V. Brendel. The bioextract
1627 server: a web-based bioinformatic workflow platform. <span
1628 class="ptmri7t-x-x-120">Nucleic acids</span>
1629 <span
1630 class="ptmri7t-x-x-120">research</span>, 39(suppl 2):W528–W532, 2011
1631 </p><!--l. 642--><p class="noindent" >uses <span
1632 class="ptmri7t-x-x-120">Vmatch </span>to remove duplicated sequences.
1633 </p></li>
1634 <li
1635 class="enumerate" id="x1-13026x13"><a
1636 id="XLUS:GNI:DOO:2015"></a>C. M. Lushbough, E. Z. Gnimpieba, and R. Dooley. Life science data
1637 analysis workflow development using the bioextract server leveraging the
1638 iplant collaborative cyberinfrastructure. <span
1639 class="ptmri7t-x-x-120">Concurrency and Computation:</span>
1640 <span
1641 class="ptmri7t-x-x-120">Practice and Experience</span>, 27(2):408–419, 2015
1642 <!--l. 648--><p class="noindent" >In this work <span
1643 class="ptmri7t-x-x-120">Vmatch </span>was used for removing duplicates in BlastP results.
1644 This use is part of a workflow in <a
1645 href="http://www.myexperiment.org/workflows/3131.html" >myexperiment</a>.
1646
1647
1648
1649 </p></li>
1650 <li
1651 class="enumerate" id="x1-13028x14"><a
1652 id="XGRE:LOY:HOR:RAT:2015"></a>Daniel Greuter, Alexander Loy, Matthias Horn, and Thomas Rattei.
1653 ProbeBase-an online resource for rRNA-targeted oligonucleotide probes
1654 and primers: new features 2016. <span
1655 class="ptmri7t-x-x-120">Nucleic acids research</span>, page gkv1232,
1656 2015
1657 <!--l. 651--><p class="noindent" >In this work <span
1658 class="ptmri7t-x-x-120">Vmatch </span>was used for probe/primer search functionality in the
1659 probeBase database.</p></li></ol>
1660 <!--l. 655--><p class="noindent" >
1661 </p>
1662 <h4 class="likesubsectionHead"><a
1663 id="x1-14000"></a>Current Usages in Human Genome Research</h4>
1664 <!--l. 656--><p class="noindent" >
1665 </p><ol class="enumerate1" >
1666 <li
1667 class="enumerate" id="x1-14002x1"><a
1668 id="XBUC:JAR:MEN:MAT:SCO:GRE:LAN:DUM:2005"></a>P.G. Buckley, C. Jarbo, U. Menzel, T. Mathiesen, C. Scott, S.G. Gregory,
1669 C.F. Langford, and J.P. Dumanski. Comprehensive DNA Copy Number
1670 Profiling of Meningioma Using a Chromosome 1 Tiling Path Microarray
1671 identifies Novel Candidate Tumor Surpressor Loci. <span
1672 class="ptmri7t-x-x-120">Cancer Res.</span>,
1673 65(7):2653–2661, 2005
1674 <!--l. 659--><p class="noindent" >In this work <span
1675 class="ptmri7t-x-x-120">Vmatch </span>was used to reveal long repeats inside human
1676 chromosome 1 and long similar regions between human chromosome 1 and
1677 all other human chromosomes.
1678 </p></li>
1679 <li
1680 class="enumerate" id="x1-14004x2"><a
1681 id="XLIA:WAN:LIU:JI:LIU:CHE:WEB:REE:DEA:2007"></a>Liang, C. and Wang, G. and Liu, L. and Ji, G. and Liu, Y. and Chen, J. and
1682 Webb, J.S. and Reese, G. and Dean, J.F.D. WebTraceMiner: a web service
1683 for processing and mining EST sequence trace files. <span
1684 class="ptmri7t-x-x-120">Nucleic Acids Res</span>,
1685 35(Web Server issue):W137–42, 2007
1686 <!--l. 662--><p class="noindent" >In this work <span
1687 class="ptmri7t-x-x-120">Vmatch </span>was used for Vector screening.
1688 </p></li>
1689 <li
1690 class="enumerate" id="x1-14006x3"><a
1691 id="XNYG:JAC:LIN:ERI:BAL:FLY:TOL:MOE:SOE:KRO:LIT:2009"></a>Sanne Nygaard, Anders Jacobsen, Morten Lindow, Jens Eriksen, Eva
1692 Balslev, Henrik Flyger, Niels Tolstrup, Søren Møller, Anders Krogh, and
1693 Thomas Litman. Identification and analysis of mirnas in human breast
1694
1695
1696
1697 cancer and teratoma samples using deep sequencing. <span
1698 class="ptmri7t-x-x-120">BMC Medical</span>
1699 <span
1700 class="ptmri7t-x-x-120">Genomics</span>, 2(1):35, 2009
1701 <!--l. 665--><p class="noindent" >In this work <span
1702 class="ptmri7t-x-x-120">Vmatch </span>was used for mapping short reads.
1703 </p></li>
1704 <li
1705 class="enumerate" id="x1-14008x4"><a
1706 id="XCOL:SOB:LU:THA:BOW:BRO:GRE:BAR:HUT:2009"></a>Christian Cole, Andrew Sobala, Cheng Lu, Shawn R Thatcher, Andrew
1707 Bowman, John WS Brown, Pamela J Green, Geoffrey J Barton, and
1708 Gyorgy Hutvagner. Filtering of deep sequencing data reveals the
1709 existence of abundant dicer-dependent small rnas derived from trnas. <span
1710 class="ptmri7t-x-x-120">Rna</span>,
1711 15(12):2147–2160, 2009
1712 <!--l. 668--><p class="noindent" >In this work <span
1713 class="ptmri7t-x-x-120">Vmatch </span>was used for matching reads to sets of RNA sequences
1714 and the Human genome.
1715 </p></li>
1716 <li
1717 class="enumerate" id="x1-14010x5"><a
1718 id="XCLO:WAN:XU:GU:LEA:HEA:BAR:STE:MAR:NOU:2011"></a>N. Cloonan, S. Wani, Q. Xu, J. Gu, K. Lea, S. Heater, C. Barbacioru,
1719 A. L. Steptoe, H. C. Martin, E. Nourbakhsh, et al. Micrornas and their
1720 isomirs function cooperatively to target common biological pathways.
1721 <span
1722 class="ptmri7t-x-x-120">Genome Biol</span>, 12(12):R126, 2011
1723 <!--l. 671--><p class="noindent" >In this work <span
1724 class="ptmri7t-x-x-120">Vmatch </span>was used to uniquely map miRNAs against the human
1725 genome.
1726 </p></li>
1727 <li
1728 class="enumerate" id="x1-14012x6"><a
1729 id="XTAK:TSU:KAT:OKA:HOR:IKE:URA:KAW:HAS:IKE:2011"></a>K Takayama, S Tsutsumi, S Katayama, T Okayama, K Horie-Inoue,
1730 K Ikeda, T Urano, C Kawazu, A Hasegawa, K Ikeo, et al. Integration
1731 of cap analysis of gene expression and chromatin immunoprecipitation
1732 analysis on array reveals genome-wide androgen receptor signaling in
1733 prostate cancer cells. <span
1734 class="ptmri7t-x-x-120">Oncogene</span>, 30(5):619–630, 2011
1735 <!--l. 674--><p class="noindent" >In this work <span
1736 class="ptmri7t-x-x-120">Vmatch </span>was used to determine the positions of CAGE tags on
1737 the human genome.
1738 </p></li>
1739 <li
1740 class="enumerate" id="x1-14014x7"><a
1741 id="XKEV:LAL:LI:CAV:NAR:KAM:MIT:HAK:KOZ:GEN:2011"></a>Kevin CH Ha, Emilie Lalonde, Lili Li, Luca Cavallone, Rachael Natrajan,
1742 Maryou B Lambros, Costas Mitsopoulos, Jarle Hakas, Iwanka Kozarewa,
1743 Kerry Fenwick, et al. Identification of gene fusion transcripts by
1744 transcriptome sequencing in BRCA1-mutated breast cancers and cell lines.
1745 <span
1746 class="ptmri7t-x-x-120">BMC Medical Genomics</span>, 4(1):75, 2011
1747 <!--l. 677--><p class="noindent" >In this work <span
1748 class="ptmri7t-x-x-120">Vmatch </span>was used to align sections of reads against RefSeq
1749 mRNA exon sequences.
1750
1751
1752
1753 </p></li>
1754 <li
1755 class="enumerate" id="x1-14016x8"><a
1756 id="XKID:CHE:WAN:JAC:ZHA:BOY:FIR:TAN:GAE:COL:2012"></a>Marie J Kidd, Zhiliang Chen, Yan Wang, Katherine J Jackson, Lyndon
1757 Zhang, Scott D Boyd, Andrew Z Fire, Mark M Tanaka, Bruno A Gaëta,
1758 and Andrew M Collins. The inference of phased haplotypes for the
1759 immunoglobulin h chain v region gene loci by analysis of vdj gene
1760 rearrangements. <span
1761 class="ptmri7t-x-x-120">The Journal of Immunology</span>, 188(3):1333–1340, 2012
1762 <!--l. 680--><p class="noindent" >In this work <span
1763 class="ptmri7t-x-x-120">Vmatch </span>was used to align sets of genes.
1764 </p></li>
1765 <li
1766 class="enumerate" id="x1-14018x9"><a
1767 id="XYAM:IKE:BOE:HOR:TAK:URA:KAI:CAR:KAW:HAY:2014"></a>Ryonosuke Yamaga, Kazuhiro Ikeda, Joost Boele, Kuniko Horie-Inoue,
1768 Ken-ichi Takayama, Tomohiko Urano, Kaoru Kaida, Piero Carninci,
1769 Jun Kawai, Yoshihide Hayashizaki, et al. Systemic identification of
1770 estrogen-regulated genes in breast cancer cells through cap analysis
1771 of gene expression mapping. <span
1772 class="ptmri7t-x-x-120">Biochemical and biophysical research</span>
1773 <span
1774 class="ptmri7t-x-x-120">communications</span>, 447(3):531–536, 2014
1775 <!--l. 683--><p class="noindent" >In this work <span
1776 class="ptmri7t-x-x-120">Vmatch </span>was used to determine the positions of CAGE tags on
1777 the human genome.
1778 </p>
1779 </li></ol>
1780 <!--l. 688--><p class="noindent" >
1781 </p>
1782 <h4 class="likesubsectionHead"><a
1783 id="x1-15000"></a>Current Usages for different Model Organisms</h4>
1784 <!--l. 689--><p class="noindent" >
1785 </p><ol class="enumerate1" >
1786 <li
1787 class="enumerate" id="x1-15002x1"><a
1788 id="XSCZ:BECK:BRI:GIE:ALT:2005"></a>A. Sczyrba, M. Beckstette, A.H. Brivanlou, R. Giegerich, and C.R.
1789 Altmann. Xendb: Full length cDNA prediction and cross species mapping
1790 in <span
1791 class="ptmri7t-x-x-120">xenopus laevis</span>. <span
1792 class="ptmri7t-x-x-120">BMC Genomics</span>, 2005
1793 <!--l. 706--><p class="noindent" >In this work <span
1794 class="ptmri7t-x-x-120">Vmatch </span>was used to cluster 317 242 EST and cDNA sequences
1795 from <span
1796 class="ptmri7t-x-x-120">Xenopus laevis</span>. <span
1797 class="ptmri7t-x-x-120">Vmatch </span>was chosen for the following reasons:
1798 </p>
1799
1800
1801
1802 <ul class="itemize1">
1803 <li class="itemize">At first, there was no clustering tool available which could handle large
1804 data sets efficiently, and which was documented well enough to allow
1805 a detailed b replication and evaluation of existing clusters.
1806 </li>
1807 <li class="itemize">Second, <span
1808 class="ptmri7t-x-x-120">Vmatch </span>identifies similarities between sequences rapidly, and
1809 it provides additional options to cluster a set of sequences based on
1810 these matches. Furthermore, the <span
1811 class="ptmri7t-x-x-120">Vmatch </span>output provides information
1812 about how the clusters were derived. Due to the efficiency of <span
1813 class="ptmri7t-x-x-120">Vmatch</span>, it
1814 was possible to perform the clustering for a wide variety of parameters
1815 on the complete sequence set. This allows to study the effect of the
1816 parameter choice on the clustering.</li></ul>
1817 </li>
1818 <li
1819 class="enumerate" id="x1-15004x2"><a
1820 id="XSPIT:LOR:CUL:SCZ:FUEL:2006"></a>M. Spitzer, S. Lorkowski, P. Cullen, A. Sczyrba, and G. Fuellen. Distinguishing
1821 isoforms and paralogs on the protein level. <span
1822 class="ptmri7t-x-x-120">BMC Bioinformatics</span>, 7:110,
1823 2006
1824 <!--l. 709--><p class="noindent" >In this work <span
1825 class="ptmri7t-x-x-120">Vmatch </span>was used to cluster EST-sequences of <span
1826 class="ptmri7t-x-x-120">Xenopus</span>
1827 <span
1828 class="ptmri7t-x-x-120">laevis</span>.
1829 </p></li>
1830 <li
1831 class="enumerate" id="x1-15006x3"><a
1832 id="XEIS:COY:WU:WU:THI:WOR:BAD:REN:AME:JON:2006"></a>J.A. Eisen, R.S. Coyne, M. Wu, D. Wu, M. Thiagarajan, J.R. Wortman, J.H.
1833 Badger, Q. Ren, P. Amedeo, and K.M. Jones et al. Macronuclear Genome
1834 Sequence of the Ciliate Tetrahymena thermophila, a Model Eukaryote. <span
1835 class="ptmri7t-x-x-120">PLoS</span>
1836 <span
1837 class="ptmri7t-x-x-120">Biology</span>, 4(9):e286, 2006
1838 <!--l. 713--><p class="noindent" >In this work <span
1839 class="ptmri7t-x-x-120">Vmatch </span>was used to search exact repeats in the Macronuclear
1840 Genome Sequence of the Ciliate <span
1841 class="ptmri7t-x-x-120">Tetrahymena thermophila</span>.
1842 </p></li>
1843 <li
1844 class="enumerate" id="x1-15008x4"><a
1845 id="XFAU:FOR:CHA:SCHRO:HAY:CAR:HUM:GRI:2008"></a>G. J. Faulkner, A. R. Forrest, A. M. Chalk, K. Schroder, Y. Hayashizaki,
1846 P. Carninci, D. A. Hume, and S. M. Grimmond. A rescue strategy for
1847 multimapping short sequence tags refines surveys of transcriptional activity by
1848 CAGE. <span
1849 class="ptmri7t-x-x-120">Genomics</span>, 91(3):281–288, Mar 2008
1850 <!--l. 736--><p class="noindent" >In this work <span
1851 class="ptmri7t-x-x-120">Vmatch </span>was used for mapping </p>
1852 <ul class="itemize1">
1853 <li class="itemize">11 567 973 FANTOM3 mouse CAGE tags to the mouse genome with
1854 minimum match length of 18 bp, a single internal mismatch allowed,
1855
1856
1857
1858 and multiple mismatches allowed at tag ends.
1859 </li>
1860 <li class="itemize">Affymetrix GNF probe sequences to transcripts without allowing for
1861 mismatches.</li></ul>
1862 </li>
1863 <li
1864 class="enumerate" id="x1-15010x5"><a
1865 id="XPRI:JOR:2008"></a>Jittima Piriyapongsa and I King Jordan. Dual coding of sirnas and mirnas by
1866 plant transposable elements. <span
1867 class="ptmri7t-x-x-120">RNA</span>, 14(5):814–821, 2008
1868 <!--l. 741--><p class="noindent" >In this work <span
1869 class="ptmri7t-x-x-120">Vmatch </span>was used to search small RNA signatures in entire miRNA
1870 gene sequences for Arabidopsis and rice.
1871 </p></li>
1872 <li
1873 class="enumerate" id="x1-15012x6"><a
1874 id="XTAF:GLA:LASS:HAY:CAR:MAT:2009"></a>R. J. Taft, E. A. Glazov, T. Lassmann, Y. Hayashizaki, P. Carninci, and J. S.
1875 Mattick. Small RNAs derived from snoRNAs. <span
1876 class="ptmri7t-x-x-120">RNA</span>, 15(7):1233–1240, Jul
1877 2009
1878 <!--l. 745--><p class="noindent" >In this work <span
1879 class="ptmri7t-x-x-120">Vmatch </span>was used to map small RNA data sets onto the
1880 corresponding reference genomes for different model organisms.
1881 </p></li>
1882 <li
1883 class="enumerate" id="x1-15014x7"><a
1884 id="XPLE:PAS:BER:AKA:CAR:VAS:LAZ:SEV:VLA:SIM:2012"></a>C. Plessy, G. Pascarella, N. Bertin, A. Akalin, C. Carrieri, A. Vassalli,
1885 D. Lazarevic, J. Severin, C. Vlachouli, R. Simone, et al. Promoter architecture
1886 of mouse olfactory receptor genes. <span
1887 class="ptmri7t-x-x-120">Genome research</span>, 22(3):486–497,
1888 2012
1889 <!--l. 748--><p class="noindent" >In this work <span
1890 class="ptmri7t-x-x-120">Vmatch </span>was used for mapping Illumina reads to the mouse
1891 genome.
1892 </p></li>
1893 <li
1894 class="enumerate" id="x1-15016x8"><a
1895 id="XKEN:SHI:2012"></a>Nathan J Kenny and Sebastian M Shimeld. Additive multiple k-mer
1896 transcriptome of the keelworm <span
1897 class="ptmri7t-x-x-120">Pomatoceros lamarckii </span>(annelida; serpulidae)
1898 reveals annelid trochophore transcription factor cassette. <span
1899 class="ptmri7t-x-x-120">Development genes and</span>
1900 <span
1901 class="ptmri7t-x-x-120">evolution</span>, 222(6):325–339, 2012
1902 <!--l. 752--><p class="noindent" >In this work <span
1903 class="ptmri7t-x-x-120">Vmatch </span>was used for redundancy removal in the context of
1904 transcriptome assembly of a keelworm species.
1905 </p></li>
1906 <li
1907 class="enumerate" id="x1-15018x9"><a
1908 id="XGOS:OHM:KOG:SON:TUR:ZAJ:ZAL:GRU:SUN:HAN:2014"></a>Cene Gostin, Robin A Ohm, Tina Kogej, Silva Sonjak, Martina Turk, Janja Zajc,
1909 Polona Zalar, Martin Grube, Hui Sun, James Han, et al. Genome sequencing of
1910 four aureobasidium pullulans varieties: biotechnological potential, stress
1911
1912
1913
1914 tolerance, and description of new species. <span
1915 class="ptmri7t-x-x-120">BMC Genomics</span>, 15(1):549,
1916 2014
1917 <!--l. 756--><p class="noindent" >In this work <span
1918 class="ptmri7t-x-x-120">Vmatch </span>was used to remove redundant contigs in a genome project
1919 of four <span
1920 class="ptmri7t-x-x-120">Aureobasidium pullulans </span>varieties.
1921 </p></li>
1922 <li
1923 class="enumerate" id="x1-15020x10"><a
1924 id="XMCM:GAR:BAI:KEM:WAR:CEV:ROB:SCHUL:BAL:HOL:2015"></a>M. McMullan, A. Gardiner, K. Bailey, E. Kemen, B. J. Ward, V. Cevik,
1925 A. Robert-Seilaniantz, T. Schultz-Larsen, A. Balmuth, E. Holub, et al.
1926 Evidence for suppression of immunity as a driver for genomic introgressions and
1927 host range expansion in races of albugo candida, a generalist parasite. <span
1928 class="ptmri7t-x-x-120">eLife</span>,
1929 4:e04550, 2015
1930 <!--l. 759--><p class="noindent" >In this work <span
1931 class="ptmri7t-x-x-120">Vmatch </span>was used for merging assemblies of Illumina sequenced
1932 cDNA.
1933 </p></li>
1934 <li
1935 class="enumerate" id="x1-15022x11"><a
1936 id="XMOR:DHA:PAV:TRO:WHE:HEL:2015"></a>C Morandin, K Dhaygude, J Paviala, K Trontti, C Wheat, and H Helanterä.
1937 Caste-biases in gene expression are specific to developmental stage in the ant
1938 formica exsecta. <span
1939 class="ptmri7t-x-x-120">Journal of evolutionary biology</span>, 28(9):1705–1718,
1940 2015
1941 <!--l. 773--><p class="noindent" >In this work <span
1942 class="ptmri7t-x-x-120">Vmatch </span>was used to combine and scaffold contigs.
1943 </p>
1944 </li></ol>
1945 <!--l. 778--><p class="noindent" >Total number of usages: 108
1946 </p><!--l. 780--><p class="noindent" >
1947 </p>
1948 <h3 class="likesectionHead"><a
1949 id="x1-16000"></a>Availability</h3>
1950 <!--l. 781--><p class="noindent" ><span
1951 class="ptmri7t-x-x-120">Vmatch </span>is available for <a
1952 href="http://www.vmatch.de/download.html" >download</a> in executable form for the following platforms:
1953 </p>
1954 <ul class="itemize1">
1955 <li class="itemize">Linux
1956
1957
1958
1959 </li>
1960 <li class="itemize">Mac OS X
1961 </li>
1962 <li class="itemize">MS Windows</li></ul>
1963 <!--l. 794--><p class="noindent" >
1964 </p>
1965 <h3 class="likesectionHead"><a
1966 id="x1-17000"></a>Developer</h3>
1967 <!--l. 795--><p class="noindent" ><span
1968 class="ptmri7t-x-x-120">Vmatch </span>was developed since May 2000 by <a
1969 href="http://www.zbh.uni-hamburg.de/kurtz" >Stefan Kurtz</a>, a professor of Computer
1970 Science at the Center for Bioinformatics, University of Hamburg, Germany.
1971 </p><!--l. 809--> <b>Important Documents</b> <ul> <li> The <a href="virtman.pdf"><i>Vmatch</i>-manual</a> </li> </ul>
1972 <!--l. 817--> <div id="footer"> Copyright &copy; 2000-2017 <a href="mailto:kurtz@zbh.uni-hamburg.de"> Stefan Kurtz</a>. Last update: 2017-06-15 </div>
1973 <!--l. 839--> <!-- Piwik --> <div id="piwik"> <script type="text/javascript"> var pkBaseURL = "https://zenlicensemanager.com/piwik/"; document.write(unescape("</script><script type="text/javascript"> try { var piwikTracker = Piwik.getTracker(pkBaseURL + "piwik.php", 3); piwikTracker.trackPageView(); piwikTracker.enableLinkTracking(); } catch( err ) {} </script> <br/> <noscript> <img src="https://zenlicensemanager.com/piwik/piwik.php?idsite=3" style="border:0" alt=""/> </noscript> <!-- End Piwik Tracking Tag --> </div>
1974
1975 </body></html>
1976
1977
1978
1979
+0
-2
src/doc/WWW/introexclude less more
0 \\begin{AboutVmatch}
1 \\end{AboutVmatch}
src/doc/WWW/matchgraph.gif less more
Binary diff not shown
+0
-12
src/doc/WWW/remove-html.rb less more
0 #!/usr/bin/env ruby
1
2 def remove_html_tags(s)
3 re = /<("[^"]*"|'[^']*'|[^'">])*>/
4 return s.gsub(re, "")
5 end
6
7 ARGV.each do |filename|
8 file = File.open(filename, "rb")
9 contents = file.read
10 print remove_html_tags(contents)
11 end
+0
-20
src/doc/WWW/replace-header.rb less more
0 #!/usr/bin/env ruby
1
2 STDIN.each_line do |line|
3 if line.match(/<meta name=\"src\" content="vmweb.tex"/)
4 print "#{line}"
5 puts <<'HEADER'
6 <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"/>
7 <meta name="description" CONTENT="The Vmatch large scale sequence analysis
8 software is a versatile software tool for efficiently solving large scale sequence matching tasks."/>
9 <meta name="keywords" CONTENT="sequence analysis, sequence mapping, BLAST, bioinformatics, computational biology"/>
10 <meta http-equiv="Content-Style-Type" content="text/css"/>
11 HEADER
12 elsif line.match(/class=\"titleHead|author|date\"/)
13 print line.gsub(/class=/,"align=\"center\" class=").gsub(/h2/,"h1")
14 elsif line.match(/\/h2/)
15 print line.gsub(/\/h2/,"\/h1")
16 else
17 print line
18 end
19 end
+0
-3
src/doc/WWW/replace-par.rb less more
0 #!/usr/bin/env ruby
1
2 puts STDIN.read.gsub!(/<p class=\"noindent\" > <!--delete paragraph-->(.*)\n?<\/p\>/,"\\1\n")
+0
-44
src/doc/WWW/vmweb.css less more
0 body {
1 font-family: Verdana, Geneva, Arial, sans-serif;
2 }
3 h1, h2, h3, h4, h5, h6 {
4 color:black;
5 }
6 #w3checklogo {
7 float: right;
8 clear: right;
9 }
10 #downloadbox {
11 float: right;
12 clear: right;
13 width: 280px;
14 padding-right: 20px;
15 }
16 #downloadbox li {
17 list-style-type: none;
18 }
19 #downloadbox li a {
20 list-style-type: none;
21 border : 1px solid black;
22 display: block;
23 padding: 4px 15px 4px 15px;
24 text-decoration: none;
25 }
26 #downloadbox li a:link {
27 color: black;
28 background: rgb(90%, 90%, 20%);
29 }
30 #downloadbox li a:visited {
31 color: gray;
32 background: rgb(90%, 90%, 20%);
33 }
34 #downloadbox li a:hover {
35 color: white;
36 background: rgb(60%, 60%, 90%);
37 }
38 #footer {
39 font-size: 66%;
40 text-align: center;
41 margin-top: 20px;
42 background-color: #DDDDDD;
43 }
+0
-840
src/doc/WWW/vmweb.tex less more
0 \documentclass[12pt]{article}
1 \usepackage{ifpdf}
2 \usepackage{graphicx}
3 \usepackage{mathptmx}
4 \usepackage{natbib}
5 \usepackage{numprint}
6 \usepackage{bibentry}
7 \usepackage{xspace}
8 \usepackage[latin1]{inputenc}
9 \usepackage{hyperref}
10 \hypersetup{%
11 colorlinks=true,
12 linkcolor=blue
13 }
14
15 \newcommand{\Vmatch}[0]{\textit{Vmatch}\xspace}
16 \newcommand{\Mybibentry}[2]{\item \bibentry{#1} \par%
17 In this work \Vmatch was used \xspace #2\xspace}
18 %\newcommand{\href}[2]{#2}
19 %\nobibliography*
20 \newcounter{Allusages}
21 \newcommand{\Updateusages}{\addtocounter{Allusages}{\theenumi}}
22 \newcommand{\PrintVol}[1]{#1}
23 \newcommand{\Includegraphics}[2]{%
24 \includegraphics[#1]{#2.pdf}
25 }
26
27 \makeatletter
28 \edef\texforht{TT\noexpand\fi
29 \@ifpackageloaded{tex4ht}
30 {\noexpand\iftrue}
31 {\noexpand\iffalse}}
32 \makeatother
33 \ifpdf
34 \newcommand{\HCode}[1]{}
35 \fi
36
37 \title{The \Vmatch large scale sequence analysis software}
38 \author{Stefan Kurtz}
39 \date{\today}
40 \parskip5pt
41 \parindent0pt
42 \begin{document}
43 \maketitle
44
45 \bibliographystyle{plain}
46 \nobibliography{defines,ltr,assembly,algorithms,rnafolding,commolbio,biotools,kurtz,genomes,strings,genetics,metagenomes}
47
48 \HCode{
49 <!--delete paragraph-->
50 <br/>
51 <center>
52 <img src="matchgraph.gif"
53 alt="show matches of different sizes in a matchgraph"/>
54 </center>
55 <div id="downloadbox">
56 <ul>
57 <li><a href="download.html">Download <i>Vmatch</i>!</a></li>
58 </ul>
59 </div>
60 }
61
62 This is the web-site for \Vmatch,
63 a versatile software tool for efficiently
64 solving large scale se\-quence matching tasks.
65 \Vmatch subsumes the software tool
66 \href{http://bibiserv.techfak.uni-bielefeld.de/reputer}{REPuter},
67 but is much more general, with a very flexible user interface,
68 and improved space and time requirements.
69 \HCode{
70 <a href="vmweb.pdf">Here</a> is a printable version of this
71 HTML-page in PDF.
72 }
73
74 \section*{Features of \Vmatch}
75 The \href{virtman.pdf}{\Vmatch-manual}
76 gives many examples on how to use \Vmatch. Here are the program's most
77 important features.
78
79 \input{introduction.inc}
80
81 \HCode{
82 <a href="Dataflowfig.pdf">Here</a> is an overview of the dataflow
83 in <i>Vmatch</i>.
84 }
85
86 \section*{Related tools}
87 There are several tools which are
88 based on the persistent index of \Vmatch:
89
90 \begin{description}
91 \item[Genalyzer]
92 is a graphical user interface
93 to visualize the output of \Vmatch in form of a match graph.
94 For details see
95
96 \bibentry{CHO:SCHLE:KUR:GIE:2004}
97
98 Genalyzer is not available any more.
99 \item[\href{http://bibiserv.techfak.uni-bielefeld.de/mga/}{MGA}]
100 is a program to compute multiple alignments of complete
101 genomes. For details see
102
103 \bibentry{HOEH:KUR:OHL:2002}
104 \item[Multimat] is a program to compute multiple exact matches between
105 three or more genome size sequences. For details see
106
107 \bibentry{OHL:KUR:2008}
108
109 Please contact
110 \href{http://www.zbh.uni-hamburg.de/kurtz}{Stefan Kurtz} if you are interested
111 in using Multimat.
112
113 \item[\href{http://bibiserv.techfak.uni-bielefeld.de/possumsearch/}{PossumSearch}]
114 Is a program to search for position specific scoring matrices.
115 For details, see
116
117 \bibentry{BEC:HOM:GIE:KUR:2006}
118 \item
119 \item[\href{http://www.genomethreader.org/}{GenomeThreader}]
120 is a software tool to compute gene structure predictions. The gene structure
121 predictions are calculated using a similarity-based approach where additional
122 cDNA/EST and/or protein sequences are used to predict gene structures via
123 spliced alignments. \textit{GenomeThreader} uses the matching capabilities
124 of \Vmatch to efficiently map the reference sequence to a genomic
125 sequence. For details, see
126
127 \bibentry{GRE:BRE:SPA:KUR:2005}
128 \item
129 \item[\href{http://www.biopieces.org/}{Biopieces}]
130 is a collection of bioinformatics tools that can be pieced together in a
131 very easy and flexible manner to perform both simple and complex tasks.
132 Some Biopieces depend on \Vmatch. For details see
133 \url{http://www.biopieces.org/}.
134 \end{description}
135
136 \HCode{
137 <a name="CurrentUsage"/>
138 }
139 \section*{Previous and Current Usages}
140
141 We provide an annotated bibliography listing papers which applied \Vmatch
142 and shortly describe the tasks for which \Vmatch was used. We omit our own
143 papers. The references were collected by a
144 \href{https://scholar.google.de/scholar?q=Vmatch+AND+Kurtz+OR+www.vmatch.de}%
145 {search in Google scholar}
146 (which, as of Jan 2, 2016 retrieved 397 results.)
147
148 \subsection*{Usages in Plant Genome Research}
149 \begin{enumerate}
150 \Mybibentry{BRE:KUR:WAL:2002}{
151 to a compute a non-redundant set from a large collection of protein sequences
152 from Zea-Maize.}
153
154 Similar applications are described in
155
156 \bibentry{DON:ROY:FRE:WAL:BRE:2003}.
157 %\item
158 %For the development of the
159 %\href{http://barleypop.vrac.iastate.edu/BarleyBase/content.php}{Barley1 GeneChip} \Vmatch is used to search
160 %against probes.
161 \item
162 PLEXdb is a database for gene expression resources for plants and plant
163 pathogens, see
164
165 \bibentry{DAS:VAN:HON:WIS:DIC:2012}
166
167 PLEXdb provides a \Vmatch-based
168 \href{http://www.plantgdb.org/cgi-bin/prj/PLEXdb/ProbeMatch.pl}{web-service}
169 to match PLEXdb probes.
170
171 \item
172 The assembly of the Arabidopsis thaliana genome from 2004
173 (GenBank entries of 2/19/04) contained vector sequence contaminations.
174 For example, region \numprint{3617880} to \numprint{3625027} of
175 chromosome II contained
176 a cloning vector. \Vmatch was used to detect the vector contamination,
177 see \href{http://www.plantgdb.org/AtGDB/Annotation/vector.php}{here}
178
179 \item
180 \bibentry{DON:LAW:SCHLUE:WIL:KUR:LUS:BRE:2005}
181
182 This work describes PlantGDB, which
183 provides a service called
184 \href{http://www.plantgdb.org/PlantGDB-cgi/vmatch/patternsearch.pl}{PatternSearch@PlantGDB}
185 for genome wide pattern searches in plant sequences. The service is based
186 on \Vmatch.
187 \Mybibentry{LIN:KRO:2005}{
188 for three different tasks:
189 \begin{itemize}
190 \item
191 Searching spliced mRNA in the Arabidopsis genome to detect
192 micromatches of length at least 20 with maximum 2 mismatches.
193 \item
194 Finding matches of length at least 15 long with at most one mismatch
195 between predicted mature miRNA-sequences and a set of ESTs as well
196 as sequences from the Arabidopsis Small RNA Project (ASRP).
197 \item
198 Aligning and performing single linkage clustering
199 of the predicted mature miRNA sequences. Candidate pairs aligning over at least
200 17 bases, allowing an edit distance of 1 were grouped in the same family.
201 \end{itemize}}
202
203 \item
204 \bibentry{POM:LEM:TUR:2006}
205
206 \bibentry{TUR:OTI:LEM:2006}
207
208 In these papers \Vmatch was used to search
209 and compare repeated elements in different chloroplast DNA.
210
211 \item
212 \bibentry{SPA:NOU:HAA:YAN:GUN:HIN:KLE:HAB:SCHOO:MAY:2007}
213 In this work about the \textit{MIPSPlantsDB} database
214 \Vmatch was used to cluster large sequence sets.
215
216 \Mybibentry{SCHIJ:VOS:MAR:JON:ROS:MOL:TIK:ANG:TUN:BOV:2007}{
217 to compare target genes of the tomato Chs RNAi to a tomato gene index.}
218
219 \Mybibentry{LIN:JAC:NYG:MAN:KRO:2007}{
220 to search different
221 plant genomes for matches of length at least 20 with maximum of 2 mismatches.
222 Here the fact that \Vmatch is an exhaustive search tool is important.}
223
224 \Mybibentry{DEC:OTI:THU:LEM:2007}{
225 to determine the presence of shared repeated elements of minimum length
226 30, with up to 10\% mismatches using in different sequence sets from
227 the green alga \textit{Leptosira terrestris}.}
228
229 \Mybibentry{OSS:SCHNE:CLA:LAN:WAR:WEI:2008}{
230 to map millions of short sequence reads to the \textit{A.~Thaliana} genome.
231 Up to four mismatches and up to three indels were allowed in the matching
232 process. The seed size was chosen to be 0. The reads were aligned using the
233 best match strategy by iteratively increasing the the allowed number of
234 mismatches and gaps at each round.}
235
236 \Mybibentry{DIBO:OSS:SCHNE:RAT:2008}{
237 to map millions of short sequence reads to the \textit{A.~Thaliana} genome.
238 \Vmatch was part of a multi-step pipeline, combining a fast
239 matching algorithm (\Vmatch) for initial read mapping and
240 an optimal alignment algorithm based on dynamic programming (QPALMA)
241 for high quality detection of splice sites.}
242
243 \Mybibentry{ASS:HER:LIN:HUE:TAL:SMA:IMM:ELD:FIE:SCHAT:2010}{
244 for motif searching in different plant genomes.}
245
246 \Mybibentry{EVE:SAT:GOL:MEY:BET:SAK:WAR:JAC:2010}{
247 to map unique consensus sequence tags to the maize reference genome.}
248
249 \Mybibentry{BRO:OTI:LEM:TUR:2010}{
250 to identify and cluster repeated sequences in \textit{Floydiella} chloroplast
251 genome.}
252
253 \Mybibentry{REH:AQU:GRU:HEN:HIL:LAU:NAO:PAT:ROM:SHU:2010}{
254 to calculate direct and reverse complementary matches of length {17} bp or
255 greater with edit distance {1} or less between five nuclear chromosomes
256 and mitochondrial and chloroplast genome sequences.}
257
258 \Mybibentry{SEK:LIN:CHI:HAN:BUE:LEO:KAE:2011}{
259 to search probe sequences against the maize genome
260 the cDNA sequences of the official maize gene models.}
261
262 \Mybibentry{DAS:OH:HAA:HER:HON:ALI:YUN:BRE:ZHU:BOH:2011}{
263 for clustering sequences assembled from 454-reads of
264 \textit{Thellungiella parvula}, a model for the evolution of plant
265 adaptation to extreme environments.}
266
267 \Mybibentry{WIL:HOF:KLE:WEI:2011}{
268 for grouping short reads into
269 pools representing the same RAD tag.}
270
271 \Mybibentry{GAO:ZHO:WAN:SU:WAN:2011}{
272 for detecting and
273 clustering repetitive sequences in diverse fern plastid genomes.}
274
275 \Mybibentry{SLO:ALV:CHU:WU:MCC:PAL:TAY:2012}{
276 to precisely
277 define the boundaries of all repeats with 100\% sequence identity.}
278
279 \Mybibentry{DUB:FAR:SCHLU:CAN:ABE:TUT:WOO:SHA:MUL:KUD:2011}{
280 cluster sequences based on their six-frame translation.}
281
282 \Mybibentry{SAX:PEN:UPA:KUM:CAR:SCHLU:FAR:WHA:SAR:MAY:2012}{
283 to identify reciprocal best matches between the pigeonpea sequences and
284 other legume sequences.}
285
286 \Mybibentry{HAZ:REE:RIS:PEC:2012}{
287 for assembly clustering and
288 optimization of contigs for
289 \textit{Neochloris oleoabundans} (a Chlorophyceae class green microalgae).}
290
291 \Mybibentry{MAR:KLE:BAN:BLA:MAC:SCHMU:SCHOL:GUN:WIC:SIM:2012}{
292 to match reads against a repeat library to
293 identity the content of the repetitive DNA per sequence read.}
294
295 \Mybibentry{CHI:DAV:BUE:2011}{
296 to align individual probes to representative gene models.}
297
298 \Mybibentry{SEV:DIJ:HAM:2011}{
299 for performing exact searches with
300 peptides against the filtered proteome of \textit{A. thaliana}.}
301
302 \Mybibentry{WOL:WEI:SEG:ROS:BEI:DON:SPI:NOR:REH:KOE:2011}{
303 to map RNAseq reads,
304 allowing up to two mismatches (option \texttt{-h 2})
305 and generating maximal substring matches
306 that are unique in some reference dataset (option \texttt{-mum cand}).}
307
308 \Mybibentry{FLE:KHA:JOH:YOU:MIT:WRE:HES:FOS:SCHAR:SCO:2011}{
309 to identify terminal inverted repeats of length range {10-65} bp,
310 $\geq 80\%$ identity, maximum inter-TIR distance 650~bp in in genomes of
311 epichloid fungal endophytes of grasses.}
312
313 \Mybibentry{CHI:KON:BUE:2012}{
314 to match putative unique transcript sequence assemblies.}
315
316 \Mybibentry{CHE:CAS:BAI:RED:MIC:2012}{
317 for refining assemblies of Illumina reads in the context of a transcriptome
318 project for plant virus vector \textit{Graminella nigrifrons}.}
319
320 \Mybibentry{KRI:PAT:JAI:GAU:CHOU:VAI:DEE:HAR:KRI:NAI:2012}{
321 for clustering repeats and for building a consensus repeat library in the
322 context of genome and transcriptome projects for \textit{Azadirachta indica}, a
323 medicinal and pesticidal angiosperm.}
324
325 \Mybibentry{LIU:KUM:ZHA:ZHE:WAR:2012}{
326 to map unique consensus sequences tags to the maize reference
327 genome and to predict targets of novel miRNAs.}
328
329 \Mybibentry{BOU:KOU:PAV:MIN:TSA:DAR:2012}{
330 for masking Long Terminal Repeats in the Maize Genome Sequence.}
331
332 \item In the papers
333
334 \bibentry{HER:MAR:DOR:PFE:GAL:SCHAA:JOU:SIM:VAL:DOL:2012}
335
336 \bibentry{PHI:PAU:BER:SOU:CHO:LAU:SIM:SAF:BEL:VAU:2013}
337
338 \Vmatch was used to mask repetitive DNA.
339
340 \Mybibentry{HOW:YU:KNA:CRO:KOL:DOL:LOR:DEA:2013}{
341 to cluster \numprint{40010} assembled isotigs.}
342
343 \Mybibentry{KAR:HAA:MAL:GEE:BOV:LAM:ANG:MAA:2013}{
344 to preprocess short reads in the context of identifying mircoRNA targets in
345 tomato fruit development.}
346
347 \Mybibentry{GRO:MAR:SIM:ABR:WAN:VIS:2013}{
348 in an all-vs-all comparison to bin contigs into
349 loci based on a minimum of 200~bp sequence overlap in the context of
350 transcriptome assembly for two Agave-species.}
351
352 \Mybibentry{KAN:HEL:DUR:WIN:ENG:BEH:HOL:BRA:HAU:FER:2013}{
353 to align 454-reads to assembled isotigs for Ragweed pollen.}
354
355 \Mybibentry{KUG:SIE:NUS:AME:SPAN:STEI:LEM:MAY:BUE:SCHWE:2013}{
356 for comparing gene sets.}
357
358 \Mybibentry{MAR:ZHO:HAS:SCHMU:VRA:KUB:KOEN:KUG:SCHOL:HAC:2013}{
359 to detect repetitive DNA content of chromosomal survey
360 sequences from the Rye genome.}
361
362 \item
363 In the papers
364
365 \bibentry{KOP:MAR:VHA:HRV:VRA:BAR:KOP:CAT:STO:NOV:2013}
366
367 \bibentry{KOP:MAR:CHA:HRI:VRA:BAR:2013}
368
369 \Vmatch was used for
370 identifying repetitive DNA content in contigs of meadow fescue chromosome 4F
371 assembled from Illumina short reads.
372
373 \item
374 In the papers
375
376 \bibentry{JAY:WAN:YU:TAC:PEL:COL:REN:VOI:2011}
377
378 \bibentry{WAN:WEI:SMI:2013}
379
380 \Vmatch was used for mapping siRNA sequences to the
381 \textit{Arabidopsis thaliana} genome.
382
383 \Mybibentry{HEN:VIV:DES:CHAU:PAY:GUT:CAS:2014}{
384 for the identification of binding motifs.}
385
386 \Mybibentry{WAN:HAB:GUN:GLAE:NUS:LUO:LOM:BOR:KER:SHA:2014}{
387 for masking one sequence set with another and for
388 mapping miRNA sequences of all plant species present in a reference database
389 to whole-genome assembly of \textit{Spirodela polyrhiza}.}
390
391 \Mybibentry{LOG:SCHEL:NUR:SAM:PEN:2014}{
392 for repeat detection.}
393
394 \Mybibentry{WAN:SHI:RIN:2015}{
395 to eliminate redundancies in assemblies of Illumina reads in
396 the context of studying plant defense mechanisms.}
397
398 \Mybibentry{ASH:HUL:WAN:YAN:GUA:JON:MAT:MOC:CHE:STE:2015}{
399 for clustering to determine a non-redundant set of assembled contigs.}
400
401 \Mybibentry{UST:NOV:BLI:SMY:2015}{
402 for clustering sequences based on their RT and aRNH domain.}
403
404 \Mybibentry{HLE:RIV:CLA:MAR:VAN:GON:GAR:LER:SIM:VAL:2015}{
405 for identifying repeats in contigs assembled from 454-reads.}
406
407 \Mybibentry{SHE:YAN:LU:WAN:SON:2015}{
408 for identifying inverted repeats in chloroplast genomes.}
409
410 \Mybibentry{PAN:MOH:KHA:MEH:EBR:2015}{
411 to identify contaminations and repetitive elements by
412 comparison of mRNA sequences to vector, bacterial and repeat databases.}
413
414 \Mybibentry{WOL:TWO:GAD:KNA:GRU:GEN:2015}{
415 to cluster contigs of different assemblies into groups of homologous sequences.}
416
417 \Mybibentry{YAN:LU:SHE:YAN:XU:SON:2015}{
418 to identify inverted repeats in chloroplast genomes.}
419
420 \Updateusages
421 \end{enumerate}
422
423 \subsection*{Usages in the Microbial Genome Research}
424 \begin{enumerate}
425 \item
426 The
427 \href{http://www.llnl.gov/str/April04/Slezak.html}{KPATH system},
428 developed at the Lawrence Livermore National Laboratories, and
429 described in
430
431 \bibentry{FIT:GAR:KUC:KUR:MYE:OTT:SLE:VIT:ZEM:MCC:2002}
432
433 \bibentry{SLE:KUC:OTT:TOR:MED:SMI:TRU:MUL:LAM:VIT:ZEM:ZHO:GAR:2003}
434
435 used \Vmatch to detect unique substrings in large
436 collection of DNA sequences. These unique substrings serve as
437 signatures allowing for rapid and accurate diagnostics
438 to identify pathogen bacteria and viruses. A similar application
439 is reported in \bibentry{GAR:KUC:VIT:SLE:2003}.
440
441 \Mybibentry{POB:WET:SZY:SCHIL:KUR:MEY:NAT:BECK:2006}{
442 to map signature tags to the genome
443 of \textit{S.~meliloti}.}
444
445 \item
446 The
447 \href{http://crispr.u-psud.fr/Server/CRISPRfinder.php}{CRISPRFinder}-program
448 and the
449 \href{http://crispr.u-psud.fr/crispr/CRISPRdatabase.php}{CRISPRdatabase}, described in
450
451 \bibentry{GRI:VER:POU:2007A}
452
453 \bibentry{GRI:VER:POU:2007B}
454
455 used \Vmatch to
456 efficiently find maximal repeats, as a first step in localizing
457 Clustered regularly interspaced short palindromic repeats (CRISPRs).
458
459 \Mybibentry{VOSS:GEO:SCHOE:UDE:HES:2009}{
460 to map predicted sequences to
461 information about Rho-independent terminators provided by a specific database.}
462
463 \Mybibentry{SCHMU:CAN:SCHLU:MA:MIT:NEL:HYT:SON:THE:CHE:2010}{
464 to cluster DNA-sequences into families based on their
465 six-frame translation.}
466
467 \Mybibentry{ZIM:GES:CHE:LOR:SCHRO:2010}{
468 to align 454-sequences to the Ecoli-genome and to cluster the sequences.}
469
470 \Mybibentry{TOU:DEN:MED:BAR:ELK:PET:2010}{
471 for detecting repeats in three bacterial species.}
472
473 \Mybibentry{MAY:MAR:HED:SIM:LIU:MOR:STEU:TAU:ROE:GUN:2011}{
474 for masking repeats in 454-reads.}
475
476 \Mybibentry{PUS:MAN:JI:LI:EVA:CRA:MOR:MEA:SIN:SAX:2011}{
477 to identify distal primers.}
478
479 \Mybibentry{BRE:SHE:POP:2011}{
480 for removing redundant transcripts assembled in an RNA-seq study based on
481 Illumina reads for \textit{Heliothis virescens} (tobacco budworm), infected
482 with a virus.}
483
484 \Mybibentry{TRI:HAM:BUE:TIS:VER:ZIN:LEA:2011}{
485 to search unassembled Illumina reads of US and African strains of
486 \textit{Xanthomonas oryzae} for evidence of transcriptional activator-like
487 effector sequences.}
488
489 \item
490 \Vmatch is used as an integral part of the PriMUX software package described in
491
492 \bibentry{HYS:NAR:ELS:CAR:WIL:GAR:2012}
493
494 In this context \Vmatch used for selecting multiplex compatible,
495 degenerate primers and probes to detect diverse targets such as viruses.
496
497 \Mybibentry{SHE:POP:2012}{
498 to identify redundant contigs from de novo exome assemblies.}
499
500 \Mybibentry{HUR:SUL:2013}{
501 to identify reads which have no common 20-mers with other
502 reads in a context of a marine viral metagenome project.}
503
504 \Mybibentry{ZHU:RHO:FESCH:2013}{
505 for clustering potential complete
506 Endogenous retroviruses of the bat \textit{Myotis lucifugus} into subfamilies.}
507
508 \item In the three papers
509
510 \bibentry{HUR:WES:BRU:SUL:2014}
511
512 \bibentry{HUR:DEN:POU:SUL:2013}
513
514 \bibentry{BRU:HUR:SCHOF:DUC:SUL:2015}
515
516 \Vmatch was used for $k$-mer analysis in the context of different marine
517 metagenome projects.
518
519 \Mybibentry{DEC:PAR:2014}{
520 for $k$-mer analysis in the context of microbial communities.}
521
522 \Mybibentry{BEN:BOU:FIC:KRI:LAR:2014}{
523 in an iterative scheme to construct contigs from reads associated with
524 resistance genes in the context of a shotgun metagenome project.}
525
526 \Mybibentry{NIC:THI:GAR:MCL:FOF:KOS:ELL:BRE:JAC:JAI:2013}{
527 to match probe candidate sequences against viral sequences and the human
528 genmome sequence.}
529
530 \Mybibentry{HEN:RUM:SCZ:VEL:DIE:GER:GOM:RAH:STO:BOR:2014}{
531 to identify the species of the Streptococcaceae
532 by comparing with Silva 115 release 16S reference sequence database.}
533 \Updateusages
534 \end{enumerate}
535
536 \subsection*{Usages in General Web-Servers or Sequence Analysis Software}
537 \begin{enumerate}
538 \item
539 Since 2000,
540 the \href{http://rsat.ulb.ac.be/rsat/}{RSA-tools}, described in
541
542 \bibentry{HEL:RIO:COL:2000}
543
544 and developed by Jacques van Helden
545 use \Vmatch to \href{http://rsat.ulb.ac.be/rsat/purge-sequence_form.cgi}{purge}
546 sequences before computing sequence statistics. Similar applications are
547 reported in the following papers:
548
549 \bibentry{HUL:WEE:CRO:GER:HEP:HEL:2003}
550
551 \bibentry{SIM:WOD:COH:HEL:2004}
552
553 \bibentry{SIM:HEL:COH:WOD:2004}.
554
555 \item
556 The program \href{http://splicenest.molgen.mpg.de/}{SpliceNest}, described in
557
558 \bibentry{COW:HAA:VIN:2002}
559
560 computes gene indices and uses \Vmatch to
561 \href{http://splicenest.molgen.mpg.de/doc/help.html\#mapping}{map} clustered
562 sequences to large genomes.
563 %\item
564 %The oligo design program
565 %\href{http://oligos.molgen.mpg.de/}{Promide}
566 %\bibentry{RAH:2002} developed by
567 %Sven Rahmann is based on the persistent index structure of \Vmatch.
568 %Promide uses \textit{mkvtree} for generating the index.
569 \item
570 \href{http://bibiserv.techfak.uni-bielefeld.de/e2g/}{e2g}
571 is a web-based server which efficiently maps large
572 EST and cDNA data sets to genomic DNA. The use of \Vmatch
573 allows to significantly extend the size of data that can be mapped in
574 reasonable time. e2g is available as a web service and hosts
575 large collections of EST sequences (e.g.\ 4.1 million mouse ESTs
576 of 1.87 Gbp) in a precomputed persistent index. For details see
577
578 \bibentry{KRUE:SCZ:KUR:GIE:2004}.
579
580 \item
581 The \href{http://bibiserv.techfak.uni-bielefeld.de/}{Bielefeld Bioinformatics Server} provides the
582 \href{http://bibiserv.techfak.uni-bielefeld.de/reputer/}{REPuter}
583 web-service to compute repeats in complete genomes. The service is based on
584 \Vmatch.
585
586 \Mybibentry{FER:DON:SCHNE:MOR:NAN:BRE:WAL:2004}{
587 to (1) match \numprint{130861} vector-trimmed sequences against the maize
588 repeat database, and (2) to cluster near-identical sequences. }
589 %The \href{http://www.mutransposon.org/project/RescueMu/research/GSSanalysis}%
590 %{Mu Transposon Information Resource},
591 \item
592 \href{http://www-ab.informatik.uni-tuebingen.de/software/crosslink/welcome.html}{CrossLink}, described in
593
594 \bibentry{DEZ:SCHAEF:WIE:WEI:HUS:2006}
595
596 is a versatile computational tool which aids in visualizing
597 relationships between RNA sequences (particularly between ncRNAs and
598 their putative target transcripts) in an intuitive and accessible way.
599 Besides BLAST, CrossLink uses \Vmatch to reveal the sequence
600 relationships to be visualized.
601
602 \item
603 The early version of the web-service \href{http://mips.gsf.de/simap/}%
604 {Similarity matrix of Proteins (SIMAP)}, see
605
606 \bibentry{ARN:RAT:TIS:TRU:STU:MEW:2005}
607
608 used \Vmatch to locate
609 the sequences in SIMAP which are similar to a given query. This is much
610 faster than running BLAST.
611
612 \Mybibentry{FIE:VAN:PEE:VAN:NAP:2005}{
613 to compute similarities between genomes, which are then visualized by the
614 program \href{http://www.win.tue.nl/dnavis/}{DNAVis}.}
615
616 \item In the paper
617
618 \bibentry{SEI:KRUE:HAR:SCHWA:LOEW:MER:DAN:GIE:2006}
619
620 Seidel et.\ al.\ describe
621 methods for creating web-services and give examples which, among other tools,
622 also integrate \Vmatch.
623
624 \item
625 The program \textit{Gepard}
626
627 \bibentry{KRU:ARN:RAT:2007}
628
629 uses \textit{mkvtree} to compute enhanced suffix arrays.
630
631 \item
632 \Vmatch is used a part of the transcriptome assembler software Rnnotator,
633 described in
634
635 \bibentry{MAR:BRU:FAN:MEN:BLO:ZHA:SHE:SNY:WAN:2010}
636 \item
637 The BioExtract-Server described in
638
639 \bibentry{LUS:JEN:BRE:2011}
640
641 uses \Vmatch to remove duplicated sequences.
642
643 \Mybibentry{LUS:GNI:DOO:2015}{
644 for removing duplicates in BlastP results. This use is
645 part of a workflow in
646 \href{http://www.myexperiment.org/workflows/3131.html}{myexperiment}.
647 }
648
649 \Mybibentry{GRE:LOY:HOR:RAT:2015}{
650 for probe/primer search functionality in the probeBase database.}
651 \Updateusages
652 \end{enumerate}
653
654 \subsection*{Current Usages in Human Genome Research}
655 \begin{enumerate}
656 \Mybibentry{BUC:JAR:MEN:MAT:SCO:GRE:LAN:DUM:2005}{
657 to reveal long repeats inside human chromosome 1 and long similar regions
658 between human chromosome 1 and all other human chromosomes.}
659
660 \Mybibentry{LIA:WAN:LIU:JI:LIU:CHE:WEB:REE:DEA:2007}{
661 for Vector screening.}
662
663 \Mybibentry{NYG:JAC:LIN:ERI:BAL:FLY:TOL:MOE:SOE:KRO:LIT:2009}{
664 for mapping short reads.}
665
666 \Mybibentry{COL:SOB:LU:THA:BOW:BRO:GRE:BAR:HUT:2009}{
667 for matching reads to sets of RNA sequences and the Human genome.}
668
669 \Mybibentry{CLO:WAN:XU:GU:LEA:HEA:BAR:STE:MAR:NOU:2011}{
670 to uniquely map miRNAs against the human genome.}
671
672 \Mybibentry{TAK:TSU:KAT:OKA:HOR:IKE:URA:KAW:HAS:IKE:2011}{
673 to determine the positions of CAGE tags on the human genome.}
674
675 \Mybibentry{KEV:LAL:LI:CAV:NAR:KAM:MIT:HAK:KOZ:GEN:2011}{
676 to align sections of reads against RefSeq mRNA exon sequences.}
677
678 \Mybibentry{KID:CHE:WAN:JAC:ZHA:BOY:FIR:TAN:GAE:COL:2012}{
679 to align sets of genes.}
680
681 \Mybibentry{YAM:IKE:BOE:HOR:TAK:URA:KAI:CAR:KAW:HAY:2014}{
682 to determine the positions of CAGE tags on the human genome.}
683
684 \Updateusages
685 \end{enumerate}
686
687 \subsection*{Current Usages for different Model Organisms}
688 \begin{enumerate}
689 \Mybibentry{SCZ:BECK:BRI:GIE:ALT:2005}{
690 to cluster \numprint{317242} EST and cDNA sequences from
691 \textit{Xenopus laevis}. \Vmatch was chosen for the following reasons:
692 \begin{itemize}
693 \item
694 At first, there was no clustering tool available which could handle
695 large data sets efficiently, and which was documented well enough to
696 allow a detailed b replication and evaluation of existing clusters.
697 \item
698 Second, \Vmatch identifies similarities between sequences rapidly,
699 and it provides additional options to cluster a set of sequences
700 based on these matches. Furthermore, the \Vmatch output provides
701 information about how the clusters were derived. Due to the
702 efficiency of \Vmatch, it was possible to perform the clustering for a
703 wide variety of parameters on the complete sequence set.
704 This allows to study the effect of the parameter choice on the clustering.
705 \end{itemize}}
706
707 \Mybibentry{SPIT:LOR:CUL:SCZ:FUEL:2006}{
708 to cluster EST-sequences of \textit{Xenopus laevis}.}
709
710 \Mybibentry{EIS:COY:WU:WU:THI:WOR:BAD:REN:AME:JON:2006}{
711 to search exact repeats in the Macronuclear Genome Sequence of the Ciliate
712 \textit{Tetrahymena thermophila}.}
713 %\item
714 %\href{http://www.plantgdb.org/}{PlantGDB} provides a Web Service
715 %named \href{https://biomoby.tigr.org/wiki/index.php/Code_Examples_-_Java}{VMatchForArabidopsis}%
716 %
717 %based on \Vmatch. It allows to search sequences
718 %from \textit{Arabidopsis Thaliana}.
719 %\item
720 %The \href{http://www.jgi.doe.gov/science/posters/LBNL-59860goltsman.pdf}{DOE Joint Genome Institute}%
721 %
722 %used \Vmatch to
723 %identify and mask all continuous non-unique sequence fragments over
724 %500~bp in \textit{Frankia sp.} and \textit{Shewanella oneidensis}.
725
726 \Mybibentry{FAU:FOR:CHA:SCHRO:HAY:CAR:HUM:GRI:2008}{
727 for mapping
728 \begin{itemize}
729 \item
730 \numprint{11567973} FANTOM3 mouse CAGE tags to the mouse genome
731 with minimum match length of {18} bp, a single internal mismatch allowed,
732 and multiple mismatches allowed at tag ends.
733 \item
734 Affymetrix GNF probe sequences to transcripts without allowing for mismatches.
735 \end{itemize}}
736
737
738 \Mybibentry{PRI:JOR:2008}{
739 to search small RNA signatures in entire miRNA gene sequences for
740 Arabidopsis and rice.}
741
742 \Mybibentry{TAF:GLA:LASS:HAY:CAR:MAT:2009}{
743 to map small RNA data sets onto the corresponding reference
744 genomes for different model organisms.}
745
746 \Mybibentry{PLE:PAS:BER:AKA:CAR:VAS:LAZ:SEV:VLA:SIM:2012}{
747 for mapping Illumina reads to the mouse genome.}
748
749 \Mybibentry{KEN:SHI:2012}{
750 for redundancy removal in the context of transcriptome assembly of
751 a keelworm species.}
752
753 \Mybibentry{GOS:OHM:KOG:SON:TUR:ZAJ:ZAL:GRU:SUN:HAN:2014}{
754 to remove redundant contigs in a genome project of four
755 \textit{Aureobasidium pullulans} varieties.}
756
757 \Mybibentry{MCM:GAR:BAI:KEM:WAR:CEV:ROB:SCHUL:BAL:HOL:2015}{
758 for merging assemblies of Illumina sequenced cDNA.}
759
760 % add applications at Bioinformatics Center Copenhagen Univ., see E-mails from
761 % Feb 2007, this may be the Biopieces.
762
763 % the following paper cites \Vmatch, but does not use it.
764 % http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1868776
765 % \bibentry{TEM:ZAV:BOD:CHA:GEY:WAS:BEN:REI:2007}, Tembe et et.\ al.\
766
767 %Add "Highly Specific Gene Silencing by Artificial MicroRNAs in Arabidopsis"
768 %of Schwab et al, 2006, and implemented on wmd2.weigelworld.org.
769 %refers to Hypa and not really to \Vmatch.
770
771 \Mybibentry{MOR:DHA:PAV:TRO:WHE:HEL:2015}{
772 to combine and scaffold contigs.}
773
774 \Updateusages
775 \end{enumerate}
776
777 Total number of usages: \arabic{Allusages}
778
779 \section*{Availability}
780 \Vmatch is available for
781 \href{http://www.vmatch.de/download.html}{download}
782 in executable form for the following platforms:
783
784 \begin{itemize}
785 \item
786 Linux
787 \item
788 Mac OS X
789 \item
790 MS Windows
791 \end{itemize}
792
793 \section*{Developer}
794 \Vmatch was developed since May 2000 by
795 \href{http://www.zbh.uni-hamburg.de/kurtz}{Stefan Kurtz},
796 a professor of
797 Computer Science at the Center for Bioinformatics, University of Hamburg,
798 Germany.
799
800 \HCode{
801 <!--delete paragraph-->
802 <b>Important Documents</b>
803 <ul>
804 <li>
805 The <a href="virtman.pdf"><i>Vmatch</i>-manual</a>
806 </li>
807 </ul>
808 }
809
810 \HCode{
811 <!--delete paragraph-->
812 <div id="footer">
813 Copyright &copy; 2000-2017 <a href="mailto:kurtz@zbh.uni-hamburg.de">
814 Stefan Kurtz</a>. Last update: 2017-06-15
815 </div>
816 }
817
818 \HCode{
819 <!--delete paragraph-->
820 <!-- Piwik -->
821 <div id="piwik">
822 <script type="text/javascript">
823 var pkBaseURL = "https://zenlicensemanager.com/piwik/";
824 document.write(unescape("%3Cscript src='" + pkBaseURL + "piwik.js' type='text/javascript'%3E%3C/script%3E"));
825 </script><script type="text/javascript">
826 try {
827 var piwikTracker = Piwik.getTracker(pkBaseURL + "piwik.php", 3);
828 piwikTracker.trackPageView();
829 piwikTracker.enableLinkTracking();
830 } catch( err ) {}
831 </script>
832 <br/>
833 <noscript>
834 <img src="https://zenlicensemanager.com/piwik/piwik.php?idsite=3" style="border:0" alt=""/>
835 </noscript>
836 <!-- End Piwik Tracking Tag -->
837 </div>
838 }
839 \end{document}
+0
-196
src/doc/WWW/xhtml-lat1.ent less more
0 <!-- Portions (C) International Organization for Standardization 1986
1 Permission to copy in any form is granted for use with
2 conforming SGML systems and applications as defined in
3 ISO 8879, provided this notice is included in all copies.
4 -->
5 <!-- Character entity set. Typical invocation:
6 <!ENTITY % HTMLlat1 PUBLIC
7 "-//W3C//ENTITIES Latin 1 for XHTML//EN"
8 "http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent">
9 %HTMLlat1;
10 -->
11
12 <!ENTITY nbsp "&#160;"> <!-- no-break space = non-breaking space,
13 U+00A0 ISOnum -->
14 <!ENTITY iexcl "&#161;"> <!-- inverted exclamation mark, U+00A1 ISOnum -->
15 <!ENTITY cent "&#162;"> <!-- cent sign, U+00A2 ISOnum -->
16 <!ENTITY pound "&#163;"> <!-- pound sign, U+00A3 ISOnum -->
17 <!ENTITY curren "&#164;"> <!-- currency sign, U+00A4 ISOnum -->
18 <!ENTITY yen "&#165;"> <!-- yen sign = yuan sign, U+00A5 ISOnum -->
19 <!ENTITY brvbar "&#166;"> <!-- broken bar = broken vertical bar,
20 U+00A6 ISOnum -->
21 <!ENTITY sect "&#167;"> <!-- section sign, U+00A7 ISOnum -->
22 <!ENTITY uml "&#168;"> <!-- diaeresis = spacing diaeresis,
23 U+00A8 ISOdia -->
24 <!ENTITY copy "&#169;"> <!-- copyright sign, U+00A9 ISOnum -->
25 <!ENTITY ordf "&#170;"> <!-- feminine ordinal indicator, U+00AA ISOnum -->
26 <!ENTITY laquo "&#171;"> <!-- left-pointing double angle quotation mark
27 = left pointing guillemet, U+00AB ISOnum -->
28 <!ENTITY not "&#172;"> <!-- not sign = angled dash,
29 U+00AC ISOnum -->
30 <!ENTITY shy "&#173;"> <!-- soft hyphen = discretionary hyphen,
31 U+00AD ISOnum -->
32 <!ENTITY reg "&#174;"> <!-- registered sign = registered trade mark sign,
33 U+00AE ISOnum -->
34 <!ENTITY macr "&#175;"> <!-- macron = spacing macron = overline
35 = APL overbar, U+00AF ISOdia -->
36 <!ENTITY deg "&#176;"> <!-- degree sign, U+00B0 ISOnum -->
37 <!ENTITY plusmn "&#177;"> <!-- plus-minus sign = plus-or-minus sign,
38 U+00B1 ISOnum -->
39 <!ENTITY sup2 "&#178;"> <!-- superscript two = superscript digit two
40 = squared, U+00B2 ISOnum -->
41 <!ENTITY sup3 "&#179;"> <!-- superscript three = superscript digit three
42 = cubed, U+00B3 ISOnum -->
43 <!ENTITY acute "&#180;"> <!-- acute accent = spacing acute,
44 U+00B4 ISOdia -->
45 <!ENTITY micro "&#181;"> <!-- micro sign, U+00B5 ISOnum -->
46 <!ENTITY para "&#182;"> <!-- pilcrow sign = paragraph sign,
47 U+00B6 ISOnum -->
48 <!ENTITY middot "&#183;"> <!-- middle dot = Georgian comma
49 = Greek middle dot, U+00B7 ISOnum -->
50 <!ENTITY cedil "&#184;"> <!-- cedilla = spacing cedilla, U+00B8 ISOdia -->
51 <!ENTITY sup1 "&#185;"> <!-- superscript one = superscript digit one,
52 U+00B9 ISOnum -->
53 <!ENTITY ordm "&#186;"> <!-- masculine ordinal indicator,
54 U+00BA ISOnum -->
55 <!ENTITY raquo "&#187;"> <!-- right-pointing double angle quotation mark
56 = right pointing guillemet, U+00BB ISOnum -->
57 <!ENTITY frac14 "&#188;"> <!-- vulgar fraction one quarter
58 = fraction one quarter, U+00BC ISOnum -->
59 <!ENTITY frac12 "&#189;"> <!-- vulgar fraction one half
60 = fraction one half, U+00BD ISOnum -->
61 <!ENTITY frac34 "&#190;"> <!-- vulgar fraction three quarters
62 = fraction three quarters, U+00BE ISOnum -->
63 <!ENTITY iquest "&#191;"> <!-- inverted question mark
64 = turned question mark, U+00BF ISOnum -->
65 <!ENTITY Agrave "&#192;"> <!-- latin capital letter A with grave
66 = latin capital letter A grave,
67 U+00C0 ISOlat1 -->
68 <!ENTITY Aacute "&#193;"> <!-- latin capital letter A with acute,
69 U+00C1 ISOlat1 -->
70 <!ENTITY Acirc "&#194;"> <!-- latin capital letter A with circumflex,
71 U+00C2 ISOlat1 -->
72 <!ENTITY Atilde "&#195;"> <!-- latin capital letter A with tilde,
73 U+00C3 ISOlat1 -->
74 <!ENTITY Auml "&#196;"> <!-- latin capital letter A with diaeresis,
75 U+00C4 ISOlat1 -->
76 <!ENTITY Aring "&#197;"> <!-- latin capital letter A with ring above
77 = latin capital letter A ring,
78 U+00C5 ISOlat1 -->
79 <!ENTITY AElig "&#198;"> <!-- latin capital letter AE
80 = latin capital ligature AE,
81 U+00C6 ISOlat1 -->
82 <!ENTITY Ccedil "&#199;"> <!-- latin capital letter C with cedilla,
83 U+00C7 ISOlat1 -->
84 <!ENTITY Egrave "&#200;"> <!-- latin capital letter E with grave,
85 U+00C8 ISOlat1 -->
86 <!ENTITY Eacute "&#201;"> <!-- latin capital letter E with acute,
87 U+00C9 ISOlat1 -->
88 <!ENTITY Ecirc "&#202;"> <!-- latin capital letter E with circumflex,
89 U+00CA ISOlat1 -->
90 <!ENTITY Euml "&#203;"> <!-- latin capital letter E with diaeresis,
91 U+00CB ISOlat1 -->
92 <!ENTITY Igrave "&#204;"> <!-- latin capital letter I with grave,
93 U+00CC ISOlat1 -->
94 <!ENTITY Iacute "&#205;"> <!-- latin capital letter I with acute,
95 U+00CD ISOlat1 -->
96 <!ENTITY Icirc "&#206;"> <!-- latin capital letter I with circumflex,
97 U+00CE ISOlat1 -->
98 <!ENTITY Iuml "&#207;"> <!-- latin capital letter I with diaeresis,
99 U+00CF ISOlat1 -->
100 <!ENTITY ETH "&#208;"> <!-- latin capital letter ETH, U+00D0 ISOlat1 -->
101 <!ENTITY Ntilde "&#209;"> <!-- latin capital letter N with tilde,
102 U+00D1 ISOlat1 -->
103 <!ENTITY Ograve "&#210;"> <!-- latin capital letter O with grave,
104 U+00D2 ISOlat1 -->
105 <!ENTITY Oacute "&#211;"> <!-- latin capital letter O with acute,
106 U+00D3 ISOlat1 -->
107 <!ENTITY Ocirc "&#212;"> <!-- latin capital letter O with circumflex,
108 U+00D4 ISOlat1 -->
109 <!ENTITY Otilde "&#213;"> <!-- latin capital letter O with tilde,
110 U+00D5 ISOlat1 -->
111 <!ENTITY Ouml "&#214;"> <!-- latin capital letter O with diaeresis,
112 U+00D6 ISOlat1 -->
113 <!ENTITY times "&#215;"> <!-- multiplication sign, U+00D7 ISOnum -->
114 <!ENTITY Oslash "&#216;"> <!-- latin capital letter O with stroke
115 = latin capital letter O slash,
116 U+00D8 ISOlat1 -->
117 <!ENTITY Ugrave "&#217;"> <!-- latin capital letter U with grave,
118 U+00D9 ISOlat1 -->
119 <!ENTITY Uacute "&#218;"> <!-- latin capital letter U with acute,
120 U+00DA ISOlat1 -->
121 <!ENTITY Ucirc "&#219;"> <!-- latin capital letter U with circumflex,
122 U+00DB ISOlat1 -->
123 <!ENTITY Uuml "&#220;"> <!-- latin capital letter U with diaeresis,
124 U+00DC ISOlat1 -->
125 <!ENTITY Yacute "&#221;"> <!-- latin capital letter Y with acute,
126 U+00DD ISOlat1 -->
127 <!ENTITY THORN "&#222;"> <!-- latin capital letter THORN,
128 U+00DE ISOlat1 -->
129 <!ENTITY szlig "&#223;"> <!-- latin small letter sharp s = ess-zed,
130 U+00DF ISOlat1 -->
131 <!ENTITY agrave "&#224;"> <!-- latin small letter a with grave
132 = latin small letter a grave,
133 U+00E0 ISOlat1 -->
134 <!ENTITY aacute "&#225;"> <!-- latin small letter a with acute,
135 U+00E1 ISOlat1 -->
136 <!ENTITY acirc "&#226;"> <!-- latin small letter a with circumflex,
137 U+00E2 ISOlat1 -->
138 <!ENTITY atilde "&#227;"> <!-- latin small letter a with tilde,
139 U+00E3 ISOlat1 -->
140 <!ENTITY auml "&#228;"> <!-- latin small letter a with diaeresis,
141 U+00E4 ISOlat1 -->
142 <!ENTITY aring "&#229;"> <!-- latin small letter a with ring above
143 = latin small letter a ring,
144 U+00E5 ISOlat1 -->
145 <!ENTITY aelig "&#230;"> <!-- latin small letter ae
146 = latin small ligature ae, U+00E6 ISOlat1 -->
147 <!ENTITY ccedil "&#231;"> <!-- latin small letter c with cedilla,
148 U+00E7 ISOlat1 -->
149 <!ENTITY egrave "&#232;"> <!-- latin small letter e with grave,
150 U+00E8 ISOlat1 -->
151 <!ENTITY eacute "&#233;"> <!-- latin small letter e with acute,
152 U+00E9 ISOlat1 -->
153 <!ENTITY ecirc "&#234;"> <!-- latin small letter e with circumflex,
154 U+00EA ISOlat1 -->
155 <!ENTITY euml "&#235;"> <!-- latin small letter e with diaeresis,
156 U+00EB ISOlat1 -->
157 <!ENTITY igrave "&#236;"> <!-- latin small letter i with grave,
158 U+00EC ISOlat1 -->
159 <!ENTITY iacute "&#237;"> <!-- latin small letter i with acute,
160 U+00ED ISOlat1 -->
161 <!ENTITY icirc "&#238;"> <!-- latin small letter i with circumflex,
162 U+00EE ISOlat1 -->
163 <!ENTITY iuml "&#239;"> <!-- latin small letter i with diaeresis,
164 U+00EF ISOlat1 -->
165 <!ENTITY eth "&#240;"> <!-- latin small letter eth, U+00F0 ISOlat1 -->
166 <!ENTITY ntilde "&#241;"> <!-- latin small letter n with tilde,
167 U+00F1 ISOlat1 -->
168 <!ENTITY ograve "&#242;"> <!-- latin small letter o with grave,
169 U+00F2 ISOlat1 -->
170 <!ENTITY oacute "&#243;"> <!-- latin small letter o with acute,
171 U+00F3 ISOlat1 -->
172 <!ENTITY ocirc "&#244;"> <!-- latin small letter o with circumflex,
173 U+00F4 ISOlat1 -->
174 <!ENTITY otilde "&#245;"> <!-- latin small letter o with tilde,
175 U+00F5 ISOlat1 -->
176 <!ENTITY ouml "&#246;"> <!-- latin small letter o with diaeresis,
177 U+00F6 ISOlat1 -->
178 <!ENTITY divide "&#247;"> <!-- division sign, U+00F7 ISOnum -->
179 <!ENTITY oslash "&#248;"> <!-- latin small letter o with stroke,
180 = latin small letter o slash,
181 U+00F8 ISOlat1 -->
182 <!ENTITY ugrave "&#249;"> <!-- latin small letter u with grave,
183 U+00F9 ISOlat1 -->
184 <!ENTITY uacute "&#250;"> <!-- latin small letter u with acute,
185 U+00FA ISOlat1 -->
186 <!ENTITY ucirc "&#251;"> <!-- latin small letter u with circumflex,
187 U+00FB ISOlat1 -->
188 <!ENTITY uuml "&#252;"> <!-- latin small letter u with diaeresis,
189 U+00FC ISOlat1 -->
190 <!ENTITY yacute "&#253;"> <!-- latin small letter y with acute,
191 U+00FD ISOlat1 -->
192 <!ENTITY thorn "&#254;"> <!-- latin small letter thorn,
193 U+00FE ISOlat1 -->
194 <!ENTITY yuml "&#255;"> <!-- latin small letter y with diaeresis,
195 U+00FF ISOlat1 -->
+0
-80
src/doc/WWW/xhtml-special.ent less more
0 <!-- Special characters for XHTML -->
1
2 <!-- Character entity set. Typical invocation:
3 <!ENTITY % HTMLspecial PUBLIC
4 "-//W3C//ENTITIES Special for XHTML//EN"
5 "http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent">
6 %HTMLspecial;
7 -->
8
9 <!-- Portions (C) International Organization for Standardization 1986:
10 Permission to copy in any form is granted for use with
11 conforming SGML systems and applications as defined in
12 ISO 8879, provided this notice is included in all copies.
13 -->
14
15 <!-- Relevant ISO entity set is given unless names are newly introduced.
16 New names (i.e., not in ISO 8879 list) do not clash with any
17 existing ISO 8879 entity names. ISO 10646 character numbers
18 are given for each character, in hex. values are decimal
19 conversions of the ISO 10646 values and refer to the document
20 character set. Names are Unicode names.
21 -->
22
23 <!-- C0 Controls and Basic Latin -->
24 <!ENTITY quot "&#34;"> <!-- quotation mark, U+0022 ISOnum -->
25 <!ENTITY amp "&#38;#38;"> <!-- ampersand, U+0026 ISOnum -->
26 <!ENTITY lt "&#38;#60;"> <!-- less-than sign, U+003C ISOnum -->
27 <!ENTITY gt "&#62;"> <!-- greater-than sign, U+003E ISOnum -->
28 <!ENTITY apos "&#39;"> <!-- apostrophe = APL quote, U+0027 ISOnum -->
29
30 <!-- Latin Extended-A -->
31 <!ENTITY OElig "&#338;"> <!-- latin capital ligature OE,
32 U+0152 ISOlat2 -->
33 <!ENTITY oelig "&#339;"> <!-- latin small ligature oe, U+0153 ISOlat2 -->
34 <!-- ligature is a misnomer, this is a separate character in some languages -->
35 <!ENTITY Scaron "&#352;"> <!-- latin capital letter S with caron,
36 U+0160 ISOlat2 -->
37 <!ENTITY scaron "&#353;"> <!-- latin small letter s with caron,
38 U+0161 ISOlat2 -->
39 <!ENTITY Yuml "&#376;"> <!-- latin capital letter Y with diaeresis,
40 U+0178 ISOlat2 -->
41
42 <!-- Spacing Modifier Letters -->
43 <!ENTITY circ "&#710;"> <!-- modifier letter circumflex accent,
44 U+02C6 ISOpub -->
45 <!ENTITY tilde "&#732;"> <!-- small tilde, U+02DC ISOdia -->
46
47 <!-- General Punctuation -->
48 <!ENTITY ensp "&#8194;"> <!-- en space, U+2002 ISOpub -->
49 <!ENTITY emsp "&#8195;"> <!-- em space, U+2003 ISOpub -->
50 <!ENTITY thinsp "&#8201;"> <!-- thin space, U+2009 ISOpub -->
51 <!ENTITY zwnj "&#8204;"> <!-- zero width non-joiner,
52 U+200C NEW RFC 2070 -->
53 <!ENTITY zwj "&#8205;"> <!-- zero width joiner, U+200D NEW RFC 2070 -->
54 <!ENTITY lrm "&#8206;"> <!-- left-to-right mark, U+200E NEW RFC 2070 -->
55 <!ENTITY rlm "&#8207;"> <!-- right-to-left mark, U+200F NEW RFC 2070 -->
56 <!ENTITY ndash "&#8211;"> <!-- en dash, U+2013 ISOpub -->
57 <!ENTITY mdash "&#8212;"> <!-- em dash, U+2014 ISOpub -->
58 <!ENTITY lsquo "&#8216;"> <!-- left single quotation mark,
59 U+2018 ISOnum -->
60 <!ENTITY rsquo "&#8217;"> <!-- right single quotation mark,
61 U+2019 ISOnum -->
62 <!ENTITY sbquo "&#8218;"> <!-- single low-9 quotation mark, U+201A NEW -->
63 <!ENTITY ldquo "&#8220;"> <!-- left double quotation mark,
64 U+201C ISOnum -->
65 <!ENTITY rdquo "&#8221;"> <!-- right double quotation mark,
66 U+201D ISOnum -->
67 <!ENTITY bdquo "&#8222;"> <!-- double low-9 quotation mark, U+201E NEW -->
68 <!ENTITY dagger "&#8224;"> <!-- dagger, U+2020 ISOpub -->
69 <!ENTITY Dagger "&#8225;"> <!-- double dagger, U+2021 ISOpub -->
70 <!ENTITY permil "&#8240;"> <!-- per mille sign, U+2030 ISOtech -->
71 <!ENTITY lsaquo "&#8249;"> <!-- single left-pointing angle quotation mark,
72 U+2039 ISO proposed -->
73 <!-- lsaquo is proposed but not yet ISO standardized -->
74 <!ENTITY rsaquo "&#8250;"> <!-- single right-pointing angle quotation mark,
75 U+203A ISO proposed -->
76 <!-- rsaquo is proposed but not yet ISO standardized -->
77
78 <!-- Currency Symbols -->
79 <!ENTITY euro "&#8364;"> <!-- euro sign, U+20AC NEW -->
+0
-237
src/doc/WWW/xhtml-symbol.ent less more
0 <!-- Mathematical, Greek and Symbolic characters for XHTML -->
1
2 <!-- Character entity set. Typical invocation:
3 <!ENTITY % HTMLsymbol PUBLIC
4 "-//W3C//ENTITIES Symbols for XHTML//EN"
5 "http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent">
6 %HTMLsymbol;
7 -->
8
9 <!-- Portions (C) International Organization for Standardization 1986:
10 Permission to copy in any form is granted for use with
11 conforming SGML systems and applications as defined in
12 ISO 8879, provided this notice is included in all copies.
13 -->
14
15 <!-- Relevant ISO entity set is given unless names are newly introduced.
16 New names (i.e., not in ISO 8879 list) do not clash with any
17 existing ISO 8879 entity names. ISO 10646 character numbers
18 are given for each character, in hex. values are decimal
19 conversions of the ISO 10646 values and refer to the document
20 character set. Names are Unicode names.
21 -->
22
23 <!-- Latin Extended-B -->
24 <!ENTITY fnof "&#402;"> <!-- latin small letter f with hook = function
25 = florin, U+0192 ISOtech -->
26
27 <!-- Greek -->
28 <!ENTITY Alpha "&#913;"> <!-- greek capital letter alpha, U+0391 -->
29 <!ENTITY Beta "&#914;"> <!-- greek capital letter beta, U+0392 -->
30 <!ENTITY Gamma "&#915;"> <!-- greek capital letter gamma,
31 U+0393 ISOgrk3 -->
32 <!ENTITY Delta "&#916;"> <!-- greek capital letter delta,
33 U+0394 ISOgrk3 -->
34 <!ENTITY Epsilon "&#917;"> <!-- greek capital letter epsilon, U+0395 -->
35 <!ENTITY Zeta "&#918;"> <!-- greek capital letter zeta, U+0396 -->
36 <!ENTITY Eta "&#919;"> <!-- greek capital letter eta, U+0397 -->
37 <!ENTITY Theta "&#920;"> <!-- greek capital letter theta,
38 U+0398 ISOgrk3 -->
39 <!ENTITY Iota "&#921;"> <!-- greek capital letter iota, U+0399 -->
40 <!ENTITY Kappa "&#922;"> <!-- greek capital letter kappa, U+039A -->
41 <!ENTITY Lambda "&#923;"> <!-- greek capital letter lamda,
42 U+039B ISOgrk3 -->
43 <!ENTITY Mu "&#924;"> <!-- greek capital letter mu, U+039C -->
44 <!ENTITY Nu "&#925;"> <!-- greek capital letter nu, U+039D -->
45 <!ENTITY Xi "&#926;"> <!-- greek capital letter xi, U+039E ISOgrk3 -->
46 <!ENTITY Omicron "&#927;"> <!-- greek capital letter omicron, U+039F -->
47 <!ENTITY Pi "&#928;"> <!-- greek capital letter pi, U+03A0 ISOgrk3 -->
48 <!ENTITY Rho "&#929;"> <!-- greek capital letter rho, U+03A1 -->
49 <!-- there is no Sigmaf, and no U+03A2 character either -->
50 <!ENTITY Sigma "&#931;"> <!-- greek capital letter sigma,
51 U+03A3 ISOgrk3 -->
52 <!ENTITY Tau "&#932;"> <!-- greek capital letter tau, U+03A4 -->
53 <!ENTITY Upsilon "&#933;"> <!-- greek capital letter upsilon,
54 U+03A5 ISOgrk3 -->
55 <!ENTITY Phi "&#934;"> <!-- greek capital letter phi,
56 U+03A6 ISOgrk3 -->
57 <!ENTITY Chi "&#935;"> <!-- greek capital letter chi, U+03A7 -->
58 <!ENTITY Psi "&#936;"> <!-- greek capital letter psi,
59 U+03A8 ISOgrk3 -->
60 <!ENTITY Omega "&#937;"> <!-- greek capital letter omega,
61 U+03A9 ISOgrk3 -->
62
63 <!ENTITY alpha "&#945;"> <!-- greek small letter alpha,
64 U+03B1 ISOgrk3 -->
65 <!ENTITY beta "&#946;"> <!-- greek small letter beta, U+03B2 ISOgrk3 -->
66 <!ENTITY gamma "&#947;"> <!-- greek small letter gamma,
67 U+03B3 ISOgrk3 -->
68 <!ENTITY delta "&#948;"> <!-- greek small letter delta,
69 U+03B4 ISOgrk3 -->
70 <!ENTITY epsilon "&#949;"> <!-- greek small letter epsilon,
71 U+03B5 ISOgrk3 -->
72 <!ENTITY zeta "&#950;"> <!-- greek small letter zeta, U+03B6 ISOgrk3 -->
73 <!ENTITY eta "&#951;"> <!-- greek small letter eta, U+03B7 ISOgrk3 -->
74 <!ENTITY theta "&#952;"> <!-- greek small letter theta,
75 U+03B8 ISOgrk3 -->
76 <!ENTITY iota "&#953;"> <!-- greek small letter iota, U+03B9 ISOgrk3 -->
77 <!ENTITY kappa "&#954;"> <!-- greek small letter kappa,
78 U+03BA ISOgrk3 -->
79 <!ENTITY lambda "&#955;"> <!-- greek small letter lamda,
80 U+03BB ISOgrk3 -->
81 <!ENTITY mu "&#956;"> <!-- greek small letter mu, U+03BC ISOgrk3 -->
82 <!ENTITY nu "&#957;"> <!-- greek small letter nu, U+03BD ISOgrk3 -->
83 <!ENTITY xi "&#958;"> <!-- greek small letter xi, U+03BE ISOgrk3 -->
84 <!ENTITY omicron "&#959;"> <!-- greek small letter omicron, U+03BF NEW -->
85 <!ENTITY pi "&#960;"> <!-- greek small letter pi, U+03C0 ISOgrk3 -->
86 <!ENTITY rho "&#961;"> <!-- greek small letter rho, U+03C1 ISOgrk3 -->
87 <!ENTITY sigmaf "&#962;"> <!-- greek small letter final sigma,
88 U+03C2 ISOgrk3 -->
89 <!ENTITY sigma "&#963;"> <!-- greek small letter sigma,
90 U+03C3 ISOgrk3 -->
91 <!ENTITY tau "&#964;"> <!-- greek small letter tau, U+03C4 ISOgrk3 -->
92 <!ENTITY upsilon "&#965;"> <!-- greek small letter upsilon,
93 U+03C5 ISOgrk3 -->
94 <!ENTITY phi "&#966;"> <!-- greek small letter phi, U+03C6 ISOgrk3 -->
95 <!ENTITY chi "&#967;"> <!-- greek small letter chi, U+03C7 ISOgrk3 -->
96 <!ENTITY psi "&#968;"> <!-- greek small letter psi, U+03C8 ISOgrk3 -->
97 <!ENTITY omega "&#969;"> <!-- greek small letter omega,
98 U+03C9 ISOgrk3 -->
99 <!ENTITY thetasym "&#977;"> <!-- greek theta symbol,
100 U+03D1 NEW -->
101 <!ENTITY upsih "&#978;"> <!-- greek upsilon with hook symbol,
102 U+03D2 NEW -->
103 <!ENTITY piv "&#982;"> <!-- greek pi symbol, U+03D6 ISOgrk3 -->
104
105 <!-- General Punctuation -->
106 <!ENTITY bull "&#8226;"> <!-- bullet = black small circle,
107 U+2022 ISOpub -->
108 <!-- bullet is NOT the same as bullet operator, U+2219 -->
109 <!ENTITY hellip "&#8230;"> <!-- horizontal ellipsis = three dot leader,
110 U+2026 ISOpub -->
111 <!ENTITY prime "&#8242;"> <!-- prime = minutes = feet, U+2032 ISOtech -->
112 <!ENTITY Prime "&#8243;"> <!-- double prime = seconds = inches,
113 U+2033 ISOtech -->
114 <!ENTITY oline "&#8254;"> <!-- overline = spacing overscore,
115 U+203E NEW -->
116 <!ENTITY frasl "&#8260;"> <!-- fraction slash, U+2044 NEW -->
117
118 <!-- Letterlike Symbols -->
119 <!ENTITY weierp "&#8472;"> <!-- script capital P = power set
120 = Weierstrass p, U+2118 ISOamso -->
121 <!ENTITY image "&#8465;"> <!-- black-letter capital I = imaginary part,
122 U+2111 ISOamso -->
123 <!ENTITY real "&#8476;"> <!-- black-letter capital R = real part symbol,
124 U+211C ISOamso -->
125 <!ENTITY trade "&#8482;"> <!-- trade mark sign, U+2122 ISOnum -->
126 <!ENTITY alefsym "&#8501;"> <!-- alef symbol = first transfinite cardinal,
127 U+2135 NEW -->
128 <!-- alef symbol is NOT the same as hebrew letter alef,
129 U+05D0 although the same glyph could be used to depict both characters -->
130
131 <!-- Arrows -->
132 <!ENTITY larr "&#8592;"> <!-- leftwards arrow, U+2190 ISOnum -->
133 <!ENTITY uarr "&#8593;"> <!-- upwards arrow, U+2191 ISOnum-->
134 <!ENTITY rarr "&#8594;"> <!-- rightwards arrow, U+2192 ISOnum -->
135 <!ENTITY darr "&#8595;"> <!-- downwards arrow, U+2193 ISOnum -->
136 <!ENTITY harr "&#8596;"> <!-- left right arrow, U+2194 ISOamsa -->
137 <!ENTITY crarr "&#8629;"> <!-- downwards arrow with corner leftwards
138 = carriage return, U+21B5 NEW -->
139 <!ENTITY lArr "&#8656;"> <!-- leftwards double arrow, U+21D0 ISOtech -->
140 <!-- Unicode does not say that lArr is the same as the 'is implied by' arrow
141 but also does not have any other character for that function. So lArr can
142 be used for 'is implied by' as ISOtech suggests -->
143 <!ENTITY uArr "&#8657;"> <!-- upwards double arrow, U+21D1 ISOamsa -->
144 <!ENTITY rArr "&#8658;"> <!-- rightwards double arrow,
145 U+21D2 ISOtech -->
146 <!-- Unicode does not say this is the 'implies' character but does not have
147 another character with this function so rArr can be used for 'implies'
148 as ISOtech suggests -->
149 <!ENTITY dArr "&#8659;"> <!-- downwards double arrow, U+21D3 ISOamsa -->
150 <!ENTITY hArr "&#8660;"> <!-- left right double arrow,
151 U+21D4 ISOamsa -->
152
153 <!-- Mathematical Operators -->
154 <!ENTITY forall "&#8704;"> <!-- for all, U+2200 ISOtech -->
155 <!ENTITY part "&#8706;"> <!-- partial differential, U+2202 ISOtech -->
156 <!ENTITY exist "&#8707;"> <!-- there exists, U+2203 ISOtech -->
157 <!ENTITY empty "&#8709;"> <!-- empty set = null set, U+2205 ISOamso -->
158 <!ENTITY nabla "&#8711;"> <!-- nabla = backward difference,
159 U+2207 ISOtech -->
160 <!ENTITY isin "&#8712;"> <!-- element of, U+2208 ISOtech -->
161 <!ENTITY notin "&#8713;"> <!-- not an element of, U+2209 ISOtech -->
162 <!ENTITY ni "&#8715;"> <!-- contains as member, U+220B ISOtech -->
163 <!ENTITY prod "&#8719;"> <!-- n-ary product = product sign,
164 U+220F ISOamsb -->
165 <!-- prod is NOT the same character as U+03A0 'greek capital letter pi' though
166 the same glyph might be used for both -->
167 <!ENTITY sum "&#8721;"> <!-- n-ary summation, U+2211 ISOamsb -->
168 <!-- sum is NOT the same character as U+03A3 'greek capital letter sigma'
169 though the same glyph might be used for both -->
170 <!ENTITY minus "&#8722;"> <!-- minus sign, U+2212 ISOtech -->
171 <!ENTITY lowast "&#8727;"> <!-- asterisk operator, U+2217 ISOtech -->
172 <!ENTITY radic "&#8730;"> <!-- square root = radical sign,
173 U+221A ISOtech -->
174 <!ENTITY prop "&#8733;"> <!-- proportional to, U+221D ISOtech -->
175 <!ENTITY infin "&#8734;"> <!-- infinity, U+221E ISOtech -->
176 <!ENTITY ang "&#8736;"> <!-- angle, U+2220 ISOamso -->
177 <!ENTITY and "&#8743;"> <!-- logical and = wedge, U+2227 ISOtech -->
178 <!ENTITY or "&#8744;"> <!-- logical or = vee, U+2228 ISOtech -->
179 <!ENTITY cap "&#8745;"> <!-- intersection = cap, U+2229 ISOtech -->
180 <!ENTITY cup "&#8746;"> <!-- union = cup, U+222A ISOtech -->
181 <!ENTITY int "&#8747;"> <!-- integral, U+222B ISOtech -->
182 <!ENTITY there4 "&#8756;"> <!-- therefore, U+2234 ISOtech -->
183 <!ENTITY sim "&#8764;"> <!-- tilde operator = varies with = similar to,
184 U+223C ISOtech -->
185 <!-- tilde operator is NOT the same character as the tilde, U+007E,
186 although the same glyph might be used to represent both -->
187 <!ENTITY cong "&#8773;"> <!-- approximately equal to, U+2245 ISOtech -->
188 <!ENTITY asymp "&#8776;"> <!-- almost equal to = asymptotic to,
189 U+2248 ISOamsr -->
190 <!ENTITY ne "&#8800;"> <!-- not equal to, U+2260 ISOtech -->
191 <!ENTITY equiv "&#8801;"> <!-- identical to, U+2261 ISOtech -->
192 <!ENTITY le "&#8804;"> <!-- less-than or equal to, U+2264 ISOtech -->
193 <!ENTITY ge "&#8805;"> <!-- greater-than or equal to,
194 U+2265 ISOtech -->
195 <!ENTITY sub "&#8834;"> <!-- subset of, U+2282 ISOtech -->
196 <!ENTITY sup "&#8835;"> <!-- superset of, U+2283 ISOtech -->
197 <!ENTITY nsub "&#8836;"> <!-- not a subset of, U+2284 ISOamsn -->
198 <!ENTITY sube "&#8838;"> <!-- subset of or equal to, U+2286 ISOtech -->
199 <!ENTITY supe "&#8839;"> <!-- superset of or equal to,
200 U+2287 ISOtech -->
201 <!ENTITY oplus "&#8853;"> <!-- circled plus = direct sum,
202 U+2295 ISOamsb -->
203 <!ENTITY otimes "&#8855;"> <!-- circled times = vector product,
204 U+2297 ISOamsb -->
205 <!ENTITY perp "&#8869;"> <!-- up tack = orthogonal to = perpendicular,
206 U+22A5 ISOtech -->
207 <!ENTITY sdot "&#8901;"> <!-- dot operator, U+22C5 ISOamsb -->
208 <!-- dot operator is NOT the same character as U+00B7 middle dot -->
209
210 <!-- Miscellaneous Technical -->
211 <!ENTITY lceil "&#8968;"> <!-- left ceiling = APL upstile,
212 U+2308 ISOamsc -->
213 <!ENTITY rceil "&#8969;"> <!-- right ceiling, U+2309 ISOamsc -->
214 <!ENTITY lfloor "&#8970;"> <!-- left floor = APL downstile,
215 U+230A ISOamsc -->
216 <!ENTITY rfloor "&#8971;"> <!-- right floor, U+230B ISOamsc -->
217 <!ENTITY lang "&#9001;"> <!-- left-pointing angle bracket = bra,
218 U+2329 ISOtech -->
219 <!-- lang is NOT the same character as U+003C 'less than sign'
220 or U+2039 'single left-pointing angle quotation mark' -->
221 <!ENTITY rang "&#9002;"> <!-- right-pointing angle bracket = ket,
222 U+232A ISOtech -->
223 <!-- rang is NOT the same character as U+003E 'greater than sign'
224 or U+203A 'single right-pointing angle quotation mark' -->
225
226 <!-- Geometric Shapes -->
227 <!ENTITY loz "&#9674;"> <!-- lozenge, U+25CA ISOpub -->
228
229 <!-- Miscellaneous Symbols -->
230 <!ENTITY spades "&#9824;"> <!-- black spade suit, U+2660 ISOpub -->
231 <!-- black here seems to mean filled as opposed to hollow -->
232 <!ENTITY clubs "&#9827;"> <!-- black club suit = shamrock,
233 U+2663 ISOpub -->
234 <!ENTITY hearts "&#9829;"> <!-- black heart suit = valentine,
235 U+2665 ISOpub -->
236 <!ENTITY diams "&#9830;"> <!-- black diamond suit, U+2666 ISOpub -->
+0
-1201
src/doc/WWW/xhtml1-transitional.dtd less more
0 <!--
1 Extensible HTML version 1.0 Transitional DTD
2
3 This is the same as HTML 4 Transitional except for
4 changes due to the differences between XML and SGML.
5
6 Namespace = http://www.w3.org/1999/xhtml
7
8 For further information, see: http://www.w3.org/TR/xhtml1
9
10 Copyright (c) 1998-2002 W3C (MIT, INRIA, Keio),
11 All Rights Reserved.
12
13 This DTD module is identified by the PUBLIC and SYSTEM identifiers:
14
15 PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
16 SYSTEM "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
17
18 $Revision: 1.2 $
19 $Date: 2002/08/01 18:37:55 $
20
21 -->
22
23 <!--================ Character mnemonic entities =========================-->
24
25 <!ENTITY % HTMLlat1 PUBLIC
26 "-//W3C//ENTITIES Latin 1 for XHTML//EN"
27 "xhtml-lat1.ent">
28 %HTMLlat1;
29
30 <!ENTITY % HTMLsymbol PUBLIC
31 "-//W3C//ENTITIES Symbols for XHTML//EN"
32 "xhtml-symbol.ent">
33 %HTMLsymbol;
34
35 <!ENTITY % HTMLspecial PUBLIC
36 "-//W3C//ENTITIES Special for XHTML//EN"
37 "xhtml-special.ent">
38 %HTMLspecial;
39
40 <!--================== Imported Names ====================================-->
41
42 <!ENTITY % ContentType "CDATA">
43 <!-- media type, as per [RFC2045] -->
44
45 <!ENTITY % ContentTypes "CDATA">
46 <!-- comma-separated list of media types, as per [RFC2045] -->
47
48 <!ENTITY % Charset "CDATA">
49 <!-- a character encoding, as per [RFC2045] -->
50
51 <!ENTITY % Charsets "CDATA">
52 <!-- a space separated list of character encodings, as per [RFC2045] -->
53
54 <!ENTITY % LanguageCode "NMTOKEN">
55 <!-- a language code, as per [RFC3066] -->
56
57 <!ENTITY % Character "CDATA">
58 <!-- a single character, as per section 2.2 of [XML] -->
59
60 <!ENTITY % Number "CDATA">
61 <!-- one or more digits -->
62
63 <!ENTITY % LinkTypes "CDATA">
64 <!-- space-separated list of link types -->
65
66 <!ENTITY % MediaDesc "CDATA">
67 <!-- single or comma-separated list of media descriptors -->
68
69 <!ENTITY % URI "CDATA">
70 <!-- a Uniform Resource Identifier, see [RFC2396] -->
71
72 <!ENTITY % UriList "CDATA">
73 <!-- a space separated list of Uniform Resource Identifiers -->
74
75 <!ENTITY % Datetime "CDATA">
76 <!-- date and time information. ISO date format -->
77
78 <!ENTITY % Script "CDATA">
79 <!-- script expression -->
80
81 <!ENTITY % StyleSheet "CDATA">
82 <!-- style sheet data -->
83
84 <!ENTITY % Text "CDATA">
85 <!-- used for titles etc. -->
86
87 <!ENTITY % FrameTarget "NMTOKEN">
88 <!-- render in this frame -->
89
90 <!ENTITY % Length "CDATA">
91 <!-- nn for pixels or nn% for percentage length -->
92
93 <!ENTITY % MultiLength "CDATA">
94 <!-- pixel, percentage, or relative -->
95
96 <!ENTITY % Pixels "CDATA">
97 <!-- integer representing length in pixels -->
98
99 <!-- these are used for image maps -->
100
101 <!ENTITY % Shape "(rect|circle|poly|default)">
102
103 <!ENTITY % Coords "CDATA">
104 <!-- comma separated list of lengths -->
105
106 <!-- used for object, applet, img, input and iframe -->
107 <!ENTITY % ImgAlign "(top|middle|bottom|left|right)">
108
109 <!-- a color using sRGB: #RRGGBB as Hex values -->
110 <!ENTITY % Color "CDATA">
111
112 <!-- There are also 16 widely known color names with their sRGB values:
113
114 Black = #000000 Green = #008000
115 Silver = #C0C0C0 Lime = #00FF00
116 Gray = #808080 Olive = #808000
117 White = #FFFFFF Yellow = #FFFF00
118 Maroon = #800000 Navy = #000080
119 Red = #FF0000 Blue = #0000FF
120 Purple = #800080 Teal = #008080
121 Fuchsia= #FF00FF Aqua = #00FFFF
122 -->
123
124 <!--=================== Generic Attributes ===============================-->
125
126 <!-- core attributes common to most elements
127 id document-wide unique id
128 class space separated list of classes
129 style associated style info
130 title advisory title/amplification
131 -->
132 <!ENTITY % coreattrs
133 "id ID #IMPLIED
134 class CDATA #IMPLIED
135 style %StyleSheet; #IMPLIED
136 title %Text; #IMPLIED"
137 >
138
139 <!-- internationalization attributes
140 lang language code (backwards compatible)
141 xml:lang language code (as per XML 1.0 spec)
142 dir direction for weak/neutral text
143 -->
144 <!ENTITY % i18n
145 "lang %LanguageCode; #IMPLIED
146 xml:lang %LanguageCode; #IMPLIED
147 dir (ltr|rtl) #IMPLIED"
148 >
149
150 <!-- attributes for common UI events
151 onclick a pointer button was clicked
152 ondblclick a pointer button was double clicked
153 onmousedown a pointer button was pressed down
154 onmouseup a pointer button was released
155 onmousemove a pointer was moved onto the element
156 onmouseout a pointer was moved away from the element
157 onkeypress a key was pressed and released
158 onkeydown a key was pressed down
159 onkeyup a key was released
160 -->
161 <!ENTITY % events
162 "onclick %Script; #IMPLIED
163 ondblclick %Script; #IMPLIED
164 onmousedown %Script; #IMPLIED
165 onmouseup %Script; #IMPLIED
166 onmouseover %Script; #IMPLIED
167 onmousemove %Script; #IMPLIED
168 onmouseout %Script; #IMPLIED
169 onkeypress %Script; #IMPLIED
170 onkeydown %Script; #IMPLIED
171 onkeyup %Script; #IMPLIED"
172 >
173
174 <!-- attributes for elements that can get the focus
175 accesskey accessibility key character
176 tabindex position in tabbing order
177 onfocus the element got the focus
178 onblur the element lost the focus
179 -->
180 <!ENTITY % focus
181 "accesskey %Character; #IMPLIED
182 tabindex %Number; #IMPLIED
183 onfocus %Script; #IMPLIED
184 onblur %Script; #IMPLIED"
185 >
186
187 <!ENTITY % attrs "%coreattrs; %i18n; %events;">
188
189 <!-- text alignment for p, div, h1-h6. The default is
190 align="left" for ltr headings, "right" for rtl -->
191
192 <!ENTITY % TextAlign "align (left|center|right|justify) #IMPLIED">
193
194 <!--=================== Text Elements ====================================-->
195
196 <!ENTITY % special.extra
197 "object | applet | img | map | iframe">
198
199 <!ENTITY % special.basic
200 "br | span | bdo">
201
202 <!ENTITY % special
203 "%special.basic; | %special.extra;">
204
205 <!ENTITY % fontstyle.extra "big | small | font | basefont">
206
207 <!ENTITY % fontstyle.basic "tt | i | b | u
208 | s | strike ">
209
210 <!ENTITY % fontstyle "%fontstyle.basic; | %fontstyle.extra;">
211
212 <!ENTITY % phrase.extra "sub | sup">
213 <!ENTITY % phrase.basic "em | strong | dfn | code | q |
214 samp | kbd | var | cite | abbr | acronym">
215
216 <!ENTITY % phrase "%phrase.basic; | %phrase.extra;">
217
218 <!ENTITY % inline.forms "input | select | textarea | label | button">
219
220 <!-- these can occur at block or inline level -->
221 <!ENTITY % misc.inline "ins | del | script">
222
223 <!-- these can only occur at block level -->
224 <!ENTITY % misc "noscript | %misc.inline;">
225
226 <!ENTITY % inline "a | %special; | %fontstyle; | %phrase; | %inline.forms;">
227
228 <!-- %Inline; covers inline or "text-level" elements -->
229 <!ENTITY % Inline "(#PCDATA | %inline; | %misc.inline;)*">
230
231 <!--================== Block level elements ==============================-->
232
233 <!ENTITY % heading "h1|h2|h3|h4|h5|h6">
234 <!ENTITY % lists "ul | ol | dl | menu | dir">
235 <!ENTITY % blocktext "pre | hr | blockquote | address | center | noframes">
236
237 <!ENTITY % block
238 "p | %heading; | div | %lists; | %blocktext; | isindex |fieldset | table">
239
240 <!-- %Flow; mixes block and inline and is used for list items etc. -->
241 <!ENTITY % Flow "(#PCDATA | %block; | form | %inline; | %misc;)*">
242
243 <!--================== Content models for exclusions =====================-->
244
245 <!-- a elements use %Inline; excluding a -->
246
247 <!ENTITY % a.content
248 "(#PCDATA | %special; | %fontstyle; | %phrase; | %inline.forms; | %misc.inline;)*">
249
250 <!-- pre uses %Inline excluding img, object, applet, big, small,
251 font, or basefont -->
252
253 <!ENTITY % pre.content
254 "(#PCDATA | a | %special.basic; | %fontstyle.basic; | %phrase.basic; |
255 %inline.forms; | %misc.inline;)*">
256
257 <!-- form uses %Flow; excluding form -->
258
259 <!ENTITY % form.content "(#PCDATA | %block; | %inline; | %misc;)*">
260
261 <!-- button uses %Flow; but excludes a, form, form controls, iframe -->
262
263 <!ENTITY % button.content
264 "(#PCDATA | p | %heading; | div | %lists; | %blocktext; |
265 table | br | span | bdo | object | applet | img | map |
266 %fontstyle; | %phrase; | %misc;)*">
267
268 <!--================ Document Structure ==================================-->
269
270 <!-- the namespace URI designates the document profile -->
271
272 <!ELEMENT html (head, body)>
273 <!ATTLIST html
274 %i18n;
275 id ID #IMPLIED
276 xmlns %URI; #FIXED 'http://www.w3.org/1999/xhtml'
277 >
278
279 <!--================ Document Head =======================================-->
280
281 <!ENTITY % head.misc "(script|style|meta|link|object|isindex)*">
282
283 <!-- content model is %head.misc; combined with a single
284 title and an optional base element in any order -->
285
286 <!ELEMENT head (%head.misc;,
287 ((title, %head.misc;, (base, %head.misc;)?) |
288 (base, %head.misc;, (title, %head.misc;))))>
289
290 <!ATTLIST head
291 %i18n;
292 id ID #IMPLIED
293 profile %URI; #IMPLIED
294 >
295
296 <!-- The title element is not considered part of the flow of text.
297 It should be displayed, for example as the page header or
298 window title. Exactly one title is required per document.
299 -->
300 <!ELEMENT title (#PCDATA)>
301 <!ATTLIST title
302 %i18n;
303 id ID #IMPLIED
304 >
305
306 <!-- document base URI -->
307
308 <!ELEMENT base EMPTY>
309 <!ATTLIST base
310 id ID #IMPLIED
311 href %URI; #IMPLIED
312 target %FrameTarget; #IMPLIED
313 >
314
315 <!-- generic metainformation -->
316 <!ELEMENT meta EMPTY>
317 <!ATTLIST meta
318 %i18n;
319 id ID #IMPLIED
320 http-equiv CDATA #IMPLIED
321 name CDATA #IMPLIED
322 content CDATA #REQUIRED
323 scheme CDATA #IMPLIED
324 >
325
326 <!--
327 Relationship values can be used in principle:
328
329 a) for document specific toolbars/menus when used
330 with the link element in document head e.g.
331 start, contents, previous, next, index, end, help
332 b) to link to a separate style sheet (rel="stylesheet")
333 c) to make a link to a script (rel="script")
334 d) by stylesheets to control how collections of
335 html nodes are rendered into printed documents
336 e) to make a link to a printable version of this document
337 e.g. a PostScript or PDF version (rel="alternate" media="print")
338 -->
339
340 <!ELEMENT link EMPTY>
341 <!ATTLIST link
342 %attrs;
343 charset %Charset; #IMPLIED
344 href %URI; #IMPLIED
345 hreflang %LanguageCode; #IMPLIED
346 type %ContentType; #IMPLIED
347 rel %LinkTypes; #IMPLIED
348 rev %LinkTypes; #IMPLIED
349 media %MediaDesc; #IMPLIED
350 target %FrameTarget; #IMPLIED
351 >
352
353 <!-- style info, which may include CDATA sections -->
354 <!ELEMENT style (#PCDATA)>
355 <!ATTLIST style
356 %i18n;
357 id ID #IMPLIED
358 type %ContentType; #REQUIRED
359 media %MediaDesc; #IMPLIED
360 title %Text; #IMPLIED
361 xml:space (preserve) #FIXED 'preserve'
362 >
363
364 <!-- script statements, which may include CDATA sections -->
365 <!ELEMENT script (#PCDATA)>
366 <!ATTLIST script
367 id ID #IMPLIED
368 charset %Charset; #IMPLIED
369 type %ContentType; #REQUIRED
370 language CDATA #IMPLIED
371 src %URI; #IMPLIED
372 defer (defer) #IMPLIED
373 xml:space (preserve) #FIXED 'preserve'
374 >
375
376 <!-- alternate content container for non script-based rendering -->
377
378 <!ELEMENT noscript %Flow;>
379 <!ATTLIST noscript
380 %attrs;
381 >
382
383 <!--======================= Frames =======================================-->
384
385 <!-- inline subwindow -->
386
387 <!ELEMENT iframe %Flow;>
388 <!ATTLIST iframe
389 %coreattrs;
390 longdesc %URI; #IMPLIED
391 name NMTOKEN #IMPLIED
392 src %URI; #IMPLIED
393 frameborder (1|0) "1"
394 marginwidth %Pixels; #IMPLIED
395 marginheight %Pixels; #IMPLIED
396 scrolling (yes|no|auto) "auto"
397 align %ImgAlign; #IMPLIED
398 height %Length; #IMPLIED
399 width %Length; #IMPLIED
400 >
401
402 <!-- alternate content container for non frame-based rendering -->
403
404 <!ELEMENT noframes %Flow;>
405 <!ATTLIST noframes
406 %attrs;
407 >
408
409 <!--=================== Document Body ====================================-->
410
411 <!ELEMENT body %Flow;>
412 <!ATTLIST body
413 %attrs;
414 onload %Script; #IMPLIED
415 onunload %Script; #IMPLIED
416 background %URI; #IMPLIED
417 bgcolor %Color; #IMPLIED
418 text %Color; #IMPLIED
419 link %Color; #IMPLIED
420 vlink %Color; #IMPLIED
421 alink %Color; #IMPLIED
422 >
423
424 <!ELEMENT div %Flow;> <!-- generic language/style container -->
425 <!ATTLIST div
426 %attrs;
427 %TextAlign;
428 >
429
430 <!--=================== Paragraphs =======================================-->
431
432 <!ELEMENT p %Inline;>
433 <!ATTLIST p
434 %attrs;
435 %TextAlign;
436 >
437
438 <!--=================== Headings =========================================-->
439
440 <!--
441 There are six levels of headings from h1 (the most important)
442 to h6 (the least important).
443 -->
444
445 <!ELEMENT h1 %Inline;>
446 <!ATTLIST h1
447 %attrs;
448 %TextAlign;
449 >
450
451 <!ELEMENT h2 %Inline;>
452 <!ATTLIST h2
453 %attrs;
454 %TextAlign;
455 >
456
457 <!ELEMENT h3 %Inline;>
458 <!ATTLIST h3
459 %attrs;
460 %TextAlign;
461 >
462
463 <!ELEMENT h4 %Inline;>
464 <!ATTLIST h4
465 %attrs;
466 %TextAlign;
467 >
468
469 <!ELEMENT h5 %Inline;>
470 <!ATTLIST h5
471 %attrs;
472 %TextAlign;
473 >
474
475 <!ELEMENT h6 %Inline;>
476 <!ATTLIST h6
477 %attrs;
478 %TextAlign;
479 >
480
481 <!--=================== Lists ============================================-->
482
483 <!-- Unordered list bullet styles -->
484
485 <!ENTITY % ULStyle "(disc|square|circle)">
486
487 <!-- Unordered list -->
488
489 <!ELEMENT ul (li)+>
490 <!ATTLIST ul
491 %attrs;
492 type %ULStyle; #IMPLIED
493 compact (compact) #IMPLIED
494 >
495
496 <!-- Ordered list numbering style
497
498 1 arabic numbers 1, 2, 3, ...
499 a lower alpha a, b, c, ...
500 A upper alpha A, B, C, ...
501 i lower roman i, ii, iii, ...
502 I upper roman I, II, III, ...
503
504 The style is applied to the sequence number which by default
505 is reset to 1 for the first list item in an ordered list.
506 -->
507 <!ENTITY % OLStyle "CDATA">
508
509 <!-- Ordered (numbered) list -->
510
511 <!ELEMENT ol (li)+>
512 <!ATTLIST ol
513 %attrs;
514 type %OLStyle; #IMPLIED
515 compact (compact) #IMPLIED
516 start %Number; #IMPLIED
517 >
518
519 <!-- single column list (DEPRECATED) -->
520 <!ELEMENT menu (li)+>
521 <!ATTLIST menu
522 %attrs;
523 compact (compact) #IMPLIED
524 >
525
526 <!-- multiple column list (DEPRECATED) -->
527 <!ELEMENT dir (li)+>
528 <!ATTLIST dir
529 %attrs;
530 compact (compact) #IMPLIED
531 >
532
533 <!-- LIStyle is constrained to: "(%ULStyle;|%OLStyle;)" -->
534 <!ENTITY % LIStyle "CDATA">
535
536 <!-- list item -->
537
538 <!ELEMENT li %Flow;>
539 <!ATTLIST li
540 %attrs;
541 type %LIStyle; #IMPLIED
542 value %Number; #IMPLIED
543 >
544
545 <!-- definition lists - dt for term, dd for its definition -->
546
547 <!ELEMENT dl (dt|dd)+>
548 <!ATTLIST dl
549 %attrs;
550 compact (compact) #IMPLIED
551 >
552
553 <!ELEMENT dt %Inline;>
554 <!ATTLIST dt
555 %attrs;
556 >
557
558 <!ELEMENT dd %Flow;>
559 <!ATTLIST dd
560 %attrs;
561 >
562
563 <!--=================== Address ==========================================-->
564
565 <!-- information on author -->
566
567 <!ELEMENT address (#PCDATA | %inline; | %misc.inline; | p)*>
568 <!ATTLIST address
569 %attrs;
570 >
571
572 <!--=================== Horizontal Rule ==================================-->
573
574 <!ELEMENT hr EMPTY>
575 <!ATTLIST hr
576 %attrs;
577 align (left|center|right) #IMPLIED
578 noshade (noshade) #IMPLIED
579 size %Pixels; #IMPLIED
580 width %Length; #IMPLIED
581 >
582
583 <!--=================== Preformatted Text ================================-->
584
585 <!-- content is %Inline; excluding
586 "img|object|applet|big|small|sub|sup|font|basefont" -->
587
588 <!ELEMENT pre %pre.content;>
589 <!ATTLIST pre
590 %attrs;
591 width %Number; #IMPLIED
592 xml:space (preserve) #FIXED 'preserve'
593 >
594
595 <!--=================== Block-like Quotes ================================-->
596
597 <!ELEMENT blockquote %Flow;>
598 <!ATTLIST blockquote
599 %attrs;
600 cite %URI; #IMPLIED
601 >
602
603 <!--=================== Text alignment ===================================-->
604
605 <!-- center content -->
606 <!ELEMENT center %Flow;>
607 <!ATTLIST center
608 %attrs;
609 >
610
611 <!--=================== Inserted/Deleted Text ============================-->
612
613 <!--
614 ins/del are allowed in block and inline content, but its
615 inappropriate to include block content within an ins element
616 occurring in inline content.
617 -->
618 <!ELEMENT ins %Flow;>
619 <!ATTLIST ins
620 %attrs;
621 cite %URI; #IMPLIED
622 datetime %Datetime; #IMPLIED
623 >
624
625 <!ELEMENT del %Flow;>
626 <!ATTLIST del
627 %attrs;
628 cite %URI; #IMPLIED
629 datetime %Datetime; #IMPLIED
630 >
631
632 <!--================== The Anchor Element ================================-->
633
634 <!-- content is %Inline; except that anchors shouldn't be nested -->
635
636 <!ELEMENT a %a.content;>
637 <!ATTLIST a
638 %attrs;
639 %focus;
640 charset %Charset; #IMPLIED
641 type %ContentType; #IMPLIED
642 name NMTOKEN #IMPLIED
643 href %URI; #IMPLIED
644 hreflang %LanguageCode; #IMPLIED
645 rel %LinkTypes; #IMPLIED
646 rev %LinkTypes; #IMPLIED
647 shape %Shape; "rect"
648 coords %Coords; #IMPLIED
649 target %FrameTarget; #IMPLIED
650 >
651
652 <!--===================== Inline Elements ================================-->
653
654 <!ELEMENT span %Inline;> <!-- generic language/style container -->
655 <!ATTLIST span
656 %attrs;
657 >
658
659 <!ELEMENT bdo %Inline;> <!-- I18N BiDi over-ride -->
660 <!ATTLIST bdo
661 %coreattrs;
662 %events;
663 lang %LanguageCode; #IMPLIED
664 xml:lang %LanguageCode; #IMPLIED
665 dir (ltr|rtl) #REQUIRED
666 >
667
668 <!ELEMENT br EMPTY> <!-- forced line break -->
669 <!ATTLIST br
670 %coreattrs;
671 clear (left|all|right|none) "none"
672 >
673
674 <!ELEMENT em %Inline;> <!-- emphasis -->
675 <!ATTLIST em %attrs;>
676
677 <!ELEMENT strong %Inline;> <!-- strong emphasis -->
678 <!ATTLIST strong %attrs;>
679
680 <!ELEMENT dfn %Inline;> <!-- definitional -->
681 <!ATTLIST dfn %attrs;>
682
683 <!ELEMENT code %Inline;> <!-- program code -->
684 <!ATTLIST code %attrs;>
685
686 <!ELEMENT samp %Inline;> <!-- sample -->
687 <!ATTLIST samp %attrs;>
688
689 <!ELEMENT kbd %Inline;> <!-- something user would type -->
690 <!ATTLIST kbd %attrs;>
691
692 <!ELEMENT var %Inline;> <!-- variable -->
693 <!ATTLIST var %attrs;>
694
695 <!ELEMENT cite %Inline;> <!-- citation -->
696 <!ATTLIST cite %attrs;>
697
698 <!ELEMENT abbr %Inline;> <!-- abbreviation -->
699 <!ATTLIST abbr %attrs;>
700
701 <!ELEMENT acronym %Inline;> <!-- acronym -->
702 <!ATTLIST acronym %attrs;>
703
704 <!ELEMENT q %Inline;> <!-- inlined quote -->
705 <!ATTLIST q
706 %attrs;
707 cite %URI; #IMPLIED
708 >
709
710 <!ELEMENT sub %Inline;> <!-- subscript -->
711 <!ATTLIST sub %attrs;>
712
713 <!ELEMENT sup %Inline;> <!-- superscript -->
714 <!ATTLIST sup %attrs;>
715
716 <!ELEMENT tt %Inline;> <!-- fixed pitch font -->
717 <!ATTLIST tt %attrs;>
718
719 <!ELEMENT i %Inline;> <!-- italic font -->
720 <!ATTLIST i %attrs;>
721
722 <!ELEMENT b %Inline;> <!-- bold font -->
723 <!ATTLIST b %attrs;>
724
725 <!ELEMENT big %Inline;> <!-- bigger font -->
726 <!ATTLIST big %attrs;>
727
728 <!ELEMENT small %Inline;> <!-- smaller font -->
729 <!ATTLIST small %attrs;>
730
731 <!ELEMENT u %Inline;> <!-- underline -->
732 <!ATTLIST u %attrs;>
733
734 <!ELEMENT s %Inline;> <!-- strike-through -->
735 <!ATTLIST s %attrs;>
736
737 <!ELEMENT strike %Inline;> <!-- strike-through -->
738 <!ATTLIST strike %attrs;>
739
740 <!ELEMENT basefont EMPTY> <!-- base font size -->
741 <!ATTLIST basefont
742 id ID #IMPLIED
743 size CDATA #REQUIRED
744 color %Color; #IMPLIED
745 face CDATA #IMPLIED
746 >
747
748 <!ELEMENT font %Inline;> <!-- local change to font -->
749 <!ATTLIST font
750 %coreattrs;
751 %i18n;
752 size CDATA #IMPLIED
753 color %Color; #IMPLIED
754 face CDATA #IMPLIED
755 >
756
757 <!--==================== Object ======================================-->
758 <!--
759 object is used to embed objects as part of HTML pages.
760 param elements should precede other content. Parameters
761 can also be expressed as attribute/value pairs on the
762 object element itself when brevity is desired.
763 -->
764
765 <!ELEMENT object (#PCDATA | param | %block; | form | %inline; | %misc;)*>
766 <!ATTLIST object
767 %attrs;
768 declare (declare) #IMPLIED
769 classid %URI; #IMPLIED
770 codebase %URI; #IMPLIED
771 data %URI; #IMPLIED
772 type %ContentType; #IMPLIED
773 codetype %ContentType; #IMPLIED
774 archive %UriList; #IMPLIED
775 standby %Text; #IMPLIED
776 height %Length; #IMPLIED
777 width %Length; #IMPLIED
778 usemap %URI; #IMPLIED
779 name NMTOKEN #IMPLIED
780 tabindex %Number; #IMPLIED
781 align %ImgAlign; #IMPLIED
782 border %Pixels; #IMPLIED
783 hspace %Pixels; #IMPLIED
784 vspace %Pixels; #IMPLIED
785 >
786
787 <!--
788 param is used to supply a named property value.
789 In XML it would seem natural to follow RDF and support an
790 abbreviated syntax where the param elements are replaced
791 by attribute value pairs on the object start tag.
792 -->
793 <!ELEMENT param EMPTY>
794 <!ATTLIST param
795 id ID #IMPLIED
796 name CDATA #REQUIRED
797 value CDATA #IMPLIED
798 valuetype (data|ref|object) "data"
799 type %ContentType; #IMPLIED
800 >
801
802 <!--=================== Java applet ==================================-->
803 <!--
804 One of code or object attributes must be present.
805 Place param elements before other content.
806 -->
807 <!ELEMENT applet (#PCDATA | param | %block; | form | %inline; | %misc;)*>
808 <!ATTLIST applet
809 %coreattrs;
810 codebase %URI; #IMPLIED
811 archive CDATA #IMPLIED
812 code CDATA #IMPLIED
813 object CDATA #IMPLIED
814 alt %Text; #IMPLIED
815 name NMTOKEN #IMPLIED
816 width %Length; #REQUIRED
817 height %Length; #REQUIRED
818 align %ImgAlign; #IMPLIED
819 hspace %Pixels; #IMPLIED
820 vspace %Pixels; #IMPLIED
821 >
822
823 <!--=================== Images ===========================================-->
824
825 <!--
826 To avoid accessibility problems for people who aren't
827 able to see the image, you should provide a text
828 description using the alt and longdesc attributes.
829 In addition, avoid the use of server-side image maps.
830 -->
831
832 <!ELEMENT img EMPTY>
833 <!ATTLIST img
834 %attrs;
835 src %URI; #REQUIRED
836 alt %Text; #REQUIRED
837 name NMTOKEN #IMPLIED
838 longdesc %URI; #IMPLIED
839 height %Length; #IMPLIED
840 width %Length; #IMPLIED
841 usemap %URI; #IMPLIED
842 ismap (ismap) #IMPLIED
843 align %ImgAlign; #IMPLIED
844 border %Length; #IMPLIED
845 hspace %Pixels; #IMPLIED
846 vspace %Pixels; #IMPLIED
847 >
848
849 <!-- usemap points to a map element which may be in this document
850 or an external document, although the latter is not widely supported -->
851
852 <!--================== Client-side image maps ============================-->
853
854 <!-- These can be placed in the same document or grouped in a
855 separate document although this isn't yet widely supported -->
856
857 <!ELEMENT map ((%block; | form | %misc;)+ | area+)>
858 <!ATTLIST map
859 %i18n;
860 %events;
861 id ID #REQUIRED
862 class CDATA #IMPLIED
863 style %StyleSheet; #IMPLIED
864 title %Text; #IMPLIED
865 name CDATA #IMPLIED
866 >
867
868 <!ELEMENT area EMPTY>
869 <!ATTLIST area
870 %attrs;
871 %focus;
872 shape %Shape; "rect"
873 coords %Coords; #IMPLIED
874 href %URI; #IMPLIED
875 nohref (nohref) #IMPLIED
876 alt %Text; #REQUIRED
877 target %FrameTarget; #IMPLIED
878 >
879
880 <!--================ Forms ===============================================-->
881
882 <!ELEMENT form %form.content;> <!-- forms shouldn't be nested -->
883
884 <!ATTLIST form
885 %attrs;
886 action %URI; #REQUIRED
887 method (get|post) "get"
888 name NMTOKEN #IMPLIED
889 enctype %ContentType; "application/x-www-form-urlencoded"
890 onsubmit %Script; #IMPLIED
891 onreset %Script; #IMPLIED
892 accept %ContentTypes; #IMPLIED
893 accept-charset %Charsets; #IMPLIED
894 target %FrameTarget; #IMPLIED
895 >
896
897 <!--
898 Each label must not contain more than ONE field
899 Label elements shouldn't be nested.
900 -->
901 <!ELEMENT label %Inline;>
902 <!ATTLIST label
903 %attrs;
904 for IDREF #IMPLIED
905 accesskey %Character; #IMPLIED
906 onfocus %Script; #IMPLIED
907 onblur %Script; #IMPLIED
908 >
909
910 <!ENTITY % InputType
911 "(text | password | checkbox |
912 radio | submit | reset |
913 file | hidden | image | button)"
914 >
915
916 <!-- the name attribute is required for all but submit & reset -->
917
918 <!ELEMENT input EMPTY> <!-- form control -->
919 <!ATTLIST input
920 %attrs;
921 %focus;
922 type %InputType; "text"
923 name CDATA #IMPLIED
924 value CDATA #IMPLIED
925 checked (checked) #IMPLIED
926 disabled (disabled) #IMPLIED
927 readonly (readonly) #IMPLIED
928 size CDATA #IMPLIED
929 maxlength %Number; #IMPLIED
930 src %URI; #IMPLIED
931 alt CDATA #IMPLIED
932 usemap %URI; #IMPLIED
933 onselect %Script; #IMPLIED
934 onchange %Script; #IMPLIED
935 accept %ContentTypes; #IMPLIED
936 align %ImgAlign; #IMPLIED
937 >
938
939 <!ELEMENT select (optgroup|option)+> <!-- option selector -->
940 <!ATTLIST select
941 %attrs;
942 name CDATA #IMPLIED
943 size %Number; #IMPLIED
944 multiple (multiple) #IMPLIED
945 disabled (disabled) #IMPLIED
946 tabindex %Number; #IMPLIED
947 onfocus %Script; #IMPLIED
948 onblur %Script; #IMPLIED
949 onchange %Script; #IMPLIED
950 >
951
952 <!ELEMENT optgroup (option)+> <!-- option group -->
953 <!ATTLIST optgroup
954 %attrs;
955 disabled (disabled) #IMPLIED
956 label %Text; #REQUIRED
957 >
958
959 <!ELEMENT option (#PCDATA)> <!-- selectable choice -->
960 <!ATTLIST option
961 %attrs;
962 selected (selected) #IMPLIED
963 disabled (disabled) #IMPLIED
964 label %Text; #IMPLIED
965 value CDATA #IMPLIED
966 >
967
968 <!ELEMENT textarea (#PCDATA)> <!-- multi-line text field -->
969 <!ATTLIST textarea
970 %attrs;
971 %focus;
972 name CDATA #IMPLIED
973 rows %Number; #REQUIRED
974 cols %Number; #REQUIRED
975 disabled (disabled) #IMPLIED
976 readonly (readonly) #IMPLIED
977 onselect %Script; #IMPLIED
978 onchange %Script; #IMPLIED
979 >
980
981 <!--
982 The fieldset element is used to group form fields.
983 Only one legend element should occur in the content
984 and if present should only be preceded by whitespace.
985 -->
986 <!ELEMENT fieldset (#PCDATA | legend | %block; | form | %inline; | %misc;)*>
987 <!ATTLIST fieldset
988 %attrs;
989 >
990
991 <!ENTITY % LAlign "(top|bottom|left|right)">
992
993 <!ELEMENT legend %Inline;> <!-- fieldset label -->
994 <!ATTLIST legend
995 %attrs;
996 accesskey %Character; #IMPLIED
997 align %LAlign; #IMPLIED
998 >
999
1000 <!--
1001 Content is %Flow; excluding a, form, form controls, iframe
1002 -->
1003 <!ELEMENT button %button.content;> <!-- push button -->
1004 <!ATTLIST button
1005 %attrs;
1006 %focus;
1007 name CDATA #IMPLIED
1008 value CDATA #IMPLIED
1009 type (button|submit|reset) "submit"
1010 disabled (disabled) #IMPLIED
1011 >
1012
1013 <!-- single-line text input control (DEPRECATED) -->
1014 <!ELEMENT isindex EMPTY>
1015 <!ATTLIST isindex
1016 %coreattrs;
1017 %i18n;
1018 prompt %Text; #IMPLIED
1019 >
1020
1021 <!--======================= Tables =======================================-->
1022
1023 <!-- Derived from IETF HTML table standard, see [RFC1942] -->
1024
1025 <!--
1026 The border attribute sets the thickness of the frame around the
1027 table. The default units are screen pixels.
1028
1029 The frame attribute specifies which parts of the frame around
1030 the table should be rendered. The values are not the same as
1031 CALS to avoid a name clash with the valign attribute.
1032 -->
1033 <!ENTITY % TFrame "(void|above|below|hsides|lhs|rhs|vsides|box|border)">
1034
1035 <!--
1036 The rules attribute defines which rules to draw between cells:
1037
1038 If rules is absent then assume:
1039 "none" if border is absent or border="0" otherwise "all"
1040 -->
1041
1042 <!ENTITY % TRules "(none | groups | rows | cols | all)">
1043
1044 <!-- horizontal placement of table relative to document -->
1045 <!ENTITY % TAlign "(left|center|right)">
1046
1047 <!-- horizontal alignment attributes for cell contents
1048
1049 char alignment char, e.g. char=':'
1050 charoff offset for alignment char
1051 -->
1052 <!ENTITY % cellhalign
1053 "align (left|center|right|justify|char) #IMPLIED
1054 char %Character; #IMPLIED
1055 charoff %Length; #IMPLIED"
1056 >
1057
1058 <!-- vertical alignment attributes for cell contents -->
1059 <!ENTITY % cellvalign
1060 "valign (top|middle|bottom|baseline) #IMPLIED"
1061 >
1062
1063 <!ELEMENT table
1064 (caption?, (col*|colgroup*), thead?, tfoot?, (tbody+|tr+))>
1065 <!ELEMENT caption %Inline;>
1066 <!ELEMENT thead (tr)+>
1067 <!ELEMENT tfoot (tr)+>
1068 <!ELEMENT tbody (tr)+>
1069 <!ELEMENT colgroup (col)*>
1070 <!ELEMENT col EMPTY>
1071 <!ELEMENT tr (th|td)+>
1072 <!ELEMENT th %Flow;>
1073 <!ELEMENT td %Flow;>
1074
1075 <!ATTLIST table
1076 %attrs;
1077 summary %Text; #IMPLIED
1078 width %Length; #IMPLIED
1079 border %Pixels; #IMPLIED
1080 frame %TFrame; #IMPLIED
1081 rules %TRules; #IMPLIED
1082 cellspacing %Length; #IMPLIED
1083 cellpadding %Length; #IMPLIED
1084 align %TAlign; #IMPLIED
1085 bgcolor %Color; #IMPLIED
1086 >
1087
1088 <!ENTITY % CAlign "(top|bottom|left|right)">
1089
1090 <!ATTLIST caption
1091 %attrs;
1092 align %CAlign; #IMPLIED
1093 >
1094
1095 <!--
1096 colgroup groups a set of col elements. It allows you to group
1097 several semantically related columns together.
1098 -->
1099 <!ATTLIST colgroup
1100 %attrs;
1101 span %Number; "1"
1102 width %MultiLength; #IMPLIED
1103 %cellhalign;
1104 %cellvalign;
1105 >
1106
1107 <!--
1108 col elements define the alignment properties for cells in
1109 one or more columns.
1110
1111 The width attribute specifies the width of the columns, e.g.
1112
1113 width=64 width in screen pixels
1114 width=0.5* relative width of 0.5
1115
1116 The span attribute causes the attributes of one
1117 col element to apply to more than one column.
1118 -->
1119 <!ATTLIST col
1120 %attrs;
1121 span %Number; "1"
1122 width %MultiLength; #IMPLIED
1123 %cellhalign;
1124 %cellvalign;
1125 >
1126
1127 <!--
1128 Use thead to duplicate headers when breaking table
1129 across page boundaries, or for static headers when
1130 tbody sections are rendered in scrolling panel.
1131
1132 Use tfoot to duplicate footers when breaking table
1133 across page boundaries, or for static footers when
1134 tbody sections are rendered in scrolling panel.
1135
1136 Use multiple tbody sections when rules are needed
1137 between groups of table rows.
1138 -->
1139 <!ATTLIST thead
1140 %attrs;
1141 %cellhalign;
1142 %cellvalign;
1143 >
1144
1145 <!ATTLIST tfoot
1146 %attrs;
1147 %cellhalign;
1148 %cellvalign;
1149 >
1150
1151 <!ATTLIST tbody
1152 %attrs;
1153 %cellhalign;
1154 %cellvalign;
1155 >
1156
1157 <!ATTLIST tr
1158 %attrs;
1159 %cellhalign;
1160 %cellvalign;
1161 bgcolor %Color; #IMPLIED
1162 >
1163
1164 <!-- Scope is simpler than headers attribute for common tables -->
1165 <!ENTITY % Scope "(row|col|rowgroup|colgroup)">
1166
1167 <!-- th is for headers, td for data and for cells acting as both -->
1168
1169 <!ATTLIST th
1170 %attrs;
1171 abbr %Text; #IMPLIED
1172 axis CDATA #IMPLIED
1173 headers IDREFS #IMPLIED
1174 scope %Scope; #IMPLIED
1175 rowspan %Number; "1"
1176 colspan %Number; "1"
1177 %cellhalign;
1178 %cellvalign;
1179 nowrap (nowrap) #IMPLIED
1180 bgcolor %Color; #IMPLIED
1181 width %Length; #IMPLIED
1182 height %Length; #IMPLIED
1183 >
1184
1185 <!ATTLIST td
1186 %attrs;
1187 abbr %Text; #IMPLIED
1188 axis CDATA #IMPLIED
1189 headers IDREFS #IMPLIED
1190 scope %Scope; #IMPLIED
1191 rowspan %Number; "1"
1192 colspan %Number; "1"
1193 %cellhalign;
1194 %cellvalign;
1195 nowrap (nowrap) #IMPLIED
1196 bgcolor %Color; #IMPLIED
1197 width %Length; #IMPLIED
1198 height %Length; #IMPLIED
1199 >
1200
+0
-176
src/testdata/swissSmall less more
0 ID 100K_RAT STANDARD; PRT; 889 AA.
1 AC Q62671;
2 DT 01-NOV-1997 (Rel. 35, Created)
3 DT 01-NOV-1997 (Rel. 35, Last sequence update)
4 DT 15-JUL-1999 (Rel. 38, Last annotation update)
5 DE 100 KD PROTEIN (EC 6.3.2.-).
6 OS Rattus norvegicus (Rat).
7 OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Mammalia;
8 OC Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus.
9 RN [1]
10 RP SEQUENCE FROM N.A.
11 RC STRAIN=WISTAR; TISSUE=TESTIS;
12 RX MEDLINE; 92253337.
13 RA MUELLER D., REHBEIN M., BAUMEISTER H., RICHTER D.;
14 RT "Molecular characterization of a novel rat protein structurally
15 RT related to poly(A) binding proteins and the 70K protein of the U1
16 RT small nuclear ribonucleoprotein particle (snRNP).";
17 RL Nucleic Acids Res. 20:1471-1475(1992).
18 RN [2]
19 RP ERRATUM.
20 RA MUELLER D., REHBEIN M., BAUMEISTER H., RICHTER D.;
21 RL Nucleic Acids Res. 20:2624-2624(1992).
22 CC -!- FUNCTION: E3 UBIQUITIN-PROTEIN LIGASE WHICH ACCEPTS UBIQUITIN FROM
23 CC AN E2 UBIQUITIN-CONJUGATING ENZYME IN THE FORM OF A THIOESTER AND
24 CC THEN DIRECTLY TRANSFERS THE UBIQUITIN TO TARGETED SUBSTRATES (BY
25 CC SIMILARITY). THIS PROTEIN MAY BE INVOLVED IN MATURATION AND/OR
26 CC POST-TRANSCRIPTIONAL REGULATION OF MRNA.
27 CC -!- TISSUE SPECIFICITY: HIGHEST LEVELS FOUND IN TESTIS. ALSO PRESENT
28 CC IN LIVER, KIDNEY, LUNG AND BRAIN.
29 CC -!- DEVELOPMENTAL STAGE: IN EARLY POST-NATAL LIFE, EXPRESSION IN
30 CC THE TESTIS INCREASES TO REACH A MAXIMUM AROUND DAY 28.
31 CC -!- MISCELLANEOUS: A CYSTEINE RESIDUE IS REQUIRED FOR
32 CC UBIQUITIN-THIOLESTER FORMATION.
33 CC -!- SIMILARITY: CONTAINS AN HECT-TYPE E3 UBIQUITIN-PROTEIN LIGASE
34 CC DOMAIN.
35 CC -!- SIMILARITY: A CENTRAL REGION (AA 485-514) IS SIMILAR TO THE
36 CC C-TERMINAL DOMAINS OF MAMMALIAN AND YEAST POLY (A) RNA BINDING
37 CC PROTEINS (PABP).
38 CC -!- SIMILARITY: THE C-TERMINAL HALF SHOWS HIGH SIMILARITY TO
39 CC DROSOPHILA HYPERPLASMIC DISC PROTEIN AND SOME, TO HUMAN E6-AP.
40 CC -!- SIMILARITY: CONTAINS MIXED-CHARGE DOMAINS SIMILAR TO RNA-BINDING
41 CC PROTEINS.
42 CC --------------------------------------------------------------------------
43 CC This SWISS-PROT entry is copyright. It is produced through a collaboration
44 CC between the Swiss Institute of Bioinformatics and the EMBL outstation -
45 CC the European Bioinformatics Institute. There are no restrictions on its
46 CC use by non-profit institutions as long as its content is in no way
47 CC modified and this statement is not removed. Usage by and for commercial
48 CC entities requires a license agreement (See http://www.isb-sib.ch/announce/
49 CC or send an email to license@isb-sib.ch).
50 CC --------------------------------------------------------------------------
51 DR EMBL; X64411; CAA45756.1; -.
52 DR PFAM; PF00632; HECT; 1.
53 DR PFAM; PF00658; PABP; 1.
54 KW Ubiquitin conjugation; Ligase.
55 FT DOMAIN 77 88 ASP/GLU-RICH (ACIDIC).
56 FT DOMAIN 127 150 PRO-RICH.
57 FT DOMAIN 420 439 ARG/GLU-RICH (MIXED CHARGE).
58 FT DOMAIN 448 457 ARG/ASP-RICH (MIXED CHARGE).
59 FT DOMAIN 485 514 PABP-LIKE.
60 FT DOMAIN 579 590 ASP/GLU-RICH (ACIDIC).
61 FT DOMAIN 786 889 HECT DOMAIN.
62 FT DOMAIN 827 847 PRO-RICH.
63 FT BINDING 858 858 UBIQUITIN (BY SIMILARITY).
64 SQ SEQUENCE 889 AA; 100368 MW; DD7E6C7A CRC32;
65 MMSARGDFLN YALSLMRSHN DEHSDVLPVL DVCSLKHVAY VFQALIYWIK AMNQQTTLDT
66 PQLERKRTRE LLELGIDNED SEHENDDDTS QSATLNDKDD ESLPAETGQN HPFFRRSDSM
67 TFLGCIPPNP FEVPLAEAIP LADQPHLLQP NARKEDLFGR PSQGLYSSSA GSGKCLVEVT
68 MDRNCLEVLP TKMSYAANLK NVMNMQNRQK KAGEDQSMLA EEADSSKPGP SAHDVAAQLK
69 SSLLAEIGLT ESEGPPLTSF RPQCSFMGMV ISHDMLLGRW RLSLELFGRV FMEDVGAEPG
70 SILTELGGFE VKESKFRREM EKLRNQQSRD LSLEVDRDRD LLIQQTMRQL NNHFGRRCAT
71 TPMAVHRVKV TFKDEPGEGS GVARSFYTAI AQAFLSNEKL PNLDCIQNAN KGTHTSLMQR
72 LRNRGERDRE REREREMRRS SGLRAGSRRD RDRDFRRQLS IDTRPFRPAS EGNPSDDPDP
73 LPAHRQALGE RLYPRVQAMQ PAFASKITGM LLELSPAQLL LLLASEDSLR ARVEEAMELI
74 VAHGRENGAD SILDLGLLDS SEKVQENRKR HGSSRSVVDM DLDDTDDGDD NAPLFYQPGK
75 RGFYTPRPGK NTEARLNCFR NIGRILGLCL LQNELCPITL NRHVIKVLLG RKVNWHDFAF
76 FDPVMYESLR QLILASQSSD ADAVFSAMDL AFAVDLCKEE GGGQVELIPN GVNIPVTPQN
77 VYEYVRKYAE HRMLVVAEQP LHAMRKGLLD VLPKNSLEDL TAEDFRLLVN GCGEVNVQML
78 ISFTSFNDES GENAEKLLQF KRWFWSIVER MSMTERQDLV YFWTSSPSLP ASEEGFQPMP
79 SITIRPPDDQ HLPTANTCIS RLYVPLYSSK QILKQKLLLA IKTKNFGFV
80 //
81 ID 104K_THEPA STANDARD; PRT; 924 AA.
82 AC P15711;
83 DT 01-APR-1990 (Rel. 14, Created)
84 DT 01-APR-1990 (Rel. 14, Last sequence update)
85 DT 01-AUG-1992 (Rel. 23, Last annotation update)
86 DE 104 KD MICRONEME-RHOPTRY ANTIGEN.
87 OS Theileria parva.
88 OC Eukaryota; Alveolata; Apicomplexa; Piroplasmida; Theileriidae;
89 OC Theileria.
90 RN [1]
91 RP SEQUENCE FROM N.A.
92 RC STRAIN=MUGUGA;
93 RX MEDLINE; 90158697.
94 RA IAMS K.P., YOUNG J.R., NENE V., DESAI J., WEBSTER P.,
95 RA OLE-MOIYOI O.K., MUSOKE A.J.;
96 RT "Characterisation of the gene encoding a 104-kilodalton microneme-
97 RT rhoptry protein of Theileria parva.";
98 RL Mol. Biochem. Parasitol. 39:47-60(1990).
99 CC -!- SUBCELLULAR LOCATION: IN MICRONEME/RHOPTRY COMPLEXES.
100 CC -!- DEVELOPMENTAL STAGE: SPOROZOITE ANTIGEN.
101 CC --------------------------------------------------------------------------
102 CC This SWISS-PROT entry is copyright. It is produced through a collaboration
103 CC between the Swiss Institute of Bioinformatics and the EMBL outstation -
104 CC the European Bioinformatics Institute. There are no restrictions on its
105 CC use by non-profit institutions as long as its content is in no way
106 CC modified and this statement is not removed. Usage by and for commercial
107 CC entities requires a license agreement (See http://www.isb-sib.ch/announce/
108 CC or send an email to license@isb-sib.ch).
109 CC --------------------------------------------------------------------------
110 DR EMBL; M29954; AAA18217.1; -.
111 DR PIR; A44945; A44945.
112 KW Antigen; Sporozoite; Repeat.
113 FT DOMAIN 1 19 HYDROPHOBIC.
114 FT DOMAIN 905 924 HYDROPHOBIC.
115 SQ SEQUENCE 924 AA; 103625 MW; 4563AAA0 CRC32;
116 MKFLILLFNI LCLFPVLAAD NHGVGPQGAS GVDPITFDIN SNQTGPAFLT AVEMAGVKYL
117 QVQHGSNVNI HRLVEGNVVI WENASTPLYT GAIVTNNDGP YMAYVEVLGD PNLQFFIKSG
118 DAWVTLSEHE YLAKLQEIRQ AVHIESVFSL NMAFQLENNK YEVETHAKNG ANMVTFIPRN
119 GHICKMVYHK NVRIYKATGN DTVTSVVGFF RGLRLLLINV FSIDDNGMMS NRYFQHVDDK
120 YVPISQKNYE TGIVKLKDYK HAYHPVDLDI KDIDYTMFHL ADATYHEPCF KIIPNTGFCI
121 TKLFDGDQVL YESFNPLIHC INEVHIYDRN NGSIICLHLN YSPPSYKAYL VLKDTGWEAT
122 THPLLEEKIE ELQDQRACEL DVNFISDKDL YVAALTNADL NYTMVTPRPH RDVIRVSDGS
123 EVLWYYEGLD NFLVCAWIYV SDGVASLVHL RIKDRIPANN DIYVLKGDLY WTRITKIQFT
124 QEIKRLVKKS KKKLAPITEE DSDKHDEPPE GPGASGLPPK APGDKEGSEG HKGPSKGSDS
125 SKEGKKPGSG KKPGPAREHK PSKIPTLSKK PSGPKDPKHP RDPKEPRKSK SPRTASPTRR
126 PSPKLPQLSK LPKSTSPRSP PPPTRPSSPE RPEGTKIIKT SKPPSPKPPF DPSFKEKFYD
127 DYSKAASRSK ETKTTVVLDE SFESILKETL PETPGTPFTT PRPVPPKRPR TPESPFEPPK
128 DPDSPSTSPS EFFTPPESKR TRFHETPADT PLPDVTAELF KEPDVTAETK SPDEAMKRPR
129 SPSEYEDTSP GDYPSLPMKR HRLERLRLTT TEMETDPGRM AKDASGKPVK LKRSKSFDDL
130 TTVELAPEPK ASRIVVDDEG TEADDEETHP PEERQKTEVR RRRPPKKPSK SPRPSKPKKP
131 KKPDSAYIPS ILAILVVSLI VGIL
132 //
133 ID 108_LYCES STANDARD; PRT; 102 AA.
134 AC Q43495;
135 DT 15-JUL-1999 (Rel. 38, Created)
136 DT 15-JUL-1999 (Rel. 38, Last sequence update)
137 DT 15-JUL-1999 (Rel. 38, Last annotation update)
138 DE PROTEIN 108 PRECURSOR.
139 OS Lycopersicon esculentum (Tomato).
140 OC Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta;
141 OC euphyllophytes; Spermatophyta; Magnoliophyta; eudicotyledons;
142 OC core eudicots; Asteridae; euasterids I; Solanales; Solanaceae;
143 OC Solanum.
144 RN [1]
145 RP SEQUENCE FROM N.A.
146 RC STRAIN=CV. VF36; TISSUE=ANTHER;
147 RX MEDLINE; 94143497.
148 RA CHEN R., SMITH A.G.;
149 RT "Nucleotide sequence of a stamen- and tapetum-specific gene from
150 RT Lycopersicon esculentum.";
151 RL Plant Physiol. 101:1413-1413(1993).
152 CC -!- TISSUE SPECIFICITY: STAMEN- AND TAPETUM-SPECIFIC.
153 CC -!- SIMILARITY: BELONGS TO THE A9 / FIL1 FAMILY.
154 CC --------------------------------------------------------------------------
155 CC This SWISS-PROT entry is copyright. It is produced through a collaboration
156 CC between the Swiss Institute of Bioinformatics and the EMBL outstation -
157 CC the European Bioinformatics Institute. There are no restrictions on its
158 CC use by non-profit institutions as long as its content is in no way
159 CC modified and this statement is not removed. Usage by and for commercial
160 CC entities requires a license agreement (See http://www.isb-sib.ch/announce/
161 CC or send an email to license@isb-sib.ch).
162 CC --------------------------------------------------------------------------
163 DR EMBL; Z14088; CAA78466.1; -.
164 DR MENDEL; 8853; LYCes;1133;1.
165 KW Signal.
166 FT SIGNAL 1 30 POTENTIAL.
167 FT CHAIN 31 102 PROTEIN 108.
168 FT DISULFID 41 77 BY SIMILARITY.
169 FT DISULFID 51 66 BY SIMILARITY.
170 FT DISULFID 67 92 BY SIMILARITY.
171 FT DISULFID 79 99 BY SIMILARITY.
172 SQ SEQUENCE 102 AA; 10576 MW; AFA4875A CRC32;
173 MASVKSSSSS SSSSFISLLL LILLVIVLQS QVIECQPQQS CTASLTGLNV CAPFLVPGSP
174 TASTECCNAV QSINHDCMCN TMRIAAQIPA QCNLPPLSCS AN
175 //