Commit a1e48a616208581c838627f0fde67dd3abba1ac5 - tigr-glimmer

+5

-0

debian/NEWS.Debian less more

	0	With version 2.13-1, the binaries and man pages all have a prefix "tigr-"
	1	to avoid conflicts with other programs, with extract in particular.
	2
	3	-- Steffen Moeller <moeller@pzr.uni-rostock.de>, Thu, 10 Nov 2004 17:33:46 +0100
	4

+23

-0

debian/README.Debian less more

	0	tigr-glimmer for Debian
	1	-----------------------
	2
	3	The glimmer software of the TIGR institute was renamed to tigr-glimmer
	4	because of a name conflict with the GNOME library. The package works
	5	for me, most efforts went into the reformatting of the readme files
	6	for the man pages, feedback is welcome.
	7
	8	The upstream authors are very supportive of this debian package for
	9	their software and I thank them for this.
	10
	11	In version 2.13-1, the binaries and man pages all have a prefix "tigr-"
	12	to avoid conflicts with other programs, with extract in particular.
	13	This was changed in the current packaging of version 3.x in favour of
	14	putting the executables under /usr/lib/tigr-glimmer. A wrapper is
	15	provided that enables to call the binaries via
	16
	17	tigr-glimmer <binary>
	18
	19	(see man tigr-glimmer). Alternatively you might add this directory
	20	to your search PATH to call the binaries directly.
	21
	22	-- Steffen Moeller <moeller@debian.org>, Thu, 10 Nov 2004 12:33:46 +0100

+24

-0

debian/bin/tigr-glimmer less more

	0	#!/bin/sh
	1
	2	BINDIR=/usr/lib/tigr-glimmer
	3
	4	if [ $# -lt 1 ] ; then
	5	echo "Usage: $0 <program>" 1>&2
	6	echo " Existing programs are:"
	7	ls ${BINDIR}
	8	exit 1
	9	fi
	10
	11	WRAPPER=$0
	12	PROGRAM=$1
	13	shift
	14	ARGS=$*
	15
	16	if [ -x ${BINDIR}/${PROGRAM} ]; then
	17	exec ${BINDIR}/${PROGRAM} ${ARGS}
	18	else
	19	echo "Usage: ${PROGRAM} does not exist in Tigr Glimmer"
	20	echo " Existing programs are:"
	21	ls ${BINDIR}
	22	exit 1
	23	fi

+20

-0

debian/bin/tigr-run-glimmer3 less more

	0	#!/bin/sh
	1	echo "run Glimmer3"
	2	clear
	3	echo "Genome is " $1
	4	echo "Find non-overlapping orfs in tmp.coord"
	5	BINDIR="/usr/lib/tigr-glimmer"
	6	rm -f tmp.coord
	7	${BINDIR}/long-orfs $1 \| ${BINDIR}/get-putative >tmp.coord
	8	echo "Extract training sequences to tmp.train"
	9	rm -f tmp.train
	10	${BINDIR}/extract $1 tmp.coord >tmp.train
	11	wc tmp.train
	12	echo "Build interpolated context model in tmp.model"
	13	rm -f tmp.model
	14	${BINDIR}/build-icm <tmp.train >tmp.model
	15	echo "Predict genes with Glimmer3 with coordinates in g3.coord"
	16	rm -f g3.coord
	17	# get-putative is ot contained in version 3.x any more
	18	# ${BINDIR}/glimmer3 $1 tmp.model \| ${BINDIR}/get-putative >g3.coord
	19	${BINDIR}/glimmer3 $1 tmp.model >g3.coord

+82

-0

debian/changelog less more

	0	tigr-glimmer (3.02-4) unstable; urgency=medium
	1
	2	* moved debian/upstream to debian/upstream/metadata
	3	* cme fix dpkg-control
	4	* Fix crashes reported by Mayhem
	5	Closes: #715701, #715702
	6
	7	-- Andreas Tille <tille@debian.org> Tue, 15 Dec 2015 10:17:14 +0100
	8
	9	tigr-glimmer (3.02-3) unstable; urgency=low
	10
	11	* debian/upstream: publication information
	12	* debian/source/format: 3.0 (quilt)
	13	* debian/control:
	14	- updated homepage URL
	15	- cme fix dpkg-control
	16	- canonical Vcs URLs
	17	- dropped cdbs + quilt from Build-Depends
	18	* debian/README.source: deleted because redundant
	19	* debian/rules: switch from cdbs to dh
	20	* Hardening by droping Makefile patch in favour of providing
	21	options directly inside debian/rules
	22	* debian/copyright: DEP5
	23	* Verified current build log and noticed that -L/usr/lib is not used
	24	Closes: #722845
	25
	26	-- Andreas Tille <tille@debian.org> Tue, 05 Nov 2013 10:33:48 +0100
	27
	28	tigr-glimmer (3.02-2) unstable; urgency=low
	29
	30	* debian/control:
	31	- Fixed Vcs-Svn (missing svn/)
	32	- Updated Standards-Version to 3.8.1 (no changes needed)
	33	- Standards-Version: 3.8.3 (no changes needed)
	34	- debhelper (>= 7)
	35	* Fixed E-Mail address of upstream author in debian/copyright
	36	* Fix FTBFS on amd64
	37	Closes: #560442
	38	* Added README.source
	39
	40	-- Andreas Tille <tille@debian.org> Thu, 21 Jan 2010 22:52:45 +0100
	41
	42	tigr-glimmer (3.02-1) unstable; urgency=low
	43
	44	[ Charles Plessy ]
	45	* debian/watch:
	46	- Replaced by the new one written by Nelson (Closes: #385258)
	47
	48	[ Andreas Tille ]
	49	* New upstream version
	50	* Group maintenance by Debian-Med team
	51	- DM-Upload-Allowed: Yes
	52	- Vcs tags
	53	- Use correct address as Uploader: Steffen Moeller <moeller@debian.org>
	54	* Standards-Version: 3.7.3 (no changes needed)
	55	* debhelper >= 5
	56	* Moved Homepage from long description to control fields
	57	* Removed [Biology] from short description
	58
	59	-- Andreas Tille <tille@debian.org> Tue, 22 Apr 2008 11:59:07 +0200
	60
	61	tigr-glimmer (2.13-1.1) unstable; urgency=low
	62
	63	* Non-maintainer upload.
	64	* Fix GCC 4.3 compatibility, patch by Kumar Appaiah (Closes: #461691)
	65
	66	-- Moritz Muehlenhoff <jmm@debian.org> Thu, 20 Mar 2008 00:02:06 +0100
	67
	68	tigr-glimmer (2.13-1) unstable; urgency=low
	69
	70	* New upstream release - no significant changes for Linux users.
	71	* Resolves conflict for "extract" binary and man page (Closes:Bug#227790,Bug#274780).
	72
	73	-- Steffen Moeller <moeller@pzr.uni-rostock.de> Wed, 10 Nov 2004 11:58:46 +0100
	74
	75	tigr-glimmer (2.12-1) unstable; urgency=low
	76
	77	* Initial Release (Closes:#219453).
	78	* Added man pages to upstream release
	79
	80	-- Steffen Moeller <moeller@pzr.uni-rostock.de> Thu, 16 Oct 2003 17:33:46 +0200
	81

+1

-0

debian/compat less more

0

9

+25

-0

debian/control less more

	0	Source: tigr-glimmer
	1	Maintainer: Debian Med Packaging Team <debian-med-packaging@lists.alioth.debian.org>
	2	Uploaders: Steffen Moeller <moeller@debian.org>,
	3	Andreas Tille <tille@debian.org>
	4	Section: science
	5	Priority: optional
	6	Build-Depends: debhelper (>= 9),
	7	docbook-to-man
	8	Standards-Version: 3.9.6
	9	Vcs-Browser: http://anonscm.debian.org/viewvc/debian-med/trunk/packages/tigr-glimmer/trunk/
	10	Vcs-Svn: svn://anonscm.debian.org/debian-med/trunk/packages/tigr-glimmer/trunk/
	11	Homepage: http://ccb.jhu.edu/software/glimmer/index.shtml
	12
	13	Package: tigr-glimmer
	14	Architecture: any
	15	Depends: ${shlibs:Depends},
	16	${misc:Depends}
	17	Description: Gene detection in archea and bacteria
	18	Developed by the TIGR institute this software detects coding sequences in
	19	bacteria and archea.
	20	.
	21	Glimmer is a system for finding genes in microbial DNA, especially the
	22	genomes of bacteria and archaea. Glimmer (Gene Locator and Interpolated
	23	Markov Modeler) uses interpolated Markov models (IMMs) to identify the
	24	coding regions and distinguish them from noncoding DNA.

+23

-0

debian/copyright less more

	0	Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
	1	Upstream-Name: Glimmer
	2	Upstream-Contact: Art Delcher <adelcher@tigr.org>,
	3	Steven Salzberg <salzberg@tigr.org>
	4	Source: http://ccb.jhu.edu/software/glimmer/glimmer302b.tar.gz
	5
	6	Files: *
	7	Copyright: © 1999-2008 Art Delcher <adelcher@tigr.org>,
	8	Steven Salzberg <salzberg@tigr.org>
	9	License: Artistic
	10	THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR
	11	IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
	12	WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
	13	On Debian systems, the complete text of the GNU General
	14	Public License can be found in `/usr/share/common-licenses/Artistic'.
	15
	16	Files: debian/*
	17	Copyright: © 2003-2004 Steffen Moeller <moeller@debian.org>
	18	© 2008 Charles Plessy <plessy@debian.org>
	19	© 2008-2013 Andreas Tille <tille@debian.org>
	20	License: GPL-2+
	21	On Debian systems, the complete text of the GNU General
	22	Public License can be found in `/usr/share/common-licenses/GPL'.

+2

-0

debian/docs less more

	0	docs/notes.pdf
	1	debian/glimmer2_docs

+101

-0

debian/glimmer2_docs/README less more

	0	This file and all files in this release of the Glimmer system are
	1	copyright (c) 1999 and (c) 2000 by Arthur Delcher, Steven Salzberg,
	2	Simon Kasif, and Owen White. All rights reserved. Redistribution
	3	is not permitted without the express written permission of
	4	the authors.
	5
	6	Glimmer 2.0 is described in:
	7	A.L. Delcher, D. Harmon, S. Kasif, O. White, and S.L. Salzberg.
	8	Improved Microbial Gene Identification with Glimmer.
	9	Nucleic Acids Research, 27 (1999), 4636-4641.
	10	Please reference this paper if you use the system as part of any
	11	published research. Note that Glimmer 1.0 is described in
	12	S. Salzberg, A. Delcher, S. Kasif, and O. White.
	13	Microbial Gene Identification using Interpolated Markov Models.
	14	Nucleic Acids Research, 26:2 (1998), 544-548.
	15
	16	Quickstart: if you just want to run Glimmer 2.0 on your genome
	17	and you don't want to adjust any parameters (although we don't
	18	recommend this), you can simply compile this system and run
	19	it with the included run-glimmer2 script. E.g.:
	20	unix-prompt> make
	21	[various compilation messages appear]
	22	unix-prompt> run-glimmer2 mygenome
	23
	24	run-glimmer2 will create an Interpolated Markov Model of your genome
	25	and store it in a binary file called tmp.model. It will store
	26	the predicted gene coordinates in g2.coord. Along the way
	27	it will extract long ORFs and store them and their coordinates
	28	in tmp.train and tmp.coord.
	29
	30	Recommended: read the readmes.
	31
	32	Glimmer 1.0 had 4 readme files, and Glimmer 2.0 maintains that
	33	structure. The four main programs are:
	34	1. long-orfs
	35	2. extract
	36	3. build-icm
	37	4. glimmer2
	38	There are files called *.readme for each of these programs. Please
	39	read these first before emailing the authors with any questions.
	40
	41	Art Delcher, adelcher@tigr.org, was the primary programmer for
	42	most of the Glimmer 2.0 code, and he can answer most technical
	43	questions.
	44
	45	CHANGELOG, 7/31/00:
	46	- Weak scores are now only invoked with the -w option. Any weak-score
	47	gene is rejected automatically by an overlap with a regular gene.
	48	- Weak-scores genes and "voted" genes are now annotated by [Weak] and
	49	[Vote] in the final listing. Voted genes are those which have a
	50	significant number of relatively high-scoring subregions. Voted
	51	genes also are rejected automatically by overlaps with regular genes.
	52	- Weak scores are computed to be more independent of architecture-dependent
	53	floating-point features. (Previously, 64-bit machines would sometimes
	54	generate different results from 32-bit machines.)
	55	- Fixed bug in RNABin function that occurred when the gene
	56	started on the very last base of the genome. This function is
	57	now not called at all if the Choose_First_Start_Codon option is
	58	selected (which is the default).
	59	- Fixed problem that occurred on short pieces of genome when one
	60	frame (or more) had no stop codons.
	61	- An ignore option (-i) to specify a list of regions in which no predictions
	62	will be made, such as ribosomal RNAs. This feature has not yet been
	63	thoroughly tested.
	64
	65	CHANGELOG, 9 December 2002
	66	- Raw scores are now printed in the main listing and in []'s in
	67	the final list of putatative genes
	68	- Add +S option to us a "stricter" independent (intergenic) model
	69	that discounts stop codons. Since only orfs (which have no stop
	70	codons) are ever scored, the independent model is at a disadvantage
	71	unless it also assumes that it is only scoring orfs. Thus, with the
	72	+S option, the independent score is done codon by codon.
	73	The probabilities of codons are intially set to what the
	74	previous independent model would be:
	75	The probability of a codon "atg", for example is:
	76	Pr[a] * Pr[t] * Pr[g]
	77	Then each of these is divide by the sum of the probabilities of the
	78	non-stop codons.
	79	- Add -L option to specify the name of a file containing a list
	80	of coordinates. The genes in these lists are scored separately by
	81	the ICM, output, and then the program stops (i.e., no
	82	overlapping/voting rules).
	83
	84	CHANGELOG, 5 February 2003
	85	- The strict independent (intergenic) model is now the only mode.
	86	The +S option is tolerated but has no effect.
	87
	88	CHANGELOG, 18 April 2003
	89	- Compute the optimal length for minimum "long" orfs, so that the
	90	program will return the largest number of orfs possible. The -g
	91	switch still works if specified, but I don't know why anyone would
	92	want to use that for a training set.
	93	- Change minimum overlap by default to be 0. This means that genes
	94	that overlap even by 1 base will be considered in conflict by Glimmer,
	95	and the program will try to adjust their start codons to remove the
	96	conflict or else delete one of the genes.
	97
	98	CHANGELOG, 7 October 2003
	99	- Fix bug on long-orfs.cc to avoid occasional array out-of-bounds
	100	error (detected on Mac OS X).

+60

-0

debian/glimmer2_docs/build-icm.readme less more

	0	// Copyright (c) 1997-99 by Arthur Delcher, Steven Salzberg, Simon
	1	// Kasif, and Owen White. All rights reserved. Redistribution
	2	// is not permitted without the express written permission of
	3	// the authors.
	4
	5	Program build-icm.c creates and outputs an interpolated Markov
	6	model (IMM) as described in the paper
	7	A.L. Delcher, D. Harmon, S. Kasif, O. White, and S.L. Salzberg.
	8	Improved Microbial Gene Identification with Glimmer.
	9	Nucleic Acids Research, 1999, in press.
	10	Please reference this paper if you use the system as part of any
	11	published research.
	12
	13	Input comes from the file named on the command-line. Format should be
	14	one string per line. Each line has an ID string followed by white space
	15	followed by the sequence itself. The script run-glimmer2 generates
	16	an input file in the correct format using the 'extract' program.
	17
	18	The IMM is constructed as follows: For a given context, say
	19	acgtta, we want to estimate the probability distribution of the
	20	next character. We shall do this as a linear combination of the
	21	observed probability distributions for this context and all of
	22	its suffixes, i.e., cgtta, gtta, tta, ta, a and empty. By
	23	observed distributions I mean the counts of the number of
	24	occurrences of these strings in the training set. The linear
	25	combination is determined by a set of probabilities, lambda, one
	26	for each context string. For context acgtta the linear combination
	27	coefficients are:
	28	lambda (acgtta)
	29	(1 - lambda (acgtta)) x lambda (cgtta)
	30	(1 - lambda (acgtta)) x (1 - lambda (cgtta)) x lambda (gtta)
	31	(1 - lambda (acgtta)) x (1 - lambda (cgtta)) x (1 - lambda (gtta)) x lambda (tta)
	32	:
	33	(1 - lambda (acgtta)) x (1 - lambda (cgtta)) x (1 - lambda (gtta))
	34	x (1 - lambda (tta)) x (1 - lambda (ta)) x (1 - lambda (a))
	35
	36	We compute the lambda values for each context as follows:
	37	- If the number of observations in the training set is >= the constant
	38	SAMPLE_SIZE_BOUND, the lambda for that context is 1.0
	39	- Otherwise, do a chi-square test on the observations for this context
	40	compared to the distribution predicted for the one-character shorter
	41	suffix context.
	42	If the chi-square significance < 0.5, set the lambda for this context to 0.0
	43	Otherwise set the lambda for this context to:
	44	(chi-square significance) x (# observations) / SAMPLE_WEIGHT
	45
	46	To compile the program:
	47
	48	g++ build-icm.c -lm -o build-icm
	49
	50	Uses include files delcher.h context.h strarray.h gene.h
	51
	52	To run the program:
	53
	54	build-icm <train.seq >train.model
	55
	56	This will use the training data in train.seq to produce the file
	57	train.model, containing your IMM.
	58
	59

+55

-0

debian/glimmer2_docs/extract.readme less more

	0	// Copyright (c) 1997 by Arthur Delcher, Steven Salzberg, Simon
	1	// Kasif, and Owen White. All rights reserved. Redistribution
	2	// is not permitted without the express written permission of
	3	// the authors.
	4
	5	Program extract takes a FASTA format sequence file and a file
	6	with a list of start/stop positions in that file (e.g., as produced
	7	by the long-orfs program) and extracts and outputs the
	8	specified sequences.
	9
	10	The first command-line argument is the name of the sequence file,
	11	which must be in FASTA format.
	12
	13	The second command-line argument is the name of the coordinate file.
	14	It must contain a list of pairs of positions in the first file, one
	15	per line. The format of each entry is:
	16	<IDstring> <start position> <stop position>
	17	This file should contain no other information, so if you're using
	18	the output of glimmer or long-orfs , you'll have to cut off
	19	header lines.
	20
	21	The output of the program goes to the standard output and has one
	22	line for each line in the coordinate file. Each line contains
	23	the IDstring , followed by white space, followed by the substring
	24	of the sequence file specified by the coordinate pair. Specifically,
	25	the substring starts at the first position of the pair and ends at
	26	the second position (inclusive). If the first position is bigger
	27	than the second, then the DNA reverse complement of each position
	28	is generated. Start/stop pairs that "wrap around" the end of the
	29	genome are allowed.
	30
	31	There are two optional command-line arguments:
	32
	33	-skip makes the output omit the first 3 characters of each sequence,
	34	i.e., it skips over the start codon. This was the default
	35	behaviour of the previous version of the program.
	36
	37	-l n makes the output omit an sequences shorter than n characters.
	38	n includes the 3 skipped characters if the -skip switch
	39	is one.
	40
	41	To compile the program:
	42
	43	g++ extract.c -lm -o extract
	44
	45	Uses include file delcher.h
	46
	47
	48	To run the program:
	49
	50	extract genome.seq list.coord <options>
	51
	52	where genome.seq is a genome sequence in FASTA format and
	53	list.coord is a list of start/stop pairs
	54

+295

-0

debian/glimmer2_docs/glimmer2.readme less more

	0	// Copyright (c) 1997-99 by Arthur Delcher, Steven Salzberg, Simon
	1	// Kasif, and Owen White. All rights reserved. Redistribution
	2	// is not permitted without the express written permission of
	3	// the authors.
	4
	5	// Version 1.02 revised 25 Feb 98 to ignore the independent
	6	// (random) model for long orfs. The default
	7	// length for "long" in this case is set to the length at which
	8	// exactly 1 orf of this length would be expected per 1 million
	9	// bases given the gc content of the genome. This value also can be
	10	// set by command-line option -q .
	11
	12	// Version 1.03 revised 8 Feb 99 to make it easier to specify
	13	// start and stop codons.
	14
	15	// Version 1.04 revised 10 May 99 to add -l command-line switch
	16	// to both glimmer and long-orfs to regard genome as NOT
	17	// circular. Default is to regard it as circular.
	18	// Version 2.0 uses a tree-based IMM as described in the references
	19	// given in the README file. It also implements an extensive new
	20	// algorithm (see the paper) to adjust the start locations of genes
	21	// whose initial coordinates result in an overlap.
	22
	23	// Version: 2.01 31 Jul 98
	24	// Change probability model
	25	// Simplify wraparounds
	26	// Move start codons to eliminate overlaps
	27	// Discount independent model scores when
	28	// there are no overlaps
	29	// Uses Harmon's model
	30
	31	// Version: 2.03 9 Dec 2002
	32	// Include raw scores in output
	33	// Add strict option to use independent intergenic
	34	// model that discounts stop codons
	35	// Add option to score each entry from a list of coordinates
	36	// separately, without overlapping/voting rules
	37
	38	// Version: 2.10 5 Feb 2003
	39	// Strict option to use independent intergenic
	40	// model that discounts stop codons is only behaviour
	41
	42	// Version: 2.11 18 Apr 2003
	43	// Change long-orfs to automatically compute the
	44	// optimal value of ORF length in order to maximize
	45	// the amount of training data.
	46	Program glimmer takes two inputs: a sequence file (in FASTA format)
	47	and a collection of Markov models for genes as produced by the program
	48	build-icm . It outputs a list of all open reading frames (orfs) together
	49	with scores for each as a gene.
	50
	51	The first few lines of output specify the settings of various
	52	parameter in the program:
	53
	54	Minimum gene length is the length of the smallest fragment
	55	considered to be a gene. The length is measured from the first base
	56	of the start codon to the last base before the stop codon.
	57	This value can be specified when running the program with the -g option.
	58
	59	Minimum overlap length is a lower bound on the number of bases overlap
	60	between 2 genes that is considered a problem. Overlaps shorter than
	61	this are ignored.
	62
	63	Minimum overlap percent is another lower bound on the number of bases
	64	overlap that is considered a problem. Overlaps shorter than this
	65	percentage of both genes are ignored.
	66
	67	Threshold score is the minimum in-frame score for a fragment to be
	68	considered a potential gene.
	69
	70	Use independent scores indicates whether the last column that scores each
	71	fragment using independent base probabilities is present.
	72
	73	Use first start codon indicates whether the first possible start codon
	74	is used or not. If not, the function Choose_Start is called to
	75	choose the start codon. Currently it computes hybridization energy
	76	between the string Ribosome_Pattern and the region in front of
	77	the start codon, and if this is above a threshold, that start site
	78	is chosen. The ribosome pattern string can be set by the -s option.
	79	Presumably function Choose_Start should be modified to do something
	80	cleverer.
	81
	82	Currently used start codons are atg, gtg & ttg . These can be changed
	83	in the function Is_Start , but corresponding changes should be
	84	made in Choose_Start .
	85
	86
	87	The next portion of the output is the result for each orf:
	88
	89	Column 1 is an ID number for reference purposes. It is assigned
	90	sequentially starting with 1 to all orfs whose Gene Score is
	91	at least 90 . I'll make this a command-line option when I decide
	92	what letter to use.
	93
	94	Column 2 is the reading frame of the orf. Three forward (F1, F2 and F3)
	95	and three reverse (R1, R2 and R3). These correspond with the headings
	96	for the scores in columns 9-14.
	97
	98	Column 3 is the start position of the orf, i.e., the first base after
	99	the previous stop codon.
	100
	101	Column 4 is the position of the first base of the first start codon in
	102	the orf. Currently I use atg, ctg, gtg and ttg as start codons.
	103
	104	Column 5 is the position of the last base before the stop codon. Stop
	105	codons are taa, tag, and tga. Note that for orfs in the reverse
	106	reading frames have their start position higher than the end position.
	107	The order in which orfs are listed is in increasing order by
	108	Max {OrfStart, End}, i.e., the highest numbered position in the orf,
	109	except for orfs that "wrap around" the end of the sequence.
	110
	111	Columns 6 and 7 are the lengths of the orf and gene, respectively, i.e.,
	112	1 + \|OrfStart - End\| and 1 + \|GeneStart - End\| .
	113
	114	Column 8 is the score for the gene region. It is the probability (as
	115	a percent) that the Markov model in the correct frame generated this
	116	sequence. This value matches the value in the corresponding column
	117	of frame scores--an orf in reading frame R1 has a Gene Score equal to
	118	the value in the R1 column of frame scores for that orf.
	119
	120	Columns 9-14 are the scores for the gene region in each of the 6 reading
	121	frames. It is the probability (as a percent) that the Markov model in
	122	that frame generated this sequence.
	123
	124	Column 15 is the probability as a percent that the gene sequence was generated
	125	by a model of independent probabilities for each base, and represents to
	126	some extent the probability that the sequence is "random".
	127
	128
	129	When two genes with ID numbers overlap by at least a sufficient
	130	amount (as determined by Min_Olap and Min_Olap_Percent ), a line
	131	beginning with *** is printed and scores for the overlap region
	132	are printed. If the frame of the high score of the overlap
	133	region matches the frame of the longer gene, then a message is
	134	printed that the shorter gene is rejected. Otherwise, a message
	135	is printed that both genes are "suspect". A suspect or reject
	136	message for any gene is only printed once, however.
	137
	138	A message is also printed if a gene with an ID number wholly contains another
	139	gene with an ID number. The longer "shadows" the shorter.
	140
	141
	142	At the end a list of "putative" gene positions is produced. The first
	143	column is the ID number, the second is the start position, the third
	144	is the end position. For "suspect" genes, a notation in [] 's follows:
	145
	146	[Bad Olap a b c] means that gene number a overlapped this one and
	147	was shorter but scored higher on the overlap region. b is the length
	148	of the overlap region and c is the score of this gene on the overlap
	149	region. There should be a [Shorter ...] notation with gene a
	150	giving its score.
	151
	152	[Shorter a b c] means that gene number a overlapped this one and
	153	was longer but scored lower on the overlap region. b is the length
	154	of the overlap region and c is the score of this gene on the overlap
	155	region. There should be a [Bad olap ...] notation with gene a
	156	giving its score.
	157
	158	[Shadowed by a] means that this gene was completed contained as part
	159	of gene a 's region, but in another frame.
	160
	161	[Delay by a b c d] means that this gene was tentatively rejected
	162	because of an overlap with gene b , but if the start codon is postponed
	163	by a positions, then this would be a valid gene. The start position
	164	reported for this gene includes the delay. c is the length of the overlap
	165	region that caused the rejection and d is the score in this gene's frame
	166	on that overlap region.
	167
	168	[Weak] means that this gene did not meet the regular scoring threshold,
	169	but if the independent model were ignored, its score would be high
	170	enough. Should only occur if the -w option is used.
	171
	172	[Vote] means that this gene did not meet the regular scoring threshold,
	173	but sufficiently many of its subranges had high enough scores to
	174	indicate it might be a gene.
	175
	176	Note that a gene marked as rejected may appear in this list. This can
	177	occur if the gene that caused the rejection was itself rejected. The
	178	actual algorithm to produce the list is as follows:
	179
	180	Consider the genes in decreasing order by length. If gene x is to
	181	be rejected because of an overlap with longer gene y that has not been
	182	rejected, then gene x is rejected and does not appear in the list.
	183	Otherwise, all notations for gene x that are not caused by rejected
	184	genes are reported.
	185
	186	I think a "delayed" gene might incorrectly be listed as causing a problem
	187	by the part of it that was eliminated by the delay. Probably the remaining
	188	portion should be reinserted into the sorted list base on its now-shorter
	189	length, and any notations caused by it should be re-checked to see if
	190	they're affected by shortening the gene. Let's save this for the next
	191	version.
	192
	193
	194
	195	Specifying Different Start and Stop Codons:
	196
	197	To specify different sets of start and stop codons, modify the file
	198	gene.h . Specifically, the functions:
	199
	200	Is_Forward_Start Is_Reverse_Start Is_Start
	201	Is_Forward_Stop Is_Reverse_Stop Is_Stop
	202
	203	are used to determine what is used for start and stop codons.
	204
	205	Is_Start and Is_Stop do simple string comparisons to specify
	206	which patterns are used. To add a new pattern, just add the comparison
	207	for it. To remove a pattern, comment out or delete the comparison
	208	for it.
	209
	210	The other four functions use a bit comparison to determine start and
	211	stop patterns. They represent a codon as a 12-bit pattern, with 4 bits
	212	for each base, one bit for each possible value of the bases, T, G, C
	213	or A. Thus the bit pattern 0010 0101 1100 represents the base
	214	pattern [C] [A or G] [G or T]. By doing bit operations (& \| ~) and
	215	comparisons, more complicated patterns involving ambiguous reads
	216	can be tested efficiently. Simple patterns can be tested as in
	217	the current code.
	218
	219	For example, to insert an additional start codon of CAT requires 3 changes:
	220	1. The line
	221	\|\| (Codon & 0x218) == Codon
	222	should be inserted into Is_Forward_Start , since 0x218 = 0010 0001 1000
	223	represents CAT.
	224	2. The line
	225	\|\| (Codon & 0x184) == Codon
	226	should be inserted into Is_Reverse_Start , since 0x184 = 0001 1000 0100
	227	represents ATG, which is the reverse-complement of CAT. Alternately,
	228	the #define constant ATG_MASK could be used.
	229	3. The line
	230	\|\| strncmp (S, "cat", 3) == 0
	231	should be inserted into Is_Start .
	232	If not automatically using the first start codon, some changes might
	233	also be made to the function Choose_Start .
	234
	235
	236
	237	To compile the program:
	238
	239	Use the Makefile. It will put the executables in a bin subdirectory.
	240
	241	To compile just this program use:
	242
	243	g++ glimmer2.c -lm -o glimmer
	244
	245	Uses include files delcher.h context.h strarray.h gene.h
	246
	247
	248	To run the program:
	249
	250	First run build-icm on a set of sequences to make the Markov models.
	251
	252	build-icm <train.seq >train.model
	253
	254	This will produce a file train.model. You can call this file anything
	255	you like, train.model, myicm, itsrainingtoday, etc.
	256
	257	Then run glimmer2
	258
	259	glimmer2 hflu.seq train.model
	260
	261	Options can be specified after the 2nd file name
	262
	263	glimmer2 hflu.seq train.model <options>
	264
	265	Options are:
	266	-f Use ribosome-binding energy to choose start codon. This is
	267	not fully tested and likely to be buggy. Better not to use it.
	268	+f Use first codon in orf as start codon
	269	-g n Set minimum gene length to n
	270	-i s Ignore bases within the coordinates listed in file s. File s
	271	should consist of one base pair per line (no tags), and the ignore
	272	region should be a multiple of three bases long. [Somewhat buggy]
	273	-l Regard the genome as linear (not circular), i.e., do not allow
	274	genes to "wrap around" the end of the genome.
	275	This option works on both glimmer and long-orfs .
	276	The default behavior is to regard the genome as circular.
	277	-o n Set minimum overlap length to n. Overlaps shorter than this
	278	are ignored.
	279	-p n Set minimum overlap percentage to n%. Overlaps shorter than
	280	this percentage of both strings are ignored.
	281	-q n If using independent model scores (+r option), it will only
	282	apply to orfs shorter than n . The default value for n
	283	has an expectation of one orf that length or longer occurring
	284	per million bases in a random genome with the same gc content
	285	-r Don't use independent probability score column
	286	+r Use independent probability score column
	287	-s s Use string s as the ribosome binding pattern to find start codons.
	288	Not fully tested and known to have bugs.
	289	-t n Set threshold score for calling as gene to n. If the in-frame
	290	score >= n, then the region is given a number and considered
	291	a potential gene.
	292	-w n Use "weak" scores on potential genes at least n bases long.
	293	Weak scores ignore the independent model.
	294	-X Allow orfs extending off ends of sequence to be scored

+140

-0

debian/glimmer2_docs/long-orfs.readme less more

	0	// Copyright (c) 1997-99 by Arthur Delcher, Steven Salzberg, Simon
	1	// Kasif, and Owen White. All rights reserved. Redistribution
	2	// is not permitted without the express written permission of
	3	// the authors.
	4	// Version: 1.1 April 2003 (S. Salzberg)
	5	// Compute the optimal length for minimum "long"
	6	// orfs, so that the program will return the largest
	7	// number of orfs possible. The -g switch still works
	8	// if specified, but I don't know why anyone would want
	9	// to use that for a training set.
	10	// Also, change min overlap by default to be 0.
	11	// Version 1.04 revised 10 May 99 to add -l command-line switch
	12	// to both glimmer and long-orfs to regard genome as NOT
	13	// circular. Default is to regard it as circular.
	14
	15	Program long-orfs takes a sequence file (in FASTA format) and
	16	outputs a list of all long "potential genes" in it that do not
	17	overlap by too much. By "potential gene" I mean the portion of
	18	an orf from the first start codon to the stop codon at the end.
	19
	20	The first few lines of output specify the settings of various
	21	parameters in the program:
	22
	23	Minimum gene length is the length of the smallest fragment
	24	considered to be a gene. The length is measured from the first base
	25	of the start codon to the last base before the stop codon.
	26	This value can be specified when running the program with the -g option.
	27	By default, the program now (April 2003) will compute an optimal length
	28	for this parameter, where "optimal" is the value that produces the
	29	greatest number of long ORFs, thereby increasing the amount of data
	30	used for training.
	31
	32	Minimum overlap length is a lower bound on the number of bases overlap
	33	between 2 genes that is considered a problem. Overlaps shorter than
	34	this are ignored.
	35
	36	Minimum overlap percent is another lower bound on the number of bases
	37	overlap that is considered a problem. Overlaps shorter than this
	38	percentage of both genes are ignored.
	39
	40	The next portion of the output is a list of potential genes:
	41
	42	Column 1 is an ID number for reference purposes. It is assigned
	43	sequentially starting with 1 to all long potential genes. If
	44	overlapping genes are eliminated, gaps in the numbers will occur.
	45	The ID prefix is specified in the constant ID_PREFIX .
	46
	47	Column 2 is the position of the first base of the first start codon in
	48	the orf. Currently I use atg, and gtg as start codons. This is
	49	easily changed in the function Is_Start () .
	50
	51	Column 3 is the position of the last base before the stop codon. Stop
	52	codons are taa, tag, and tga. Note that for orfs in the reverse
	53	reading frames have their start position higher than the end position.
	54	The order in which orfs are listed is in increasing order by
	55	Max {OrfStart, End}, i.e., the highest numbered position in the orf,
	56	except for orfs that "wrap around" the end of the sequence.
	57
	58	When two genes with ID numbers overlap by at least a sufficient
	59	amount (as determined by Min_Olap and Min_Olap_Percent ), they
	60	are eliminated and do not appear in the output.
	61
	62	The final output of the program (sent to the standard error file so
	63	it does not show up when output is redirected to a file) is the
	64	length of the longest orf found.
	65
	66
	67
	68	Specifying Different Start and Stop Codons:
	69
	70	To specify different sets of start and stop codons, modify the file
	71	gene.h . Specifically, the functions:
	72
	73	Is_Forward_Start Is_Reverse_Start Is_Start
	74	Is_Forward_Stop Is_Reverse_Stop Is_Stop
	75
	76	are used to determine what is used for start and stop codons.
	77
	78	Is_Start and Is_Stop do simple string comparisons to specify
	79	which patterns are used. To add a new pattern, just add the comparison
	80	for it. To remove a pattern, comment out or delete the comparison
	81	for it.
	82
	83	The other four functions use a bit comparison to determine start and
	84	stop patterns. They represent a codon as a 12-bit pattern, with 4 bits
	85	for each base, one bit for each possible value of the bases, T, G, C
	86	or A. Thus the bit pattern 0010 0101 1100 represents the base
	87	pattern [C] [A or G] [G or T]. By doing bit operations (& \| ~) and
	88	comparisons, more complicated patterns involving ambiguous reads
	89	can be tested efficiently. Simple patterns can be tested as in
	90	the current code.
	91
	92	For example, to insert an additional start codon of CAT requires 3 changes:
	93	1. The line
	94	\|\| (Codon & 0x218) == Codon
	95	should be inserted into Is_Forward_Start , since 0x218 = 0010 0001 1000
	96	represents CAT.
	97	2. The line
	98	\|\| (Codon & 0x184) == Codon
	99	should be inserted into Is_Reverse_Start , since 0x184 = 0001 1000 0100
	100	represents ATG, which is the reverse-complement of CAT. Alternately,
	101	the #define constant ATG_MASK could be used.
	102	3. The line
	103	\|\| strncmp (S, "cat", 3) == 0
	104	should be inserted into Is_Start .
	105
	106
	107
	108	To compile the program:
	109
	110	g++ long-orfs.c -lm -o long-orfs
	111
	112	Uses include files delcher.h gene.h
	113
	114
	115	To run the program:
	116
	117	long-orfs genome.seq
	118
	119	where genome.seq is a genome sequence in FASTA format.
	120
	121	Options can be specified after the genome file name
	122
	123	long-orfs genome.seq <options>
	124
	125	Options are:
	126	-g n Set minimum gene length to n. Default is to compute an
	127	optimal value automatically. Don't change this unless you
	128	know what you're doing.
	129	-l Regard the genome as linear (not circular), i.e., do not allow
	130	genes to "wrap around" the end of the genome.
	131	This option works on both glimmer and long-orfs .
	132	The default behavior is to regard the genome as circular.
	133	-o n Set maximum overlap length to n. Overlaps shorter than this
	134	are permitted. (Default is 0 bp.)
	135	-p n Set maximum overlap percentage to n%. Overlaps shorter than
	136	this percentage of both strings are ignored. (Default is 10%.)
	137
	138	If you DON'T want to eliminate overlapping genes, just use the -p 100
	139	option.

+124

-0

debian/glimmer2_mans/tigr-anomaly.sgml less more

	0	<!doctype refentry PUBLIC "-//OASIS//DTD DocBook V4.1//EN" [
	1
	2	<!-- Process this file with docbook-to-man to generate an nroff manual
	3	page: `docbook-to-man manpage.sgml > manpage.1'. You may view
	4	the manual page with: `docbook-to-man manpage.sgml \| nroff -man \|
	5	less'. A typical entry in a Makefile or Makefile.am is:
	6
	7	manpage.1: manpage.sgml
	8	docbook-to-man $< > $@
	9
	10
	11	The docbook-to-man binary is found in the docbook-to-man package.
	12	Please remember that if you create the nroff version in one of the
	13	debian/rules file targets (such as build), you will need to include
	14	docbook-to-man in your Build-Depends control field.
	15
	16	-->
	17
	18	<!-- Fill in your name for FIRSTNAME and SURNAME. -->
	19	<!ENTITY dhfirstname "<firstname>Steffen</firstname>">
	20	<!ENTITY dhsurname "<surname>Möller</surname>">
	21	<!-- Please adjust the date whenever revising the manpage. -->
	22	<!ENTITY dhdate "<date>November 10, 2004</date>">
	23	<!ENTITY dhsection "<manvolnum>1</manvolnum>">
	24	<!ENTITY dhemail "<email>moeller@debian.org</email>">
	25	<!ENTITY dhusername "Steffen Moeller">
	26	<!ENTITY dhucpackage "<refentrytitle>TIGR-GLIMMER</refentrytitle>">
	27	<!ENTITY dhpackage "tigr-glimmer">
	28
	29	<!ENTITY debian "<productname>Debian</productname>">
	30	<!ENTITY gnu "<acronym>GNU</acronym>">
	31	<!ENTITY gpl "&gnu; <acronym>GPL</acronym>">
	32	]>
	33
	34	<refentry>
	35	<refentryinfo>
	36	<address>
	37	&dhemail;
	38	</address>
	39	<author>
	40	&dhfirstname;
	41	&dhsurname;
	42	</author>
	43	<copyright>
	44	<year>2003</year>
	45	<holder>&dhusername;</holder>
	46	</copyright>
	47	&dhdate;
	48	</refentryinfo>
	49	<refmeta>
	50	&dhucpackage;
	51
	52	&dhsection;
	53	</refmeta>
	54	<refnamediv>
	55	<refname>&dhpackage;</refname>
	56
	57	<refpurpose>
	58	The program lacks a description
	59	</refpurpose>
	60	</refnamediv>
	61	<refsynopsisdiv>
	62	<cmdsynopsis>
	63	<command>tigr-anomaly</command>
	64	<arg>>dna-file</arg>
	65	<arg>>coord-file</arg>
	66	</cmdsynopsis>
	67	</refsynopsisdiv>
	68	<refsect1>
	69	<title>DESCRIPTION</title>
	70	<para>
	71	</para>
	72
	73	</refsect1>
	74	<refsect1>
	75	<title>OPTIONS</title>
	76	</refsect1>
	77	<refsect1>
	78	<title>SEE ALSO</title>
	79	<para>
	80	tigr-glimmer3 (1),
	81	tigr-adjust (1),
	82	tigr-anomaly (1),
	83	tigr-build-icm (1),
	84	tigr-check (1),
	85	tigr-codon-usage (1),
	86	tigr-compare-lists (1),
	87	tigr-extract (1),
	88	tigr-generate (1),
	89	tigr-get-len (1),
	90	tigr-get-putative (1),
	91	</para>
	92	<para>
	93	http://www.tigr.org/software/glimmer/
	94	</para>
	95
	96	<para>Please see the readme in /usr/share/doc/tigr-glimmer for a description on how to use Glimmer3.</para>
	97	</refsect1>
	98	<refsect1>
	99	<title>AUTHOR</title>
	100
	101	<para>This manual page was quickly copied from the glimmer web site by &dhusername; &dhemail; for
	102	the &debian; system.
	103	</para>
	104
	105	</refsect1>
	106	</refentry>
	107
	108	<!-- Keep this comment at the end of the file
	109	Local variables:
	110	mode: sgml
	111	sgml-omittag:t
	112	sgml-shorttag:t
	113	sgml-minimize-attributes:nil
	114	sgml-always-quote-attributes:t
	115	sgml-indent-step:2
	116	sgml-indent-data:t
	117	sgml-parent-document:nil
	118	sgml-default-dtd-file:nil
	119	sgml-exposed-tags:nil
	120	sgml-local-catalogs:nil
	121	sgml-local-ecat-files:nil
	122	End:
	123	-->

+162

-0

debian/glimmer2_mans/tigr-build-icm.sgml less more

	0	<!doctype refentry PUBLIC "-//OASIS//DTD DocBook V4.1//EN" [
	1
	2	<!-- Process this file with docbook-to-man to generate an nroff manual
	3	page: `docbook-to-man manpage.sgml > manpage.1'. You may view
	4	the manual page with: `docbook-to-man manpage.sgml \| nroff -man \|
	5	less'. A typical entry in a Makefile or Makefile.am is:
	6
	7	manpage.1: manpage.sgml
	8	docbook-to-man $< > $@
	9
	10
	11	The docbook-to-man binary is found in the docbook-to-man package.
	12	Please remember that if you create the nroff version in one of the
	13	debian/rules file targets (such as build), you will need to include
	14	docbook-to-man in your Build-Depends control field.
	15
	16	-->
	17
	18	<!-- Fill in your name for FIRSTNAME and SURNAME. -->
	19	<!ENTITY dhfirstname "<firstname>Steffen</firstname>">
	20	<!ENTITY dhsurname "<surname>Möller</surname>">
	21	<!-- Please adjust the date whenever revising the manpage. -->
	22	<!ENTITY dhdate "<date>Novemver 10, 2004</date>">
	23	<!ENTITY dhsection "<manvolnum>1</manvolnum>">
	24	<!ENTITY dhemail "<email>moeller@debian.org</email>">
	25	<!ENTITY dhusername "Steffen Moeller">
	26	<!ENTITY dhucpackage "<refentrytitle>TIGR-GLIMMER<title>">
	27	<!ENTITY dhpackage "tigr-glimmer">
	28
	29	<!ENTITY debian "<productname>Debian</productname>">
	30	<!ENTITY gnu "<acronym>GNU</acronym>">
	31	<!ENTITY gpl "&gnu; <acronym>GPL</acronym>">
	32	]>
	33
	34	<refentry>
	35	<refentryinfo>
	36	<address>
	37	&dhemail;
	38	</address>
	39	<author>
	40	&dhfirstname;
	41	&dhsurname;
	42	</author>
	43	<copyright>
	44	<year>2003</year>
	45	<holder>&dhusername;</holder>
	46	</copyright>
	47	&dhdate;
	48	</refentryinfo>
	49	<refmeta>
	50	&dhucpackage;
	51
	52	&dhsection;
	53	</refmeta>
	54	<refnamediv>
	55	<refname>&dhpackage;</refname>
	56	<refpurpose>Ceates and outputs an interpolated Markov model(IMM)</refpurpose>
	57	</refnamediv>
	58	<refsynopsisdiv>
	59	<cmdsynopsis>
	60	<command>tigr-build-icm</command>
	61	</cmdsynopsis>
	62	</refsynopsisdiv>
	63	<refsect1>
	64	<title>DESCRIPTION</title>
	65	<para>
	66	Program build-icm.c creates and outputs an interpolated Markov
	67	model (IMM) as described in the paper
	68	A.L. Delcher, D. Harmon, S. Kasif, O. White, and S.L. Salzberg.
	69	Improved Microbial Gene Identification with Glimmer.
	70	Nucleic Acids Research, 1999, in press.
	71	Please reference this paper if you use the system as part of any
	72	published research.
	73	</para><para>
	74	Input comes from the file named on the command-line. Format should be
	75	one string per line. Each line has an ID string followed by white space
	76	followed by the sequence itself. The script run-glimmer3 generates
	77	an input file in the correct format using the 'extract' program.
	78	</para><para>
	79	The IMM is constructed as follows: For a given context, say
	80	acgtta, we want to estimate the probability distribution of the
	81	next character. We shall do this as a linear combination of the
	82	observed probability distributions for this context and all of
	83	its suffixes, i.e., cgtta, gtta, tta, ta, a and empty. By
	84	observed distributions I mean the counts of the number of
	85	occurrences of these strings in the training set. The linear
	86	combination is determined by a set of probabilities, lambda, one
	87	for each context string. For context acgtta the linear combination
	88	coefficients are:
	89	</para><para>
	90	lambda (acgtta)
	91	(1 - lambda (acgtta)) x lambda (cgtta)
	92	(1 - lambda (acgtta)) x (1 - lambda (cgtta)) x lambda (gtta)
	93	(1 - lambda (acgtta)) x (1 - lambda (cgtta)) x (1 - lambda (gtta)) x lambda (tta)
	94	(1 - lambda (acgtta)) x (1 - lambda (cgtta)) x (1 - lambda (gtta))
	95	x (1 - lambda (tta)) x (1 - lambda (ta)) x (1 - lambda (a))
	96	</para><para>
	97	We compute the lambda values for each context as follows:
	98	- If the number of observations in the training set is >= the constant
	99	SAMPLE_SIZE_BOUND, the lambda for that context is 1.0
	100	- Otherwise, do a chi-square test on the observations for this context
	101	compared to the distribution predicted for the one-character shorter
	102	suffix context.
	103	If the chi-square significance < 0.5, set the lambda for this context to 0.0
	104	Otherwise set the lambda for this context to:
	105	(chi-square significance) x (# observations) / SAMPLE_WEIGHT
	106	</para><para>
	107	To run the program:
	108	</para><para>
	109	build-icm <train.seq > train.model
	110	</para><para>
	111	This will use the training data in train.seq to produce the file
	112	train.model, containing your IMM.
	113	</para>
	114	</refsect1>
	115	<refsect1>
	116	<title>SEE ALSO</title>
	117	<para>
	118	tigr-glimmer3 (1),
	119	tigr-long-orfs (1),
	120	tigr-adjust (1),
	121	tigr-anomaly (1),
	122	tigr-extract (1),
	123	tigr-check (1),
	124	tigr-codon-usage (1),
	125	tigr-compare-lists (1),
	126	tigr-extract (1),
	127	tigr-generate (1),
	128	tigr-get-len (1),
	129	tigr-get-putative (1),
	130	</para>
	131	<para>http://www.tigr.org/software/glimmer/</para>
	132	<para>Please see the readme in /usr/share/doc/tigr-glimmer for a description on how to use Glimmer3.</para>
	133	</refsect1>
	134	<refsect1>
	135	<title>AUTHOR</title>
	136
	137	<para>This manual page was quickly copied from the glimmer web site and readme file by &dhusername; &dhemail; for
	138	the &debian; system.
	139	</para>
	140
	141	</refsect1>
	142	</refentry>
	143
	144	<!-- Keep this comment at the end of the file
	145	Local variables:
	146	mode: sgml
	147	sgml-omittag:t
	148	sgml-shorttag:t
	149	sgml-minimize-attributes:nil
	150	sgml-always-quote-attributes:t
	151	sgml-indent-step:2
	152	sgml-indent-data:t
	153	sgml-parent-document:nil
	154	sgml-default-dtd-file:nil
	155	sgml-exposed-tags:nil
	156	sgml-local-catalogs:nil
	157	sgml-local-ecat-files:nil
	158	End:
	159	-->
	160
	161

+165

-0

debian/glimmer2_mans/tigr-extract.sgml less more

	0	<!doctype refentry PUBLIC "-//OASIS//DTD DocBook V4.1//EN" [
	1
	2	<!-- Process this file with docbook-to-man to generate an nroff manual
	3	page: `docbook-to-man manpage.sgml > manpage.1'. You may view
	4	the manual page with: `docbook-to-man manpage.sgml \| nroff -man \|
	5	less'. A typical entry in a Makefile or Makefile.am is:
	6
	7	manpage.1: manpage.sgml
	8	docbook-to-man $< > $@
	9
	10
	11	The docbook-to-man binary is found in the docbook-to-man package.
	12	Please remember that if you create the nroff version in one of the
	13	debian/rules file targets (such as build), you will need to include
	14	docbook-to-man in your Build-Depends control field.
	15
	16	-->
	17
	18	<!-- Fill in your name for FIRSTNAME and SURNAME. -->
	19	<!ENTITY dhfirstname "<firstname>Steffen</firstname>">
	20	<!ENTITY dhsurname "<surname>Möller</surname>">
	21	<!-- Please adjust the date whenever revising the manpage. -->
	22	<!ENTITY dhdate "<date>November 10, 2004</date>">
	23	<!ENTITY dhsection "<manvolnum>1</manvolnum>">
	24	<!ENTITY dhemail "<email>moeller@debian.org</email>">
	25	<!ENTITY dhusername "Steffen Moeller">
	26	<!ENTITY dhucpackage "<refentrytitle>TIGR-GLIMMER</refentrytitle>">
	27	<!ENTITY dhpackage "tigr-glimmer">
	28
	29	<!ENTITY debian "<productname>Debian</productname>">
	30	<!ENTITY gnu "<acronym>GNU</acronym>">
	31	<!ENTITY gpl "&gnu; <acronym>GPL</acronym>">
	32	]>
	33
	34	<refentry>
	35	<refentryinfo>
	36	<address>
	37	&dhemail;
	38	</address>
	39	<author>
	40	&dhfirstname;
	41	&dhsurname;
	42	</author>
	43	<copyright>
	44	<year>2003</year>
	45	<holder>&dhusername;</holder>
	46	</copyright>
	47	&dhdate;
	48	</refentryinfo>
	49	<refmeta>
	50	&dhucpackage;
	51
	52	&dhsection;
	53	</refmeta>
	54	<refnamediv>
	55	<refname>&dhpackage;</refname>
	56
	57	<refpurpose>
	58	Fine start/stop positions of genes in genome sequence
	59	</refpurpose>
	60	</refnamediv>
	61	<refsynopsisdiv>
	62	<cmdsynopsis>
	63	<command>tigr-extract</command>
	64	<arg>genome-file <option><replaceable>options</replaceable></option></arg>
	65	</cmdsynopsis>
	66	</refsynopsisdiv>
	67	<refsect1>
	68	<title>DESCRIPTION</title>
	69	<para>
	70	Program extract takes a FASTA format sequence file and a file
	71	with a list of start/stop positions in that file (e.g., as produced
	72	by the long-orfs program) and extracts and outputs the
	73	specified sequences.
	74	</para><para>
	75	The first command-line argument is the name of the sequence file,
	76	which must be in FASTA format.
	77	</para><para>
	78	The second command-line argument is the name of the coordinate file.
	79	It must contain a list of pairs of positions in the first file, one
	80	per line. The format of each entry is:
	81	</para><para> <IDstring>> <start position> <stop position>
	82	</para><para>This file should contain no other information, so if you're using
	83	the output of glimmer or long-orfs , you'll have to cut off
	84	header lines.
	85	</para><para>
	86	The output of the program goes to the standard output and has one
	87	line for each line in the coordinate file. Each line contains
	88	the IDstring , followed by white space, followed by the substring
	89	of the sequence file specified by the coordinate pair. Specifically,
	90	the substring starts at the first position of the pair and ends at
	91	the second position (inclusive). If the first position is bigger
	92	than the second, then the DNA reverse complement of each position
	93	is generated. Start/stop pairs that "wrap around" the end of the
	94	genome are allowed.
	95	</para>
	96	</refsect1>
	97	<refsect1>
	98	<title>OPTIONS</title>
	99	<variablelist>
	100	<varlistentry>
	101	<term><option>-skip</option></term>
	102	<listitem>
	103	<para> makes the output omit the first 3 characters of each sequence, i.e., it skips over the start codon. This was the behaviour of the previous version of the program.</para>
	104	</listitem>
	105	</varlistentry>
	106	<varlistentry>
	107	<term><option>-l</option></term><listitem><para>
	108	makes the output omit an sequences shorter than n characters.
	109	n includes the 3 skipped characters if the -skip switch
	110	is one.
	111	</para></listitem>
	112	</varlistentry>
	113	</variablelist>
	114	</refsect1>
	115	<refsect1>
	116	<title>SEE ALSO</title>
	117	<para>
	118	tigr-glimmer3 (1),
	119	tigr-long-orfs (1),
	120	tigr-adjust (1),
	121	tigr-anomaly (1),
	122	tigr-build-icm (1),
	123	tigr-check (1),
	124	tigr-codon-usage (1),
	125	tigr-compare-lists (1),
	126	tigr-extract (1),
	127	tigr-generate (1),
	128	tigr-get-len (1),
	129	tigr-get-putative (1),
	130	</para>
	131	<para>
	132	http://www.tigr.org/software/glimmer/
	133	</para>
	134
	135	<para>Please see the readme in /usr/share/doc/tigr-glimmer for a description on how to use Glimmer3.</para>
	136	</refsect1>
	137	<refsect1>
	138	<title>AUTHOR</title>
	139
	140	<para>This manual page was quickly copied from the glimmer web site by &dhusername; &dhemail; for
	141	the &debian; system.
	142	</para>
	143
	144	</refsect1>
	145	</refentry>
	146
	147	<!-- Keep this comment at the end of the file
	148	Local variables:
	149	mode: sgml
	150	sgml-omittag:t
	151	sgml-shorttag:t
	152	sgml-minimize-attributes:nil
	153	sgml-always-quote-attributes:t
	154	sgml-indent-step:2
	155	sgml-indent-data:t
	156	sgml-parent-document:nil
	157	sgml-default-dtd-file:nil
	158	sgml-exposed-tags:nil
	159	sgml-local-catalogs:nil
	160	sgml-local-ecat-files:nil
	161	End:
	162	-->
	163
	164

+246

-0

debian/glimmer2_mans/tigr-glimmer3.sgml less more

	0	<!doctype refentry PUBLIC "-//OASIS//DTD DocBook V4.1//EN" [
	1
	2	<!-- Process this file with docbook-to-man to generate an nroff manual
	3	page: `docbook-to-man manpage.sgml > manpage.1'. You may view
	4	the manual page with: `docbook-to-man manpage.sgml \| nroff -man \|
	5	less'. A typical entry in a Makefile or Makefile.am is:
	6
	7	manpage.1: manpage.sgml
	8	docbook-to-man $< > $@
	9
	10
	11	The docbook-to-man binary is found in the docbook-to-man package.
	12	Please remember that if you create the nroff version in one of the
	13	debian/rules file targets (such as build), you will need to include
	14	docbook-to-man in your Build-Depends control field.
	15
	16	-->
	17
	18	<!-- Fill in your name for FIRSTNAME and SURNAME. -->
	19	<!ENTITY dhfirstname "<firstname>Steffen</firstname>">
	20	<!ENTITY dhsurname "<surname>Möller</surname>">
	21	<!-- Please adjust the date whenever revising the manpage. -->
	22	<!ENTITY dhdate "<date>November 10, 2004</date>">
	23	<!ENTITY dhsection "<manvolnum>1</manvolnum>">
	24	<!ENTITY dhemail "<email>moeller@debian.org</email>">
	25	<!ENTITY dhusername "Steffen Moeller">
	26	<!ENTITY dhucpackage "<refentrytitle>TIGR-GLIMMER</refentrytitle>">
	27	<!ENTITY dhpackage "tigr-glimmer">
	28
	29	<!ENTITY debian "<productname>Debian</productname>">
	30	<!ENTITY gnu "<acronym>GNU</acronym>">
	31	<!ENTITY gpl "&gnu; <acronym>GPL</acronym>">
	32	]>
	33
	34	<refentry>
	35	<refentryinfo>
	36	<address>
	37	&dhemail;
	38	</address>
	39	<author>
	40	&dhfirstname;
	41	&dhsurname;
	42	</author>
	43	<copyright>
	44	<year>2003</year>
	45	<holder>&dhusername;</holder>
	46	</copyright>
	47	&dhdate;
	48	</refentryinfo>
	49	<refmeta>
	50	&dhucpackage;
	51
	52	&dhsection;
	53	</refmeta>
	54	<refnamediv>
	55	<refname>&dhpackage;</refname>
	56	<refpurpose>
	57	Find/Score potential genes in genome-file using the probability model in icm-file
	58	</refpurpose>
	59	</refnamediv>
	60	<refsynopsisdiv>
	61	<cmdsynopsis>
	62	<command>tigr-glimmer3</command>
	63	<arg><option><replaceable>genome-file</replaceable></option></arg>
	64	<arg><option><replaceable>icm-file</replaceable></option></arg>
	65	<arg><option><replaceable>[options]</replaceable></option></arg>
	66	</cmdsynopsis>
	67	</refsynopsisdiv>
	68	<refsect1>
	69	<title>DESCRIPTION</title>
	70	<para>
	71	<command>&dhpackage;</command> is a system for finding genes in microbial DNA, especially the genomes of bacteria and archaea. <command>&dhpackage;</command> (Gene Locator and Interpolated Markov Modeler) uses interpolated Markov models (IMMs) to identify the coding regions and distinguish them from noncoding DNA. The IMM approach, described in our Nucleic Acids Research paper on <command>&dhpackage;</command> 1.0 and in our subsequent paper on <command>&dhpackage;</command> 2.0, uses a combination of Markov models from 1st through 8th-order, weighting each model according to its predictive power. <command>&dhpackage;</command> 1.0 and 2.0 use 3-periodic nonhomogenous Markov models in their IMMs.
	72	</para><para>
	73	<command>&dhpackage;</command> is the primary microbial gene finder at TIGR, and has been used to annotate the complete genomes of B. burgdorferi (Fraser et al., Nature, Dec. 1997), T. pallidum (Fraser et al., Science, July 1998), T. maritima, D. radiodurans, M. tuberculosis, and non-TIGR projects including C. trachomatis, C. pneumoniae, and others. Its analyses of some of these genomes and others is available at the TIGR microbial database site.
	74	</para><para>
	75	A special version of <command>&dhpackage;</command> designed for small eukaryotes, GlimmerM, was used to find the genes in chromosome 2 of the malaria parasite, P. falciparum.. GlimmerM is described in S.L. Salzberg, M. Pertea, A.L. Delcher, M.J. Gardner, and H. Tettelin, "Interpolated Markov models for eukaryotic gene finding," Genomics 59 (1999), 24-31. Click here (http://www.tigr.org/software/glimmerm/) to visit the GlimmerM site, which includes information on how to download the GlimmerM system.
	76	</para><para>
	77	The <command>&dhpackage;</command> system consists of two main programs. The first of these is the training program, build-imm. This program takes an input set of sequences and builds and outputs the IMM for them. These sequences can be complete genes or just partial orfs. For a new genome, this training data can consist of those genes with strong database hits as well as very long open reading frames that are statistically almost certain to be genes. The second program is glimmer, which uses this IMM to identify putative genes in an entire genome. <command>&dhpackage;</command> automatically resolves conflicts between most overlapping genes by choosing one of them. It also identifies genes that are suspected to truly overlap, and flags these for closer inspection by the user. These ``suspect'' gene candidates have been a very small percentage of the total for all the genomes analyzed thus far.
	78	<command>&dhpackage;</command> is a program that...</para>
	79	</refsect1>
	80	<refsect1>
	81	<title>OPTIONS</title>
	82	<variablelist>
	83	<varlistentry>
	84	<term><option>-C <replaceable>n</replaceable></option></term>
	85	<listitem>
	86	<para>Use n as GC percentage of independent model</para>
	87	<para>Note: n should be a percentage, e.g., -C 45.2</para>
	88	</listitem>
	89	</varlistentry>
	90	<varlistentry>
	91	<term>-f</term><listitem><para>Use ribosome-binding energy to choose start codon</para></listitem>
	92	</varlistentry>
	93	<varlistentry>
	94	<term><option>+f</option></term><listitem><para>Use first codon in orf as start codon</para></listitem>
	95	</varlistentry>
	96	<varlistentry>
	97	<term><option>-g <replaceable>n</replaceable></option></term><listitem><para>Set minimum gene length to n</para></listitem>
	98	</varlistentry>
	99	<varlistentry>
	100	<term><option>-i <replaceable>filename</replaceable></option></term>
	101	<listitem>
	102	<para>Use <option><replaceable>filename</replaceable></option>
	103	to select regions of bases that are off
	104	limits, so that no bases within that area will be examined
	105	</para>
	106	</listitem>
	107	</varlistentry>
	108	<varlistentry>
	109	<term><option>-l</option></term>
	110	<listitem><para>Assume linear rather than circular genome, i.e., no wraparound</para></listitem>
	111	</varlistentry>
	112	<varlistentry>
	113	<term><option>-L <replaceable>filename</replaceable></option></term>
	114	<listitem><para>Use filename to specify a list of orfs that should
	115	be scored separately, with no overlap rules
	116	</para></listitem>
	117	</varlistentry>
	118	<varlistentry>
	119	<term><option>-M</option></term>
	120	<listitem><para>Input is a multifasta file of separate genes to be scored
	121	separately, with no overlap rules
	122	</para>
	123	</listitem>
	124	</varlistentry>
	125	<varlistentry>
	126	<term><option>-o <replaceable>n</replaceable></option></term>
	127	<listitem>
	128	<para>Set minimum overlap length to n. Overlaps shorter than this
	129	are ignored.
	130	</para></listitem>
	131	</varlistentry>
	132	<varlistentry>
	133	<term><option>-p <replaceable>n</replaceable></option></term>
	134	<listitem>
	135	<para>
	136	Set minimum overlap percentage to n%. Overlaps shorter than this percentage of both strings are ignored.
	137	</para>
	138	</listitem>
	139	</varlistentry>
	140	<varlistentry>
	141	<term><option>-q <replaceable>n</replaceable></option></term>
	142	<listitem>
	143	<para>Set the maximum length orf that can be rejected because of
	144	the independent probability score column to (n - 1)
	145	</para>
	146	</listitem>
	147	</varlistentry>
	148	<varlistentry>
	149	<term><option>-r</option></term>
	150	<listitem>
	151	<para>
	152	Don't use independent probability score column
	153	</para>
	154	</listitem>
	155	</varlistentry>
	156	<varlistentry>
	157	<term><option>+r</option></term>
	158	<listitem><para>
	159	Use independent probability score column
	160	</para>
	161	</listitem>
	162	</varlistentry>
	163	<varlistentry>
	164	<term><option>-r</option></term>
	165	<listitem>
	166	<para>
	167	Don't use independent probability score column
	168	</para> </listitem> </varlistentry> <varlistentry>
	169	<term><option>-s <replaceable>s</replaceable></option></term>
	170	<listitem><para> Use string s as the ribosome binding pattern to find start codons.</para>
	171	</listitem>
	172	</varlistentry>
	173	<varlistentry>
	174	<term><option>+S</option></term>
	175	<listitem>
	176	<para>
	177	Do use stricter independent intergenic model that doesn't
	178	give probabilities to in-frame stop codons. (Option is obsolete
	179	since this is now the only behaviour
	180	</para> </listitem>
	181	</varlistentry>
	182	<varlistentry>
	183	<term><option>-t <replaceable>n</replaceable></option></term>
	184	<listitem><para>
	185	Set threshold score for calling as gene to n. If the in-frame
	186	score >= n, then the region is given a number and considered
	187	a potential gene.
	188	</para> </listitem>
	189	</varlistentry>
	190	<varlistentry>
	191	<term><option>-w <replaceable>n</replaceable> </option></term>
	192	<listitem><para>
	193	Use "weak" scores on tentative genes n or longer. Weak
	194	scores ignore the independent probability score.
	195	</para></listitem>
	196	</varlistentry>
	197	</variablelist>
	198	</refsect1>
	199	<refsect1>
	200	<title>SEE ALSO</title>
	201	<para>
	202	tigr-adjust (1),
	203	tigr-anomaly (1),
	204	tigr-build-icm (1),
	205	tigr-check (1),
	206	tigr-codon-usage (1),
	207	tigr-compare-lists (1),
	208	tigr-extract (1),
	209	tigr-generate (1),
	210	tigr-get-len (1),
	211	tigr-get-putative (1),
	212	tigr-glimmer3 (1),
	213	tigr-long-orfs (1)
	214	</para>
	215	<para>
	216	http://www.tigr.org/software/glimmer/
	217	</para>
	218	<para>Please see the readme in /usr/share/doc/glimmer for a description on how to use Glimmer.</para>
	219	</refsect1>
	220	<refsect1>
	221	<title>AUTHOR</title>
	222	<para>This manual page was quickly copied from the glimmer web site by &dhusername; &dhemail; for
	223	the &debian; system.
	224	</para>
	225	</refsect1>
	226	</refentry>
	227
	228	<!-- Keep this comment at the end of the file
	229	Local variables:
	230	mode: sgml
	231	sgml-omittag:t
	232	sgml-shorttag:t
	233	sgml-minimize-attributes:nil
	234	sgml-always-quote-attributes:t
	235	sgml-indent-step:2
	236	sgml-indent-data:t
	237	sgml-parent-document:nil
	238	sgml-default-dtd-file:nil
	239	sgml-exposed-tags:nil
	240	sgml-local-catalogs:nil
	241	sgml-local-ecat-files:nil
	242	End:
	243	-->
	244
	245

+238

-0

debian/glimmer2_mans/tigr-long-orfs.sgml less more

	0	<!doctype refentry PUBLIC "-//OASIS//DTD DocBook V4.1//EN" [
	1
	2	<!-- Process this file with docbook-to-man to generate an nroff manual
	3	page: `docbook-to-man manpage.sgml > manpage.1'. You may view
	4	the manual page with: `docbook-to-man manpage.sgml \| nroff -man \|
	5	less'. A typical entry in a Makefile or Makefile.am is:
	6
	7	manpage.1: manpage.sgml
	8	docbook-to-man $< > $@
	9
	10
	11	The docbook-to-man binary is found in the docbook-to-man package.
	12	Please remember that if you create the nroff version in one of the
	13	debian/rules file targets (such as build), you will need to include
	14	docbook-to-man in your Build-Depends control field.
	15
	16	-->
	17
	18	<!-- Fill in your name for FIRSTNAME and SURNAME. -->
	19	<!ENTITY dhfirstname "<firstname>Steffen</firstname>">
	20	<!ENTITY dhsurname "<surname>Möller</surname>">
	21	<!-- Please adjust the date whenever revising the manpage. -->
	22	<!ENTITY dhdate "<date>November 10, 2004</date>">
	23	<!ENTITY dhsection "<manvolnum>1</manvolnum>">
	24	<!ENTITY dhemail "<email>moeller@debian.org</email>">
	25	<!ENTITY dhusername "Steffen Moeller">
	26	<!ENTITY dhucpackage "<refentrytitle>LONG-ORFS</refentrytitle>">
	27	<!ENTITY dhpackage "long-orfs">
	28
	29	<!ENTITY debian "<productname>Debian</productname>">
	30	<!ENTITY gnu "<acronym>GNU</acronym>">
	31	<!ENTITY gpl "&gnu; <acronym>GPL</acronym>">
	32	]>
	33
	34	<refentry>
	35	<refentryinfo>
	36	<address>
	37	&dhemail;
	38	</address>
	39	<author>
	40	&dhfirstname;
	41	&dhsurname;
	42	</author>
	43	<copyright>
	44	<year>2003</year>
	45	<holder>&dhusername;</holder>
	46	</copyright>
	47	&dhdate;
	48	</refentryinfo>
	49	<refmeta>
	50	&dhucpackage;
	51
	52	&dhsection;
	53	</refmeta>
	54	<refnamediv>
	55	<refname>&dhpackage;</refname>
	56
	57	<refpurpose>
	58	Find/Score potential genes in genome-file using
	59	the probability model in icm-file
	60	</refpurpose>
	61	</refnamediv>
	62	<refsynopsisdiv>
	63	<cmdsynopsis>
	64	<command>tigr-long-orgs</command>
	65	<arg>genome-file <option><replaceable>options</replaceable></option></arg>
	66	</cmdsynopsis>
	67	</refsynopsisdiv>
	68	<refsect1>
	69	<title>DESCRIPTION</title>
	70	<para>
	71	Program long-orfs takes a sequence file (in FASTA format) and
	72	outputs a list of all long "potential genes" in it that do not
	73	overlap by too much. By "potential gene" I mean the portion of
	74	an orf from the first start codon to the stop codon at the end.
	75	</para><para>
	76	The first few lines of output specify the settings of various
	77	parameters in the program:
	78	</para><para>
	79	Minimum gene length is the length of the smallest fragment
	80	considered to be a gene. The length is measured from the first base
	81	of the start codon to the last base before the stop codon.
	82	This value can be specified when running the program with the -g option.
	83	By default, the program now (April 2003) will compute an optimal length
	84	for this parameter, where "optimal" is the value that produces the
	85	greatest number of long ORFs, thereby increasing the amount of data
	86	used for training.
	87	</para><para>
	88	Minimum overlap length is a lower bound on the number of bases overlap
	89	between 2 genes that is considered a problem. Overlaps shorter than
	90	this are ignored.
	91	</para><para>
	92	Minimum overlap percent is another lower bound on the number of bases
	93	overlap that is considered a problem. Overlaps shorter than this
	94	percentage of both genes are ignored.
	95	</para><para>
	96	The next portion of the output is a list of potential genes:
	97	</para><para>
	98	Column 1 is an ID number for reference purposes. It is assigned
	99	sequentially starting with 1 to all long potential genes. If
	100	overlapping genes are eliminated, gaps in the numbers will occur.
	101	The ID prefix is specified in the constant ID_PREFIX .
	102	</para><para>
	103	Column 2 is the position of the first base of the first start codon in
	104	the orf. Currently I use atg, and gtg as start codons. This is
	105	easily changed in the function Is_Start () .
	106	</para><para>
	107	Column 3 is the position of the last base before the stop codon. Stop
	108	codons are taa, tag, and tga. Note that for orfs in the reverse
	109	reading frames have their start position higher than the end position.
	110	The order in which orfs are listed is in increasing order by
	111	Max {OrfStart, End}, i.e., the highest numbered position in the orf,
	112	except for orfs that "wrap around" the end of the sequence.
	113	</para><para>
	114	When two genes with ID numbers overlap by at least a sufficient
	115	amount (as determined by Min_Olap and Min_Olap_Percent ), they
	116	are eliminated and do not appear in the output.
	117	</para><para>
	118	The final output of the program (sent to the standard error file so
	119	it does not show up when output is redirected to a file) is the
	120	length of the longest orf found.
	121	</para><para>
	122
	123
	124	Specifying Different Start and Stop Codons:
	125	</para><para>
	126	To specify different sets of start and stop codons, modify the file
	127	gene.h . Specifically, the functions:
	128	</para><para>
	129	Is_Forward_Start Is_Reverse_Start Is_Start
	130	Is_Forward_Stop Is_Reverse_Stop Is_Stop
	131	</para><para>
	132	are used to determine what is used for start and stop codons.
	133	</para><para>
	134	Is_Start and Is_Stop do simple string comparisons to specify
	135	which patterns are used. To add a new pattern, just add the comparison
	136	for it. To remove a pattern, comment out or delete the comparison
	137	for it.
	138	</para><para>
	139	The other four functions use a bit comparison to determine start and
	140	stop patterns. They represent a codon as a 12-bit pattern, with 4 bits
	141	for each base, one bit for each possible value of the bases, T, G, C
	142	or A. Thus the bit pattern 0010 0101 1100 represents the base
	143	pattern [C] [A or G] [G or T]. By doing bit operations (& \| ~) and
	144	comparisons, more complicated patterns involving ambiguous reads
	145	can be tested efficiently. Simple patterns can be tested as in
	146	the current code.
	147	</para><para>
	148	For example, to insert an additional start codon of CAT requires 3 changes:
	149	1. The line
	150	\|\| (Codon & 0x218) == Codon
	151	should be inserted into Is_Forward_Start , since 0x218 = 0010 0001 1000
	152	represents CAT.
	153	2. The line
	154	\|\| (Codon & 0x184) == Codon
	155	should be inserted into Is_Reverse_Start , since 0x184 = 0001 1000 0100
	156	represents ATG, which is the reverse-complement of CAT. Alternately,
	157	the #define constant ATG_MASK could be used.
	158	3. The line
	159	\|\| strncmp (S, "cat", 3) == 0
	160	should be inserted into Is_Start .
	161	</para>
	162
	163	</refsect1>
	164	<refsect1>
	165	<title>OPTIONS</title>
	166	<variablelist>
	167	<varlistentry>
	168	<term><option>-g <replaceable>n</replaceable></option></term>
	169	<listitem>
	170	<para> Set minimum gene length to n. Default is to compute an
	171	optimal value automatically. Don't change this unless you
	172	know what you're doing.</para>
	173	</listitem>
	174	</varlistentry>
	175	<varlistentry>
	176	<term><option>-l</option></term><listitem><para>Regard the genome as linear (not circular), i.e., do not allow
	177	genes to "wrap around" the end of the genome.
	178	This option works on both glimmer and long-orfs .
	179	The default behavior is to regard the genome as circular.</para></listitem>
	180	</varlistentry>
	181	<varlistentry>
	182	<term><option>-o <replaceable>n</replaceable></option></term><listitem><para>Set maximum overlap length to n. Overlaps shorter than this
	183	are permitted. (Default is 0 bp.)</para></listitem>
	184	</varlistentry>
	185	<varlistentry>
	186	<term><option>-p <replaceable>n</replaceable></option></term><listitem><para>Set maximum overlap percentage to n%. Overlaps shorter than
	187	this percentage of both strings are ignored. (Default is 10%.)</para></listitem>
	188	</varlistentry>
	189	</variablelist>
	190	</refsect1>
	191	<refsect1>
	192	<title>SEE ALSO</title>
	193	<para>
	194	tigr-glimmer3 (1),
	195	tigr-adjust (1),
	196	tigr-anomaly (1),
	197	tigr-build-icm (1),
	198	tigr-check (1),
	199	tigr-codon-usage (1),
	200	tigr-compare-lists (1),
	201	tigr-extract (1),
	202	tigr-generate (1),
	203	tigr-get-len (1),
	204	tigr-get-putative (1),
	205	</para>
	206	<para>
	207	http://www.tigr.org/software/glimmer/
	208	</para>
	209
	210	<para>Please see the readme in /usr/share/doc/tigr-glimmer for a description on how to use Glimmer3.</para>
	211	</refsect1>
	212	<refsect1>
	213	<title>AUTHOR</title>
	214
	215	<para>This manual page was quickly copied from the glimmer web site by &dhusername; &dhemail; for
	216	the &debian; system.
	217	</para>
	218
	219	</refsect1>
	220	</refentry>
	221
	222	<!-- Keep this comment at the end of the file
	223	Local variables:
	224	mode: sgml
	225	sgml-omittag:t
	226	sgml-shorttag:t
	227	sgml-minimize-attributes:nil
	228	sgml-always-quote-attributes:t
	229	sgml-indent-step:2
	230	sgml-indent-data:t
	231	sgml-parent-document:nil
	232	sgml-default-dtd-file:nil
	233	sgml-exposed-tags:nil
	234	sgml-local-catalogs:nil
	235	sgml-local-ecat-files:nil
	236	End:
	237	-->

+120

-0

debian/glimmer2_mans/tigr-run-glimmer3.sgml less more

	0	<!doctype refentry PUBLIC "-//OASIS//DTD DocBook V4.1//EN" [
	1
	2	<!-- Process this file with docbook-to-man to generate an nroff manual
	3	page: `docbook-to-man manpage.sgml > manpage.1'. You may view
	4	the manual page with: `docbook-to-man manpage.sgml \| nroff -man \|
	5	less'. A typical entry in a Makefile or Makefile.am is:
	6
	7	manpage.1: manpage.sgml
	8	docbook-to-man $< > $@
	9
	10
	11	The docbook-to-man binary is found in the docbook-to-man package.
	12	Please remember that if you create the nroff version in one of the
	13	debian/rules file targets (such as build), you will need to include
	14	docbook-to-man in your Build-Depends control field.
	15
	16	-->
	17
	18	<!-- Fill in your name for FIRSTNAME and SURNAME. -->
	19	<!ENTITY dhfirstname "<firstname>Steffen</firstname>">
	20	<!ENTITY dhsurname "<surname>Möller</surname>">
	21	<!-- Please adjust the date whenever revising the manpage. -->
	22	<!ENTITY dhdate "<date>November 10, 2004</date>">
	23	<!ENTITY dhsection "<manvolnum>1</manvolnum>">
	24	<!ENTITY dhemail "<email>moeller@debian.org</email>">
	25	<!ENTITY dhusername "Steffen Moeller">
	26	<!ENTITY dhucpackage "<refentrytitle>TIGR-GLIMMER</refentrytitle>">
	27	<!ENTITY dhpackage "tigr-glimmer">
	28
	29	<!ENTITY debian "<productname>Debian</productname>">
	30	<!ENTITY gnu "<acronym>GNU</acronym>">
	31	<!ENTITY gpl "&gnu; <acronym>GPL</acronym>">
	32	]>
	33
	34	<refentry>
	35	<refentryinfo>
	36	<address>
	37	&dhemail;
	38	</address>
	39	<author>
	40	&dhfirstname;
	41	&dhsurname;
	42	</author>
	43	<copyright>
	44	<year>2003</year>
	45	<holder>&dhusername;</holder>
	46	</copyright>
	47	&dhdate;
	48	</refentryinfo>
	49	<refmeta>
	50	&dhucpackage;
	51
	52	&dhsection;
	53	</refmeta>
	54	<refnamediv>
	55	<refname>&dhpackage;</refname>
	56
	57	<refpurpose>
	58	Apply the suite of programs within glimmer3 to a a prokaryotic or archean genome.
	59	</refpurpose>
	60	</refnamediv>
	61	<refsynopsisdiv>
	62	<cmdsynopsis>
	63	<command>tigr-run-glimmer3</command>
	64	</cmdsynopsis>
	65	</refsynopsisdiv>
	66	<refsect1>
	67	<title>DESCRIPTION</title>
	68	<para>
	69	A shell script that wraps a set of tigr-* utilities of the glimmer package to retrieve coding regions.
	70	</para>
	71	</refsect1>
	72	<refsect1>
	73	<title>SEE ALSO</title>
	74	<para>
	75	tigr-glimmer3 (1),
	76	tigr-adjust (1),
	77	tigr-anomaly (1),
	78	tigr-build-icm (1),
	79	tigr-check (1),
	80	tigr-codon-usage (1),
	81	tigr-compare-lists (1),
	82	tigr-extract (1),
	83	tigr-generate (1),
	84	tigr-get-len (1),
	85	tigr-get-putative (1),
	86	tigr-long-orfs (1),
	87	</para>
	88	<para>
	89	http://www.tigr.org/software/glimmer/
	90	</para>
	91
	92	<para>Please see the readme in /usr/share/doc/tigr-glimmer for a description on how to use Glimmer3.</para>
	93	</refsect1>
	94	<refsect1>
	95	<title>AUTHOR</title>
	96
	97	<para>This manual page was quickly copied from the glimmer web site by &dhusername; &dhemail; for
	98	the &debian; system.
	99	</para>
	100
	101	</refsect1>
	102	</refentry>
	103
	104	<!-- Keep this comment at the end of the file
	105	Local variables:
	106	mode: sgml
	107	sgml-omittag:t
	108	sgml-shorttag:t
	109	sgml-minimize-attributes:nil
	110	sgml-always-quote-attributes:t
	111	sgml-indent-step:2
	112	sgml-indent-data:t
	113	sgml-parent-document:nil
	114	sgml-default-dtd-file:nil
	115	sgml-exposed-tags:nil
	116	sgml-local-catalogs:nil
	117	sgml-local-ecat-files:nil
	118	End:
	119	-->

+2

-0

debian/install less more

	0	bin/* usr/lib/tigr-glimmer
	1	debian/bin/* usr/bin

+2

-0

debian/manpages less more

	0	debian/*.1
	1	debian/glimmer2_mans/*.1

+175

-0

debian/patches/10_gcc4.3.patch less more

	0	Author: Kumar Appaiah
	1	Description: Fix #461691
	2
	3	--- a/src/Common/delcher.cc
	4	+++ b/src/Common/delcher.cc
	5	@@ -9,6 +9,7 @@
	6
	7	#include "delcher.hh"
	8
	9	+#include <cstring>
	10
	11	const int COMMATIZE_BUFF_LEN = 50;
	12	// Length of buffer for creating string with commas
	13	--- a/src/Common/fasta.cc
	14	+++ b/src/Common/fasta.cc
	15	@@ -9,7 +9,7 @@
	16
	17	#include "fasta.hh"
	18
	19	-
	20	+#include <cstring>
	21
	22	void Fasta_Print
	23	(FILE * fp, const char * s, const char * hdr, int fasta_width)
	24	--- a/src/Common/gene.cc
	25	+++ b/src/Common/gene.cc
	26	@@ -10,6 +10,7 @@
	27	#include "delcher.hh"
	28	#include "gene.hh"
	29
	30	+#include <cstring>
	31
	32	static const char COMPLEMENT_TABLE []
	33	= "nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn"
	34	--- a/src/Glimmer/anomaly.cc
	35	+++ b/src/Glimmer/anomaly.cc
	36	@@ -12,6 +12,7 @@
	37
	38	#include "anomaly.hh"
	39
	40	+#include <cstring>
	41
	42	// Global variables
	43
	44	--- a/src/ICM/icm.cc
	45	+++ b/src/ICM/icm.cc
	46	@@ -15,6 +15,8 @@
	47
	48	#include "icm.hh"
	49
	50	+#include <cstring>
	51	+
	52	using namespace std;
	53
	54	extern int Verbose;
	55	--- a/src/Util/entropy-score.cc
	56	+++ b/src/Util/entropy-score.cc
	57	@@ -9,7 +9,7 @@
	58	// regions in it by entropy distance. Results are output
	59	// to stdout .
	60
	61	-
	62	+#include <cstring>
	63
	64	#include "entropy-score.hh"
	65
	66	--- a/src/Glimmer/glimmer3.cc
	67	+++ b/src/Glimmer/glimmer3.cc
	68	@@ -12,11 +12,10 @@
	69	// Copyright (c) 2006 University of Maryland Center for Bioinformatics
	70	// & Computational Biology
	71
	72	-
	73	+#include <cstring>
	74
	75	#include "glimmer3.hh"
	76
	77	-
	78	static int For_Edwin = 0;
	79
	80
	81	--- a/src/ICM/build-icm.cc
	82	+++ b/src/ICM/build-icm.cc
	83	@@ -13,6 +13,7 @@
	84
	85	#include "build-icm.hh"
	86
	87	+#include <cstring>
	88
	89	static int Genbank_Xlate_Code = 0;
	90	// Holds the Genbank translation table number that determines
	91	--- a/src/Util/extract.cc
	92	+++ b/src/Util/extract.cc
	93	@@ -9,7 +9,7 @@
	94	// sequences specified by coordinates. The resulting sequences
	95	// are output (in multifasta or two-string format) to stdout.
	96
	97	-
	98	+#include <cstring>
	99
	100	#include "extract.hh"
	101
	102	--- a/src/Glimmer/glimmer2.cc
	103	+++ b/src/Glimmer/glimmer2.cc
	104	@@ -37,6 +37,7 @@
	105	#include "delcher.h"
	106	#include "gene.h"
	107
	108	+#include <cstring>
	109
	110	const int DEFAULT_MIN_GENE_LEN = 90;
	111	const double DEFAULT_MIN_OLAP_PERCENT = 0.10;
	112	--- a/src/Glimmer/long-orfs.cc
	113	+++ b/src/Glimmer/long-orfs.cc
	114	@@ -15,7 +15,7 @@
	115
	116	#include "long-orfs.hh"
	117
	118	-
	119	+#include <cstring>
	120
	121	// External variables
	122
	123	--- a/src/ICM/build-fixed.cc
	124	+++ b/src/ICM/build-fixed.cc
	125	@@ -12,6 +12,7 @@
	126
	127	#include "build-fixed.hh"
	128
	129	+#include <cstring>
	130
	131	static FILE * Index_File_fp = NULL;
	132	// File containing a list of subscripts of strings to train model
	133	--- a/src/ICM/score-fixed.cc
	134	+++ b/src/ICM/score-fixed.cc
	135	@@ -8,6 +8,7 @@
	136
	137	#include "score-fixed.hh"
	138
	139	+#include <cstring>
	140
	141	static char * Pos_Model_Path;
	142	// Name of file containing the positive model
	143	--- a/src/Util/multi-extract.cc
	144	+++ b/src/Util/multi-extract.cc
	145	@@ -10,7 +10,7 @@
	146	// resulting sequences are output (in multifasta or two-string format)
	147	// to stdout.
	148
	149	-
	150	+#include <cstring>
	151
	152	#include "multi-extract.hh"
	153
	154	--- a/src/Util/start-codon-distrib.cc
	155	+++ b/src/Util/start-codon-distrib.cc
	156	@@ -17,6 +17,7 @@
	157
	158	#include "start-codon-distrib.hh"
	159
	160	+#include <cstring>
	161
	162	// External variables
	163
	164	--- a/src/Util/uncovered.cc
	165	+++ b/src/Util/uncovered.cc
	166	@@ -10,7 +10,7 @@
	167	// specified in the file named as the second command-line argument.
	168	// Output is a multifasta file sent to stdout.
	169
	170	-
	171	+#include <cstring>
	172
	173	#include "uncovered.hh"
	174

+25

-0

debian/patches/10_gcc4.4.patch less more

	0	Author: Andreas Tille <tille@debian.org>
	1	Description: Fix FTBFS #560442
	2
	3	--- a/src/Common/gene.cc
	4	+++ b/src/Common/gene.cc
	5	@@ -444,7 +444,7 @@ int Char_Sub
	6	// Return a subscript corresponding to character ch .
	7
	8	{
	9	- char * p;
	10	+ const char * p;
	11
	12	p = strchr (CONVERSION_STRING, tolower (ch));
	13	if (p == NULL)
	14	--- a/src/ICM/icm.cc
	15	+++ b/src/ICM/icm.cc
	16	@@ -1983,7 +1983,7 @@ int Subscript
	17	// model) for character ch .
	18
	19	{
	20	- char * p;
	21	+ const char * p;
	22
	23	p = strchr (ALPHA_STRING, tolower (Filter (ch)));
	24	if (p == NULL)

+140

-0

debian/patches/mayhem.patch less more

	0	Author: Andreas Tille <tille@debian.org>
	1	Last-Update: Mon, 14 Dec 2015 16:44:19 +0100
	2	Bug-Debian: http://bugs.debian.org/715701,
	3	http://bugs.debian.org/715702
	4	Description: Fix crashes reported by Mayhem
	5	See http://www.drpaulcarter.com/cs/common-c-errors.php#4.1
	6	to make fgetc() more safe. However, the original problem is
	7	that for empty strings no space at all is allocated. This is
	8	now done in advance.
	9
	10	--- a/src/ICM/build-fixed.cc
	11	+++ b/src/ICM/build-fixed.cc
	12	@@ -234,20 +234,24 @@ static int Read_String
	13	{
	14	int ch, ct;
	15
	16	- while ((ch = fgetc (fp)) != EOF && ch != '>')
	17	+ while ((ch = fgetc (fp)) != EOF && ch != ((int) '>'))
	18	;
	19
	20	if (ch == EOF)
	21	return FALSE;
	22
	23	ct = 0;
	24	- while ((ch = fgetc (fp)) != EOF && ch != '\n' && isspace (ch))
	25	+ while ((ch = fgetc (fp)) != EOF && ch != ((int) '\n') && isspace (ch))
	26	;
	27	if (ch == EOF)
	28	return FALSE;
	29	- if (ch != '\n' && ! isspace (ch))
	30	+ if (ch != ((int) '\n') && ! isspace (ch))
	31	ungetc (ch, fp);
	32	- while ((ch = fgetc (fp)) != EOF && ch != '\n')
	33	+ if (tag_size == 0 ) {
	34	+ tag_size += INCR_SIZE;
	35	+ tag = (char *) Safe_realloc (tag, tag_size);
	36	+ }
	37	+ while ((ch = fgetc (fp)) != EOF && ch != ((int) '\n'))
	38	{
	39	if (ct >= tag_size - 1)
	40	{
	41	@@ -259,7 +263,11 @@ static int Read_String
	42	tag [ct ++] = '\0';
	43
	44	ct = 0;
	45	- while ((ch = fgetc (fp)) != EOF && ch != '>')
	46	+ if (s_size == 0) {
	47	+ s_size += INCR_SIZE;
	48	+ s = (char *) Safe_realloc (s, s_size);
	49	+ }
	50	+ while ((ch = fgetc (fp)) != EOF && ch != ((int) '>'))
	51	{
	52	if (isspace (ch))
	53	continue;
	54	--- a/src/ICM/build-icm.cc
	55	+++ b/src/ICM/build-icm.cc
	56	@@ -271,20 +271,24 @@ static int Read_String
	57	{
	58	int ch, ct;
	59
	60	- while ((ch = fgetc (fp)) != EOF && ch != '>')
	61	+ while ((ch = fgetc (fp)) != EOF && ch != ((int) '>'))
	62	;
	63
	64	if (ch == EOF)
	65	return FALSE;
	66
	67	ct = 0;
	68	- while ((ch = fgetc (fp)) != EOF && ch != '\n' && isspace (ch))
	69	+ while ((ch = fgetc (fp)) != EOF && ch != ((int) '\n') && isspace (ch))
	70	;
	71	if (ch == EOF)
	72	return FALSE;
	73	if (ch != '\n' && ! isspace (ch))
	74	ungetc (ch, fp);
	75	- while ((ch = fgetc (fp)) != EOF && ch != '\n')
	76	+ if (tag_size == 0) {
	77	+ tag_size += INCR_SIZE;
	78	+ tag = (char *) Safe_realloc (tag, tag_size);
	79	+ }
	80	+ while ((ch = fgetc (fp)) != EOF && ch != ((int) '\n'))
	81	{
	82	if (ct >= tag_size - 1)
	83	{
	84	@@ -296,7 +300,11 @@ static int Read_String
	85	tag [ct ++] = '\0';
	86
	87	ct = 0;
	88	- while ((ch = fgetc (fp)) != EOF && ch != '>')
	89	+ if (s_size == 0) {
	90	+ s_size += INCR_SIZE;
	91	+ s = (char *) Safe_realloc (s, s_size);
	92	+ }
	93	+ while ((ch = fgetc (fp)) != EOF && ch != ((int) '>'))
	94	{
	95	if (isspace (ch))
	96	continue;
	97	--- a/src/ICM/score-fixed.cc
	98	+++ b/src/ICM/score-fixed.cc
	99	@@ -163,20 +163,24 @@ int Read_String
	100	{
	101	int ch, ct;
	102
	103	- while ((ch = fgetc (fp)) != EOF && ch != '>')
	104	+ while ((ch = fgetc (fp)) != EOF && ch != ((int) '>'))
	105	;
	106
	107	if (ch == EOF)
	108	return FALSE;
	109
	110	ct = 0;
	111	- while ((ch = fgetc (fp)) != EOF && ch != '\n' && isspace (ch))
	112	+ while ((ch = fgetc (fp)) != EOF && ch != ((int) '\n') && isspace (ch))
	113	;
	114	if (ch == EOF)
	115	return FALSE;
	116	if (ch != '\n' && ! isspace (ch))
	117	ungetc (ch, fp);
	118	- while ((ch = fgetc (fp)) != EOF && ch != '\n')
	119	+ if (tag_size == 0 ) {
	120	+ tag_size += INCR_SIZE;
	121	+ tag = (char *) Safe_realloc (tag, tag_size);
	122	+ }
	123	+ while ((ch = fgetc (fp)) != EOF && ch != ((int) '\n'))
	124	{
	125	if (ct >= tag_size - 1)
	126	{
	127	@@ -188,7 +192,11 @@ int Read_String
	128	tag [ct ++] = '\0';
	129
	130	ct = 0;
	131	- while ((ch = fgetc (fp)) != EOF && ch != '>')
	132	+ if (s_size == 0) {
	133	+ s_size += INCR_SIZE;
	134	+ s = (char *) Safe_realloc (s, s_size);
	135	+ }
	136	+ while ((ch = fgetc (fp)) != EOF && ch != ((int) '>'))
	137	{
	138	if (isspace (ch))
	139	continue;

+4

-0

debian/patches/series less more

	0	10_gcc4.3.patch
	1	10_gcc4.4.patch
	2
	3	mayhem.patch

+25

-0

debian/rules less more

	0	#!/usr/bin/make -f
	1
	2	MANPAGES=debian/glimmer2_mans/tigr-anomaly.1 \
	3	debian/glimmer2_mans/tigr-build-icm.1 \
	4	debian/glimmer2_mans/tigr-extract.1 \
	5	debian/glimmer2_mans/tigr-glimmer3.1 \
	6	debian/glimmer2_mans/tigr-long-orfs.1 \
	7	debian/glimmer2_mans/tigr-run-glimmer3.1
	8
	9	.SUFFIXES: .1 .sgml
	10
	11	.sgml.1:
	12	docbook-to-man $< > $@
	13
	14	%:
	15	dh $@
	16
	17	override_dh_clean:
	18	dh_clean $(MANPAGES)
	19	cd src; make clean
	20	rm -f bin/* lib/* obj/*
	21
	22	override_dh_auto_build: $(MANPAGES)
	23	# dh_auto_build
	24	cd src; make CFLAGS="$(CFLAGS)" CPPFLAGS="$(CPPFLAGS)" CXXFLAGS="$(CXXFLAGS)" LDFLAGS="$(LDFLAGS)"

+1

-0

debian/source/format less more

0

3.0 (quilt)

+39

-0

debian/tigr-glimmer.1 less more

	0	.TH TIGR-GLIMMER 1 "April 16, 2008"
	1	.SH NAME
	2	tigr-glimmer \- runs various programs of the TIGR Glimmer suite
	3	.SH SYNOPSIS
	4	.B tigr-glimmer
	5	.B program
	6	[arguments]
	7	.SH DESCRIPTION
	8	This manual page documents briefly the
	9	.B tigr-glimmer
	10	wrapper to the TIGR Glimmer programs.
	11	This manual page was written for the Debian GNU/Linux distribution
	12	because upstream does not provide this wrapper and it was invented
	13	for Debian to avoid conflicts with other packages that might cause
	14	a name space polution.
	15	.PP
	16	\fBtigr-glimmer\fP is just a wrapper that invokes the various programs in
	17	the TIGR Glimmer software package. You can get more detailed documentation
	18	in /usr/share/doc/tigr-glimmer. Please note that the documentation there
	19	is a part of the former version Glimmer 2. The version Glimmer 3 has
	20	some features that were described in the notes.pdf document inside
	21	the documentation directory.
	22	.PP
	23	The following programs are included: anomaly, build-fixed, build-icm,
	24	entropy-profile, entropy-score, extract, glimmer3, long-orfs, multi-extract,
	25	score-fixed, start-codon-distrib, test, uncovered and window-acgt.
	26	.SH OPTIONS
	27	There are no options.
	28	.SH EXAMPLES
	29	.IP tigr-glimmer\ build-icm
	30	.IP tigr-glimmer\ long-orfs
	31	.SH SEE ALSO
	32	For the pre previously packaged version Glimmer2 some text files from
	33	the documentation were turned to man pages for the Debian GNU/Linux
	34	distribution by Steffen Moeller <moeller@debian.org>
	35	.br
	36	.SH AUTHORS
	37	This manual page was written by Andreas Tille <tille@debian.org>, for
	38	the Debian GNU/Linux system (but may be used by others).

+12

-0

debian/upstream/metadata less more

	0	Reference:
	1	Author: Steven L. Salzberg and Arthur L. Delcher and S. Kasif and O. White
	2	Title: Microbial gene identification using interpolated Markov models
	3	Journal: Nucleic Acids Research
	4	Year: 1998
	5	Volume: 26
	6	Number: 2
	7	Pages: 544-8
	8	DOI: 10.1093/nar/26.2.544
	9	PMID: 9421513
	10	URL: http://nar.oxfordjournals.org/content/26/2/544
	11	eprint: http://nar.oxfordjournals.org/content/26/2/544.full.pdf+html

+3

-0

debian/watch less more

	0	version=3
	1	opts="dversionmangle=s/\.//" \
	2	http://www.cbcb.umd.edu/software/glimmer/ glimmer(.*)\.tar.gz