Codebase list tigr-glimmer / 6d2f66d debian / glimmer2_mans / tigr-glimmer3.sgml
6d2f66d

Tree @6d2f66d (Download .tar.gz)

tigr-glimmer3.sgml @6d2f66draw · history · blame

<!doctype refentry PUBLIC "-//OASIS//DTD DocBook V4.1//EN" [

<!-- Process this file with docbook-to-man to generate an nroff manual
     page: `docbook-to-man manpage.sgml > manpage.1'.  You may view
     the manual page with: `docbook-to-man manpage.sgml | nroff -man |
     less'.  A typical entry in a Makefile or Makefile.am is:

manpage.1: manpage.sgml
	docbook-to-man $< > $@

    
	The docbook-to-man binary is found in the docbook-to-man package.
	Please remember that if you create the nroff version in one of the
	debian/rules file targets (such as build), you will need to include
	docbook-to-man in your Build-Depends control field.

  -->

  <!-- Fill in your name for FIRSTNAME and SURNAME. -->
  <!ENTITY dhfirstname "<firstname>Steffen</firstname>">
  <!ENTITY dhsurname   "<surname>Möller</surname>">
  <!-- Please adjust the date whenever revising the manpage. -->
  <!ENTITY dhdate      "<date>November 10, 2004</date>">
  <!ENTITY dhsection   "<manvolnum>1</manvolnum>">
  <!ENTITY dhemail     "<email>moeller@debian.org</email>">
  <!ENTITY dhusername  "Steffen Moeller">
  <!ENTITY dhucpackage "<refentrytitle>TIGR-GLIMMER</refentrytitle>">
  <!ENTITY dhpackage   "tigr-glimmer">

  <!ENTITY debian      "<productname>Debian</productname>">
  <!ENTITY gnu         "<acronym>GNU</acronym>">
  <!ENTITY gpl         "&gnu; <acronym>GPL</acronym>">
]>

<refentry>
  <refentryinfo>
    <address>
      &dhemail;
    </address>
    <author>
      &dhfirstname;
      &dhsurname;
    </author>
    <copyright>
      <year>2003</year>
      <holder>&dhusername;</holder>
    </copyright>
    &dhdate;
  </refentryinfo>
  <refmeta>
    &dhucpackage;

    &dhsection;
  </refmeta>
  <refnamediv>
    <refname>&dhpackage;</refname>
    <refpurpose>
Find/Score potential genes in genome-file using the probability model in icm-file
</refpurpose>
  </refnamediv>
  <refsynopsisdiv>
    <cmdsynopsis>
      <command>tigr-glimmer3</command>
	<arg><option><replaceable>genome-file</replaceable></option></arg>
	<arg><option><replaceable>icm-file</replaceable></option></arg>
        <arg><option><replaceable>[options]</replaceable></option></arg>
    </cmdsynopsis>
  </refsynopsisdiv>
  <refsect1>
    <title>DESCRIPTION</title>
<para>
<command>&dhpackage;</command> is a system for finding genes in microbial DNA, especially the genomes of bacteria and archaea. <command>&dhpackage;</command> (Gene Locator and Interpolated Markov Modeler) uses interpolated Markov models (IMMs) to identify the coding regions and distinguish them from noncoding DNA. The IMM approach, described in our Nucleic Acids Research paper on <command>&dhpackage;</command> 1.0 and in our subsequent paper on <command>&dhpackage;</command> 2.0, uses a combination of Markov models from 1st through 8th-order, weighting each model according to its predictive power. <command>&dhpackage;</command> 1.0 and 2.0 use 3-periodic nonhomogenous Markov models in their IMMs.
</para><para>
<command>&dhpackage;</command> is the primary microbial gene finder at TIGR, and has been used to annotate the complete genomes of B. burgdorferi (Fraser et al., Nature, Dec. 1997), T. pallidum (Fraser et al., Science, July 1998), T. maritima, D. radiodurans, M. tuberculosis, and non-TIGR projects including C. trachomatis, C. pneumoniae, and others. Its analyses of some of these genomes and others is available at the TIGR microbial database site.
</para><para>
A special version of <command>&dhpackage;</command> designed for small eukaryotes, GlimmerM, was used to find the genes in chromosome 2 of the malaria parasite, P. falciparum.. GlimmerM is described in S.L. Salzberg, M. Pertea, A.L. Delcher, M.J. Gardner, and H. Tettelin, "Interpolated Markov models for eukaryotic gene finding," Genomics 59 (1999), 24-31.  Click here (http://www.tigr.org/software/glimmerm/) to visit the GlimmerM site, which includes information on how to download the GlimmerM system.
</para><para>
The <command>&dhpackage;</command> system consists of two main programs. The first of these is the training program, build-imm. This program takes an input set of sequences and builds and outputs the IMM for them. These sequences can be complete genes or just partial orfs. For a new genome, this training data can consist of those genes with strong database hits as well as very long open reading frames that are statistically almost certain to be genes. The second program is glimmer, which uses this IMM to identify putative genes in an entire genome. <command>&dhpackage;</command> automatically resolves conflicts between most overlapping genes by choosing one of them. It also identifies genes that are suspected to truly overlap, and flags these for closer inspection by the user. These ``suspect'' gene candidates have been a very small percentage of the total for all the genomes analyzed thus far.
    <command>&dhpackage;</command> is a program that...</para>
  </refsect1>
  <refsect1>
    <title>OPTIONS</title>
    <variablelist>
      <varlistentry>
        <term><option>-C <replaceable>n</replaceable></option></term>
	<listitem>
	<para>Use n as GC percentage of independent model</para>
        <para>Note:  n should be a percentage, e.g., -C 45.2</para>
	</listitem>
      </varlistentry>
      <varlistentry>
       <term>-f</term><listitem><para>Use ribosome-binding energy to choose start codon</para></listitem>
      </varlistentry>
      <varlistentry>
       <term><option>+f</option></term><listitem><para>Use first codon in orf as start codon</para></listitem>
      </varlistentry>
      <varlistentry>
       <term><option>-g <replaceable>n</replaceable></option></term><listitem><para>Set minimum gene length to n</para></listitem>
      </varlistentry>
      <varlistentry>
<term><option>-i <replaceable>filename</replaceable></option></term>
	<listitem>
	<para>Use <option><replaceable>filename</replaceable></option>
	to select regions of bases that are off
        limits, so that no bases within that area will be examined
	</para>
	</listitem>
      </varlistentry>
      <varlistentry>
      <term><option>-l</option></term>
      <listitem><para>Assume linear rather than circular genome, i.e., no wraparound</para></listitem>
      </varlistentry>
      <varlistentry>
	<term><option>-L <replaceable>filename</replaceable></option></term>
	<listitem><para>Use filename to specify a list of orfs that should
        be scored separately, with no overlap rules
	</para></listitem>
      </varlistentry>
      <varlistentry>
	<term><option>-M</option></term>
	<listitem><para>Input is a multifasta file of separate genes to be scored
        separately, with no overlap rules
      </para>
      </listitem>
</varlistentry>
<varlistentry>
	<term><option>-o <replaceable>n</replaceable></option></term>
	<listitem>
	 <para>Set minimum overlap length to n.  Overlaps shorter than this
        are ignored.
      </para></listitem>
</varlistentry>
<varlistentry>
	<term><option>-p <replaceable>n</replaceable></option></term>
	<listitem>
	<para>
	Set minimum overlap percentage to n%.  Overlaps shorter than this percentage of *both* strings are ignored.
	</para>
	</listitem>
</varlistentry>
<varlistentry>
	<term><option>-q <replaceable>n</replaceable></option></term>
      <listitem>
      <para>Set the maximum length orf that can be rejected because of
        the independent probability score column to (n - 1)
      </para>
	</listitem>
</varlistentry>
<varlistentry>
      <term><option>-r</option></term>
	<listitem>
      <para>
 Don't use independent probability score column
      </para>
	</listitem>
</varlistentry>
<varlistentry>
	<term><option>+r</option></term>
	<listitem><para>
Use independent probability score column
      </para>
	</listitem>
</varlistentry>
<varlistentry>
      <term><option>-r</option></term>
	<listitem>
      <para>
 Don't use independent probability score column
      </para> </listitem> </varlistentry> <varlistentry>
	<term><option>-s <replaceable>s</replaceable></option></term>
	<listitem><para> Use string s as the ribosome binding pattern to find start codons.</para>
	</listitem>
</varlistentry>
<varlistentry>
      <term><option>+S</option></term>
	<listitem>
      <para>
 Do use stricter independent intergenic model that doesn't
        give probabilities to in-frame stop codons.  (Option is obsolete
        since this is now the only behaviour
      </para> </listitem>
</varlistentry>
<varlistentry>
	<term><option>-t <replaceable>n</replaceable></option></term>
	<listitem><para>
 Set threshold score for calling as gene to n.  If the in-frame
        score >= n, then the region is given a number and considered
        a potential gene.
      </para> </listitem>
</varlistentry>
<varlistentry>
	<term><option>-w <replaceable>n</replaceable> </option></term>
	<listitem><para>
   Use "weak" scores on tentative genes n or longer.  Weak
        scores ignore the independent probability score.
	</para></listitem>
      </varlistentry>
    </variablelist>
  </refsect1>
  <refsect1>
    <title>SEE ALSO</title>
    <para>
tigr-adjust (1),
tigr-anomaly	(1),
tigr-build-icm (1),
tigr-check (1),
tigr-codon-usage (1),
tigr-compare-lists (1),
tigr-extract (1),
tigr-generate (1),
tigr-get-len (1),
tigr-get-putative (1),
tigr-glimmer3 (1),
tigr-long-orfs (1)
</para>
<para>
http://www.tigr.org/software/glimmer/
</para>
    <para>Please see the readme in /usr/share/doc/glimmer for a description on how to use Glimmer.</para>
  </refsect1>
  <refsect1>
    <title>AUTHOR</title>
    <para>This manual page was quickly copied from the glimmer web site by &dhusername; &dhemail; for
      the &debian; system.
    </para>
  </refsect1>
</refentry>

<!-- Keep this comment at the end of the file
Local variables:
mode: sgml
sgml-omittag:t
sgml-shorttag:t
sgml-minimize-attributes:nil
sgml-always-quote-attributes:t
sgml-indent-step:2
sgml-indent-data:t
sgml-parent-document:nil
sgml-default-dtd-file:nil
sgml-exposed-tags:nil
sgml-local-catalogs:nil
sgml-local-ecat-files:nil
End:
-->