Codebase list tigr-glimmer / 74d6398 debian / tigr-build-icm.sgml
74d6398

Tree @74d6398 (Download .tar.gz)

tigr-build-icm.sgml @74d6398raw · history · blame

<!doctype refentry PUBLIC "-//OASIS//DTD DocBook V4.1//EN" [

<!-- Process this file with docbook-to-man to generate an nroff manual
     page: `docbook-to-man manpage.sgml > manpage.1'.  You may view
     the manual page with: `docbook-to-man manpage.sgml | nroff -man |
     less'.  A typical entry in a Makefile or Makefile.am is:

manpage.1: manpage.sgml
	docbook-to-man $< > $@

    
	The docbook-to-man binary is found in the docbook-to-man package.
	Please remember that if you create the nroff version in one of the
	debian/rules file targets (such as build), you will need to include
	docbook-to-man in your Build-Depends control field.

  -->

  <!-- Fill in your name for FIRSTNAME and SURNAME. -->
  <!ENTITY dhfirstname "<firstname>Steffen</firstname>">
  <!ENTITY dhsurname   "<surname>Möller</surname>">
  <!-- Please adjust the date whenever revising the manpage. -->
  <!ENTITY dhdate      "<date>Novemver 10, 2004</date>">
  <!ENTITY dhsection   "<manvolnum>1</manvolnum>">
  <!ENTITY dhemail     "<email>moeller@pzr.uni-rostock.de</email>">
  <!ENTITY dhusername  "Steffen Moeller">
  <!ENTITY dhucpackage "<refentrytitle>TIGR-GLIMMER<title>">
  <!ENTITY dhpackage   "tigr-glimmer">

  <!ENTITY debian      "<productname>Debian</productname>">
  <!ENTITY gnu         "<acronym>GNU</acronym>">
  <!ENTITY gpl         "&gnu; <acronym>GPL</acronym>">
]>

<refentry>
  <refentryinfo>
    <address>
      &dhemail;
    </address>
    <author>
      &dhfirstname;
      &dhsurname;
    </author>
    <copyright>
      <year>2003</year>
      <holder>&dhusername;</holder>
    </copyright>
    &dhdate;
  </refentryinfo>
  <refmeta>
    &dhucpackage;

    &dhsection;
  </refmeta>
  <refnamediv>
    <refname>&dhpackage;</refname>
    <refpurpose>Ceates and outputs an interpolated Markov model(IMM)</refpurpose>
  </refnamediv>
  <refsynopsisdiv>
    <cmdsynopsis>
      <command>tigr-build-icm</command>
    </cmdsynopsis>
  </refsynopsisdiv>
  <refsect1>
    <title>DESCRIPTION</title>
<para>
Program  build-icm.c  creates and outputs an interpolated Markov
model (IMM) as described in the paper
  A.L. Delcher, D. Harmon, S. Kasif, O. White, and S.L. Salzberg.
  Improved Microbial Gene Identification with Glimmer.
  Nucleic Acids Research, 1999, in press.
Please reference this paper if you use the system as part of any
published research.
</para><para>
Input comes from the file named on the command-line.  Format should be
one string per line.  Each line has an ID string followed by white space
followed by the sequence itself.  The script run-glimmer3 generates
an input file in the correct format using the 'extract' program.
</para><para>
The IMM is constructed as follows: For a given context, say
acgtta, we want to estimate the probability distribution of the
next character.  We shall do this as a linear combination of the
observed probability distributions for this context and all of
its suffixes, i.e., cgtta, gtta, tta, ta, a and empty.  By
observed distributions I mean the counts of the number of
occurrences of these strings in the training set.  The linear
combination is determined by a set of probabilities, lambda, one
for each context string.  For context acgtta the linear combination
coefficients are:
</para><para>
    lambda (acgtta)
    (1 - lambda (acgtta)) x lambda (cgtta)
    (1 - lambda (acgtta)) x (1 - lambda (cgtta)) x lambda (gtta)
    (1 - lambda (acgtta)) x (1 - lambda (cgtta)) x (1 - lambda (gtta)) x lambda (tta)
    (1 - lambda (acgtta)) x (1 - lambda (cgtta)) x (1 - lambda (gtta))
             x (1 - lambda (tta))  x (1 - lambda (ta))  x (1 - lambda (a))
</para><para>
We compute the lambda values for each context as follows:
  - If the number of observations in the training set is &gt;= the constant
    SAMPLE_SIZE_BOUND, the lambda for that context is 1.0
  - Otherwise, do a chi-square test on the observations for this context
    compared to the distribution predicted for the one-character shorter
    suffix context.
    If the chi-square significance &lt; 0.5, set the lambda for this context to 0.0
    Otherwise set the lambda for this context to:
       (chi-square significance) x (# observations) / SAMPLE_WEIGHT
</para><para>
To run the program:
</para><para>
      build-icm &lt;train.seq &gt; train.model
</para><para>
  This will use the training data in train.seq to produce the file
  train.model, containing your IMM.
</para>
  </refsect1>
  <refsect1>
    <title>SEE ALSO</title>
    <para>
tigr-glimmer3 (1),
tigr-long-orfs (1),
tigr-adjust (1),
tigr-anomaly	(1),
tigr-extract (1),
tigr-check (1),
tigr-codon-usage (1),
tigr-compare-lists (1),
tigr-extract (1),
tigr-generate (1),
tigr-get-len (1),
tigr-get-putative (1),
    </para>
    <para>http://www.tigr.org/software/glimmer/</para>
    <para>Please see the readme in /usr/share/doc/tigr-glimmer for a description on how to use Glimmer3.</para>
  </refsect1>
  <refsect1>
    <title>AUTHOR</title>

    <para>This manual page was quickly copied from the glimmer web site and readme file by &dhusername; &dhemail; for
      the &debian; system.
    </para>

  </refsect1>
</refentry>

<!-- Keep this comment at the end of the file
Local variables:
mode: sgml
sgml-omittag:t
sgml-shorttag:t
sgml-minimize-attributes:nil
sgml-always-quote-attributes:t
sgml-indent-step:2
sgml-indent-data:t
sgml-parent-document:nil
sgml-default-dtd-file:nil
sgml-exposed-tags:nil
sgml-local-catalogs:nil
sgml-local-ecat-files:nil
End:
-->