Codebase list diamond-aligner / debian/0.9.23+dfsg-2
debian/0.9.23+dfsg-2

Tree @debian/0.9.23+dfsg-2 (Download .tar.gz)

Introduction
============

DIAMOND is a sequence aligner for protein and translated DNA searches,
designed for high performance analysis of big sequence data. The key
features are:

-   Pairwise alignment of proteins and translated DNA at 500x-20,000x
    speed of BLAST.
-   Frameshift alignments for long read analysis.
-   Low resource requirements and suitable for running on standard
    desktops or laptops.
-   Various output formats, including BLAST pairwise, tabular and XML,
    as well as taxonomic classification.

Keep posted about new developments by following me on
[Twitter](https://twitter.com/bbuchfink).

[![Build Status](https://travis-ci.org/bbuchfink/diamond.svg?branch=master)](https://travis-ci.org/bbuchfink/diamond)
[![image](https://anaconda.org/bioconda/diamond/badges/version.svg)](https://anaconda.org/bioconda/diamond)
![image](https://anaconda.org/bioconda/diamond/badges/platforms.svg)
[![image](https://anaconda.org/bioconda/diamond/badges/downloads.svg)](https://anaconda.org/bioconda/diamond)

Quick start guide
=================

Please read the
[manual](https://github.com/bbuchfink/diamond/raw/master/diamond_manual.pdf)
for detailed installation and usage instructions. This demonstrates a
quick example for setting up and using the program on Linux.

Installing the software on your system may be done by downloading it in
binary format for immediate use:

    wget http://github.com/bbuchfink/diamond/releases/download/v0.9.23/diamond-linux64.tar.gz
    tar xzf diamond-linux64.tar.gz

The extracted `diamond` binary file should be moved to a directory
contained in your executable search path (PATH environment variable).

To now run an alignment task, we assume to have a protein database file
in FASTA format named `nr.faa` and a file of DNA reads that we want to
align named `reads.fna`.

In order to set up a reference database for DIAMOND, the `makedb`
command needs to be executed with the following command line:

    $ diamond makedb --in nr.faa -d nr

This will create a binary DIAMOND database file with the specified name
(`nr.dmnd`). The alignment task may then be initiated using the `blastx`
command like this:

    $ diamond blastx -d nr -q reads.fna -o matches.m8

The output file here is specified with the `–o` option and named
`matches.m8`. By default, it is generated in BLAST tabular format.

**Note**:

-   The program may use quite a lot of memory and also temporary
    disk space. Should the program fail due to running out of either
    one, you need to set a lower value for the block size parameter
    `-b` (see the [manual](https://github.com/bbuchfink/diamond/raw/master/diamond_manual.pdf)).
-   The default (fast) mode was mainly designed for short reads. For
    longer sequences, the sensitive modes (options `--sensitive` or
    `--more-sensitive`) are recommended.
-   The runtime of the program is not linear in the size of the
    query file and it is much more efficient for large query files
    (\> 1 million sequences) than for smaller ones.
-   Low complexity masking is applied to the query and reference
    sequences by default. Masked residues appear in the output as X.
-   The default e-value cutoff of DIAMOND is 0.001 while that of
    BLAST is 10, so by default the program will search a lot more
    stringently than BLAST and not report weak hits.

About
=====

DIAMOND is developed by Benjamin Buchfink at the Detlef Weigel lab, Max
Planck Institute for Developmental Biology, Tübingen, Germany.

\[[Email](mailto:buchfink@gmail.com)\]
\[[Twitter](https://twitter.com/bbuchfink)\] \[[Google
Scholar](https://scholar.google.de/citations?user=kjPIF1cAAAAJ)\]
\[[MPI-EBIO](http://eb.tuebingen.mpg.de/)\]

Publication:

-   Buchfink B, Xie C, Huson DH, \"Fast and sensitive protein alignment
    using DIAMOND\", *Nature Methods* **12**, 59-60 (2015).
    [doi:10.1038/nmeth.3176](https://doi.org/10.1038/nmeth.3176)