Clustal
Clustal Omega | |
---|---|
Basic data
|
|
developer | Des Higgins, Fabian Sievers (Conway Institute, UCD ) |
Current version | 1.2.1 (February 28, 2014) |
operating system | Unix , Linux , Mac , MS-Windows |
programming language | C ++ |
category | Bioinformatics tool |
License | GNU General Public License , version 2 |
www.clustal.org/omega/ |
Clustal | |
---|---|
Basic data
|
|
developer | Gibson T. ( EMBL ), Thompson J. ( CNRS ), Higgins D. ( UCD ) |
Current version | 2.1 (November 17, 2010) |
operating system | Unix , Linux , macOS , Windows |
programming language | C ++ |
category | Bioinformatics tool |
License | from version 2.1 LGPL , previously free for academic users |
www.clustal.org |
Clustal is a widely used computer program for multiple sequence alignment . The current version is 2.1. There are three variants of the program:
- ClustalW : a command line program
- ClustalX : with a graphical user interface . The program is available for Windows, Mac OS and Unix / Linux.
- Clustal Omega: a command line program . The program can align many sequences (> 100,000) quickly and with great quality.
Input / output
The program can process a wide range of input formats, including NBRF / PIR, FASTA , EMBL / Swissprot or UniProt, Clustal, GCC / MSF, GCG9 RSF and GDE.
The output can be in the following formats: Clustal, NBRF / PIR, GCG / MSF, PHYLIP , GDE, NEXUS.
Multiple sequence alignment
Clustal performs three main steps:
- Pairwise alignment ,
- create a phylogenetic tree (or use a custom one),
- use the phylogenetic tree for multiple alignment.
These steps are performed automatically when you select Do Complete Alignment . Further options are Do Alignment from guide tree (carry out alignment using a guide tree ) and Produce guide tree only (only create the guide tree ).
Profile alignments
Pairwise alignments are calculated for all and against all sequences; Matches are stored in a matrix. This is then converted into a distance matrix , where the distance value reflects the evolutionary distance between each sequence pair.
From this distance matrix is based on a neighbor joining clustering ( Neighbor-joining clustering algorithm ), a guide tree , or a phylogenetic tree constructed which specifies the order in the sequence couples aligniert (arranged), and are to be combined with previous alignments. Sequences are progressively aligned at each branch point, starting with the sequence pair that is closest to each other.
Settings
Users can align sequences using the default setting, but it makes sense to use your own parameters on a case-by-case basis.
The main parameters are the gap opening penalty and the gap extension penalty (see sequence alignment ).
Accelerated version
An FPGA -based version of the ClustalW algorithm is offered by Progeniq and has a processing speed that is twenty times higher than that of the software implementation.
swell
- JD Thompson et al. (1997): The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. In: Nucleic Acids Research . Vol. 25, pp. 4876-4882. PMID 9396791
- R. Chenna et al. (2003): Multiple sequence alignment with the cluster series of programs. In: Nucleic Acid Research. Vol. 31, pp. 3497-3500. PMID 12824352
- MA Larkin et al. (2007): Clustal W and Clustal X version 2.0. In: Bioinformatics. Vol. 23, pp. 2947-2948. PMID 17846036
- F. Sievers et al. (2011): Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. In: Mol Syst Biol 7. 2011 Oct 11. doi : 10.1038 / msb.2011.75
Web links
- EBI : ClustalW (English)
- Clustal Homepage (English)
- Progeniq Pte Ltd, White Paper - Accelerating Intensive Applications at 10x-50x Speedup to Remove Bottlenecks in Computational Workflows
- Progeniq BoostServe, 1000 CPU cores
Individual evidence
- ↑ See file COPYING, in source archive , accessed January 15, 2014