Man page - clustalw(1)

Packages contains this manual

Package: clustalw
apt-get install clustalw

Manuals in package:

clustalw(1)

Documentations in package:

clustalw

Manual

CLUSTALW

NAME
SYNOPSIS
DESCRIPTION
OPTIONS
DATA (sequences)
VERBS (do things)
PARAMETERS (set things)
BUGS
SEE ALSO
REFERENCES
AUTHORS
COPYRIGHT
NOTES

NAME

clustalw - Multiple alignment of nucleic acid and protein sequences

SYNOPSIS

	clustalw [ -infile ] file.ext [ OPTIONS ]
	clustalw [ -help \| -fullhelp ]

DESCRIPTION

Clustal W is a general purpose multiple alignment program for DNA or proteins.

The program performs simultaneous alignment of many nucleotide or amino acid sequences. It is typically run interactively, providing a menu and an online help. If you prefer to use it in command-line (batch) mode, you will have to give several options, the minimum being -infile .

OPTIONS

DATA (sequences)

-infile= file.ext

Input sequences.

-profile1= file.ext and -profile2= file.ext

Profiles (old alignment)

VERBS (do things)

-options

List the command line parameters.

-help or -check

Outline the command line params.

-fullhelp

Output full help content.

-align

Do full multiple alignment.

-tree

Calculate NJ tree.

-pim

Output percent identity matrix (while calculating the tree).

-bootstrap =n

Bootstrap a NJ tree ( n = number of bootstraps; def. = 1000).

-convert

Output the input sequences in a different file format.

PARAMETERS (set things)

General settings:

-interactive

Read command line, then enter normal interactive menus.

-quicktree

Use FAST algorithm for the alignment guide tree.

-type=

PROTEIN or DNA sequences.

-negative

Protein alignment with negative values in matrix.

-outfile=

Sequence alignment file name.

-output=

GCG , GDE , PHYLIP , PIR or NEXUS .

-outputorder=

INPUT or ALIGNED

-case

LOWER or UPPER (for GDE output only).

-seqnos=

OFF or ON (for Clustal output only).

-seqnos_range=

OFF or ON (NEW: for all output formats).

-range= m , n

Sequence range to write starting m to m + n .

-maxseqlen= n

Maximum allowed input sequence length.

-quiet

Reduce console output to minimum.

-stats= file

Log some alignments statistics to file .

Fast Pairwise Alignments:

-ktuple= n

Word size.

-topdiags= n

Number of best diags.

-window= n

Window around best diags.

-pairgap= n

Gap penalty.

-score

PERCENT or ABSOLUTE .

Slow Pairwise Alignments:

-pwmatrix=

:Protein weight matrix= BLOSUM , PAM , GONNET , ID or filename

-pwdnamatrix=

DNA weight matrix= BLOSUM IUB, BLOSUM CLUSTALW or BLOSUM filename.

-pwgapopen= f

Gap opening penalty.

-pwgapext= f

Gap extension penalty.

Multiple Alignments:

-newtree=

File for new guide tree.

-usetree=

File for old guide tree.

-matrix=

Protein weight matrix= BLOSUM , PAM , GONNET , ID or filename .

-dnamatrix=

DNA weight matrix= IUB , CLUSTALW or filename .

-gapopen= f

Gap opening penalty.

-gapext= f

Gap extension penalty.

-engaps

No end gap separation pen.

-gapdist= n

Gap separation pen. range.

-nogap

Residue-specific gaps off.

-nohgap

Hydrophilic gaps off.

-hgapresidues=

List hydrophilic res.

-maxdiv= n

Percent identity for delay.

-type=

PROTEIN or DNA

-transweight= f

Transitions weighting.

-iteration=

NONE or TREE or ALIGNMENT .

-numiter= n

Maximum number of iterations to perform.

Profile Alignments:

-profile

Merge two alignments by profile alignment.

-newtree1=

File for new guide tree for profile1.

-newtree2=

File for new guide tree for profile2.

-usetree1=

File for old guide tree for profile1.

-usetree2=

File for old guide tree for profile2.

Sequence to Profile Alignments:

-sequences

Sequentially add profile2 sequences to profile1 alignment.

-newtree=

File for new guide tree.

-usetree=

File for old guide tree.

Structure Alignments:

-nosecstr1

Do not use secondary structure-gap penalty mask for profile 1.

-nosecstr2

Do not use secondary structure-gap penalty mask for profile 2.

-secstrout= STRUCTURE or MASK or BOTH or NONE

Output in alignment file.

-helixgap= n

Gap penalty for helix core residues.

-strandgap= n

Gap penalty for strand core residues.

loopgap= n

Gap penalty for loop regions.

-terminalgap= n

Gap penalty for structure termini.

-helixendin= n

Number of residues inside helix to be treated as terminal.

-helixendout= n

Number of residues outside helix to be treated as terminal.

-strandendin= n

Number of residues inside strand to be treated as terminal.

-strandendout= n

Number of residues outside strand to be treated as terminal.

Trees:

-outputtree= nj OR phylip OR dist OR nexus

-seed= n

Seed number for bootstraps.

-kimura

Use Kimura's correction.

-tossgaps

Ignore positions with gaps.

-bootlabels= node

Position of bootstrap values in tree display.

-clustering=

NJ or UPGMA.

BUGS

The Clustal bug tracking system can be found at http://bioinf.ucd.ie/bugzilla/buglist.cgi?quicksearch=clustal .

REFERENCES

• Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG. (2007). Clustal W and Clustal X version 2.0. [1] Bioinformatics, 23, 2947-2948.

• Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG, Thompson JD. (2003). Multiple sequence alignment with the Clustal series of programs. [2] Nucleic Acids Res., 31, 3497-3500.

• Jeanmougin F, Thompson JD, Gouy M, Higgins DG, Gibson TJ. (1998). Multiple sequence alignment with Clustal X [3] . Trends Biochem Sci., 23, 403-405.

• Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG. (1997). The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. [4] Nucleic Acids Res., 25, 4876-4882.

• Higgins DG, Thompson JD, Gibson TJ. (1996). Using CLUSTAL for multiple sequence alignments. [5] Methods Enzymol., 266, 383-402.

• Thompson JD, Higgins DG, Gibson TJ. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. [6] Nucleic Acids Res., 22, 4673-4680.

• Higgins DG. (1994). CLUSTAL V: multiple alignment of DNA and protein sequences. [7] Methods Mol Biol., 25, 307-318

• Higgins DG, Bleasby AJ, Fuchs R. (1992). CLUSTAL V: improved software for multiple sequence alignment. [8] Comput. Appl. Biosci., 8, 189-191.

• Higgins,D.G. and Sharp,P.M. (1989). Fast and sensitive multiple sequence alignments on a microcomputer. [9] Comput. Appl. Biosci., 5, 151-153.

• Higgins,D.G. and Sharp,P.M. (1988). CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. [10] Gene, 73, 237-244.

AUTHORS

Des Higgins

Julie Thompson

Toby Gibson

Charles Plessy <plessy@debian.org>

Prepared this manpage in DocBook XML for the Debian distribution.

COPYRIGHT

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License along with this program. If not, see http://www.gnu.org/licenses/, or on Debian systems, /usr/share/common-licenses/LGPL-3.

This manual page and its XML source can be used, modified, and redistributed as if it were in public domain.

NOTES

Clustal W and Clustal X version 2.0.

http://www.ncbi.nlm.nih.gov/pubmed/17846036

Multiple sequence alignment with the Clustal series of programs.

http://www.ncbi.nlm.nih.gov/pubmed/12824352

Multiple sequence alignment with Clustal X

http://www.ncbi.nlm.nih.gov/pubmed/9810230

The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools.

http://www.ncbi.nlm.nih.gov/pubmed/9396791

Using CLUSTAL for multiple sequence alignments.

http://www.ncbi.nlm.nih.gov/pubmed/8743695

CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.

http://www.ncbi.nlm.nih.gov/pubmed/7984417

CLUSTAL V: multiple alignment of DNA and protein sequences.

http://www.ncbi.nlm.nih.gov/pubmed/8004173

CLUSTAL V: improved software for multiple sequence alignment.

http://www.ncbi.nlm.nih.gov/pubmed/1591615

Fast and sensitive multiple sequence alignments on a microcomputer.

http://www.ncbi.nlm.nih.gov/pubmed/2720464

10.

CLUSTAL: a package for performing multiple sequence alignment on a microcomputer.

http://www.ncbi.nlm.nih.gov/pubmed/3243435