Man page - alignment-thin(1)

Packages contains this manual

Manual

alignment-thin

NAME
SYNOPSIS
DESCRIPTION
GENERAL OPTIONS:
SEQUENCE FILTERING OPTIONS:
COLUMN FILTERING OPTIONS:
OUTPUT OPTIONS:
EXAMPLES:
REPORTING BUGS:
AUTHORS

NAME

alignment-thin - Remove sequences or columns from an alignment.

SYNOPSIS

alignment-thin alignment-file [OPTIONS]

DESCRIPTION

Remove sequences or columns from an alignment.

GENERAL OPTIONS:

-h , –help

Print usage information.

-V , –verbose

Output more log messages on stderr.

SEQUENCE FILTERING OPTIONS:

-p arg , –protect arg

Sequences that cannot be removed (comma-separated).

-k arg , –keep arg

Remove sequences not in comma-separated list arg .

-r arg , –remove arg

Remove sequences in comma-separated list arg .

-l arg , –longer-than arg

Remove sequences not longer than arg .

-s arg , –shorter-than arg

Remove sequences not shorter than arg .

-c arg , –cutoff arg

Remove similar sequences with #mismatches < cutoff.

-d arg , –down-to arg

Remove similar sequences down to arg sequences.

–remove-gappy arg

Remove arg outlier sequences – defined as sequences that are missing too many conserved sites.

–conserved arg (=0.75)

Fraction of sequences that must contain a letter for it to be considered conserved.

COLUMN FILTERING OPTIONS:

-K arg , –keep-columns arg

Keep columns from this sequence

-m arg , –min-letters arg

Remove columns with fewer than arg letters.

-u arg , –remove-unique arg

Remove insertions in a single sequence if longer than arg letters

-e , –erase-empty-columns

Remove columns with no characters (all gaps).

OUTPUT OPTIONS:

-S , –sort

Sort partially ordered columns to group similar gaps.

-L , –show-lengths

Just print out sequence lengths.

-N , –show-names

Just print out sequence lengths.

-F arg , –find-dups arg

For each sequence, find the closest other sequence.

EXAMPLES:

Remove columns without a minimum number of letters:

% alignment-thin --min-letters=5 file.fasta > file-thinned.fasta

Remove sequences by name:

% alignment-thin --remove=seq1,seq2 file.fasta > file2.fasta

% alignment-thin --keep=seq1,seq2 file.fasta > file2.fasta

Remove short sequences:

% alignment-thin --longer-than=250 file.fasta > file-long.fasta

Remove similar sequences with <= 5 differences from the closest other sequence:

% alignment-thin --cutoff=5 file.fasta > more-than-5-differences.fasta

Remove similar sequences until we have the right number of sequences:

% alignment-thin --down-to=30 file.fasta > file-30taxa.fasta

Remove dissimilar sequences that are missing conserved columns:

% alignment-thin --remove-gappy=10 file.fasta > file2.fasta

Protect some sequences from being removed:

% alignment-thin --down-to=30 file.fasta --protect=seq1,seq2 > file2.fasta

% alignment-thin --down-to=30 file.fasta --protect=@filename > file2.fasta

REPORTING BUGS:

BAli-Phy online help: http://www.bali-phy.org/docs.php .

Please send bug reports to bali-phy-users@googlegroups.com .

AUTHORS

Benjamin Redelings.