Man page - splazers(1)

Packages contains this manual

Manual

SPLAZERS

NAME
SYNOPSIS
DESCRIPTION
REQUIRED ARGUMENTS
OPTIONS
Main Options::
Output Format Options::
Split Mapping Options::
Filtration Options::
Verification Options:

NAME

splazers - Split-map read sequences

SYNOPSIS

splazers [ OPTIONS ] < GENOME FILE > < READS FILE >
splazers
[ OPTIONS ] < GENOME FILE > < READS FILE 1 > < READS FILE 2 >

DESCRIPTION

SplazerS uses a prefix-suffix mapping strategy to split-map read sequences.If a SAM file of mapped reads is given as input, all unmapped but anchoredreads are split-mapped onto anchoring target regions (specify option -an),if a Fasta/q file of reads is given, reads are split-mapped onto the wholereference sequence.

(c) Copyright 2010 by Anne-Katrin Emde.

REQUIRED ARGUMENTS

ARGUMENT 0 INPUT_FILE

A reference genome file. Valid filetypes are: .sam[.*] , .raw[.*] , .gbk[.*] , .frn[.*] , .fq[.*] , .fna[.*] , .ffn[.*] , .fastq[.*] , .fasta[.*] , .faa[.*] , .fa[.*] , .embl[.*] , and .bam , where * is any of the following extensions: gz , bz2 , and bgzf for transparent (de)compression.

READS List of INPUT_FILE ’s

Either one (single-end) or two (paired-end) read files. Valid filetypes are: .sam[.*] , .raw[.*] , .gbk[.*] , .frn[.*] , .fq[.*] , .fna[.*] , .ffn[.*] , .fastq[.*] , .fasta[.*] , .faa[.*] , .fa[.*] , .embl[.*] , and .bam , where * is any of the following extensions: gz , bz2 , and bgzf for transparent (de)compression.

OPTIONS

-h , --help

Display the help message.

--version

Display version information.

Main Options::

-o , --output OUTPUT_FILE

Change output filename. Default: < READS FILE >.result.

-f , --forward

only compute forward matches

-r , --reverse

only compute reverse complement matches

-i , --percent-identity DOUBLE

Percent identity threshold. In range [50..100]. Default: 92 .

-rr , --recognition-rate DOUBLE

set the percent recognition rate In range [80..100]. Default: 99 .

-pd , --param-dir STRING

Read user-computed parameter files in the directory < DIR >.

-id , --indels

Allow indels. Default: mismatches only.

-ll , --library-length INTEGER

Paired-end library length. In range [1..inf]. Default: 220 .

-le , --library-error INTEGER

Paired-end library length tolerance. In range [0..inf]. Default: 50 .

-m , --max-hits INTEGER

Output only < NUM > of the best hits. In range [1..inf]. Default: 100 .

--unique

Output only unique best matches (-m 1 -dr 0 -pa).

-tr , --trim-reads INTEGER

Trim reads to given length. Default: off. In range [14..inf].

-mcl , --min-clipped-len INTEGER

min. read length for read clipping In range [1..inf]. Default: 0 .

-qih , --quality-in-header

quality string in fasta header

-ou , --outputUnmapped OUTPUT_FILE

output filename for unmapped reads

-v , --verbose

verbose mode

-vv , --vverbose

very verbose mode

Output Format Options::

-a , --alignment

dump the alignment for each match

-pa , --purge-ambiguous

purge reads with more than max-hits best matches

-dr , --distance-range INTEGER

only consider matches with at most NUM more errors compared to the best (default output all)

-of , --output-format INTEGER

Set output format. 0 = RazerS, 1 = Enhanced Fasta, 2 = Eland, 3 = GFF, 4 = SAM. In range [0..4].

-gn , --genome-naming INTEGER

Select how genomes are named. 0 = use Fasta id, 1 = enumerate beginning with 1. In range [0..1]. Default: 0 .

-rn , --read-naming INTEGER

Select how reads are named. 0 = use Fasta id, 1 = enumerate beginning with 1. In range [0..1]. Default: 0 .

-so , --sort-order INTEGER

Select how matches are sorted. 0 = read number, 1 = genome position. In range [0..1]. Default: 0 .

-pf , --position-format INTEGER

Select begin/end position numbering (see Coordinate section below). 0 = gap space, 1 = position space. In range [0..1]. Default: 0 .

Split Mapping Options::

-sm , --split-mapping INTEGER

min. match length for prefix/suffix mapping (to disable split mapping, set to 0) Default: 18 .

-maxG , --max-gap INTEGER

max. length of middle gap Default: 10000 .

-minG , --min-gap INTEGER

min. length of middle gap (for edit distance mapping about 10% of read length is recommended) Default: 0 .

-ep , --errors-prefix INTEGER

max. number of errors in prefix match Default: 1 .

-es , --errors-suffix INTEGER

max. number of errors in suffix match Default: 1 .

-gl , --genome-len INTEGER

genome length in Mb, for computation of expected number of random matches In range [-inf..10000]. Default: 3000 .

-an , --anchored

anchored split mapping, only unmapped reads with mapped mates will be considered, requires the reads to be given in SAM format

-pc , --penalty-c INTEGER

percent of read length, used as penalty for split-gap Default: 2 .

Filtration Options::

-oc , --overabundance-cut INTEGER

Set k-mer overabundance cut ratio. In range [0..1].

-rl , --repeat-length INTEGER

Skip simple-repeats of length < NUM >. In range [1..inf]. Default: 1000 .

-tl , --taboo-length INTEGER

Set taboo length. In range [1..inf]. Default: 1 .

-lm , --low-memory

decrease memory usage at the expense of runtime

Verification Options:

-mN , --match-N

N matches all other characters. Default: N matches nothing.

-ed , --error-distr STRING

Write error distribution to FILE .