Man page - gt-extractfeat(1)

Packages contains this manual

Manual

GT-EXTRACTFEAT

NAME
SYNOPSIS
DESCRIPTION
REPORTING BUGS

NAME

gt-extractfeat - Extract features given in GFF3 file from sequence file.

SYNOPSIS

gt extractfeat [option ...] [GFF3_file]

DESCRIPTION

-type [ string ]

set type of features to extract (default: undefined)

-join [ yes|no ]

join feature sequences in the same subgraph into a single one (default: no)

-translate [ yes|no ]

translate the features (of a DNA sequence) into protein (default: no)

-seqid [ yes|no ]

add sequence ID of extracted features to FASTA descriptions (default: no)

-target [ yes|no ]

add target ID(s) of extracted features to FASTA descriptions (default: no)

-coords [ yes|no ]

add location of extracted features to FASTA descriptions (default: no)

-retainids [ yes|no ]

use ID attributes of extracted features as FASTA descriptions (default: no)

-gcode [ value ]

specify genetic code to use (default: 1)

-seqfile [ filename ]

set the sequence file from which to take the sequences (default: undefined)

-encseq [ filename ]

set the encoded sequence indexname from which to take the sequences (default: undefined)

-seqfiles

set the sequence files from which to extract the features use -- to terminate the list of sequence files

-matchdesc [ yes|no ]

search the sequence descriptions from the input files for the desired sequence IDs (in GFF3), reporting the first match (default: no)

-matchdescstart [ yes|no ]

exactly match the sequence descriptions from the input files for the desired sequence IDs (in GFF3) from the beginning to the first whitespace (default: no)

-usedesc [ yes|no ]

use sequence descriptions to map the sequence IDs (in GFF3) to actual sequence entries. If a description contains a sequence range (e.g., III:1000001..2000000), the first part is used as sequence ID ( III ) and the first range position as offset ( 1000001 ) (default: no)

-regionmapping [ string ]

set file containing sequence-region to sequence file mapping (default: undefined)

-v [ yes|no ]

be verbose (default: no)

-width [ value ]

set output width for FASTA sequence printing (0 disables formatting) (default: 0)

-o [ filename ]

redirect output to specified file (default: undefined)

-gzip [ yes|no ]

write gzip compressed output file (default: no)

-bzip2 [ yes|no ]

write bzip2 compressed output file (default: no)

-force [ yes|no ]

force writing to output file (default: no)

-help

display help and exit

-version

display version information and exit

Genetic code numbers for option -gcode :

1: Standard 2: Vertebrate Mitochondrial 3: Yeast Mitochondrial 4: Mold Mitochondrial; Protozoan Mitochondrial; Coelenterate Mitochondrial; Mycoplasma; Spiroplasma 5: Invertebrate Mitochondrial 6: Ciliate Nuclear; Dasycladacean Nuclear; Hexamita Nuclear 9: Echinoderm Mitochondrial; Flatworm Mitochondrial 10: Euplotid Nuclear 11: Bacterial, Archaeal and Plant Plastid 12: Alternative Yeast Nuclear 13: Ascidian Mitochondrial 14: Alternative Flatworm Mitochondrial 15: Blepharisma Macronuclear 16: Chlorophycean Mitochondrial 21: Trematode Mitochondrial 22: Scenedesmus obliquus Mitochondrial 23: Thraustochytrium Mitochondrial 24: Pterobranchia Mitochondrial 25: Candidate Division SR1 and Gracilibacteria

File format for option -regionmapping :

The file supplied to option -regionmapping defines a “mapping”. A mapping maps the sequence-region entries given in the GFF3_file to a sequence file containing the corresponding sequence. Mappings can be defined in one of the following two forms:

mapping = {
chr1 = "hs_ref_chr1.fa.gz",
chr2 = "hs_ref_chr2.fa.gz"
}

or

function mapping(sequence_region)
return "hs_ref_"..sequence_region..".fa.gz"
end

The first form defines a Lua (http://www.lua.org) table named “mapping” which maps each sequence region to the corresponding sequence file. The second one defines a Lua function “mapping”, which has to return the sequence file name when it is called with the sequence_region as argument.

REPORTING BUGS

Report bugs to https://github.com/genometools/genometools/issues.