Man page - ccextractor(1)
Packages contains this manual
Manual
CCEXTRACTOR
NAMESYNOPSIS
DESCRIPTION
OPTIONS
File name related options:
Options that affect what will be processed:
Input formats:
Output formats:
Options that affect how input files will be processed.
Options that affect what kind of output will be produced:
Options that affect how ccextractor reads and writes (buffering):
Options that affect the built-in closed caption decoder:
Options that affect timing:
Options that affect what segment of the input file(s) to process:
Adding start and end credits:
Options that affect debug data:
Communication with other programs and console output:
SEE ALSO
NAME
CCExtractor - A closed caption software decoder
SYNOPSIS
ccextractor [options] inputfile1 [inputfile2...] [-o outputfilename] [-o1 outputfilename1] [-o2 outputfilename2]
DESCRIPTION
Extracts closed captions from MPEG files.
(DVB, .TS, ReplayTV 4000 and 5000, dvr-ms, bttv, Tivo and Dish Network are known to work).
ccextractor reads a video stream looking for closed captions (subtitles).
It can do two things:
- Save the data
to a "raw", unprocessed file which you can later
use
as input for other tools, such as McPoodleâs excellent
suite.
- Generate a
subtitles file (.srt,.smi, or .txt) which you can directly
use with your favourite player.
OPTIONS
File name related options:
inputfile:
file(s) to process
-o
outputfilename:
Use -o parameters to define output filename if you donât like the default ones.
Default: (same as infile plus _1 or _2 when needed and .raw or .srt extension).
-o or -o1 -> Name of the first (maybe only) output file.
-o2 -> Name of the second output file, when it applies.
-cf filename: Write âcleanâ data to a file.
Clean means the ES without TS or PES headers.
You can pass as
many input files as you need. They will be processed in
order.
If a file name is suffixed by
+
, ccextractor will try
to follow a numerical sequence.
For example, DVD001.VOB
+
means DVD
001
.VOB,
DVD
002
.VOB and so on until there are no more files.
Output will be one single file (either raw or srt). Use this
if you made your recording in several cuts (to skip
commercials for example) but you want one subtitle file with
contiguous timing.
Options that affect what will be processed:
-1 , -2 , -12 : Output Field 1 data, Field 2 data, or both
Default: -1
-cc2
: When in srt/sami
mode, process captions in channel 2 instead
channel 1.
In general, if
you want English subtitles you donât need to use these
options
as they are broadcast in field 1, channel 1.
If you want the second language (usually Spanish) you may
need to try
-2
, or
-cc2
, or both.
Input formats:
With the
exception of McPoodleâs raw format, which is just the
closed caption data with no other info,
CCExtractor can usually detect the input format
correctly.
To force a
specific format:
-in
=
format
Where format is one of these:
ts -> For Transport Streams.
ps -> For Program Streams.
es -> For Elementary Streams.
asf -> ASF container (such as DVR-MS).
bin -> CCExtractorâs own binary format.
raw -> For McPoodleâs raw files.
|
-ts , -ps , -es and -asf (or --dvr-ms ) can be used as shorts. |
Output formats:
-out = format
Where format is one of these:
srt -> SubRip (default, so not actually needed).
sami -> MS Synchronized Accesible Media Interface.
bin -> CC data in CCExtractorâs own binary format.
raw -> CC data in McPoodleâs Broadcast format.
dvdraw -> CC data in McPoodleâs DVD format.
txt -> Transcript (no time codes, no roll-up captions, just the plain transcription.
Options that affect how input files will be processed.
-gt --goptime : Use GOP for timing instead of PTS.
This only
applies to Program or Transport Streams with MPEG2 data and
overrides the default PTS timing.
GOP timing is always used for Elementary Streams.
-fp --fixpadding : Fix padding
Some cards (or
providers, or whatever) seem to send 0000 as CC padding
instead of 8080.
If you get bad timing, this might solve it.
-90090 : Use 90090 (instead of 90000) as MPEG clock frequency.
(reported to be needed at least by Panasonic DMR-ES15 DVD Recorder)
-ve --videoedited :
By default,
ccextractor will process input files in sequence
as if they were all one large file (i.e. split by a generic,
non video-aware tool.
If you are processing video hat was split with a editing
tool,
use
-ve
so ccextractor doesnât try to rebuild
the original timing.
-s --stream [secs]: Consider the file as a continuous stream
That is,
growing as ccextractor processes it,
so donât try to figure out its size and donât
terminate processing
when reaching the current end (i.e. wait for more data to
arrive).
If the optional parameter secs is present, it means the
number of seconds
without any new data after which ccextractor should exit.
Use this parameter if you want to process a live stream but
not kill ccextractor externally.
Note: If -s is used then only one input file is allowed.
-myth
: Force MythTV code
branch.
-nomyth
: Disable MythTV code branch.
The MythTV
branch is needed for analog captures where the closed
caption data is stored in the VBI, such as those with bttv
cards (Hauppage 250 for example).
This is detected automatically so you donât need to
worry about this unless autodetection doesnât work for
you.
-wtvconvertfix :
This switch
works around a bug in Windows 7âs built in software to
convert *.wtv to *.dvr-ms.
For analog NTSC recordings the CC information is marked as
digital captions.
Use this switch only when needed.
Options that affect what kind of output will be produced:
-unicode
:
Encode subtitles in Unicode instead of Latin-1
-utf8
: Encode subtitles in UTF-8 instead of Latin-1
-nofc --nofontcolor
: For .srt/.sami, donât add
font color tags.
-trim
: Trim lines.
-dc --defaultcolor
: Select a different default color
(instead of
white).
This causes all
output in .srt/.smi files to have a font tag, which makes
the files larger.
Add the color you want in RGB, such as
-dc
#FF0000
for red.
-sc --sentencecap : Sentence capitalization.
Use if you hate ALL CAPS in subtitles.
--capfile -caf
file: Add
the contents of âfileâ to the list of words
that must be capitalized.
For example, if file is a plain text file that contains
Tony
Alan
Whenever those
words are found they will be written exactly as they appear
in the file.
Use one line per word. Lines starting with
#
are
considered comments and discarded.
Options that affect how ccextractor reads and writes (buffering):
-bi
--bufferinput
: Forces input buffering.
-nobi -nobufferinput
: Disables input buffering.
Note: -bo is only used when writing raw files, not .srt or .sami
Options that affect the built-in closed caption decoder:
-dru : Direct Roll-Up.
When in roll-up
mode, write character by character instead of line by line.
Note that this produces (much) larger files.
-noru --norollup :
If you hate the
repeated lines caused by the roll-up emulation,
you can have ccextractor write only one line at a time,
getting rid of these repeated lines.
Options that affect timing:
-delay ms: For srt/sami, add this number of milliseconds to all times.
For example,
-delay
400 makes subtitles appear 400ms late.
You can also use negative numbers to make subs appear
early.
Notes on times:
-startat
and
-endat
times are used first, then
-delay
.
So if you use
-srt -startat
3:00
-endat
5:00
-delay
120000,
ccextractor will generate a .srt file, with only data from
3:00 to 5:00 in the input file(s)
and then add that (huge) delay, which would make the final
file start at 5:00 and end at 7:00.
Options that affect what segment of the input file(s) to process:
-startat
time: Only write caption information that starts after the
given time.
Time can be
seconds, MM:SS or HH:MM:SS.
For example,
-startat
3:00 means âstart writing
from minute 3.
-endat
time: Stop
processing after the given time (same format as
-startat
).
The
-startat
and
-endat
options are honored in all
output formats.
In all formats with timing information the times are
unchanged.
-scr --screenfuls num: Write ânumâ screenfuls and terminate processing.
Adding start and end credits:
CCExtractor can
_try_ to add a custom message (for credits for example)
at the start and end of the file, looking for a window where
there are no captions.
If there is no such window, then no text will be added.
The start window must be between the times given and must
have enough time
to display the message for at least the specified time.
--startcreditstext
txt: Write this text as start
credits.
If there are several lines, separate them with the characters \n.
For example Line1\nLine 2.
--startcreditsnotbefore
time: Donât display the start credits before
this time (S, or MM:SS).
Default: 0
--startcreditsnotafter
time: Donât display the start credits after this
time (S, or MM:SS).
Default: 5:00
--startcreditsforatleast
time: Start credits need to be displayed for
at least this time (S, or MM:SS).
Default: 2
--startcreditsforatmost
time: Start credits should be displayed for at
most this time (S, or MM:SS).
Default: 5
--endcreditstext txt: Write this text as end credits.
If there are several lines, separate them with the characters \n.
For example Line1\nLine 2.
--endcreditsforatleast
time: End credits need to be displayed for at
least this time (S, or MM:SS).
Default: 2
--endcreditsforatmost
time: End credits should be displayed for at most
this time (S, or MM:SS).
Default: 5
Options that affect debug data:
-debug
:
Show lots of debugging output.
-608
: Print debug traces from the EIA-608 decoder.
If you need to submit a bug report, please send the output from this option.
-708 : Print debug information from the EIA-711 (DTV) decoder.
(currently in development and useless)
-goppts
: Enable lots of
time stamp output.
-vides
: Print debug info about the analysed elementary
video stream.
-cbraw
: Print debug trace with the raw 608/708 data with
time stamps.
-nosync
: Disable the syncing code.
Only useful for debugging purposes.
-fullbin
: Disable the
removal of trailing padding blocks when exporting
to bin format.
Only useful for debugging purposes.
-parsedebug : Print debug info about the parsed container file.
(Only for TS/ASF files at the moment.)
Communication with other programs and console output:
--gui_mode_reports
:
Report progress and interesting events to stderr in
a easy to parse format.
This is intended to be used by other programs. See docs directory for details.
|
--no_progress_bar : Suppress the output of the progress bar |
SEE ALSO
Originally based on McPoodleâs tools.
Check his page for lots of information on closed captions technical details.
(http://www.geocities.com/mcpoodle43/SCC_TOOLS/DOCS/SCC_TOOLS.HTML)
This tool home page:
http://ccextractor.sourceforge.net