Man page - ccextractor(1)
Packages contas this manual
Manual
| CCEXTRACTOR(1) | User Commands | CCEXTRACTOR(1) |
NAME
CCExtractor - A closed caption software decoder
SYNOPSIS
ccextractor [options] inputfile1 [inputfile2...] [-o outputfilename] [-o1 outputfilename1] [-o2 outputfilename2]
DESCRIPTION
Extracts closed captions from MPEG files.
(DVB, .TS, ReplayTV 4000 and 5000, dvr-ms, bttv, Tivo and Dish Network are known to work).
ccextractor reads a video stream looking for closed captions (subtitles).
It can do two things:
- Save the data to a "raw", unprocessed file which you
can later use
as input for other tools, such as McPoodle's excellent suite.
- Generate a subtitles file (.srt,.smi, or .txt) which you can
directly
use with your favourite player.
OPTIONS
File name related options:
- Use -o parameters to define output filename if you don't like the default ones.
- Default: (same as infile plus _1 or _2 when needed and .raw or .srt extension).
- -o or -o1 -> Name of the first (maybe only) output file.
- -o2 -> Name of the second output file, when it applies.
- Clean means the ES without TS or PES headers.
You can pass as many input files as you need. They will be
processed in order.
If a file name is suffixed by +, ccextractor will try to follow a
numerical sequence.
For example, DVD001.VOB+ means DVD001.VOB, DVD002.VOB and
so on until there are no more files.
Output will be one single file (either raw or srt). Use this if you made your
recording in several cuts (to skip commercials for example) but you want one
subtitle file with contiguous timing.
Options that affect what will be processed:
- -1, -2, -12: Output Field 1 data, Field 2 data, or both
- Default: -1
In general, if you want English subtitles you don't need to use
these options
as they are broadcast in field 1, channel 1.
If you want the second language (usually Spanish) you may need to try
-2, or -cc2, or both.
Input formats:
With the exception of McPoodle's raw format, which is just the
closed caption data with no other info,
CCExtractor can usually detect the input format correctly.
To force a specific format:
Where format is one of these:
- ts -> For Transport Streams.
- ps -> For Program Streams.
- es -> For Elementary Streams.
- asf -> ASF container (such as DVR-MS).
- bin -> CCExtractor's own binary format.
- raw -> For McPoodle's raw files.
-ts, -ps, -es and -asf (or --dvr-ms) can be used as shorts.
Output formats:
Where format is one of these:
- srt -> SubRip (default, so not actually needed).
- sami -> MS Synchronized Accesible Media Interface.
- bin -> CC data in CCExtractor's own binary format.
- raw -> CC data in McPoodle's Broadcast format.
- dvdraw -> CC data in McPoodle's DVD format.
- txt -> Transcript (no time codes, no roll-up captions, just the plain transcription.
Options that affect how input files will be processed.
- This only applies to Program or Transport Streams with MPEG2 data and
overrides the default PTS timing.
GOP timing is always used for Elementary Streams.
- Some cards (or providers, or whatever) seem to send 0000 as CC padding
instead of 8080.
If you get bad timing, this might solve it.
- -90090: Use 90090 (instead of 90000) as MPEG clock frequency.
- (reported to be needed at least by Panasonic DMR-ES15 DVD Recorder)
- By default, ccextractor will process input files in sequence
as if they were all one large file (i.e. split by a generic, non video-aware tool.
If you are processing video hat was split with a editing tool,
use -ve so ccextractor doesn't try to rebuild the original timing.
- That is, growing as ccextractor processes it,
so don't try to figure out its size and don't terminate processing
when reaching the current end (i.e. wait for more data to arrive).
If the optional parameter secs is present, it means the number of seconds
without any new data after which ccextractor should exit.
Use this parameter if you want to process a live stream but not kill ccextractor externally.Note: If -s is used then only one input file is allowed.
- The MythTV branch is needed for analog captures where the closed caption
data is stored in the VBI, such as those with bttv cards (Hauppage 250 for
example).
This is detected automatically so you don't need to worry about this unless autodetection doesn't work for you.
- This switch works around a bug in Windows 7's built in software to convert
*.wtv to *.dvr-ms.
For analog NTSC recordings the CC information is marked as digital captions.
Use this switch only when needed.
Options that affect what kind of output will be produced:
- -unicode: Encode subtitles in Unicode instead of Latin-1
- -utf8: Encode subtitles in UTF-8 instead of Latin-1
- -nofc --nofontcolor: For .srt/.sami, don't add font color tags.
- -trim: Trim lines.
- -dc --defaultcolor: Select a different default color (instead of white).
- This causes all output in .srt/.smi files to have a font tag, which makes
the files larger.
Add the color you want in RGB, such as -dc #FF0000 for red.
- Use if you hate ALL CAPS in subtitles.
- For example, if file is a plain text file that contains
- Tony
AlanWhenever those words are found they will be written exactly as they appear in the file.
Use one line per word. Lines starting with # are considered comments and discarded.
Options that affect how ccextractor reads and writes (buffering):
Note: -bo is only used when writing raw files, not .srt or .sami
Options that affect the built-in closed caption decoder:
- When in roll-up mode, write character by character instead of line by
line.
Note that this produces (much) larger files.
- If you hate the repeated lines caused by the roll-up emulation,
you can have ccextractor write only one line at a time, getting rid of these repeated lines.
Options that affect timing:
- For example, -delay 400 makes subtitles appear 400ms late.
You can also use negative numbers to make subs appear early.
Notes on times: -startat and -endat times are used
first, then -delay.
So if you use -srt -startat 3:00 -endat 5:00
-delay 120000,
ccextractor will generate a .srt file, with only data from 3:00 to 5:00 in the
input file(s)
and then add that (huge) delay, which would make the final file start at 5:00
and end at 7:00.
Options that affect what segment of the input file(s) to process:
- Time can be seconds, MM:SS or HH:MM:SS.
For example, -startat 3:00 means 'start writing from minute 3.
The -startat and -endat options are honored in all
output formats.
In all formats with timing information the times are unchanged.
-scr --screenfuls num: Write 'num' screenfuls and terminate processing.
Adding start and end credits:
CCExtractor can _try_ to add a custom message (for credits for
example)
at the start and end of the file, looking for a window where there are no
captions.
If there is no such window, then no text will be added.
The start window must be between the times given and must have enough time
to display the message for at least the specified time.
- If there are several lines, separate them with the characters \n.
- For example Line1\nLine 2.
- Default: 0
- Default: 5:00
- Default: 2
- --startcreditsforatmost time: Start credits should be displayed for at most this time (S, or MM:SS).
- Default: 5
- If there are several lines, separate them with the characters \n.
- For example Line1\nLine 2.
- Default: 2
- Default: 5
Options that affect debug data:
- -debug: Show lots of debugging output.
- -608: Print debug traces from the EIA-608 decoder.
- If you need to submit a bug report, please send the output from this option.
- -708: Print debug information from the EIA-711 (DTV) decoder.
- (currently in development and useless)
- -goppts: Enable lots of time stamp output.
- -vides: Print debug info about the analysed elementary video stream.
- -cbraw: Print debug trace with the raw 608/708 data with time stamps.
- -nosync: Disable the syncing code.
- Only useful for debugging purposes.
- Only useful for debugging purposes.
- (Only for TS/ASF files at the moment.)
Communication with other programs and console output:
- This is intended to be used by other programs. See docs directory for details.
--no_progress_bar: Suppress the output of the progress bar
SEE ALSO
Originally based on McPoodle's tools.
- Check his page for lots of information on closed captions technical details.
- (http://www.geocities.com/mcpoodle43/SCC_TOOLS/DOCS/SCC_TOOLS.HTML)
This tool home page:
- http://ccextractor.sourceforge.net
| June 2011 | CCExtractor 0.57, Carlos Fernandez Sanz, Volker Quetschke. |