Man page - unicode::linebreakc_callback_base(3)
Packages contas this manual
- unicode_convert_tocbuf_toutf8_init(3)
- unicode::wordbreak(3)
- unicode_convert_uc(3)
- unicode_script(3)
- unicode_decomposition_init(3)
- unicode_wbscan_next(3)
- unicode_isgraph(3)
- unicode_bidi_direction(3)
- unicode::bidi_embed_paragraph_level(3)
- unicode_convert_tou_init(3)
- unicode_derived_grapheme_extend_lookup(3)
- unicode_bidi_calc_types(3)
- unicode_isdigit(3)
- unicode_convert_deinit(3)
- unicode_grapheme_break_init(3)
- unicode_bidi_cleanup(3)
- unicode_html40ent_lookup(3)
- unicode::linebreak_callback_save_buf(3)
- unicode_convert_tocbuf_fromutf8_init(3)
- unicode_emoji_modifier_base(3)
- unicode_derived_case_ignorable_lookup(3)
- unicode::bidi_cleanup(3)
- unicode_emoji_extended_pictographic(3)
- unicode::bidi_get_direction(3)
- unicode_convert_fromu_init(3)
- unicode_convert_init(3)
- unicode_category_lookup(3)
- unicode_composition_deinit(3)
- unicode_wb_next(3)
- unicode_bidi_setbnl(3)
- unicode_default_chset(3)
- unicode_derived_xid_start_lookup(3)
- unicode::bidi_calc_types(3)
- unicode_lbc_set_opts(3)
- unicode_wbscan_end(3)
- unicode::bidi_reorder(3)
- unicode_word_break(3)
- unicode_convert_fromutf8(3)
- unicode_isalpha(3)
- unicode::bidi(3)
- unicode_derived_cased_lookup(3)
- unicode_emoji(3)
- unicode_grapheme_break_next(3)
- unicode_derived_changes_when_casefolded_lookup(3)
- unicode_emoji_presentation(3)
- unicode_grapheme_break(3)
- unicode::bidi_logical_order(3)
- unicode_isblank(3)
- unicode::iconvert::convert(3)
- unicode_lbc_next(3)
- unicode::decompose_default_reallocate(3)
- unicode_uc(3)
- unicode_convert_toutf8(3)
- unicode_derived_grapheme_base_lookup(3)
- unicode_derived_incb_lookup(3)
- unicode_lb_set_opts(3)
- unicode::iconvert::convert_tocase(3)
- unicode_derived_math_lookup(3)
- unicode_composition_apply(3)
- unicode_bidi_type(3)
- unicode_convert_tocase(3)
- unicode::linebreakc_callback_base(3)
- unicode_decompose_reallocate_size(3)
- unicode_derived_changes_when_lowercased_lookup(3)
- unicode_derived_id_start_lookup(3)
- unicode_wb_end(3)
- unicode::tolower(3)
- unicode_wb_init(3)
- unicode::decompose(3)
- unicode_bidi(3)
- unicode_u_ucs2_native(3)
- unicode::wordbreak_callback_base(3)
- unicode_lbc_init(3)
- unicode_lc(3)
- unicode::linebreakc_iter(3)
- unicode_bidi_logical_order(3)
- unicode_tc(3)
- unicode::ucs_2(3)
- unicode_decompose(3)
- unicode::compose_default_callback(3)
- unicode_emoji_lookup(3)
- unicode_bidi_embed(3)
- unicode_ccc(3)
- unicode_composition_init(3)
- unicode::canonical(3)
- unicode_lb_init(3)
- unicode_convert_tobuf(3)
- unicode_derived_default_ignorable_code_point_lookup(3)
- unicode_ispunct(3)
- unicode_emoji_component(3)
- unicode_derived_changes_when_titlecased_lookup(3)
- unicode_bidi_mirror(3)
- unicode_bidi_reorder(3)
- unicode_bidi_calc(3)
- unicode_general_category_lookup(3)
- unicode_derived_lowercase_lookup(3)
- unicode::bidi_combinings(3)
- unicode_bidi_cleaned_size(3)
- unicode::compose(3)
- unicode_lb_next_cnt(3)
- unicode_convert_tou_tobuf(3)
- unicode_convert_tocbuf_init(3)
- unicode_grapheme_break_deinit(3)
- unicode_compose(3)
- unicode::iconvert::tou(3)
- unicode_bidi_embed_paragraph_level(3)
- unicode_lbc_next_cnt(3)
- unicode_locale_chset(3)
- unicode::linebreak_callback_base(3)
- unicode_isspace(3)
- unicode_bidi_needs_embed(3)
- unicode_emoji_modifier(3)
- unicode_islower(3)
- unicode_convert(3)
- unicode_derived_xid_continue_lookup(3)
- unicode::iso_8859_1(3)
- unicode_lb_end(3)
- unicode_derived_id_continue_lookup(3)
- unicode::ucs_4(3)
- courier-unicode(7)
- unicode::toupper(3)
- unicode_lbc_end(3)
- unicode_wb_next_cnt(3)
- unicode::linebreak_iter(3)
- unicode_derived_changes_when_casemapped_lookup(3)
- unicode_bidi_combinings(3)
- unicode_bidi_calc_levels(3)
- unicode_line_break(3)
- unicode_lb_next(3)
- unicode_derived_core_properties(3)
- unicode_isupper(3)
- unicode_convert_fromu_tobuf(3)
- unicode_derived_uppercase_lookup(3)
- unicode_canonical(3)
- unicode::bidi_calc(3)
- unicode::bidi_needs_embed(3)
- unicode_isalnum(3)
- unicode_wbscan_init(3)
- unicode_decomposition_deinit(3)
- unicode_u_ucs4_native(3)
- unicode::bidi_embed(3)
- unicode_bidi_bracket_type(3)
- unicode_derived_grapheme_link_lookup(3)
- unicode::iconvert::fromu(3)
- unicode::utf_8(3)
- unicode_derived_changes_when_uppercased_lookup(3)
- unicode::bidi_override(3)
apt-get install libcourier-unicode-dev
Manual
| UNICODE::LINEBREAK(3) | Courier Unicode Library | UNICODE::LINEBREAK(3) |
NAME
unicode::linebreak_callback_base, unicode::linebreak_callback_save_buf, unicode::linebreakc_callback_base, unicode::linebreak_iter, unicode::linebreakc_iter - unicode line-breaking rules
SYNOPSIS
#include <courier-unicode.h>
class linebreak : public unicode::linebreak_callback_base {
public:
using unicode::linebreak_callback_base::operator<<;
using unicode::linebreak_callback_base::operator();
int callback(int linebreak_code)
{
// ...
}
};
char32_t c;
std::u32string buf;
linebreak compute_linebreak;
compute_linebreak.set_opts(UNICODE_LB_OPT_SYBREAK);
compute_linebreak << c;
compute_linebreak(buf);
compute_linebreak(buf.begin(), buf.end());
compute_linebreak.finish();
// ...
unicode::linebreak_callback_save_buf linebreaks;
std::list<int> lb=linebreaks.lb_buf;
class linebreakc : public unicode::linebreakc_callback_base {
public:
using unicode::linebreak_callback_base::operator<<;
using unicode::linebreak_callback_base::operator();
int callback(int linebreak_code, char32_t ch)
{
// ...
}
};
// ...
std::u32string buf;
typedef unicode::linebreak_iter<std::u32string::const_iterator> iter_t;
iter_t beg_iter(buf.begin(), buf.end()), end_iter;
beg_iter.set_opts(UNICODE_LB_OPT_SYBREAK);
std::vector<int> linebreaks;
std::copy(beg_iter, end_iter, std::back_insert_iterator<std::vector<int>>(linebreaks));
// ...
typedef unicode::linebreakc_iter<std::u32string::const_iterator> iter_t;
iter_t beg_iter(buf.begin(), buf.end()), end_iter;
beg_iter.set_opts(UNICODE_LB_OPT_SYBREAK);
std::vector<std::pair<int, char32_t>> linebreaks;
std::copy(beg_iter, end_iter, std::back_insert_iterator<std::vector<int>>(linebreaks));
DESCRIPTION
unicode::linebreak_callback_base is a C++ binding for the unicode line-breaking rule implementation described in unicode_line_break(3).
Subclass unicode::linebreak_callback_base and implement callback() that's virtually inherited from unicode::linebreak_callback_base. The callback() callback function receives the output values from the line-breaking algorithm, the UNICODE_LB_MANDATORY, UNICODE_LB_NONE, or the UNICODE_LB_ALLOWED value, for each unicode character.
callback() should return 0. A non-zero return reports an error, that stops the line-breaking algorithm. See unicode_line_break(3) for more information.
The alternate unicode::linebreakc_callback_base interface uses a virtually inherited callback() that receives two parameters, the line-break code value, and the corresponding unicode character.
The input unicode characters for the line-breaking algorithm are provided by the << operator, one unicode character at a time; or by the () operator, passing either a container, or a beginning and an ending iterator value for an input sequence of unicode characters. finish() indicates the end of the unicode character sequence.
set_opts sets line-breaking options (see unicode_lb_set_opts() for more information).
unicode::linebreak_callback_save_buf is a subclass that implements callback() by saving the linebreaks codes into a std::list.
The linebreak_iter template implements an input iterator over ints. The template parameter is an input iterator over unicode chars. The constructor's parameters are a beginning and an ending iterator value for a sequence of char32_t. This constructs the beginning iterator value for a sequence of ints consisting of line-break values (UNICODE_LB_MANDATORY, UNICODE_LB_NONE, or UNICODE_LB_ALLOWED) corresponding to each char32_t in the underlying sequence. The default constructor creates the ending iterator value for the sequence.
The iterator implements a set_opts() methods that sets the options for the line-breaking algorithm.
The linebreakc_iter template implements a similar input iterator, with the difference that it ends up iterating over a std::pair of line-breaking values and the corresponding char32_t from the underlying input sequence.
SEE ALSO
courier-unicode(7), unicode_line_break(3).
AUTHOR
Sam Varshavchik
| 05/18/2024 | Courier Unicode Library |