ON_UnicodeShortCodePoint is a tool to use when working with Unicode code points with values <=0xFFFF. Note that valid Unicode code point values can be as large as 0x10FFFD. (0x10FFFE and 0x10FFFF are specified as <not a character> by the Unicode Standard code chart https://www.unicode.org/charts/PDF/U10FF80.pdf) This class is used when converting between Unicode and BIG5 encodings and in other settings where Unicode code points > 0xFFFF are not encountered and the 2 byte size of ON_UnicodeShortCodePoint appreciably more efficient that a 4 byte size of an unsigned int.
More...
#include <opennurbs_unicode.h>
ON_UnicodeShortCodePoint is a tool to use when working with Unicode code points with values <=0xFFFF. Note that valid Unicode code point values can be as large as 0x10FFFD. (0x10FFFE and 0x10FFFF are specified as <not a character> by the Unicode Standard code chart https://www.unicode.org/charts/PDF/U10FF80.pdf) This class is used when converting between Unicode and BIG5 encodings and in other settings where Unicode code points > 0xFFFF are not encountered and the 2 byte size of ON_UnicodeShortCodePoint appreciably more efficient that a 4 byte size of an unsigned int.
◆ ON_UnicodeShortCodePoint() [1/2]
ON_UnicodeShortCodePoint::ON_UnicodeShortCodePoint |
( |
| ) |
|
|
default |
◆ ~ON_UnicodeShortCodePoint()
ON_UnicodeShortCodePoint::~ON_UnicodeShortCodePoint |
( |
| ) |
|
|
default |
◆ ON_UnicodeShortCodePoint() [2/2]
◆ Compare()
◆ Create()
Creates a Unicode code point with the specified code point value.
- Parameters
-
unicode_code_point | A valid Unicode code point. |
- Returns
- If unicode_code_point is <= 0xFFFE and a valid Unicode code point or is the Unicode byte order mark (0xFFFE), then an instance with that value as code point is returned. Otherwise ON_UnicodeShortCodePoint::Error is returned. Notes:
- Valid Unicode code points can be as large as 0x10FFFD and ON_UnicodeShortCodePoint cannot accommodate code points >= U+10000.
- Values >= 0xD800 and < 0xE000 are not valid Unicode code points. These values are used in UTF-16 surrogate pair encodings of code points >= U+10000.
◆ CreateFromBig5() [1/2]
Find a Unicode code point with the same character as big5_code_point.
- Parameters
-
big5_code_point | |
not_available | Value to return when big5_code_point is valid but does not map to a Unicode code point. |
- Returns
- If there is a corresponding BIG5 or ASCII code point, that code point is returned. Otherwise, if big5_code_point is valid, not_available is returned. Otherwise ON_UnicodeShortCodePoint::Error is returned.
◆ CreateFromBig5() [2/2]
Find a Unicode code point with the same character as big5_code_point.
- Parameters
-
big5_code_point | |
not_available | Value to return when big5_code_point is valid but does not map to a Unicode code point. |
- Returns
- If there is a corresponding BIG5 or ASCII code point, that code point is returned. Otherwise, if big5_code_point is valid, not_available is returned. Otherwise ON_UnicodeShortCodePoint::Error is returned.
◆ IsASCII()
bool ON_UnicodeShortCodePoint::IsASCII |
( |
bool |
bNullIsASCII | ) |
const |
The Unicode is a extension of ASCII encoding and code points <= 0x7F are valid Unicode code points.
- Parameters
-
- Returns
- True if the code point value &le= 0x7F.
◆ IsByteOrderMark()
bool ON_UnicodeShortCodePoint::IsByteOrderMark |
( |
| ) |
const |
The Unicode code point value 0xFFFE os not a character and is used as a byte order mark.
- Returns
- True if this Unicode code point value is 0xFFFE.
◆ IsNull()
bool ON_UnicodeShortCodePoint::IsNull |
( |
| ) |
const |
Determine if this Unicode code point is 0.
- Returns
- True if the code point value is 0.
◆ IsPrivateUse()
bool ON_UnicodeShortCodePoint::IsPrivateUse |
( |
| ) |
const |
Unicode code points are separated into standard and private use (user defined) code points.
- Returns
- True if this Unicode code point is a private use (user defined) code point.
◆ IsReplacementCharacter()
bool ON_UnicodeShortCodePoint::IsReplacementCharacter |
( |
| ) |
const |
The Unicode code point U+FFFD is called the replacement character. It is typically a light question mark with a dark diamond background. It is often used to indicate a character is unknown, unavailable, does not exist, or an error occurred when creating that code point.
- Returns
- True if this Unicode code point is U+FFFD.
◆ IsStandard()
bool ON_UnicodeShortCodePoint::IsStandard |
( |
bool |
bNullIsValid | ) |
const |
Unicode code points are separated into standard and private use (user defined) code points.
- Parameters
-
bNullIsValid | Value to return if the code point value is 0. |
- Returns
- Returns true this Unicode code point is a standard code point.
◆ IsValid()
bool ON_UnicodeShortCodePoint::IsValid |
( |
bool |
bNullIsValid, |
|
|
bool |
bByteOrderMarkIsValid |
|
) |
| const |
Determine if the Unicode code point is valid.
- Parameters
-
bNullIsValid | Value to return if the code point value is 0. |
bByteOrderMarkIsValid | Value to return if the code point value is 0xFFFE. |
- Returns
- True if the code point value is a valid Unicode code point.
◆ operator=()
◆ UnicodeCodePoint()
unsigned int ON_UnicodeShortCodePoint::UnicodeCodePoint |
( |
| ) |
const |
◆ ByteOrderMark
◆ Error
◆ Null
ON_UnicodeShortCodePoint::NUll has a code point value = 0.
◆ ReplacementCharacter