doc/Manual/en/BH_Unicode.pod


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184

=encoding UTF-8


=head1 NAME

BH_Unicode - Working with Unicode and UTF encodings


=head1 SYNTAX

 #include <BH/Math.h>

 cc prog.c -o prog -lbh


=head1 DESCRIPTION

The BH_Unicode library provides a set of functions for working with various
Unicode encodings, including UTF-8, UTF-16, and UTF-32. It allows you to
convert characters to uppercase and lowercase, as well as encode and decode
strings in the specified encodings.


=head1 API CALLS


=head2 BH_UnicodeLower

 uint32_t BH_UnicodeLower(uint32_t unit);

Converts the Unicode code I<unit> to lowercase.

The conversion is performed for characters in the Basic Multilingual Plane 
(i.e., the first 65,536 codes).


=head2 BH_UnicodeUpper

 uint32_t BH_UnicodeUpper(uint32_t unit);

Converts the Unicode code I<unit> to uppercase.

The conversion is performed for characters in the Basic Multilingual Plane 
(i.e., the first 65,536 codes).


=head2 BH_UnicodeDecodeUtf8

 size_t BH_UnicodeDecodeUtf8(const char *string,
                             size_t size,
                             uint32_t *unit);

Decodes the UTF-8 sequence from I<string> (with the specified length I<size>) 
and writes the code to I<unit>.

Invalid UTF-8 sequences will be converted to code -1.

If successful, the function returns the number of bytes read or 0 if I<string> 
contains only part of the sequence.


=head2 BH_UnicodeEncodeUtf8

 size_t BH_UnicodeEncodeUtf8(uint32_t unit,
                             char *string);

Encodes the UTF-8 sequence into I<string> from the code I<unit>.

It is assumed that the string contains 4 bytes of free space.

If successful, the function returns the number of bytes written or 0.


=head2 BH_UnicodeDecodeUtf16LE

 size_t BH_UnicodeDecodeUtf16LE(const char *string,
                                size_t size,
                                uint32_t *unit);

Decodes the UTF-16LE sequence from I<string> (with the specified length I<size>)
and writes the code to I<unit>.

Invalid UTF-16LE sequences will be converted to code -1.

If successful, the function returns the number of bytes read or 0 if I<string> 
contains only part of the sequence.


=head2 BH_UnicodeDecodeUtf16BE

 size_t BH_UnicodeDecodeUtf16BE(const char *string,
                                size_t size,
                                uint32_t *unit);

Decodes the UTF-16BE sequence from I<string> (with the specified length I<size>)
and writes the code to I<unit>.

Invalid UTF-16BE sequences will be converted to code -1.

If successful, the function returns the number of bytes read or 0 if I<string> 
contains only part of the sequence.


=head2 BH_UnicodeEncodeUtf16LE

 size_t BH_UnicodeEncodeUtf16LE(uint32_t unit,
                                char *string);

Encodes the UTF-16LE sequence into I<string> from the code I<unit>.

It is assumed that the string contains 4 bytes of free space.

If successful, the function returns the number of bytes written or 0.


=head2 BH_UnicodeEncodeUtf16BE

 size_t BH_UnicodeEncodeUtf16BE(uint32_t unit,
                                char *string);

Encodes the UTF-16BE sequence into I<string> from the code I<unit>.

It is assumed that the string contains 4 bytes of free space.

If successful, the function returns the number of bytes written or 0.


=head2 BH_UnicodeDecodeUtf32LE

 size_t BH_UnicodeDecodeUtf32LE(const char *string,
                                size_t size,
                                uint32_t *unit);

Decodes the UTF-32LE sequence from I<string> (with the specified length I<size>)
and writes the code to I<unit>.

Invalid UTF-32LE sequences will be converted to code -1.

If successful, the function returns the number of bytes read or 0 if I<string> 
contains only part of the sequence.


=head2 BH_UnicodeDecodeUtf32BE

 size_t BH_UnicodeDecodeUtf32BE(const char *string,
                                size_t size,
                                uint32_t *unit);

Decodes the UTF-32BE sequence from I<string> (with the specified length I<size>)
and writes the code to I<unit>.

Invalid UTF-32BE sequences will be converted to code -1.

If successful, the function returns the number of bytes read or 0 if I<string> 
contains only part of the sequence.


=head2 BH_UnicodeEncodeUtf32LE

 size_t BH_UnicodeEncodeUtf32LE(uint32_t unit,
                                char *string);

Encodes the UTF-32LE sequence into I<string> from the code I<unit>.

It is assumed that the string contains 4 bytes of free space.

If successful, the function returns the number of bytes written or 0.


=head2 BH_UnicodeEncodeUtf32BE

 size_t BH_UnicodeEncodeUtf32BE(uint32_t unit,
                                char *string);

Encodes the UTF-32BE sequence into I<string> from the code I<unit>.

It is assumed that the string contains 4 bytes of free space.

If successful, the function returns the number of bytes written or 0.


=head1 SEE ALSO

L<BH>