Decode base32-encoded data.
strb = base32decode(strt)
base32decode(strt) decodes the contents of string strt which represents data encoded with base32. Characters which are not 'A'-'Z', '2'-'7', or '=' are ignored. Decoding stops at the end of the string or when '=' is reached.
Encode data using base32.
strt = base32encode(strb)
base32encode(strb) encodes the contents of string strb which represents binary data. The result contains only characters 'A'-'Z' and '2'-'7', and linefeed every 56 characters. It is suitable for transmission or storage on media which accept only uppercase letters and digits, without '0' or '1' easy to misinterpret as letters.
Each character of encoded data represents 5 bits of binary data; i.e. one needs eight characters for five bytes. The five bits represent 32 different values, encoded with the characters 'A' to 'Z' and '2' to '7' in this order. When the binary data have a length which is not a multiple of 5, encoded data are padded with 2, 3, 5 or 6 characters '=' to have a multiple of 8.
Base32 encoding is an Internet standard described in RFC 4648.
s = base32encode(char(0:10)) s = AAAQEAYEAUDAOCAJBI====== d = double(base32decode(s)) d = 0 1 2 3 4 5 6 7 8 9 10
Decode base64-encoded data.
strb = base64decode(strt)
base64decode(strt) decodes the contents of string strt which represents data encoded with base64. Characters which are not 'A'-'Z', 'a'-'z', '0'-'9', '+', '/', or '=' are ignored. Decoding stops at the end of the string or when '=' is reached.
Encode data using base64.
strt = base64encode(strb)
base64encode(strb) encodes the contents of string strb which represents binary data. The result contains only characters 'A'-'Z', 'a'-'z', '0'-'9', '+', '/', and '='; and linefeed every 60 characters. It is suitable for transmission or storage on media which accept only text.
Each character of encoded data represents 6 bits of binary data; i.e. one needs four characters for three bytes. The six bits represent 64 different values, encoded with the characters 'A' to 'Z', 'a' to 'z', '0' to '9', '+', and '/' in this order. When the binary data have a length which is not a multiple of 3, encoded data are padded with one or two characters '=' to have a multiple of 4.
Base64 encoding is an Internet standard described in RFC 2045.
s = base64encode(char(0:10)) s = AAECAwQFBgcICQo= double(base64decode(s)) 0 1 2 3 4 5 6 7 8 9 10
Convert an array to a character array (string).
s = char(A) S = char(s1, s2, ...)
char(A) converts the elements of matrix A to characters, resulting in a string of the same size. Characters are stored in unsigned 16-bit words. The shape of A is preserved. Even if most functions ignore the string shape, you can force a row vector with char(A(:).').
char(s1,s2,...) concatenates vertically the arrays given as arguments to produce a string matrix. If the strings do not have the same number of columns, blanks are added to the right.
char(65:70) ABCDEF char([65, 66; 67, 68](:).') ABCD char('ab','cde') ab cde char('abc',['de';'fg']) abc de fg
setstr, uint16, operator :, operator .', ischar, logical, double, single
Remove trailing blank characters from a string.
s2 = deblank(s1)
deblank(s1) removes the trailing blank characters from string s1. Blank characters are spaces (code 32), tabulators (code 9), carriage returns (code 13), line feeds (code 10), and null characters (code 0).
double(' \tAB CD\r\n\0') 32 9 65 66 32 32 67 68 13 10 0 double(deblank(' \tAB CD\n\r\0'))) 32 9 65 66 32 32 67 68
HMAC authentication hash.
hash = hmac(hashtype, key, data) hash = hmac(hashtype, key, data, type=t)
hmac(hashtype,key,data) calculates the authentication hash of data with secret key key and the method specified by hashtype: 'md5', 'sha1', 'sha224', 'sha256', 'sha384', or 'sha512'. Both arguments data and key can be strings (char arrays) which are converted to UTF-8, or int8 or uint8 arrays. The key can be up to 64 bytes; longer keys are truncated. The result is a string of hexadecimal digits whose length depends on the hash method, from 32 for HMAC-MD5 to 128 for HMAC-SHA512.
Named argument type can change the output type. It can be 'uint8' for an uint8 array of 16 or 20 bytes (raw HMAC-MD5 or HMAC-SHA1 hash result), 'hex' for its representation as a string of 32 or 40 hexadecimal digits (default), or base64 for its conversion to Base64 in a string of 24 or 28 characters.
HMAC is an Internet standard described in RFC 2104.
HMAC-MD5 of 'Authenticated message' using secret key 'secret':
hmac('md5', 'secret', 'Authenticated message') 4f557b1f67bc4790e6e9568e2f458cf0
Same result computed explicitly, with the notations of RFC 2104: B is the block length, L is the hash length (16 for HMAC-MD5 or 20 for HMAC-SHA1), K is the key padded with zeros to have size B, and H is the hash function, defined here to produce a uint8 hash instead of an hexadecimal string like the LME functions md5 or sha1.
B = 64; L = 16; H = @(a) uint8(sscanf(md5(a), '%2x')'); key = uint8('secret'); data = uint8('Authenticated message'); K = [key, zeros(1, B - length(key), 'uint8')]; hash = H([bitxor(K, 0x5cuint8), H([bitxor(K, 0x36uint8), data])]); sprintf('%.2x', hash)
Simple implementation of the HOTP and TOTP password algorithms (RFC 4226 and 6238) often used for two-factor authentication, with their default parameter values. The password is assumed to be base32-encoded.
function n = hotp(pass, cnt) k = uint8(base32decode(pass)); c = bwrite(cnt, 'uint64;b'); // or c=bwrite([floor(c/2^32),mod(c,2^32)],'uint32;b'); hs = hmac('sha1', k, c, type='uint8'); ob = mod(hs(20), 16); dt = mod(sread(hs(ob + (1:4)), [], 'uint32;b'), 2^31); n = mod(dt, 1e6); function n = totp(pass) t = floor(posixtime / 30); n = hotp(pass, t);
Simple implementation of the PBKDF2 key stretching algorithm (RFC 2898):
function dk = pbkdf2_hmac(hashtype, p, salt, c, dkLen) hLen = length(hmac(hashtype, '', '')) / 2; dk = uint8([]); for i = 1:ceil(dkLen / hLen) u = hmac(hashtype, p, [salt, bwrite(i, 'uint32;b')], type='uint8'); f = u; for j = 2:c u = hmac(hashtype, p, u, type='uint8'); f = bitxor(f, u); end dk = [dk, f]; end dk = dk(1:dkLen);
Test of PBKDF2-HMAC-SHA1 with values provided in RFC 6070 (output format is switched to hexadecimal for easier comparison):
format int x pbkdf2_hmac_sha1('sha1', 'password', 'salt', 4096, 20) 0x4b 0x0 0x79 0x1 0xb7 0x65 0x48 0x9a 0xbe 0xad 0x49 0xd9 0x26 0xf7 0x21 0xd0 0x65 0xa4 0x29 0xc1 format
Test for a string object.
b = ischar(obj)
ischar(obj) is true if the object obj is a character string, false otherwise. Strings can have more than one line.
ischar('abc') true ischar(0) false ischar([]) false ischar('') true ischar(['abc';'def']) true
isletter, isspace, isnumeric, islogical, isinteger, islist, isstruct, setstr, char
Test for decimal digit characters.
b = isdigit(s)
For each character of string s, isdigit(s) is true if it is a digit ('0' to '9') and false otherwise. The result is a logical array with the same size as the input argument.
isdigit('a123bAB12* ') F T T T F F F T T F F
isletter, isspace, lower, upper, ischar
Test for letter characters.
b = isletter(s)
For each character of string s, isletter(s) is true if it is an ASCII letter (a-z or A-Z) and false otherwise. The result is a logical array with the same size as the input argument.
isletter gives false for letters outside the 7-bit ASCII range; unicodeclass should be used for Unicode-aware tests.
isletter('abAB12*') T T T T F F F F
isdigit, isspace, lower, upper, ischar, unicodeclass
Test for space characters.
b = isspace(s)
For each character of string s, isspace(s) is true if it is a space, a tabulator, a carriage return or a line feed, and false otherwise. The result is a logical array with the same size as the input argument.
isspace('a\tb c\nd') F T F T F T F
Convert LaTeX equation to MathML.
str = latex2mathml(tex) str = latex2mathml(tex, mml1, mml2, ...) str = latex2mathml(..., displaymath=b)
latex2mathml(tex) converts LaTeX equation in string tex to MathML. LaTeX equations may be enclosed between dollars or double-dollars, but this is not mandatory. In string literals, backslash and tick characters must be escaped as \\ and \' respectively.
With additional arguments, which must be strings containing MathML, parameters #1, #2, ... in argument tex are converted to argument i+1.
The following LaTeX features are supported:
LaTeX features not enumerated above, such as definitions and nested text and equations, are not supported.
latex2mathml has also features which are missing in LaTeX. Unicode is used for both LaTeX input and MathML output. Some semantics is recognized to build subexpressions which are revealed in the resulting MathML. For instance, in x+(y+z)w, (y+z) is a subpexpressions; so is (y+z)w with an implicit multiplication (resulting in the <mo>⁢<mo> MathML operator), used as the second operand of the addition. LaTeX code (like mathematical notation) is sometimes ambiguous and is not always converted to the expected MathML (e.g. a(b+c) is converted to a function call while the same notation could mean the product of a and b+c), but this should not have any visible effect when the MathML is typeset.
Operators can be used as freely as in LaTeX. Missing operands result in <none/>, as if there were an empty pair of braces {}. Consecutive terms are joined with implicit multiplications.
Named argument displaymath specifies whether the vertical space is tight, like in inline equations surrounded by text (false), or unconstrained, as rendered in separate lines (true). It affects the position of some limits. The default is true.
latex2mathml('xy^2') <mrow><mi>x</mi><mo>⁢</mo><msup><mi>y</mi><mn>2</mn></msup></mrow> mml = latex2mathml('\\frac{x_3+5}{x_1+x_2}'); mml = latex2mathml('$\\root n \\of x$'); mml = latex2mathml('\\pmatrix{x & \\sqrt y \\cr \\sin\\phi & \\hat\\ell}'); mml = latex2mathml('\\dot x = #1', mathml([1,2;3,0], false)); mml = latex2mathml('\\lim_{x \\rightarrow 0} f(x)', displaymath=true) mml = latex2mathml('\\lim_{x \\rightarrow 0} f(x)', displaymath=false)
Convert all uppercase letters to lowercase.
s2 = lower(s1)
lower(s1) converts all the uppercase letters of string s1 to lowercase, according to the Unicode Character Database.
lower('abcABC123') abcabc123
Conversion to MathML.
str = mathml(x) str = mathml(x, false) str = mathml(..., Format=f, NPrec=n)
mathml(x) converts its argument x to MathML presentation, returned as a string.
By default, the MathML top-level element is <math>. If the result is to be used as a MathML subelement of a larger equation, a second input argument equal to the logical value false can be specified to suppress <math>.
By default, mathml converts numbers like format '%g' of sprintf. Named arguments can override them: format is a single letter format recognized by sprintf and NPrec is the precision (number of decimals).
mathml(pi) <math> <mn>3.1416</mn> </math> mathml(1e-6, Format='e', NPrec=2) <math> <mrow><mn>1.00</mn><mo>·</mo><msup><mn>10</mn><mn>-6</mn></msup></mrow> </math>
mathmlpoly, latex2mathml, sprintf
Conversion of a polynomial to MathML.
str = mathmlpoly(pol) str = mathmlpoly(pol, var) str = mathmlpoly(..., power) str = mathmlpoly(..., false) str = mathmlpoly(..., Format=f, NPrec=n)
mathmlpoly(coef) converts polynomial coefficients pol
to MathML presentation, returned as a string. The polynomial is given as a vector of
coefficients, with the highest power first; e.g.,
By default, the name of the variable is x. An optional second argument can specify another name as a string, such as 'y', or a MathML fragment beginning with a less-than character, such as '<mn>3</mn>'.
Powers can be specified explicitly with an additional argument, a vector which must have the same length as the polynomial coefficients. Negative and fractional numbers are allowed; the imaginary part, if any, is ignored.
By default, the MathML top-level element is <math>. If the result is to be used as a MathML subelement of a larger equation, an additional input argument (the last unnamed argument) equal to the logical value false can be specified to suppress <math>.
Named arguments format and NPrec have the same effect as with mathml.
Simple third-order polynomial:
mathmlpoly([1,2,5,3])
Polynomial with negative powers of variable q:
c = [1, 2.3, 4.5, -2]; mathmlpoly(c, 'q', -(0:numel(c)-1))
Rational fraction:
str = sprintf('<mfrac>%s%s</mfrac>', mathmlpoly(num, false), mathmlpoly(den, false));
Calculate MD5 digest.
digest = md5(strb) digest = md5(fd) digest = md5(..., type=t)
md5(strb) calculates the MD5 digest of strb which represents binary data. strb can be a string (only the least-significant byte of each character is considered) or an array of bytes of class uint8 or int8. The result is a string of 32 hexadecimal digits. It is believed to be hard to create the input to get a given digest, or to create two inputs with the same digest.
md5(fd) calculates the MD5 digest of the bytes read from file descriptor fd until the end of the file. The file is left open.
Named argument type can change the output type. It can be 'uint8' for an uint8 array of 16 bytes (raw MD5 hash result), 'hex' for its representation as a string of 32 hexadecimal digits (default), or base64 for its conversion to Base64 in a string of 24 characters.
MD5 digest is an Internet standard described in RFC 1321.
MD5 of the three characters 'a', 'b', and 'c':
md5('abc') 900150983cd24fb0d6963f7d28e17f72
This can be compared to the result of the command tool md5 found on many unix systems:
$ echo -n abc | md5 900150983cd24fb0d6963f7d28e17f72
The following statements calculate the digest of the file 'somefile':
fd = fopen('somefile'); digest = md5(fd); fclose(fd);
Regular expression match.
(startIx, endIx, length, grExt) = regexp(str, re) (startIx, endIx, grExt) = regexpi(str, re)
regexp(str,re) matches regular expression re in string str. A regular expression is a string which contains meta-characters to match classes of characters, repetitions and alternatives, as described below.
Once a match is found, the remaining part of str is parsed from the end of the previous match to find more matches. The result of regexp is an array of start indices in str and an array of corresponding end indices. Empty matches have a length endIx-startIx-1=0.
The third output argument, if present, is set to a list whose items correspond to matches. Items are arrays of size 2-by-ng. Each row corresponds to a group, i.e. a subexpression in parentheses in the regular expression; the first column contains the index of the first character in str and the second column contains the index of the last character.
regexpi is similar to regexp, except that letter case is ignored.
The following regular expression elements are recognized:
Quantifiers ?, * and +, and their lazy and possessive versions (suffixed with ? or + respectively) have the highest priority. Priority can be changed with parentheses, e.g. (abc)* or (a|bc)d.
Simple match without metacharacter:
(startIx, endIx) = regexp('Some random string', 'om') startIx = 2 10 endIx = 3 11
Dot to match any character:
regexp('Some random string', 'S..e') 1
Anchor to end of string:
regexp('Some random string', '..$') 17
Repetition:
regexp('Some random string', 'r.*m') 6
By default, repetitions are greedy (as many as possible):
(startIx, endIx) = regexp('Some random string', '.*m') startIx = 1 endIx = 11
Lazy repetition (as few as possible):
(startIx, endIx) = regexp('Some random string', '.*?m') startIx = 1 4 endIx = 3 11
Possessive repetitions keep the largest number of repetitions which provides a match regardless of subsequent failures:
(startIx, endIx) = regexp('Some random string', '.*m ') startIx = 1 endIx = 12 (startIx, endIx) = regexp('Some random string', '.*+m ') startIx = [] endIx = []
Since backslash is an escape character in LME strings, it must be escaped itself:
(startIx, endIx) = regexp('Some random string', '\\b\\w.+?\\b') startIx = 1 6 13 endIx = 4 11 18
Reference to a captured group:
(startIx, endIx) = regexp('xx-ab-ab', '(.+)-\\1') startIx = 4 endIx = 8
Positive lookahead to find words followed by a colon without picking the colon itself:
(startIx, endIx) = regexp('mailto:foo@example.com', '\\b\\w+(?=:)') startIx = 1 endIx = 6
Group (the extent of the whole match is ignored using placeholder output arguments ~):
(~, ~, grExt) = regexp('Regexp are fun', '\\b(\\w+)\\s+(\\w+)\\s+(\\w+)\\b'); grExt{1} 1 6 8 10 12 14
Match ignoring case:
regexpi('Some random string', 'some') 1
Case-explicit character classes are still case-significant, but character enumerations or ranges are not:
regexpi('Some random string', '^[[:lower:]]') [] regexpi('Some random string', '^[a-z]') 1
Conversion of an array to a string.
str = setstr(A)
setstr(A) converts the elements of array A to characters, resulting in a string of the same size. Characters are stored in unsigned 16-bit words.
setstr(65:75) ABCDEFGHIJK
Calculate SHA-1 or SHA-2 digest.
digest = sha1(strb) digest = sha1(fd) digest = sha1(..., type=t) digest = sha2(...) digest = sha2(..., variant=v)
sha1(strb) calculates the SHA-1 digest of strb which represents binary data. strb can be a string (only the least-significant byte of each character is considered) or an array of bytes of class uint8 or int8. The result is a string of 40 hexadecimal digits. It is believed to be hard to create the input to get a given digest, or to create two inputs with the same digest.
sha1(fd) calculates the SHA-1 digest of the bytes read from file descriptor fd until the end of the file. The file is left open.
Named argument type can change the output type. It can be 'uint8' for an uint8 array of 20 bytes (raw SHA-1 hash result), 'hex' for its representation as a string of 40 hexadecimal digits (default), or base64 for its conversion to Base64 in a string of 28 characters.
SHA-1 digest is an Internet standard described in RFC 3174.
sha2 calculates the SHA-256 digest, a 256-bit variant of the SHA-2 hash algorithm. Its arguments are the same as those of sha1. In addition, named argument variant can specify one of the supported SHA-2 variants: 224, 256 (default), 384, or 512.
SHA-1 digest of the three characters 'a', 'b', and 'c':
sha1('abc') a9993e364706816aba3e25717850c26c9cd0d89d
SHA-224 digest of the empty message '':
sha2('', variant=224) d14a028c2a3a2bc9476102bb288234c415a2b01f828ea62ac5b3e42f
Split a string.
L = split(string, separator)
split(string,separator) finds substrings of string separated by separator and return them as a list. Empty substring are discarded. sepatator is a string of one or more characters.
split('abc;de;f', ';') {'abc', 'de', 'f'} split('++a+++b+++','++') {'a', '+b', '+'}
String comparison.
b = strcmp(s1, s2) b = strcmp(s1, s2, n)
strcmp(s1, s2) is true if the strings s1 and s2 are equal (i.e. same length and corresponding characters are equal). strcmp(s1, s2, n) compares the strings up to the n:th character. Note that this function does not return the same result as the strcmp function of the standard C library.
strcmp('abc','abc') true strcmp('abc','def') false strcmp('abc','abd',2) true strcmp('abc','abd',5) false
strcmpi, operator ===, operator ~==, operator ==, strfind, strmatch
String comparison with ignoring letter case.
b = strcmpi(s1, s2) b = strcmpi(s1, s2, n)
strcmpi compares strings for equality, ignoring letter case. In every other respect, it behaves like strcmp.
strcmpi('abc','aBc') true strcmpi('Abc','abd',2) true
strcmp, operator ===, operator ~==, operator ==, strfind, strmatch
Find a substring in a string.
pos = strfind(str, sub)
strfind(str,sub) finds occurrences of string sub in string str and returns a vector of the positions of all occurrences, or the empty vector [] if there is none. Occurrences may overlap.
strfind('ababcdbaaab','ab') 1 3 10 strfind('ababcdbaaab','ac') [] strfind('aaaaaa','aaa') 1 2 3
find, strcmp, strrep, split, strmatch, strtok
String match.
i = strmatch(str, strMatrix) i = strmatch(str, strList) i = strmatch(..., 'exact')
strmatch(str,strMatrix) compares string str with each row of the character matrix strMatrix; it returns the index of the first row whose beginning is equal to str, or 0 if no match is found. Case is significant.
strmatch(str,strList) compares string str with each element of list or cell array strList, which must be strings.
With a third argument, which must be the string 'exact', str must match the complete row or element of the second argument, not only the beginning.
strmatch('abc',['axyz';'uabc';'abcd';'efgh']) 3 strmatch('abc',['axyz';'uabc';'abcd';'efgh'],'exact') 0 strmatch('abc',{'ABC','axyz','abcdefg','ab','abcd'}) 3
Replace a substring in a string.
newstr = strrep(str, sub, repl)
strrep(str,sub,repl) replaces all occurrences of string sub in string str with string repl.
strrep('ababcdbaaab','ab','X') 'XXcdbaaX' strrep('aaaaaaa','aaa','12345') '1234512345a'
strfind, strcmp, strmatch, strtok
Token search in string.
(token, remainder) = strtok(str) (token, remainder) = strtok(str, separators)
strtok(str) gives the first token in string str. A token is defined as a substring delimited by separators or by the beginning or end of the string; by default, separators are spaces, tabulators, carriage returns and line feeds. If no token is found (i.e. if str is empty or contains only separator characters), the result is the empty string.
The optional second output is set to what follows immediately the token, including separators. If no token is found, it is the same as str.
An optional second input argument contains the separators in a string.
Strings are displayed with quotes to show clearly the separators.
strtok(' ab cde ') 'ab' (t, r) = strtok(' ab cde ') t = 'ab' r = ' cde ' (t, r) = strtok('2, 5, 3') t = '2' r = ', 5, 3'
Remove leading and trailing blank characters from a string.
s2 = strtrim(s1)
strtrim(s1) removes the leading and trailing blank characters from string s1. Blank characters are spaces (code 32), tabulators (code 9), carriage returns (code 13), line feeds (code 10), and null characters (code 0).
double(' \tAB CD\r\n\0') 32 9 65 66 32 32 67 68 13 10 0 double(strtrim(' \tAB CD\n\r\0'))) 65 66 32 32 67 68
Unicode character class.
cls = unicodeclass(c)
unicodeclass(c) gives the Unicode character class (General_Category property in the Unicode Character Database) of its argument c, which must be a single-character string. The result is one of the following two-character strings:
Class | Description | Class | Description |
---|---|---|---|
'Lu' | Letter, Uppercase | 'Pi' | Punctuation, Initial qupte |
'Ll' | Letter, Lowercase | 'Pf' | Punctuation, Final Quote |
'Lt' | Letter, Titlecase | 'Po' | Punctuation, Other |
'Lm' | Letter, Modifier | 'Sm' | Symbol, Math |
'Lo' | Letter, Other | 'Sc' | Symbol, Currency |
'Mn' | Mark, Non-Spcacing | 'Sk' | Symbol, Modifier |
'Mc' | Mark, Spacing Combining | 'So' | Symbol, Other |
'Me' | Mark, Enclosing | 'Zs' | Separator, Spcace |
'Nd' | Number, Decimal Digit | 'Zl' | Separator, Line |
'Nl' | Number, Letter | 'Zp' | Separator, Paragraph |
'No' | Number, Other | 'Cc' | Other, Control |
'Pc' | Punctuation, Connector | 'Cf' | Other, Format |
'Pd' | Punctuation, Dash | 'Cs' | Other, Surrogate |
'Ps' | Punctuation, Open | 'Co' | Other, Private Use |
'Pe' | Punctuation, Close | 'Cn' | Other, Not Assigned |
Convert all lowercase letters to lowercase.
s2 = upper(s1)
upper(s1) converts all the lowercase letters of string s1 to uppercase, according to the Unicode Character Database.
upper('abcABC123') ABCABC123
Decode Unicode characters encoded with UTF-32.
str = utf32decode(b)
utf32decode(b) decodes the contents of uint32 or int32 array b which represents Unicode characters encoded with UTF-32 (basically, Unicode code point). The result is a standard character array with a single row, usually encoded with UTF-16. Invalid codes are ignored.
If all the codes in b correspond to the Basic Multilingual Plane (16-bits, and not surrogate 0xd800-0xdfff), the result is equivalent to char(b).
Encode a string of Unicode characters using UTF-32.
b = utf32encode(str)
utf32encode(str) encodes the contents of character array str using UTF-32. Each Unicode character in str, made of 1 or 2 UTF-16 words, corresponds to one UTF-32 code. The result is an array of unsigned 32-bit integers.
If all the characters in str correspond to the Basic Multilingual Plane (16-bits, and no surrogate pairs), the result is equivalent to uint32(str).
utf32encode('abc') 1x3 uint32 array 97 98 99 str = utf32decode(65872uint32); double(str) 55296 56656 utf32encode(str) 65872uint32
Decode Unicode characters encoded with UTF-8.
str = utf8decode(b)
utf8decode(b) decodes the contents of uint8 or int8 array b which represents Unicode characters encoded with UTF-8. Each Unicode character corresponds to up to 4 bytes of UTF-8 code. The result is a standard character array with a single row; characters are usually encoded as UTF-16, with 1 or 2 words per character. Invalid codes (for example when the beginning of the decoded data does not correspond to a character boundary) are ignored.
Encode a string of Unicode characters using UTF-8.
b = utf8encode(str)
utf8encode(str) encodes the contents of character array str using UTF-8. Each Unicode character in str corresponds to up to 4 bytes of UTF-8 code. The result is an array of unsigned 8-bit integers.
If the input string does not contain Unicode characters, the output is invalid.
b = utf8encode(['abc', 200, 2000, 20000]) b = 1x10 uint8 array 97 98 99 195 136 223 144 228 184 160 str = utf8decode(b); double(str) 97 98 99 200 2000 20000