15.8. String Functions and Operators
String Operators
The || operator performs concatenation.
String Functions
Note
These functions assume that the input strings contain valid UTF-8 encoded
Unicode code points. There are no explicit checks for valid UTF-8 and
the functions may return incorrect results on invalid UTF-8.
Invalid UTF-8 data can be corrected with from_utf8().
Additionally, the functions operate on Unicode code points and not user visible characters (or grapheme clusters). Some languages combine multiple code points into a single user-perceived character, the basic unit of a writing system for a language, but the functions will treat each code point as a separate unit.
The lower() and upper() functions do not perform
locale-sensitive, context-sensitive, or one-to-many mappings required for
some languages. Specifically, this will return incorrect results for
Lithuanian, Turkish and Azeri.
-
chr(n) → varchar Returns the Unicode code point
nas a single character string.
-
concat(string1, ..., stringN) → varchar Returns the concatenation of
string1,string2,...,stringN. This function provides the same functionality as the SQL-standard concatenation operator (||).
-
length(string) → bigint Returns the length of
stringin characters.
-
levenshtein_distance(string1, string2) → bigint Returns the Levenshtein edit distance of
string1andstring2, i.e. the minimum number of single-character edits (insertions, deletions or substitutions) needed to changestring1intostring2.
-
lower(string) → varchar Converts
stringto lowercase.
-
lpad(string, size, padstring) → varchar Left pads
stringtosizecharacters withpadstring. Ifsizeis less than the length ofstring, the result is truncated tosizecharacters.sizemust not be negative andpadstringmust be non-empty.
-
ltrim(string) → varchar Removes leading whitespace from
string.
-
replace(string, search) → varchar Removes all instances of
searchfromstring.
-
replace(string, search, replace) → varchar Replaces all instances of
searchwithreplaceinstring.
-
reverse(string) → varchar Returns
stringwith the characters in reverse order.
-
rpad(string, size, padstring) → varchar Right pads
stringtosizecharacters withpadstring. Ifsizeis less than the length ofstring, the result is truncated tosizecharacters.sizemust not be negative andpadstringmust be non-empty.
-
rtrim(string) → varchar Removes trailing whitespace from
string.
-
split(string, delimiter) → array<varchar> Splits
stringondelimiterand returns an array.
-
split(string, delimiter, limit) → array<varchar> Splits
stringondelimiterand returns an array of size at mostlimit. The last element in the array always contain everything left in thestring.limitmust be a positive number.
-
split_part(string, delimiter, index) → varchar Splits
stringondelimiterand returns the fieldindex. Field indexes start with1. If the index is larger than than the number of fields, then null is returned.
-
split_to_map(string, entryDelimiter, keyValueDelimiter) → map<varchar, varchar> Splits
stringbyentryDelimiterandkeyValueDelimiterand returns a map.entryDelimitersplitsstringinto key-value pairs.keyValueDelimitersplits each pair into key and value.
-
strpos(string, substring) → bigint Returns the starting position of the first instance of
substringinstring. Positions start with1. If not found,0is returned.
-
position(substring IN string) → bigint Returns the starting position of the first instance of
substringinstring. Positions start with1. If not found,0is returned.
-
substr(string, start) → varchar Returns the rest of
stringfrom the starting positionstart. Positions start with1. A negative starting position is interpreted as being relative to the end of the string.
-
substr(string, start, length) → varchar Returns a substring from
stringof lengthlengthfrom the starting positionstart. Positions start with1. A negative starting position is interpreted as being relative to the end of the string.
-
trim(string) → varchar Removes leading and trailing whitespace from
string.
-
upper(string) → varchar Converts
stringto uppercase.
Unicode Functions
-
normalize(string) → varchar Transforms
stringwith NFC normalization form.
-
normalize(string, form) → varchar Transforms
stringwith the specified normalization form.formmust be be one of the following keywords:Form Description NFDCanonical Decomposition NFCCanonical Decomposition, followed by Canonical Composition NFKDCompatibility Decomposition NFKCCompatibility Decomposition, followed by Canonical Composition Note
This SQL-standard function has special syntax and requires specifying
formas a keyword, not as a string.
-
to_utf8(string) → varbinary Encodes
stringinto a UTF-8 varbinary representation.
-
from_utf8(binary) → varchar Decodes a UTF-8 encoded string from
binary. Invalid UTF-8 sequences are replaced with the Unicode replacement characterU+FFFD.
-
from_utf8(binary, replace) → varchar Decodes a UTF-8 encoded string from
binary. Invalid UTF-8 sequences are replaced with replace. The replacement string replace must either be a single character or empty (in which case invalid characters are removed).