14.8. String Functions and Operators
String Operators
The || operator performs concatenation.
String Functions
Note
These functions assume that the input strings contain valid UTF-8 encoded
Unicode code points.  There are no explicit checks for valid UTF-8 and
the functions may return incorrect results on invalid UTF-8.
Invalid UTF-8 data can be corrected with from_utf8().
Additionally, the functions operate on Unicode code points and not user visible characters (or grapheme clusters). Some languages combine multiple code points into a single user-perceived character, the basic unit of a writing system for a language, but the functions will treat each code point as a separate unit.
The lower() and upper() functions do not perform
locale-sensitive, context-sensitive, or one-to-many mappings required for
some languages. Specifically, this will return incorrect results for
Lithuanian, Turkish and Azeri.
- 
chr(n) → varchar
- Returns the Unicode code point - nas a single character string.
- 
concat(string1, ..., stringN) → varchar
- Returns the concatenation of - string1,- string2,- ...,- stringN. This function provides the same functionality as the SQL-standard concatenation operator (- ||).
- 
length(string) → bigint
- Returns the length of - stringin characters.
- 
lower(string) → varchar
- Converts - stringto lowercase.
- 
lpad(string, size, padstring) → varchar
- Left pads - stringto- sizecharacters with- padstring. If- sizeis less than the length of- string, the result is truncated to- sizecharacters.- sizemust not be negative and- padstringmust be non-empty.
- 
ltrim(string) → varchar
- Removes leading whitespace from - string.
- 
replace(string, search) → varchar
- Removes all instances of - searchfrom- string.
- 
replace(string, search, replace) → varchar
- Replaces all instances of - searchwith- replacein- string.
- 
reverse(string) → varchar
- Returns - stringwith the characters in reverse order.
- 
rpad(string, size, padstring) → varchar
- Right pads - stringto- sizecharacters with- padstring. If- sizeis less than the length of- string, the result is truncated to- sizecharacters.- sizemust not be negative and- padstringmust be non-empty.
- 
rtrim(string) → varchar
- Removes trailing whitespace from - string.
- 
split(string, delimiter) → array<varchar>
- Splits - stringon- delimiterand returns an array.
- 
split(string, delimiter, limit) → array<varchar>
- Splits - stringon- delimiterand returns an array of size at most- limit. The last element in the array always contain everything left in the- string.- limitmust be a positive number.
- 
split_part(string, delimiter, index) → varchar
- Splits - stringon- delimiterand returns the field- index. Field indexes start with- 1. If the index is larger than than the number of fields, then null is returned.
- 
split_to_map(string, entryDelimiter, keyValueDelimiter) → map<varchar, varchar>
- Splits - stringby- entryDelimiterand- keyValueDelimiterand returns a map.- entryDelimitersplits- stringinto key-value pairs.- keyValueDelimitersplits each pair into key and value.
- 
strpos(string, substring) → bigint
- Returns the starting position of the first instance of - substringin- string. Positions start with- 1. If not found,- 0is returned.
- 
position(substring IN string) → bigint
- Returns the starting position of the first instance of - substringin- string. Positions start with- 1. If not found,- 0is returned.
- 
substr(string, start) → varchar
- Returns the rest of - stringfrom the starting position- start. Positions start with- 1. A negative starting position is interpreted as being relative to the end of the string.
- 
substr(string, start, length) → varchar
- Returns a substring from - stringof length- lengthfrom the starting position- start. Positions start with- 1. A negative starting position is interpreted as being relative to the end of the string.
- 
trim(string) → varchar
- Removes leading and trailing whitespace from - string.
- 
upper(string) → varchar
- Converts - stringto uppercase.
Unicode Functions
- 
normalize(string) → varchar
- Transforms - stringwith NFC normalization form.
- 
normalize(string, form) → varchar
- Transforms - stringwith the specified normalization form.- formmust be be one of the following keywords:- Form - Description - NFD- Canonical Decomposition - NFC- Canonical Decomposition, followed by Canonical Composition - NFKD- Compatibility Decomposition - NFKC- Compatibility Decomposition, followed by Canonical Composition - Note - This SQL-standard function has special syntax and requires specifying - formas a keyword, not as a string.
- 
to_utf8(string) → varbinary
- Encodes - stringinto a UTF-8 varbinary representation.
- 
from_utf8(binary) → varchar
- Decodes a UTF-8 encoded string from - binary. Invalid UTF-8 sequences are replaced with the Unicode replacement character- U+FFFD.
- 
from_utf8(binary, replace) → varchar
- Decodes a UTF-8 encoded string from - binary. Invalid UTF-8 sequences are replaced with replace. The replacement string replace must either be a single character or empty (in which case invalid characters are removed).