Spark--sql--所有函数举例(spark-2.x版本)
! expr - Logical not. % expr1 % expr2 - Returns the remainder afterexpr1/expr2. Examples: > SELECT 2 % 1.8; 0.2 > SELECT MOD(2, 1.8); 0.2 & expr1 & expr2 - Returns the result of bitwise AND ofexpr1andexpr2. Examples: > SELECT 3 & 5; 1 * expr1 * expr2 - Returnsexpr1*expr2. Examples: > SELECT 2 * 3; 6 + expr1 + expr2 - Returnsexpr1+expr2. Examples: > SELECT 1 + 2; 3 - expr1 - expr2 - Returnsexpr1-expr2. Examples: > SELECT 2 - 1; 1 / expr1 / expr2 - Returnsexpr1/expr2. It always performs floating point division. Examples: > SELECT 3 / 2; 1.5 > SELECT 2L / 2L; 1.0 < expr1 < expr2 - Returns true ifexpr1is less thanexpr2. <= expr1 <= expr2 - Returns true ifexpr1is less than or equal toexpr2. <=> expr1 <=> expr2 - Returns same result as the EQUAL(=) operator for non-null operands, but returns true if both are null, false if one of the them is null. = expr1 = expr2 - Returns true ifexpr1equalsexpr2, or false otherwise. == expr1 == expr2 - Returns true ifexpr1equalsexpr2, or false otherwise. > expr1 > expr2 - Returns true ifexpr1is greater thanexpr2. >= expr1 >= expr2 - Returns true ifexpr1is greater than or equal toexpr2. ^ expr1 ^ expr2 - Returns the result of bitwise exclusive OR ofexpr1andexpr2. Examples: > SELECT 3 ^ 5; 2 abs abs(expr) - Returns the absolute value of the numeric value. Examples: > SELECT abs(-1); 1 acos acos(expr) - Returns the inverse cosine (a.k.a. arccosine) ofexprif -1<=expr<=1 or NaN otherwise. Examples: > SELECT acos(1); 0.0 > SELECT acos(2); NaN add_months add_months(start_date, num_months) - Returns the date that isnum_monthsafterstart_date. Examples: > SELECT add_months('2016-08-31', 1); 2016-09-30 and expr1 and expr2 - Logical AND. approx_count_distinct approx_count_distinct(expr[, relativeSD]) - Returns the estimated cardinality by HyperLogLog++.relativeSDdefines the maximum estimation error allowed. approx_percentile approx_percentile(col, percentage [, accuracy]) - Returns the approximate percentile value of numeric columncolat the given percentage. The value of percentage must be between 0.0 and 1.0. Theaccuracyparameter (default: 10000) is a positive numeric literal which controls approximation accuracy at the cost of memory. Higher value ofaccuracyyields better accuracy,1.0/accuracyis the relative error of the approximation. Whenpercentageis an array, each value of the percentage array must be between 0.0 and 1.0. In this case, returns the approximate percentile array of columncolat the given percentage array. Examples: > SELECT approx_percentile(10.0, array(0.5, 0.4, 0.1), 100); [10.0,10.0,10.0] > SELECT approx_percentile(10.0, 0.5, 100); 10.0 array array(expr, ...) - Returns an array with the given elements. Examples: > SELECT array(1, 2, 3); [1,2,3] array_contains array_contains(array, value) - Returns true if the array contains the value. Examples: > SELECT array_contains(array(1, 2, 3), 2); true ascii ascii(str) - Returns the numeric value of the first character ofstr. Examples: > SELECT ascii('222'); 50 > SELECT ascii(2); 50 asin asin(expr) - Returns the inverse sine (a.k.a. arcsine) the arc sin ofexprif -1<=expr<=1 or NaN otherwise. Examples: > SELECT asin(0); 0.0 > SELECT asin(2); NaN assert_true assert_true(expr) - Throws an exception ifexpris not true. Examples: > SELECT assert_true(0 < 1); NULL atan atan(expr) - Returns the inverse tangent (a.k.a. arctangent). Examples: > SELECT atan(0); 0.0 atan2 atan2(expr1, expr2) - Returns the angle in radians between the positive x-axis of a plane and the point given by the coordinates (expr1,expr2). Examples: > SELECT atan2(0, 0); 0.0 avg avg(expr) - Returns the mean calculated from values of a group. base64 base64(bin) - Converts the argument from a binarybinto a base 64 string. Examples: > SELECT base64('Spark SQL'); U3BhcmsgU1FM bigint bigint(expr) - Casts the valueexprto the target data typebigint. bin bin(expr) - Returns the string representation of the long valueexprrepresented in binary. Examples: > SELECT bin(13); 1101 > SELECT bin(-13); 1111111111111111111111111111111111111111111111111111111111110011 > SELECT bin(13.3); 1101 binary binary(expr) - Casts the valueexprto the target data typebinary. bit_length bit_length(expr) - Returns the bit length ofexpror number of bits in binary data. Examples: > SELECT bit_length('Spark SQL'); 72 boolean boolean(expr) - Casts the valueexprto the target data typeboolean. bround bround(expr, d) - Returnsexprrounded toddecimal places using HALF_EVEN rounding mode. Examples: > SELECT bround(2.5, 0); 2.0 cast cast(expr AS type) - Casts the valueexprto the target data typetype. Examples: > SELECT cast('10' as int); 10 cbrt cbrt(expr) - Returns the cube root ofexpr. Examples: > SELECT cbrt(27.0); 3.0 ceil ceil(expr) - Returns the smallest integer not smaller thanexpr. Examples: > SELECT ceil(-0.1); 0 > SELECT ceil(5); 5 ceiling ceiling(expr) - Returns the smallest integer not smaller thanexpr. Examples: > SELECT ceiling(-0.1); 0 > SELECT ceiling(5); 5 char char(expr) - Returns the ASCII character having the binary equivalent toexpr. If n is larger than 256 the result is equivalent to chr(n % 256) Examples: > SELECT char(65); A char_length char_length(expr) - Returns the character length ofexpror number of bytes in binary data. Examples: > SELECT char_length('Spark SQL'); 9 > SELECT CHAR_LENGTH('Spark SQL'); 9 > SELECT CHARACTER_LENGTH('Spark SQL'); 9 character_length character_length(expr) - Returns the character length ofexpror number of bytes in binary data. Examples: > SELECT character_length('Spark SQL'); 9 > SELECT CHAR_LENGTH('Spark SQL'); 9 > SELECT CHARACTER_LENGTH('Spark SQL'); 9 chr chr(expr) - Returns the ASCII character having the binary equivalent toexpr. If n is larger than 256 the result is equivalent to chr(n % 256) Examples: > SELECT chr(65); A coalesce coalesce(expr1, expr2, ...) - Returns the first non-null argument if exists. Otherwise, null. Examples: > SELECT coalesce(NULL, 1, NULL); 1 collect_list collect_list(expr) - Collects and returns a list of non-unique elements. collect_set collect_set(expr) - Collects and returns a set of unique elements. concat concat(str1, str2, ..., strN) - Returns the concatenation of str1, str2, ..., strN. Examples: > SELECT concat('Spark', 'SQL'); SparkSQL concat_ws concat_ws(sep, [str | array(str)]+) - Returns the concatenation of the strings separated bysep. Examples: > SELECT concat_ws(' ', 'Spark', 'SQL'); Spark SQL conv conv(num, from_base, to_base) - Convertnumfromfrom_basetoto_base. Examples: > SELECT conv('100', 2, 10); 4 > SELECT conv(-10, 16, -10); -16 corr corr(expr1, expr2) - Returns Pearson coefficient of correlation between a set of number pairs. cos cos(expr) - Returns the cosine ofexpr. Examples: > SELECT cos(0); 1.0 cosh cosh(expr) - Returns the hyperbolic cosine ofexpr. Examples: > SELECT cosh(0); 1.0 cot cot(expr) - Returns the cotangent ofexpr. Examples: > SELECT cot(1); 0.6420926159343306 count count(*) - Returns the total number of retrieved rows, including rows containing null. count(expr) - Returns the number of rows for which the supplied expression is non-null. count(DISTINCT expr[, expr...]) - Returns the number of rows for which the supplied expression(s) are unique and non-null. count_min_sketch count_min_sketch(col, eps, confidence, seed) - Returns a count-min sketch of a column with the given esp, confidence and seed. The result is an array of bytes, which can be deserialized to aCountMinSketchbefore usage. Count-min sketch is a probabilistic data structure used for cardinality estimation using sub-linear space. covar_pop covar_pop(expr1, expr2) - Returns the population covariance of a set of number pairs. covar_samp covar_samp(expr1, expr2) - Returns the sample covariance of a set of number pairs. crc32 crc32(expr) - Returns a cyclic redundancy check value of theexpras a bigint. Examples: > SELECT crc32('Spark'); 1557323817 cube cume_dist cume_dist() - Computes the position of a value relative to all values in the partition. current_database current_database() - Returns the current database. Examples: > SELECT current_database(); default current_date current_date() - Returns the current date at the start of query evaluation. current_timestamp current_timestamp() - Returns the current timestamp at the start of query evaluation. date date(expr) - Casts the valueexprto the target data typedate. date_add date_add(start_date, num_days) - Returns the date that isnum_daysafterstart_date. Examples: > SELECT date_add('2016-07-30', 1); 2016-07-31 date_format date_format(timestamp, fmt) - Convertstimestampto a value of string in the format specified by the date formatfmt. Examples: > SELECT date_format('2016-04-08', 'y'); 2016 date_sub date_sub(start_date, num_days) - Returns the date that isnum_daysbeforestart_date. Examples: > SELECT date_sub('2016-07-30', 1); 2016-07-29 datediff datediff(endDate, startDate) - Returns the number of days fromstartDatetoendDate. Examples: > SELECT datediff('2009-07-31', '2009-07-30'); 1 > SELECT datediff('2009-07-30', '2009-07-31'); -1 day day(date) - Returns the day of month of the date/timestamp. Examples: > SELECT day('2009-07-30'); 30 dayofmonth dayofmonth(date) - Returns the day of month of the date/timestamp. Examples: > SELECT dayofmonth('2009-07-30'); 30 dayofweek dayofweek(date) - Returns the day of the week for date/timestamp (1 = Sunday, 2 = Monday, ..., 7 = Saturday). Examples: > SELECT dayofweek('2009-07-30'); 5 dayofyear dayofyear(date) - Returns the day of year of the date/timestamp. Examples: > SELECT dayofyear('2016-04-09'); 100 decimal decimal(expr) - Casts the valueexprto the target data typedecimal. decode decode(bin, charset) - Decodes the first argument using the second argument character set. Examples: > SELECT decode(encode('abc', 'utf-8'), 'utf-8'); abc degrees degrees(expr) - Converts radians to degrees. Examples: > SELECT degrees(3.141592653589793); 180.0 dense_rank dense_rank() - Computes the rank of a value in a group of values. The result is one plus the previously assigned rank value. Unlike the function rank, dense_rank will not produce gaps in the ranking sequence. double double(expr) - Casts the valueexprto the target data typedouble. e e() - Returns Euler's number, e. Examples: > SELECT e(); 2.718281828459045 elt elt(n, str1, str2, ...) - Returns then-th string, e.g., returnsstr2whennis 2. Examples: > SELECT elt(1, 'scala', 'java'); scala encode encode(str, charset) - Encodes the first argument using the second argument character set. Examples: > SELECT encode('abc', 'utf-8'); abc exp exp(expr) - Returns e to the power ofexpr. Examples: > SELECT exp(0); 1.0 explode explode(expr) - Separates the elements of arrayexprinto multiple rows, or the elements of mapexprinto multiple rows and columns. Examples: > SELECT explode(array(10, 20)); 10 20 explode_outer explode_outer(expr) - Separates the elements of arrayexprinto multiple rows, or the elements of mapexprinto multiple rows and columns. Examples: > SELECT explode_outer(array(10, 20)); 10 20 expm1 expm1(expr) - Returns exp(expr) - 1. Examples: > SELECT expm1(0); 0.0 factorial factorial(expr) - Returns the factorial ofexpr.expris [0..20]. Otherwise, null. Examples: > SELECT factorial(5); 120 find_in_set find_in_set(str, str_array) - Returns the index (1-based) of the given string (str) in the comma-delimited list (str_array). Returns 0, if the string was not found or if the given string (str) contains a comma. Examples: > SELECT find_in_set('ab','abc,b,ab,c,def'); 3 first first(expr[, isIgnoreNull]) - Returns the first value ofexprfor a group of rows. IfisIgnoreNullis true, returns only non-null values. first_value first_value(expr[, isIgnoreNull]) - Returns the first value ofexprfor a group of rows. IfisIgnoreNullis true, returns only non-null values. float float(expr) - Casts the valueexprto the target data typefloat. floor floor(expr) - Returns the largest integer not greater thanexpr. Examples: > SELECT floor(-0.1); -1 > SELECT floor(5); 5 format_number format_number(expr1, expr2) - Formats the numberexpr1like '#,###,###.##', rounded toexpr2decimal places. Ifexpr2is 0, the result has no decimal point or fractional part. This is supposed to function like MySQL's FORMAT. Examples: > SELECT format_number(12332.123456, 4); 12,332.1235 format_string format_string(strfmt, obj, ...) - Returns a formatted string from printf-style format strings. Examples: > SELECT format_string("Hello World %d %s", 100, "days"); Hello World 100 days from_json from_json(jsonStr, schema[, options]) - Returns a struct value with the givenjsonStrandschema. Examples: > SELECT from_json('{"a":1, "b":0.8}', 'a INT, b DOUBLE'); {"a":1, "b":0.8} > SELECT from_json('{"time":"26/08/2015"}', 'time Timestamp', map('timestampFormat', 'dd/MM/yyyy')); {"time":"2015-08-26 00:00:00.0"} Since:2.2.0 from_unixtime from_unixtime(unix_time, format) - Returnsunix_timein the specifiedformat. Examples: > SELECT from_unixtime(0, 'yyyy-MM-dd HH:mm:ss'); 1970-01-01 00:00:00 from_utc_timestamp from_utc_timestamp(timestamp, timezone) - Given a timestamp, which corresponds to a certain time of day in UTC, returns another timestamp that corresponds to the same time of day in the given timezone. Examples: > SELECT from_utc_timestamp('2016-08-31', 'Asia/Seoul'); 2016-08-31 09:00:00 get_json_object get_json_object(json_txt, path) - Extracts a json object frompath. Examples: > SELECT get_json_object('{"a":"b"}', '$.a'); b greatest greatest(expr, ...) - Returns the greatest value of all parameters, skipping null values. Examples: > SELECT greatest(10, 9, 2, 4, 3); 10 grouping grouping_id hash hash(expr1, expr2, ...) - Returns a hash value of the arguments. Examples: > SELECT hash('Spark', array(123), 2); -1321691492 hex hex(expr) - Convertsexprto hexadecimal. Examples: > SELECT hex(17); 11 > SELECT hex('Spark SQL'); 537061726B2053514C hour hour(timestamp) - Returns the hour component of the string/timestamp. Examples: > SELECT hour('2009-07-30 12:58:59'); 12 hypot hypot(expr1, expr2) - Returns sqrt(expr12 +expr22). Examples: > SELECT hypot(3, 4); 5.0 if if(expr1, expr2, expr3) - Ifexpr1evaluates to true, then returnsexpr2; otherwise returnsexpr3. Examples: > SELECT if(1 < 2, 'a', 'b'); a ifnull ifnull(expr1, expr2) - Returnsexpr2ifexpr1is null, orexpr1otherwise. Examples: > SELECT ifnull(NULL, array('2')); ["2"] in expr1 in(expr2, expr3, ...) - Returns true ifexprequals to any valN. initcap initcap(str) - Returnsstrwith the first letter of each word in uppercase. All other letters are in lowercase. Words are delimited by white space. Examples: > SELECT initcap('sPark sql'); Spark Sql inline inline(expr) - Explodes an array of structs into a table. Examples: > SELECT inline(array(struct(1, 'a'), struct(2, 'b'))); 1 a 2 b inline_outer inline_outer(expr) - Explodes an array of structs into a table. Examples: > SELECT inline_outer(array(struct(1, 'a'), struct(2, 'b'))); 1 a 2 b input_file_block_length input_file_block_length() - Returns the length of the block being read, or -1 if not available. input_file_block_start input_file_block_start() - Returns the start offset of the block being read, or -1 if not available. input_file_name input_file_name() - Returns the name of the file being read, or empty string if not available. instr instr(str, substr) - Returns the (1-based) index of the first occurrence ofsubstrinstr. Examples: > SELECT instr('SparkSQL', 'SQL'); 6 int int(expr) - Casts the valueexprto the target data typeint. isnan isnan(expr) - Returns true ifexpris NaN, or false otherwise. Examples: > SELECT isnan(cast('NaN' as double)); true isnotnull isnotnull(expr) - Returns true ifexpris not null, or false otherwise. Examples: > SELECT isnotnull(1); true isnull isnull(expr) - Returns true ifexpris null, or false otherwise. Examples: > SELECT isnull(1); false java_method java_method(class, method[, arg1[, arg2 ..]]) - Calls a method with reflection. Examples: > SELECT java_method('java.util.UUID', 'randomUUID'); c33fb387-8500-4bfa-81d2-6e0e3e930df2 > SELECT java_method('java.util.UUID', 'fromString', 'a5cf6c42-0c85-418f-af6c-3e4e5b1328f2'); a5cf6c42-0c85-418f-af6c-3e4e5b1328f2 json_tuple json_tuple(jsonStr, p1, p2, ..., pn) - Returns a tuple like the function get_json_object, but it takes multiple names. All the input parameters and output column types are string. Examples: > SELECT json_tuple('{"a":1, "b":2}', 'a', 'b'); 1 2 kurtosis kurtosis(expr) - Returns the kurtosis value calculated from values of a group. lag lag(input[, offset[, default]]) - Returns the value ofinputat theoffsetth row before the current row in the window. The default value ofoffsetis 1 and the default value ofdefaultis null. If the value ofinputat theoffsetth row is null, null is returned. If there is no such offset row (e.g., when the offset is 1, the first row of the window does not have any previous row),defaultis returned. last last(expr[, isIgnoreNull]) - Returns the last value ofexprfor a group of rows. IfisIgnoreNullis true, returns only non-null values. last_day last_day(date) - Returns the last day of the month which the date belongs to. Examples: > SELECT last_day('2009-01-12'); 2009-01-31 last_value last_value(expr[, isIgnoreNull]) - Returns the last value ofexprfor a group of rows. IfisIgnoreNullis true, returns only non-null values. lcase lcase(str) - Returnsstrwith all characters changed to lowercase. Examples: > SELECT lcase('SparkSql'); sparksql lead lead(input[, offset[, default]]) - Returns the value ofinputat theoffsetth row after the current row in the window. The default value ofoffsetis 1 and the default value ofdefaultis null. If the value ofinputat theoffsetth row is null, null is returned. If there is no such an offset row (e.g., when the offset is 1, the last row of the window does not have any subsequent row),defaultis returned. least least(expr, ...) - Returns the least value of all parameters, skipping null values. Examples: > SELECT least(10, 9, 2, 4, 3); 2 left left(str, len) - Returns the leftmostlen(lencan be string type) characters from the stringstr,iflenis less or equal than 0 the result is an empty string. Examples: > SELECT left('Spark SQL', 3); Spa length length(expr) - Returns the character length ofexpror number of bytes in binary data. Examples: > SELECT length('Spark SQL'); 9 > SELECT CHAR_LENGTH('Spark SQL'); 9 > SELECT CHARACTER_LENGTH('Spark SQL'); 9 levenshtein levenshtein(str1, str2) - Returns the Levenshtein distance between the two given strings. Examples: > SELECT levenshtein('kitten', 'sitting'); 3 like str like pattern - Returns true if str matches pattern, null if any arguments are null, false otherwise. Arguments: str - a string expression pattern - a string expression. The pattern is a string which is matched literally, with exception to the following special symbols: _ matches any one character in the input (similar to . in posix regular expressions) % matches zero or more characters in the input (similar to .* in posix regular expressions) The escape character is '\'. If an escape character precedes a special symbol or another escape character, the following character is matched literally. It is invalid to escape any other character. Since Spark 2.0, string literals are unescaped in our SQL parser. For example, in order to match "\abc", the pattern should be "\abc". When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, it fallbacks to Spark 1.6 behavior regarding string literal parsing. For example, if the config is enabled, the pattern to match "\abc" should be "\abc". Examples: > SELECT '%SystemDrive%\Users\John' like '\%SystemDrive\%\\Users%' true Note: Use RLIKE to match with standard regular expressions. ln ln(expr) - Returns the natural logarithm (base e) ofexpr. Examples: > SELECT ln(1); 0.0 locate locate(substr, str[, pos]) - Returns the position of the first occurrence ofsubstrinstrafter positionpos. The givenposand return value are 1-based. Examples: > SELECT locate('bar', 'foobarbar'); 4 > SELECT locate('bar', 'foobarbar', 5); 7 > SELECT POSITION('bar' IN 'foobarbar'); 4 log log(base, expr) - Returns the logarithm ofexprwithbase. Examples: > SELECT log(10, 100); 2.0 log10 log10(expr) - Returns the logarithm ofexprwith base 10. Examples: > SELECT log10(10); 1.0 log1p log1p(expr) - Returns log(1 +expr). Examples: > SELECT log1p(0); 0.0 log2 log2(expr) - Returns the logarithm ofexprwith base 2. Examples: > SELECT log2(2); 1.0 lower lower(str) - Returnsstrwith all characters changed to lowercase. Examples: > SELECT lower('SparkSql'); sparksql lpad lpad(str, len, pad) - Returnsstr, left-padded withpadto a length oflen. Ifstris longer thanlen, the return value is shortened tolencharacters. Examples: > SELECT lpad('hi', 5, ''); hi > SELECT lpad('hi', 1, ''); h ltrim ltrim(str) - Removes the leading and trailing space characters fromstr. Examples: > SELECT ltrim(' SparkSQL'); SparkSQL map map(key0, value0, key1, value1, ...) - Creates a map with the given key/value pairs. Examples: > SELECT map(1.0, '2', 3.0, '4'); {1.0:"2",3.0:"4"} map_keys map_keys(map) - Returns an unordered array containing the keys of the map. Examples: > SELECT map_keys(map(1, 'a', 2, 'b')); [1,2] map_values map_values(map) - Returns an unordered array containing the values of the map. Examples: > SELECT map_values(map(1, 'a', 2, 'b')); ["a","b"] max max(expr) - Returns the maximum value ofexpr. md5 md5(expr) - Returns an MD5 128-bit checksum as a hex string ofexpr. Examples: > SELECT md5('Spark'); 8cde774d6f7333752ed72cacddb05126 mean mean(expr) - Returns the mean calculated from values of a group. min min(expr) - Returns the minimum value ofexpr. minute minute(timestamp) - Returns the minute component of the string/timestamp. Examples: > SELECT minute('2009-07-30 12:58:59'); 58 mod expr1 mod expr2 - Returns the remainder afterexpr1/expr2. Examples: > SELECT 2 mod 1.8; 0.2 > SELECT MOD(2, 1.8); 0.2 monotonically_increasing_id monotonically_increasing_id() - Returns monotonically increasing 64-bit integers. The generated ID is guaranteed to be monotonically increasing and unique, but not consecutive. The current implementation puts the partition ID in the upper 31 bits, and the lower 33 bits represent the record number within each partition. The assumption is that the data frame has less than 1 billion partitions, and each partition has less than 8 billion records. month month(date) - Returns the month component of the date/timestamp. Examples: > SELECT month('2016-07-30'); 7 months_between months_between(timestamp1, timestamp2) - Returns number of months betweentimestamp1andtimestamp2. Examples: > SELECT months_between('1997-02-28 10:30:00', '1996-10-30'); 3.94959677 named_struct named_struct(name1, val1, name2, val2, ...) - Creates a struct with the given field names and values. Examples: > SELECT named_struct("a", 1, "b", 2, "c", 3); {"a":1,"b":2,"c":3} nanvl nanvl(expr1, expr2) - Returnsexpr1if it's not NaN, orexpr2otherwise. Examples: > SELECT nanvl(cast('NaN' as double), 123); 123.0 negative negative(expr) - Returns the negated value ofexpr. Examples: > SELECT negative(1); -1 next_day next_day(start_date, day_of_week) - Returns the first date which is later thanstart_dateand named as indicated. Examples: > SELECT next_day('2015-01-14', 'TU'); 2015-01-20 not not expr - Logical not. now now() - Returns the current timestamp at the start of query evaluation. ntile ntile(n) - Divides the rows for each window partition intonbuckets ranging from 1 to at mostn. nullif nullif(expr1, expr2) - Returns null ifexpr1equals toexpr2, orexpr1otherwise. Examples: > SELECT nullif(2, 2); NULL nvl nvl(expr1, expr2) - Returnsexpr2ifexpr1is null, orexpr1otherwise. Examples: > SELECT nvl(NULL, array('2')); ["2"] nvl2 nvl2(expr1, expr2, expr3) - Returnsexpr2ifexpr1is not null, orexpr3otherwise. Examples: > SELECT nvl2(NULL, 2, 1); 1 octet_length octet_length(expr) - Returns the byte length ofexpror number of bytes in binary data. Examples: > SELECT octet_length('Spark SQL'); 9 or expr1 or expr2 - Logical OR. parse_url parse_url(url, partToExtract[, key]) - Extracts a part from a URL. Examples: > SELECT parse_url('http://spark.apache.org/pathquery=1', 'HOST') spark.apache.org > SELECT parse_url('http://spark.apache.org/pathquery=1', 'QUERY') query=1 > SELECT parse_url('http://spark.apache.org/pathquery=1', 'QUERY', 'query') 1 percent_rank percent_rank() - Computes the percentage ranking of a value in a group of values. percentile percentile(col, percentage [, frequency]) - Returns the exact percentile value of numeric columncolat the given percentage. The value of percentage must be between 0.0 and 1.0. The value of frequency should be positive integral percentile(col, array(percentage1 [, percentage2]...) [, frequency]) - Returns the exact percentile value array of numeric columncolat the given percentage(s). Each value of the percentage array must be between 0.0 and 1.0. The value of frequency should be positive integral percentile_approx percentile_approx(col, percentage [, accuracy]) - Returns the approximate percentile value of numeric columncolat the given percentage. The value of percentage must be between 0.0 and 1.0. Theaccuracyparameter (default: 10000) is a positive numeric literal which controls approximation accuracy at the cost of memory. Higher value ofaccuracyyields better accuracy,1.0/accuracyis the relative error of the approximation. Whenpercentageis an array, each value of the percentage array must be between 0.0 and 1.0. In this case, returns the approximate percentile array of columncolat the given percentage array. Examples: > SELECT percentile_approx(10.0, array(0.5, 0.4, 0.1), 100); [10.0,10.0,10.0] > SELECT percentile_approx(10.0, 0.5, 100); 10.0 pi pi() - Returns pi. Examples: > SELECT pi(); 3.141592653589793 pmod pmod(expr1, expr2) - Returns the positive value ofexpr1modexpr2. Examples: > SELECT pmod(10, 3); 1 > SELECT pmod(-10, 3); 2 posexplode posexplode(expr) - Separates the elements of arrayexprinto multiple rows with positions, or the elements of mapexprinto multiple rows and columns with positions. Examples: > SELECT posexplode(array(10,20)); 0 10 1 20 posexplode_outer posexplode_outer(expr) - Separates the elements of arrayexprinto multiple rows with positions, or the elements of mapexprinto multiple rows and columns with positions. Examples: > SELECT posexplode_outer(array(10,20)); 0 10 1 20 position position(substr, str[, pos]) - Returns the position of the first occurrence ofsubstrinstrafter positionpos. The givenposand return value are 1-based. Examples: > SELECT position('bar', 'foobarbar'); 4 > SELECT position('bar', 'foobarbar', 5); 7 > SELECT POSITION('bar' IN 'foobarbar'); 4 positive positive(expr) - Returns the value ofexpr. pow pow(expr1, expr2) - Raisesexpr1to the power ofexpr2. Examples: > SELECT pow(2, 3); 8.0 power power(expr1, expr2) - Raisesexpr1to the power ofexpr2. Examples: > SELECT power(2, 3); 8.0 printf printf(strfmt, obj, ...) - Returns a formatted string from printf-style format strings. Examples: > SELECT printf("Hello World %d %s", 100, "days"); Hello World 100 days quarter quarter(date) - Returns the quarter of the year for date, in the range 1 to 4. Examples: > SELECT quarter('2016-08-31'); 3 radians radians(expr) - Converts degrees to radians. Examples: > SELECT radians(180); 3.141592653589793 rand rand([seed]) - Returns a random value with independent and identically distributed (i.i.d.) uniformly distributed values in [0, 1). Examples: > SELECT rand(); 0.9629742951434543 > SELECT rand(0); 0.8446490682263027 > SELECT rand(null); 0.8446490682263027 randn randn([seed]) - Returns a random value with independent and identically distributed (i.i.d.) values drawn from the standard normal distribution. Examples: > SELECT randn(); -0.3254147983080288 > SELECT randn(0); 1.1164209726833079 > SELECT randn(null); 1.1164209726833079 rank rank() - Computes the rank of a value in a group of values. The result is one plus the number of rows preceding or equal to the current row in the ordering of the partition. The values will produce gaps in the sequence. reflect reflect(class, method[, arg1[, arg2 ..]]) - Calls a method with reflection. Examples: > SELECT reflect('java.util.UUID', 'randomUUID'); c33fb387-8500-4bfa-81d2-6e0e3e930df2 > SELECT reflect('java.util.UUID', 'fromString', 'a5cf6c42-0c85-418f-af6c-3e4e5b1328f2'); a5cf6c42-0c85-418f-af6c-3e4e5b1328f2 regexp_extract regexp_extract(str, regexp[, idx]) - Extracts a group that matchesregexp. Examples: > SELECT regexp_extract('100-200', '(\d+)-(\d+)', 1); 100 regexp_replace regexp_replace(str, regexp, rep) - Replaces all substrings ofstrthat matchregexpwithrep. Examples: > SELECT regexp_replace('100-200', '(\d+)', 'num'); num-num repeat repeat(str, n) - Returns the string which repeats the given string value n times. Examples: > SELECT repeat('123', 2); 123123 replace replace(str, search[, replace]) - Replaces all occurrences ofsearchwithreplace. Arguments: str - a string expression search - a string expression. Ifsearchis not found instr,stris returned unchanged. replace - a string expression. Ifreplaceis not specified or is an empty string, nothing replaces the string that is removed fromstr. Examples: > SELECT replace('ABCabc', 'abc', 'DEF'); ABCDEF reverse reverse(str) - Returns the reversed given string. Examples: > SELECT reverse('Spark SQL'); LQS krapS right right(str, len) - Returns the rightmostlen(lencan be string type) characters from the stringstr,iflenis less or equal than 0 the result is an empty string. Examples: > SELECT right('Spark SQL', 3); SQL rint rint(expr) - Returns the double value that is closest in value to the argument and is equal to a mathematical integer. Examples: > SELECT rint(12.3456); 12.0 rlike str rlike regexp - Returns true ifstrmatchesregexp, or false otherwise. Arguments: str - a string expression regexp - a string expression. The pattern string should be a Java regular expression. Since Spark 2.0, string literals (including regex patterns) are unescaped in our SQL parser. For example, to match "\abc", a regular expression forregexpcan be "^\abc$". There is a SQL config 'spark.sql.parser.escapedStringLiterals' that can be used to fallback to the Spark 1.6 behavior regarding string literal parsing. For example, if the config is enabled, theregexpthat can match "\abc" is "^\abc$". Examples: When spark.sql.parser.escapedStringLiterals is disabled (default). > SELECT '%SystemDrive%\Users\John' rlike '%SystemDrive%\\Users.*' true When spark.sql.parser.escapedStringLiterals is enabled. > SELECT '%SystemDrive%\Users\John' rlike '%SystemDrive%\Users.*' true Note: Use LIKE to match with simple string pattern. rollup round round(expr, d) - Returnsexprrounded toddecimal places using HALF_UP rounding mode. Examples: > SELECT round(2.5, 0); 3.0 row_number row_number() - Assigns a unique, sequential number to each row, starting with one, according to the ordering of rows within the window partition. rpad rpad(str, len, pad) - Returnsstr, right-padded withpadto a length oflen. Ifstris longer thanlen, the return value is shortened tolencharacters. Examples: > SELECT rpad('hi', 5, ''); hi > SELECT rpad('hi', 1, ''); h rtrim rtrim(str) - Removes the trailing space characters fromstr. Examples: > SELECT rtrim(' SparkSQL '); SparkSQL second second(timestamp) - Returns the second component of the string/timestamp. Examples: > SELECT second('2009-07-30 12:58:59'); 59 sentences sentences(str[, lang, country]) - Splitsstrinto an array of array of words. Examples: > SELECT sentences('Hi there! Good morning.'); [["Hi","there"],["Good","morning"]] sha sha(expr) - Returns a sha1 hash value as a hex string of theexpr. Examples: > SELECT sha('Spark'); 85f5955f4b27a9a4c2aab6ffe5d7189fc298b92c sha1 sha1(expr) - Returns a sha1 hash value as a hex string of theexpr. Examples: > SELECT sha1('Spark'); 85f5955f4b27a9a4c2aab6ffe5d7189fc298b92c sha2 sha2(expr, bitLength) - Returns a checksum of SHA-2 family as a hex string ofexpr. SHA-224, SHA-256, SHA-384, and SHA-512 are supported. Bit length of 0 is equivalent to 256. Examples: > SELECT sha2('Spark', 256); 529bc3b07127ecb7e53a4dcf1991d9152c24537d919178022b2c42657f79a26b shiftleft shiftleft(base, expr) - Bitwise left shift. Examples: > SELECT shiftleft(2, 1); 4 shiftright shiftright(base, expr) - Bitwise (signed) right shift. Examples: > SELECT shiftright(4, 1); 2 shiftrightunsigned shiftrightunsigned(base, expr) - Bitwise unsigned right shift. Examples: > SELECT shiftrightunsigned(4, 1); 2 sign sign(expr) - Returns -1.0, 0.0 or 1.0 asexpris negative, 0 or positive. Examples: > SELECT sign(40); 1.0 signum signum(expr) - Returns -1.0, 0.0 or 1.0 asexpris negative, 0 or positive. Examples: > SELECT signum(40); 1.0 sin sin(expr) - Returns the sine ofexpr. Examples: > SELECT sin(0); 0.0 sinh sinh(expr) - Returns the hyperbolic sine ofexpr. Examples: > SELECT sinh(0); 0.0 size size(expr) - Returns the size of an array or a map. Returns -1 if null. Examples: > SELECT size(array('b', 'd', 'c', 'a')); 4 skewness skewness(expr) - Returns the skewness value calculated from values of a group. smallint smallint(expr) - Casts the valueexprto the target data typesmallint. sort_array sort_array(array[, ascendingOrder]) - Sorts the input array in ascending or descending order according to the natural ordering of the array elements. Examples: > SELECT sort_array(array('b', 'd', 'c', 'a'), true); ["a","b","c","d"] soundex soundex(str) - Returns Soundex code of the string. Examples: > SELECT soundex('Miller'); M460 space space(n) - Returns a string consisting ofnspaces. Examples: > SELECT concat(space(2), '1'); 1 spark_partition_id spark_partition_id() - Returns the current partition id. split split(str, regex) - Splitsstraround occurrences that matchregex. Examples: > SELECT split('oneAtwoBthreeC', '[ABC]'); ["one","two","three",""] sqrt sqrt(expr) - Returns the square root ofexpr. Examples: > SELECT sqrt(4); 2.0 stack stack(n, expr1, ..., exprk) - Separatesexpr1, ...,exprkintonrows. Examples: > SELECT stack(2, 1, 2, 3); 1 2 3 NULL std std(expr) - Returns the sample standard deviation calculated from values of a group. stddev stddev(expr) - Returns the sample standard deviation calculated from values of a group. stddev_pop stddev_pop(expr) - Returns the population standard deviation calculated from values of a group. stddev_samp stddev_samp(expr) - Returns the sample standard deviation calculated from values of a group. str_to_map str_to_map(text[, pairDelim[, keyValueDelim]]) - Creates a map after splitting the text into key/value pairs using delimiters. Default delimiters are ',' forpairDelimand ':' forkeyValueDelim. Examples: > SELECT str_to_map('a:1,b:2,c:3', ',', ':'); map("a":"1","b":"2","c":"3") > SELECT str_to_map('a'); map("a":null) string string(expr) - Casts the valueexprto the target data typestring. struct struct(col1, col2, col3, ...) - Creates a struct with the given field values. substr substr(str, pos[, len]) - Returns the substring ofstrthat starts atposand is of lengthlen, or the slice of byte array that starts atposand is of lengthlen. Examples: > SELECT substr('Spark SQL', 5); k SQL > SELECT substr('Spark SQL', -3); SQL > SELECT substr('Spark SQL', 5, 1); k substring substring(str, pos[, len]) - Returns the substring ofstrthat starts atposand is of lengthlen, or the slice of byte array that starts atposand is of lengthlen. Examples: > SELECT substring('Spark SQL', 5); k SQL > SELECT substring('Spark SQL', -3); SQL > SELECT substring('Spark SQL', 5, 1); k substring_index substring_index(str, delim, count) - Returns the substring fromstrbeforecountoccurrences of the delimiterdelim. Ifcountis positive, everything to the left of the final delimiter (counting from the left) is returned. Ifcountis negative, everything to the right of the final delimiter (counting from the right) is returned. The function substring_index performs a case-sensitive match when searching fordelim. Examples: > SELECT substring_index('www.apache.org', '.', 2); www.apache sum sum(expr) - Returns the sum calculated from values of a group. tan tan(expr) - Returns the tangent ofexpr. Examples: > SELECT tan(0); 0.0 tanh tanh(expr) - Returns the hyperbolic tangent ofexpr. Examples: > SELECT tanh(0); 0.0 timestamp timestamp(expr) - Casts the valueexprto the target data typetimestamp. tinyint tinyint(expr) - Casts the valueexprto the target data typetinyint. to_date to_date(date_str[, fmt]) - Parses thedate_strexpression with thefmtexpression to a date. Returns null with invalid input. By default, it follows casting rules to a date if thefmtis omitted. Examples: > SELECT to_date('2009-07-30 04:17:52'); 2009-07-30 > SELECT to_date('2016-12-31', 'yyyy-MM-dd'); 2016-12-31 to_json to_json(expr[, options]) - Returns a json string with a given struct value Examples: > SELECT to_json(named_struct('a', 1, 'b', 2)); {"a":1,"b":2} > SELECT to_json(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy')); {"time":"26/08/2015"} > SELECT to_json(array(named_struct('a', 1, 'b', 2)); [{"a":1,"b":2}] Since:2.2.0 to_timestamp to_timestamp(timestamp[, fmt]) - Parses thetimestampexpression with thefmtexpression to a timestamp. Returns null with invalid input. By default, it follows casting rules to a timestamp if thefmtis omitted. Examples: > SELECT to_timestamp('2016-12-31 00:12:00'); 2016-12-31 00:12:00 > SELECT to_timestamp('2016-12-31', 'yyyy-MM-dd'); 2016-12-31 00:00:00 to_unix_timestamp to_unix_timestamp(expr[, pattern]) - Returns the UNIX timestamp of the given time. Examples: > SELECT to_unix_timestamp('2016-04-08', 'yyyy-MM-dd'); 1460041200 to_utc_timestamp to_utc_timestamp(timestamp, timezone) - Given a timestamp, which corresponds to a certain time of day in the given timezone, returns another timestamp that corresponds to the same time of day in UTC. Examples: > SELECT to_utc_timestamp('2016-08-31', 'Asia/Seoul'); 2016-08-30 15:00:00 translate translate(input, from, to) - Translates theinputstring by replacing the characters present in thefromstring with the corresponding characters in thetostring. Examples: > SELECT translate('AaBbCc', 'abc', '123'); A1B2C3 trim trim(str) - Removes the leading and trailing space characters fromstr. Examples: > SELECT trim(' SparkSQL '); SparkSQL trunc trunc(date, fmt) - Returnsdatewith the time portion of the day truncated to the unit specified by the format modelfmt. Examples: > SELECT trunc('2009-02-12', 'MM'); 2009-02-01 > SELECT trunc('2015-10-27', 'YEAR'); 2015-01-01 ucase ucase(str) - Returnsstrwith all characters changed to uppercase. Examples: > SELECT ucase('SparkSql'); SPARKSQL unbase64 unbase64(str) - Converts the argument from a base 64 stringstrto a binary. Examples: > SELECT unbase64('U3BhcmsgU1FM'); Spark SQL unhex unhex(expr) - Converts hexadecimalexprto binary. Examples: > SELECT decode(unhex('537061726B2053514C'), 'UTF-8'); Spark SQL unix_timestamp unix_timestamp([expr[, pattern]]) - Returns the UNIX timestamp of current or specified time. Examples: > SELECT unix_timestamp(); 1476884637 > SELECT unix_timestamp('2016-04-08', 'yyyy-MM-dd'); 1460041200 upper upper(str) - Returnsstrwith all characters changed to uppercase. Examples: > SELECT upper('SparkSql'); SPARKSQL uuid uuid() - Returns an universally unique identifier (UUID) string. The value is returned as a canonical UUID 36-character string. Examples: > SELECT uuid(); 46707d92-02f4-4817-8116-a4c3b23e6266 var_pop var_pop(expr) - Returns the population variance calculated from values of a group. var_samp var_samp(expr) - Returns the sample variance calculated from values of a group. variance variance(expr) - Returns the sample variance calculated from values of a group. weekofyear weekofyear(date) - Returns the week of the year of the given date. A week is considered to start on a Monday and week 1 is the first week with >3 days. Examples: > SELECT weekofyear('2008-02-20'); 8 when CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - Whenexpr1= true, returnsexpr2; whenexpr3= true, returnexpr4; else returnexpr5. window xpath xpath(xml, xpath) - Returns a string array of values within the nodes of xml that match the XPath expression. Examples: > SELECT xpath('b1b2b3c1c2','a/b/text()'); ['b1','b2','b3'] xpath_boolean xpath_boolean(xml, xpath) - Returns true if the XPath expression evaluates to true, or if a matching node is found. Examples: > SELECT xpath_boolean('1','a/b'); true xpath_double xpath_double(xml, xpath) - Returns a double value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric. Examples: > SELECT xpath_double('12', 'sum(a/b)'); 3.0 xpath_float xpath_float(xml, xpath) - Returns a float value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric. Examples: > SELECT xpath_float('12', 'sum(a/b)'); 3.0 xpath_int xpath_int(xml, xpath) - Returns an integer value, or the value zero if no match is found, or a match is found but the value is non-numeric. Examples: > SELECT xpath_int('12', 'sum(a/b)'); 3 xpath_long xpath_long(xml, xpath) - Returns a long integer value, or the value zero if no match is found, or a match is found but the value is non-numeric. Examples: > SELECT xpath_long('12', 'sum(a/b)'); 3 xpath_number xpath_number(xml, xpath) - Returns a double value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric. Examples: > SELECT xpath_number('12', 'sum(a/b)'); 3.0 xpath_short xpath_short(xml, xpath) - Returns a short integer value, or the value zero if no match is found, or a match is found but the value is non-numeric. Examples: > SELECT xpath_short('12', 'sum(a/b)'); 3 xpath_string xpath_string(xml, xpath) - Returns the text contents of the first xml node that matches the XPath expression. Examples: > SELECT xpath_string('bcc','a/c'); cc year year(date) - Returns the year component of the date/timestamp. Examples: > SELECT year('2016-07-30'); 2016 | expr1 | expr2 - Returns the result of bitwise OR ofexpr1andexpr2. Examples: > SELECT 3 | 5; 7 ~ ~ expr - Returns the result of bitwise NOT ofexpr. Examples: > SELECT ~ 0; -1
参考https://www.2cto.com/net/201803/727248.html