User-defined functions¶
Table of contents
CREATE OR REPLACE
¶
CrateDB supports user-defined functions. See CREATE FUNCTION for a full syntax description.
CREATE FUNCTION
defines a new function:
cr> CREATE FUNCTION my_subtract_function(integer, integer)
... RETURNS integer
... LANGUAGE JAVASCRIPT
... AS 'function my_subtract_function(a, b) { return a - b; }';
CREATE OK, 1 row affected (... sec)
cr> SELECT doc.my_subtract_function(3, 1) AS col;
+-----+
| col |
+-----+
| 2 |
+-----+
SELECT 1 row in set (... sec)
CREATE OR REPLACE FUNCTION
will either create a new function or replace
an existing function definition:
cr> CREATE OR REPLACE FUNCTION log10(bigint)
... RETURNS double precision
... LANGUAGE JAVASCRIPT
... AS 'function log10(a) {return Math.log(a)/Math.log(10); }';
CREATE OK, 1 row affected (... sec)
cr> SELECT doc.log10(10) AS col;
+-----+
| col |
+-----+
| 1.0 |
+-----+
SELECT 1 row in set (... sec)
It is possible to use named function arguments in the function signature. For
example, the calculate_distance
function signature has two geo_point
arguments named start
and end
:
cr> CREATE OR REPLACE FUNCTION calculate_distance("start" geo_point, "end" geo_point)
... RETURNS real
... LANGUAGE JAVASCRIPT
... AS 'function calculate_distance(start, end) {
... return Math.sqrt(
... Math.pow(end[0] - start[0], 2),
... Math.pow(end[1] - start[1], 2));
... }';
CREATE OK, 1 row affected (... sec)
Note
Argument names are used for query documentation purposes only. You cannot reference arguments by name in the function body.
Optionally, a schema-qualified function name can be defined. If you omit the schema, the current session schema is used:
cr> CREATE OR REPLACE FUNCTION my_schema.log10(bigint)
... RETURNS double precision
... LANGUAGE JAVASCRIPT
... AS 'function log10(a) { return Math.log(a)/Math.log(10); }';
CREATE OK, 1 row affected (... sec)
Note
In order to improve the PostgreSQL server compatibility CrateDB allows the creation of user defined functions against the pg_catalog schema. However, the creation of user defined functions against the read-only System information and Information schema schemas is prohibited.
Supported types¶
Function arguments and return values can be any of the supported data types. The values passed into a function must strictly correspond to the specified argument data types.
Note
The value returned by the function will be casted to the return type provided in the definition if required. An exception will be thrown if the cast is not successful.
Overloading¶
Within a specific schema, you can overload functions by defining functions with the same name but a different set of arguments:
cr> CREATE FUNCTION my_schema.my_multiply(integer, integer)
... RETURNS integer
... LANGUAGE JAVASCRIPT
... AS 'function my_multiply(a, b) { return a * b; }';
CREATE OK, 1 row affected (... sec)
This would overload the my_multiply
function with different argument
types:
cr> CREATE FUNCTION my_schema.my_multiply(bigint, bigint)
... RETURNS bigint
... LANGUAGE JAVASCRIPT
... AS 'function my_multiply(a, b) { return a * b; }';
CREATE OK, 1 row affected (... sec)
This would overload the my_multiply
function with more arguments:
cr> CREATE FUNCTION my_schema.my_multiply(bigint, bigint, bigint)
... RETURNS bigint
... LANGUAGE JAVASCRIPT
... AS 'function my_multiply(a, b, c) { return a * b * c; }';
CREATE OK, 1 row affected (... sec)
Caution
It is considered bad practice to create functions that have the same name as the CrateDB built-in functions!
Note
If you call a function without a schema name, CrateDB will look it up in the built-in functions first and only then in the user-defined functions available in the search_path.
Therefore a built-in function with the same name as a user-defined function will hide the latter, even if it contains a different set of arguments! However, such functions can still be called if the schema name is explicitly provided.
Determinism¶
Caution
User-defined functions need to be deterministic, meaning that they must always return the same result value when called with the same argument values, because CrateDB might cache the returned values and reuse the value if the function is called multiple times with the same arguments.
DROP FUNCTION
¶
Functions can be dropped like this:
cr> DROP FUNCTION doc.log10(bigint);
DROP OK, 1 row affected (... sec)
Adding IF EXISTS
prevents from raising an error if the function doesn’t
exist:
cr> DROP FUNCTION IF EXISTS doc.log10(integer);
DROP OK, 1 row affected (... sec)
Optionally, argument names can be specified within the drop statement:
cr> DROP FUNCTION IF EXISTS doc.calculate_distance(start_point geo_point, end_point geo_point);
DROP OK, 1 row affected (... sec)
Optionally, you can provide a schema:
cr> DROP FUNCTION my_schema.log10(bigint);
DROP OK, 1 row affected (... sec)
Supported languages¶
Currently, CrateDB only supports JavaScript for user-defined functions.
JavaScript¶
The user defined function JavaScript is compatible with the ECMAScript 2019 specification.
CrateDB uses the GraalVM JavaScript engine as a JavaScript (ECMAScript) language execution runtime. The GraalVM JavaScript engine is a Java application that works on the stock Java Virtual Machines (VMs). The interoperability between Java code (host language) and JavaScript user-defined functions (guest language) is guaranteed by the GraalVM Polyglot API.
Please note: CrateDB does not use the GraalVM JIT compiler as optimizing compiler. However, the stock host Java VM JIT compilers can JIT-compile, optimize, and execute the GraalVM JavaScript codebase to a certain extent.
The execution context for guest JavaScript is created with restricted privileges to allow for the safe execution of less trusted guest language code. The guest language application context for each user-defined function is created with default access modifiers, so any access to managed resources is denied. The only exception is the host language interoperability configuration which explicitly allows access to Java lists and arrays. Please refer to GraalVM Security Guide for more detailed information.
Also, even though user-defined functions implemented with ECMA-compliant
JavaScript, objects that are normally accessible with a web browser
(e.g. window
, console
, and so on) are not available.
Note
GraalVM treats objects provided to JavaScript user-defined functions as close as possible to their respective counterparts and therefore by default only a subset of prototype functions are available in user-defined functions. For CrateDB 4.6 and earlier the object prototype was disabled.
Please refer to the GraalVM JavaScript Compatibility FAQ to learn more about the compatibility.
JavaScript supported types¶
JavaScript functions can handle all CrateDB data types. However, for some return types the function output must correspond to the certain format.
If a function requires geo_point
as a return type, then the JavaScript
function must return a double precision
array of size 2, WKT
string or
GeoJson
object.
Here is an example of a JavaScript function returning a double array
:
cr> CREATE FUNCTION rotate_point(point geo_point, angle real)
... RETURNS geo_point
... LANGUAGE JAVASCRIPT
... AS 'function rotate_point(point, angle) {
... var cos = Math.cos(angle);
... var sin = Math.sin(angle);
... var x = cos * point[0] - sin * point[1];
... var y = sin * point[0] + cos * point[1];
... return [x, y];
... }';
CREATE OK, 1 row affected (... sec)
Below is an example of a JavaScript function returning a WKT
string, which
will be cast to geo_point
:
cr> CREATE FUNCTION symmetric_point(point geo_point)
... RETURNS geo_point
... LANGUAGE JAVASCRIPT
... AS 'function symmetric_point (point, angle) {
... var x = - point[0],
... y = - point[1];
... return "POINT (\" + x + \", \" + y +\")";
... }';
CREATE OK, 1 row affected (... sec)
Similarly, if the function specifies the geo_shape
return data type, then
the JavaScript function should return a GeoJson
object or WKT
string:
cr> CREATE FUNCTION line("start" array(double precision), "end" array(double precision))
... RETURNS object
... LANGUAGE JAVASCRIPT
... AS 'function line(start, end) {
... return { "type": "LineString", "coordinates" : [start_point, end_point] };
... }';
CREATE OK, 1 row affected (... sec)
Note
If the return value of the JavaScript function is undefined
, it is
converted to NULL
.
Working with NUMBERS
¶
The JavaScript engine interprets numbers as java.lang.Double
,
java.lang.Long
, or java.lang.Integer
, depending on the computation
performed. In most cases, this is not an issue, since the return type of the
JavaScript function will be cast to the return type specified in the CREATE
FUNCTION
statement, although cast might result in a loss of precision.
However, when you try to cast DOUBLE PRECISION
to
TIMESTAMP WITH TIME ZONE
, it will be interpreted as UTC seconds and will
result in a wrong value:
cr> CREATE FUNCTION utc(bigint, bigint, bigint)
... RETURNS TIMESTAMP WITH TIME ZONE
... LANGUAGE JAVASCRIPT
... AS 'function utc(year, month, day) {
... return Date.UTC(year, month, day, 0, 0, 0);
... }';
CREATE OK, 1 row affected (... sec)
cr> SELECT date_format(utc(2016,04,6)) as epoque;
+------------------------------+
| epoque |
+------------------------------+
| 48314-07-22T00:00:00.000000Z |
+------------------------------+
SELECT 1 row in set (... sec)
To avoid this behavior, the numeric value should be divided by 1000 before it is returned:
cr> CREATE FUNCTION utc(bigint, bigint, bigint)
... RETURNS TIMESTAMP WITH TIME ZONE
... LANGUAGE JAVASCRIPT
... AS 'function utc(year, month, day) {
... return Date.UTC(year, month, day, 0, 0, 0)/1000;
... }';
CREATE OK, 1 row affected (... sec)
cr> SELECT date_format(utc(2016,04,6)) as epoque;
+-----------------------------+
| epoque |
+-----------------------------+
| 2016-05-06T00:00:00.000000Z |
+-----------------------------+
SELECT 1 row in set (... sec)