Node-specific settings¶
Table of contents
Basics¶
- cluster.name
 - Default:
crateRuntime:noThe name of the CrateDB cluster the node should join to.
 
- node.name
 - Runtime:
noThe name of the node. If no name is configured a random one will be generated.
Note
Node names must be unique in a CrateDB cluster.
 
- node.store.allow_mmap
 - Default:
trueRuntime:noThe setting indicates whether or not memory-mapping is allowed.
 
Node types¶
CrateDB supports different types of nodes.
The following settings can be used to differentiate nodes upon startup:
- node.master
 - Default:
trueRuntime:noWhether or not this node is able to get elected as master node in the cluster.
 
- node.data
 - Default:
trueRuntime:noWhether or not this node will store data.
 
Using different combinations of these two settings, you can create four different types of node. Each type of node is differentiated by what types of load it will handle.
Tabulating the truth values for node.master and node.data produces a
truth table outlining the four different types of node:
Master  | 
No master  | 
|
Data  | 
Handle all loads.  | 
Handles client requests and query execution.  | 
No data  | 
Handles cluster management.  | 
Handles client requests.  | 
Nodes marked as node.master will only handle cluster management if they are
elected as the cluster master. All other loads are shared equally.
General¶
- node.sql.read_only
 - Default:
falseRuntime:noIf set to
true, the node will only allow SQL statements which are resulting in read operations. 
- statement_timeout
 - Default:
0Runtime:yesThe maximum duration of any statement before it gets cancelled.
This value is used as default value for the statement_timeout session setting
If
0queries are allowed to run infinitely and don’t get cancelled automatically. 
Note
Updating this setting won’t affect existing sessions, it will only take effect for new sessions.
Networking¶
Hosts¶
- network.host
 - Default:
_local_Runtime:noThe IP address CrateDB will bind itself to. This setting sets both the network.bind_host and network.publish_host values.
 
- network.bind_host
 - Default:
_local_Runtime:noThis setting determines to which address CrateDB should bind itself to.
 
- network.publish_host
 - Default:
_local_Runtime:noThis setting is used by a CrateDB node to publish its own address to the rest of the cluster.
 
Tip
Apart from IPv4 and IPv6 addresses there are some special values that can be used for all above settings:
  | 
Any loopback addresses on the system, for example
  | 
  | 
Any site-local addresses on the system, for
example   | 
  | 
Any globally-scoped addresses on the system, for
example   | 
  | 
Addresses of a network interface, for example
  | 
Ports¶
- http.port
 - Runtime:
noThis defines the TCP port range to which the CrateDB HTTP service will be bound to. It defaults to
4200-4300. Always the first free port in this range is used. If this is set to an integer value it is considered as an explicit single port.The HTTP protocol is used for the REST endpoint which is used by all clients except the Java client.
 
- http.publish_port
 - Runtime:
noThe port HTTP clients should use to communicate with the node. It is necessary to define this setting if the bound HTTP port (
http.port) of the node is not directly reachable from outside, e.g. running it behind a firewall or inside a Docker container. 
- transport.tcp.port
 - Runtime:
noThis defines the TCP port range to which the CrateDB transport service will be bound to. It defaults to
4300-4400. Always the first free port in this range is used. If this is set to an integer value it is considered as an explicit single port.The transport protocol is used for internal node-to-node communication.
 
- transport.publish_port
 - Runtime:
noThe port that the node publishes to the cluster for its own discovery. It is necessary to define this setting when the bound tranport port (
transport.tcp.port) of the node is not directly reachable from outside, e.g. running it behind a firewall or inside a Docker container. 
- psql.port
 - Runtime:
noThis defines the TCP port range to which the CrateDB Postgres service will be bound to. It defaults to
5432-5532. Always the first free port in this range is used. If this is set to an integer value it is considered as an explicit single port. 
Advanced TCP settings¶
Any interface that uses TCP (Postgres wire, HTTP & Transport protocols) shares the following settings:
- network.tcp.no_delay
 - Default:
trueRuntime:noEnable or disable the Nagle’s algorithm for buffering TCP packets. Buffering is disabled by default.
 
- network.tcp.keep_alive
 - Default:
trueRuntime:noConfigures the
SO_KEEPALIVEoption for sockets, which determines whether they send TCP keepalive probes. 
- network.tcp.reuse_address
 - Default:
trueon non-windows machines andfalseotherwiseRuntime:noConfigures the
SO_REUSEADDRSoption for sockets, which determines whether they should reuse the address. 
- network.tcp.send_buffer_size
 - Default:
-1Runtime:noThe size of the TCP send buffer (SO_SNDBUF socket option). By default not explicitly set.
 
- network.tcp.receive_buffer_size
 - Default:
-1Runtime:noThe size of the TCP receive buffer (SO_RCVBUF socket option). By default not explicitly set.
 
Note
Each setting in this section has its counterpart for HTTP and transport.
To provide a protocol specific setting, remove network prefix and use
either http or transport instead. For example, no_delay can be
configured as http.tcp.no_delay and transport.tcp.no_delay. Please
note, that PG interface takes its settings from transport.
Transport settings¶
- transport.connect_timeout
 - Default:
30sRuntime:noThe connect timeout for initiating a new connection.
 
- transport.compress
 - Default:
falseRuntime:noSet to true to enable compression (DEFLATE) between all nodes.
 
- transport.ping_schedule
 - Default:
-1Runtime:noSchedule a regular application-level ping message to ensure that transport connections between nodes are kept alive. Defaults to -1 (disabled). It is preferable to correctly configure TCP keep-alives instead of using this feature, because TCP keep-alives apply to all kinds of long-lived connections and not just to transport connections.
 
Paths¶
Note
Relative paths are relative to CRATE_HOME. Absolute paths override this behavior.
- path.conf
 - Default:
configRuntime:noFilesystem path to the directory containing the configuration files
crate.ymlandlog4j2.properties. 
- path.data
 - Default:
dataRuntime:noFilesystem path to the directory where this CrateDB node stores its data (table data and cluster metadata).
Multiple paths can be set by using a comma separated list and each of these paths will hold full shards (instead of striping data across them). For example:
path.data: /path/to/data1,/path/to/data2
When CrateDB finds striped shards at the provided locations (from CrateDB <0.55.0), these shards will be migrated automatically on startup.
 
- path.logs
 - Default:
logsRuntime:noFilesystem path to a directory where log files should be stored.
Can be used as a variable inside
log4j2.properties.For example:
appender: file: file: ${path.logs}/${cluster.name}.log
 
- path.repo
 - Runtime:
noA list of filesystem or UNC paths where repositories of type fs may be stored.
Without this setting a CrateDB user could write snapshot files to any directory that is writable by the CrateDB process. To safeguard against this security issue, the possible paths have to be whitelisted here.
See also location setting of repository type
fs. 
See also
Plug-ins¶
- plugin.mandatory
 - Runtime:
noA list of plug-ins that are required for a node to startup.
If any plug-in listed here is missing, the CrateDB node will fail to start.
 
CPU¶
- processors
 - Runtime:
noThe number of processors is used to set the size of the thread pools CrateDB is using appropriately. If not set explicitly, CrateDB will infer the number from the available processors on the system.
In environments where the CPU amount can be restricted (like Docker) or when multiple CrateDB instances are running on the same hardware, the inferred number might be too high. In such a case, it is recommended to set the value explicitly.
 
Memory¶
- bootstrap.memory_lock
 - Default:
falseRuntime:noCrateDB performs poorly when the JVM starts swapping: you should ensure that it never swaps. If set to
true, CrateDB will use themlockallsystem call on startup to ensure that the memory pages of the CrateDB process are locked into RAM. 
Garbage collection¶
CrateDB logs if JVM garbage collection on different memory pools takes too long. The following settings can be used to adjust these timeouts:
- monitor.jvm.gc.collector.young.warn
 - Default:
1000msRuntime:noCrateDB will log a warning message if it takes more than the configured timespan to collect the Eden Space (heap).
 
- monitor.jvm.gc.collector.young.info
 - Default:
700msRuntime:noCrateDB will log an info message if it takes more than the configured timespan to collect the Eden Space (heap).
 
- monitor.jvm.gc.collector.young.debug
 - Default:
400msRuntime:noCrateDB will log a debug message if it takes more than the configured timespan to collect the Eden Space (heap).
 
- monitor.jvm.gc.collector.old.warn
 - Default:
10000msRuntime:noCrateDB will log a warning message if it takes more than the configured timespan to collect the Old Gen / Tenured Gen (heap).
 
- monitor.jvm.gc.collector.old.info
 - Default:
5000msRuntime:noCrateDB will log an info message if it takes more than the configured timespan to collect the Old Gen / Tenured Gen (heap).
 
- monitor.jvm.gc.collector.old.debug
 - Default:
2000msRuntime:noCrateDB will log a debug message if it takes more than the configured timespan to collect the Old Gen / Tenured Gen (heap).
 
Authentication¶
Trust authentication¶
- auth.trust.http_default_user
 - Default:
crateRuntime:noThe default user that should be used for authentication when clients connect to CrateDB via HTTP protocol and they do not specify a user via the
Authorizationrequest header. 
- auth.trust.http_support_x_real_ip
 - Default:
falseRuntime:noIf enabled, the HTTP transport will trust the
X-Real-IPheader sent by the client to determine the client’s IP address. This is useful when CrateDB is running behind a reverse proxy or load-balancer. For improved security, any_local_IP address (127.0.0.1and::1) defined in this header will be ignored. 
Warning
Enabling this setting can be a security risk, as it allows clients to
impersonate other clients by sending a fake X-Real-IP header.
Host-based authentication¶
Authentication settings (auth.host_based.*) are node settings, which means
that their values apply only to the node where they are applied and different
nodes may have different authentication settings.
- auth.host_based.enabled
 - Default:
falseRuntime:noSetting to enable or disable Host Based Authentication (HBA). It is disabled by default.
 
HBA entries¶
The auth.host_based.config. setting is a group setting that can have zero,
one or multiple groups that are defined by their group key (${order}) and
their fields (user, address, method, protocol, ssl).
- ${order}:
 - An identifier that is used as a natural order key when looking up the hostbased configuration entries. For example, an order key of
awill belooked up before an order key ofb. This key guarantees that the entrylookup order will remain independent from the insertion order of theentries. 
The Host-Based Authentication (HBA) setting is a list of predicates that users can specify to restrict or allow access to CrateDB.
The meaning of the fields of the are as follows:
- auth.host_based.config.${order}.user
 - Runtime:
noSpecifies an existing CrateDB username, onlycrateuser (superuser) isavailable. If no user is specified in the entry, then all existing userscan have access. 
- auth.host_based.config.${order}.address
 - Runtime:
noThe client machine addresses that the client matches, and which are allowedto authenticate. This field may contain an IPv4 address, an IPv6 address oran IPv4 CIDR mask. For example:127.0.0.1or127.0.0.1/32. It alsomay contain a hostname or the special_local_notation which will matchboth IPv4 and IPv6 connections from localhost. A hostname specificationthat starts with a dot (.) matches a suffix of the actual hostname.So .crate.io would match foo.crate.io but not just crate.io. If no addressis specified in the entry, then access to CrateDB is open for all hosts. 
- auth.host_based.config.${order}.method
 - Runtime:
noThe authentication method to use when a connection matches this entry.Valid values aretrust,cert, andpassword. If no method isspecified, thetrustmethod is used by default.See Trust method, Client certificate authentication method and Password authentication method for moreinformation about these methods. 
- auth.host_based.config.${order}.protocol
 - Runtime:
noSpecifies the protocol for which the authentication entry should be used.If no protocol is specified, then this entry will be valid for allprotocols that rely on host based authentication see Trust method). 
- auth.host_based.config.${order}.ssl
 - Default:
optionalRuntime:noSpecifies whether the client must use SSL/TLS to connect to the cluster.If set toonthen the client must be connected through SSL/TLSotherwise is not authenticated. If set tooffthen the client mustnot be connected via SSL/TLS otherwise is not authenticated. Finallyoptional, which is the value when the option is completely skipped,means that the client can be authenticated regardless of SSL/TLS is usedor not. 
Example of config groups:
auth.host_based.config:
  entry_a:
    user: crate
    address: 127.16.0.0/16
  entry_b:
    method: trust
  entry_3:
    user: crate
    address: 172.16.0.0/16
    method: trust
    protocol: pg
    ssl: on
Secured communications (SSL/TLS)¶
Secured communications via SSL allows you to encrypt traffic between CrateDB nodes and clients connecting to them. Connections are secured using Transport Layer Security (TLS).
- ssl.http.enabled
 - Default:
falseRuntime:noSet this to true to enable secure communication between the CrateDB node and the client through SSL via the HTTPS protocol.
 
- ssl.psql.enabled
 - Default:
falseRuntime:noSet this to true to enable secure communication between the CrateDB node and the client through SSL via the PostgreSQL wire protocol.
 
- ssl.transport.mode
 - Default:
legacyRuntime:noFor communication between nodes, choose:
offSSL cannot be used
legacySSL is not used. If HBA is enabled, transport connections won’t be verified Any reachable host can establish a connection.
onSSL must be used
 
- ssl.keystore_filepath
 - Runtime:
noThe full path to the node keystore file.
 
- ssl.keystore_password
 - Runtime:
noThe password used to decrypt the keystore file defined with
ssl.keystore_filepath. 
- ssl.keystore_key_password
 - Runtime:
noThe password entered at the end of the
keytool -genkey command. 
Note
Optionally trusted CA certificates can be stored separately from the node’s keystore into a truststore for CA certificates.
- ssl.truststore_filepath
 - Runtime:
noThe full path to the node truststore file. If not defined, then only a keystore will be used.
 
- ssl.truststore_password
 - Runtime:
noThe password used to decrypt the truststore file defined with
ssl.truststore_filepath. 
- ssl.resource_poll_interval
 - Default:
5mRuntime:noThe frequency at which SSL files such as keystore and truststore are polled for changes.
 
Cross-origin resource sharing (CORS)¶
Many browsers support the same-origin policy which requires web applications to explicitly allow requests across origins. The cross-origin resource sharing settings in CrateDB allow for configuring these.
- http.cors.enabled
 - Default:
falseRuntime:noEnable or disable cross-origin resource sharing.
 
- http.cors.allow-origin
 - Default:
<empty>Runtime:noDefine allowed origins of a request.
*allows any origin (which can be a substantial security risk) and by prepending a/the string will be treated as a regular expression. For example/https?:\/\/crate.io/will allow requests fromhttps://crate.ioandhttps://crate.io. This setting disallows any origin by default. 
- http.cors.max-age
 - Default:
1728000(20 days)Runtime:noMax cache age of a preflight request in seconds.
 
- http.cors.allow-methods
 - Default:
OPTIONS, HEAD, GET, POST, PUT, DELETERuntime:noAllowed HTTP methods.
 
- http.cors.allow-headers
 - Default:
X-Requested-With, Content-Type, Content-LengthRuntime:noAllowed HTTP headers.
 
- http.cors.allow-credentials
 - Default:
falseRuntime:noAdd the
Access-Control-Allow-Credentialsheader to responses. 
Blobs¶
- blobs.path
 - Runtime:
noPath to a filesystem directory where to store blob data allocated for this node.
By default blobs will be stored under the same path as normal data. A relative path value is interpreted as relative to
CRATE_HOME. 
Repositories¶
Repositories are used to backup a CrateDB cluster.
- repositories.url.allowed_urls
 - Runtime:
noThis setting only applies to repositories of type url.
With this setting a list of urls can be specified which are allowed to be used if a repository of type
urlis created.Wildcards are supported in the host, path, query and fragment parts.
This setting is a security measure to prevent access to arbitrary resources.
In addition, the supported protocols can be restricted using the repositories.url.supported_protocols setting.
 
- repositories.url.supported_protocols
 - Default:
http,https,ftp,fileandjarRuntime:noA list of protocols that are supported by repositories of type url.
The
jarprotocol is used to access the contents of jar files. For more info, see the java JarURLConnection documentation. 
See also the path.repo Setting.
Queries¶
- indices.query.bool.max_clause_count
 - Default:
8192Runtime:noThis setting defines the maximum number of elements an array can have so that the
!= ANY(),LIKE ANY(),ILIKE ANY(),NOT LIKE ANY()and theNOT ILIKE ANY()operators can be applied on it.Note
Increasing this value to a large number (e.g. 10M) and applying those
ANYoperators on arrays of that length can lead to heavy memory, consumption which could cause nodes to crash with OutOfMemory exceptions. 
Legacy¶
- legacy.table_function_column_naming
 - Default:
falseRuntime:noSince CrateDB 5.0.0, if the table function is not aliased and is returning a single base data typed column, the table function name is used as the column name. This setting can be set in order to use the naming convention prior to 5.0.0.
The following table functions are affected by this setting:
When the setting is set and a single column is expected to be returned, the returned column will be named
col1,groups, orcol1respectively.Note
Beware that if not all nodes in the cluster are consistently set or unset, the behaviour will depend on the node handling the query.
 
JavaScript language¶
- lang.js.enabled
 - Default:
trueRuntime:noSetting to enable or disable JavaScript UDF support.
 
Custom attributes¶
The node.attr namespace is a bag of custom attributes. Custom attributes
can be used to control shard allocation.
You can create any attribute you want under this namespace, like
node.attr.key: value. These attributes use the node.attr namespace to
distinguish them from core node attribute like node.name.
Custom attributes are not validated by CrateDB, unlike core node attributes.