1. Introduction to the Kyuubi Configurations System - 图1

Introduction to the Kyuubi Configurations System

Kyuubi provides several ways to configure the system and corresponding engines.

Environments

You can configure the environment variables in $KYUUBI_HOME/conf/kyuubi-env.sh, e.g, JAVA_HOME, then this java runtime will be used both for Kyuubi server instance and the applications it launches. You can also change the variable in the subprocess’s env configuration file, e.g.$SPARK_HOME/conf/spark-env.sh to use more specific ENV for SQL engine applications.

  1. #!/usr/bin/env bash
  2. #
  3. # Licensed to the Apache Software Foundation (ASF) under one or more
  4. # contributor license agreements. See the NOTICE file distributed with
  5. # this work for additional information regarding copyright ownership.
  6. # The ASF licenses this file to You under the Apache License, Version 2.0
  7. # (the "License"); you may not use this file except in compliance with
  8. # the License. You may obtain a copy of the License at
  9. #
  10. # http://www.apache.org/licenses/LICENSE-2.0
  11. #
  12. # Unless required by applicable law or agreed to in writing, software
  13. # distributed under the License is distributed on an "AS IS" BASIS,
  14. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  15. # See the License for the specific language governing permissions and
  16. # limitations under the License.
  17. #
  18. #
  19. # - JAVA_HOME Java runtime to use. By default use "java" from PATH.
  20. #
  21. #
  22. # - KYUUBI_CONF_DIR Directory containing the Kyuubi configurations to use.
  23. # (Default: $KYUUBI_HOME/conf)
  24. # - KYUUBI_LOG_DIR Directory for Kyuubi server-side logs.
  25. # (Default: $KYUUBI_HOME/logs)
  26. # - KYUUBI_PID_DIR Directory stores the Kyuubi instance pid file.
  27. # (Default: $KYUUBI_HOME/pid)
  28. # - KYUUBI_MAX_LOG_FILES Maximum number of Kyuubi server logs can rotate to.
  29. # (Default: 5)
  30. # - KYUUBI_JAVA_OPTS JVM options for the Kyuubi server itself in the form "-Dx=y".
  31. # (Default: none).
  32. # - KYUUBI_NICENESS The scheduling priority for Kyuubi server.
  33. # (Default: 0)
  34. # - KYUUBI_WORK_DIR_ROOT Root directory for launching sql engine applications.
  35. # (Default: $KYUUBI_HOME/work)
  36. # - HADOOP_CONF_DIR Directory containing the Hadoop / YARN configuration to use.
  37. #
  38. # - SPARK_HOME Spark distribution which you would like to use in Kyuubi.
  39. # - SPARK_CONF_DIR Optional directory where the Spark configuration lives.
  40. # (Default: $SPARK_HOME/conf)
  41. #
  42. ## Examples ##
  43. # export JAVA_HOME=/usr/jdk64/jdk1.8.0_152
  44. # export HADOOP_CONF_DIR=/usr/ndp/current/mapreduce_client/conf
  45. # export KYUUBI_JAVA_OPTS="-Xmx10g -XX:+UnlockDiagnosticVMOptions -XX:ParGCCardsPerStrideChunk=4096 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSClassUnloadingEnabled -XX:+CMSParallelRemarkEnabled -XX:+UseCondCardMark -XX:MaxDirectMemorySize=1024m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=./logs -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -Xloggc:./logs/kyuubi-server-gc-%t.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=5M -XX:NewRatio=3 -XX:MetaspaceSize=512m"

For the environment variables that only needed to be transferred into engine side, you can set it with a Kyuubi configuration item formatted kyuubi.engineEnv.VAR_NAME. For example, with kyuubi.engineEnv.SPARK_DRIVER_MEMORY=4g, the environment variable SPARK_DRIVER_MEMORY with value 4g would be transferred into engine side. With kyuubi.engineEnv.SPARK_CONF_DIR=/apache/confs/spark/conf, the value of SPARK_CONF_DIR in engine side is set to /apache/confs/spark/conf.

Kyuubi Configurations

You can configure the Kyuubi properties in $KYUUBI_HOME/conf/kyuubi-defaults.conf. For example:

  1. #
  2. # Licensed to the Apache Software Foundation (ASF) under one or more
  3. # contributor license agreements. See the NOTICE file distributed with
  4. # this work for additional information regarding copyright ownership.
  5. # The ASF licenses this file to You under the Apache License, Version 2.0
  6. # (the "License"); you may not use this file except in compliance with
  7. # the License. You may obtain a copy of the License at
  8. #
  9. # http://www.apache.org/licenses/LICENSE-2.0
  10. #
  11. # Unless required by applicable law or agreed to in writing, software
  12. # distributed under the License is distributed on an "AS IS" BASIS,
  13. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  14. # See the License for the specific language governing permissions and
  15. # limitations under the License.
  16. #
  17. ## Kyuubi Configurations
  18. #
  19. # kyuubi.authentication NONE
  20. # kyuubi.frontend.bind.host localhost
  21. # kyuubi.frontend.bind.port 10009
  22. #
  23. # Details in https://kyuubi.readthedocs.io/en/latest/deployment/settings.html

Authentication

Key Default Meaning Type Since
kyuubi.authentication
NONE
Client authentication types.
  • NOSASL: raw transport.
  • NONE: no authentication check.
  • KERBEROS: Kerberos/GSSAPI authentication.
  • CUSTOM: User-defined authentication.
  • LDAP: Lightweight Directory Access Protocol authentication.
string
1.0.0
kyuubi.authentication
.custom.class
<undefined>
User-defined authentication implementation of org.apache.kyuubi.service.authentication.PasswdAuthenticationProvider
string
1.3.0
kyuubi.authentication
.ldap.base.dn
<undefined>
LDAP base DN.
string
1.0.0
kyuubi.authentication
.ldap.domain
<undefined>
LDAP domain.
string
1.0.0
kyuubi.authentication
.ldap.guidKey
uid
LDAP attribute name whose values are unique in this LDAP server.For example:uid or cn.
string
1.2.0
kyuubi.authentication
.ldap.url
<undefined>
SPACE character separated LDAP connection URL(s).
string
1.0.0
kyuubi.authentication
.sasl.qop
auth
Sasl QOP enable higher levels of protection for Kyuubi communication with clients.
  • auth - authentication only (default)
  • auth-int - authentication plus integrity protection
  • auth-conf - authentication plus integrity and confidentiality protection. This is applicable only if Kyuubi is configured to use Kerberos authentication.
string
1.0.0

Backend

Key Default Meaning Type Since
kyuubi.backend.engine
.exec.pool.keepalive
.time
PT1M
Time(ms) that an idle async thread of the operation execution thread pool will wait for a new task to arrive before terminating in SQL engine applications
duration
1.0.0
kyuubi.backend.engine
.exec.pool.shutdown
.timeout
PT10S
Timeout(ms) for the operation execution thread pool to terminate in SQL engine applications
duration
1.0.0
kyuubi.backend.engine
.exec.pool.size
100
Number of threads in the operation execution thread pool of SQL engine applications
int
1.0.0
kyuubi.backend.engine
.exec.pool.wait.queue
.size
100
Size of the wait queue for the operation execution thread pool in SQL engine applications
int
1.0.0
kyuubi.backend.server
.exec.pool.keepalive
.time
PT1M
Time(ms) that an idle async thread of the operation execution thread pool will wait for a new task to arrive before terminating in Kyuubi server
duration
1.0.0
kyuubi.backend.server
.exec.pool.shutdown
.timeout
PT10S
Timeout(ms) for the operation execution thread pool to terminate in Kyuubi server
duration
1.0.0
kyuubi.backend.server
.exec.pool.size
100
Number of threads in the operation execution thread pool of Kyuubi server
int
1.0.0
kyuubi.backend.server
.exec.pool.wait.queue
.size
100
Size of the wait queue for the operation execution thread pool of Kyuubi server
int
1.0.0

Delegation

Key Default Meaning Type Since
kyuubi.delegation.key
.update.interval
PT24H
unused yet
duration
1.0.0
kyuubi.delegation
.token.gc.interval
PT1H
unused yet
duration
1.0.0
kyuubi.delegation
.token.max.lifetime
PT168H
unused yet
duration
1.0.0
kyuubi.delegation
.token.renew.interval
PT168H
unused yet
duration
1.0.0

Engine

Key Default Meaning Type Since
kyuubi.engine
.connection.url.use
.hostname
true
When true, engine register with hostname to zookeeper. When spark run on k8s with cluster mode, set to false to ensure that server can connect to engine
boolean
1.3.0
kyuubi.engine
.deregister.exception
.classes
A comma separated list of exception classes. If there is any exception thrown, whose class matches the specified classes, the engine would deregister itself.
seq
1.2.0
kyuubi.engine
.deregister.exception
.messages
A comma separated list of exception messages. If there is any exception thrown, whose message or stacktrace matches the specified message list, the engine would deregister itself.
seq
1.2.0
kyuubi.engine
.deregister.exception
.ttl
PT30M
Time to live(TTL) for exceptions pattern specified in kyuubi.engine.deregister.exception.classes and kyuubi.engine.deregister.exception.messages to deregister engines. Once the total error count hits the kyuubi.engine.deregister.job.max.failures within the TTL, an engine will deregister itself and wait for self-terminated. Otherwise, we suppose that the engine has recovered from temporary failures.
duration
1.2.0
kyuubi.engine
.deregister.job.max
.failures
4
Number of failures of job before deregistering the engine.
int
1.2.0
kyuubi.engine.event
.json.log.path
file:/tmp/kyuubi/events
The location of all the engine events go for the builtin JSON logger.
  • Local Path: start with ‘file:’
  • HDFS Path: start with ‘hdfs:’
string
1.3.0
kyuubi.engine.event
.loggers
A comma separated list of engine history loggers, where engine/session/operation etc events go.
  • SPARK: the events will be written to the spark history events
  • JSON: the events will be written to the location of kyuubi.engine.event.json.log.path
  • JDBC: to be done
  • CUSTOM: to be done.
seq
1.3.0
kyuubi.engine
.initialize.sql
SHOW DATABASES
SemiColon-separated list of SQL statements to be initialized in the newly created engine before queries. This configuration can not be used in JDBC url due to the limitation of Beeline/JDBC driver.
seq
1.2.0
kyuubi.engine.session
.initialize.sql
SHOW DATABASES
SemiColon-separated list of SQL statements to be initialized in the newly created engine session before queries. This configuration can not be used in JDBC url due to the limitation of Beeline/JDBC driver.
seq
1.3.0
kyuubi.engine.share
.level
USER
Engines will be shared in different levels, available configs are:
  • CONNECTION: engine will not be shared but only used by the current client connection
  • USER: engine will be shared by all sessions created by a unique username, see also kyuubi.engine.share.level.sub.domain
  • SERVER: the App will be shared by Kyuubi servers
string
1.2.0
kyuubi.engine.share
.level.sub.domain
<undefined>
Allow end-users to create a sub-domain for the share level of an engine. A sub-domain is a case-insensitive string values in ^[a-zA-Z_]{1,10}$ form. For example, for USER share level, an end-user can share a certain engine within a sub-domain, not for all of its clients. End-users are free to create multiple engines in the USER share level
string
1.2.0
kyuubi.engine.single
.spark.session
false
When set to true, this engine is running in a single session mode. All the JDBC/ODBC connections share the temporary views, function registries, SQL configuration and the current database.
boolean
1.3.0
kyuubi.engine.ui.stop
.enabled
true
When true, allows Kyuubi engine to be killed from the Spark Web UI.
boolean
1.3.0

Frontend

Key Default Meaning Type Since
kyuubi.frontend
.backoff.slot.length
PT0.1S
Time to back off during login to the frontend service.
duration
1.0.0
kyuubi.frontend.bind
.host
<undefined>
Hostname or IP of the machine on which to run the frontend service.
string
1.0.0
kyuubi.frontend.bind
.port
10009
Port of the machine on which to run the frontend service.
int
1.0.0
kyuubi.frontend.login
.timeout
PT20S
Timeout for Thrift clients during login to the frontend service.
duration
1.0.0
kyuubi.frontend.max
.message.size
104857600
Maximum message size in bytes a Kyuubi server will accept.
int
1.0.0
kyuubi.frontend.max
.worker.threads
999
Maximum number of threads in the of frontend worker thread pool for the frontend service
int
1.0.0
kyuubi.frontend.min
.worker.threads
9
Minimum number of threads in the of frontend worker thread pool for the frontend service
int
1.0.0
kyuubi.frontend
.worker.keepalive.time
PT1M
Keep-alive time (in milliseconds) for an idle worker thread
duration
1.0.0

Ha

Key Default Meaning Type Since
kyuubi.ha.zookeeper
.acl.enabled
false
Set to true if the zookeeper ensemble is kerberized
boolean
1.0.0
kyuubi.ha.zookeeper
.connection.base.retry
.wait
1000
Initial amount of time to wait between retries to the zookeeper ensemble
int
1.0.0
kyuubi.ha.zookeeper
.connection.max
.retries
3
Max retry times for connecting to the zookeeper ensemble
int
1.0.0
kyuubi.ha.zookeeper
.connection.max.retry
.wait
30000
Max amount of time to wait between retries for BOUNDED_EXPONENTIAL_BACKOFF policy can reach, or max time until elapsed for UNTIL_ELAPSED policy to connect the zookeeper ensemble
int
1.0.0
kyuubi.ha.zookeeper
.connection.retry
.policy
EXPONENTIAL_BACKOFF
The retry policy for connecting to the zookeeper ensemble, all candidates are:
  • ONE_TIME
  • N_TIME
  • EXPONENTIAL_BACKOFF
  • BOUNDED_EXPONENTIAL_BACKOFF
  • UNTIL_ELAPSED
string
1.0.0
kyuubi.ha.zookeeper
.connection.timeout
15000
The timeout(ms) of creating the connection to the zookeeper ensemble
int
1.0.0
kyuubi.ha.zookeeper
.namespace
kyuubi
The root directory for the service to deploy its instance uri. Additionally, it will creates a -[username] suffixed root directory for each application
string
1.0.0
kyuubi.ha.zookeeper
.node.creation.timeout
PT2M
Timeout for creating zookeeper node
duration
1.2.0
kyuubi.ha.zookeeper
.quorum
The connection string for the zookeeper ensemble
string
1.0.0
kyuubi.ha.zookeeper
.session.timeout
60000
The timeout(ms) of a connected session to be idled
int
1.0.0

Kinit

Key Default Meaning Type Since
kyuubi.kinit.interval
PT1H
How often will Kyuubi server run kinit -kt [keytab] [principal] to renew the local Kerberos credentials cache
duration
1.0.0
kyuubi.kinit.keytab
<undefined>
Location of Kyuubi server’s keytab.
string
1.0.0
kyuubi.kinit.max
.attempts
10
How many times will kinit process retry
int
1.0.0
kyuubi.kinit
.principal
<undefined>
Name of the Kerberos principal.
string
1.0.0

Metrics

Key Default Meaning Type Since
kyuubi.metrics
.console.interval
PT5S
How often should report metrics to console
duration
1.2.0
kyuubi.metrics
.enabled
true
Set to true to enable kyuubi metrics system
boolean
1.2.0
kyuubi.metrics.json
.interval
PT5S
How often should report metrics to json file
duration
1.2.0
kyuubi.metrics.json
.location
metrics
Where the json metrics file located
string
1.2.0
kyuubi.metrics
.prometheus.path
/metrics
URI context path of prometheus metrics HTTP server
string
1.2.0
kyuubi.metrics
.prometheus.port
10019
Prometheus metrics HTTP server port
int
1.2.0
kyuubi.metrics
.reporters
JSON
A comma separated list for all metrics reporters
  • CONSOLE - ConsoleReporter which outputs measurements to CONSOLE periodically.
  • JMX - JmxReporter which listens for new metrics and exposes them as MBeans.
  • JSON - JsonReporter which outputs measurements to json file periodically.
  • PROMETHEUS - PrometheusReporter which exposes metrics in prometheus format.
  • SLF4J - Slf4jReporter which outputs measurements to system log periodically.
seq
1.2.0
kyuubi.metrics.slf4j
.interval
PT5S
How often should report metrics to SLF4J logger
duration
1.2.0

Operation

Key Default Meaning Type Since
kyuubi.operation.idle
.timeout
PT3H
Operation will be closed when it’s not accessed for this duration of time
duration
1.0.0
kyuubi.operation
.interrupt.on.cancel
true
When true, all running tasks will be interrupted if one cancels a query. When false, all running tasks will remain until finished.
boolean
1.2.0
kyuubi.operation
.query.timeout
<undefined>
Timeout for query executions at server-side, take affect with client-side timeout(java.sql.Statement.setQueryTimeout) together, a running query will be cancelled automatically if timeout. It’s off by default, which means only client-side take fully control whether the query should timeout or not. If set, client-side timeout capped at this point. To cancel the queries right away without waiting task to finish, consider enabling kyuubi.operation.interrupt.on.cancel together.
duration
1.2.0
kyuubi.operation
.scheduler.pool
<undefined>
The scheduler pool of job. Note that, this config should be used after change Spark config spark.scheduler.mode=FAIR.
string
1.1.1
kyuubi.operation
.status.polling
.timeout
PT5S
Timeout(ms) for long polling asynchronous running sql query’s status
duration
1.0.0

Session

Key Default Meaning Type Since
kyuubi.session.check
.interval
PT5M
The check interval for session timeout.
duration
1.0.0
kyuubi.session.conf
.ignore.list
A comma separated list of ignored keys. If the client connection contains any of them, the key and the corresponding value will be removed silently during engine bootstrap and connection setup. Note that this rule is for server-side protection defined via administrators to prevent some essential configs from tampering but will not forbid users to set dynamic configurations via SET syntax.
seq
1.2.0
kyuubi.session.conf
.restrict.list
A comma separated list of restricted keys. If the client connection contains any of them, the connection will be rejected explicitly during engine bootstrap and connection setup. Note that this rule is for server-side protection defined via administrators to prevent some essential configs from tampering but will not forbid users to set dynamic configurations via SET syntax.
seq
1.2.0
kyuubi.session.engine
.check.interval
PT5M
The check interval for engine timeout
duration
1.0.0
kyuubi.session.engine
.idle.timeout
PT30M
engine timeout, the engine will self-terminate when it’s not accessed for this duration
duration
1.0.0
kyuubi.session.engine
.initialize.timeout
PT1M
Timeout for starting the background engine, e.g. SparkSQLEngine.
duration
1.0.0
kyuubi.session.engine
.log.timeout
PT24H
If we use Spark as the engine then the session submit log is the console output of spark-submit. We will retain the session submit log until over the config value.
duration
1.1.0
kyuubi.session.engine
.login.timeout
PT15S
The timeout(ms) of creating the connection to remote sql query engine
duration
1.0.0
kyuubi.session.engine
.share.level
USER
(deprecated) - Using kyuubi.engine.share.level instead
string
1.0.0
kyuubi.session.engine
.spark.main.resource
<undefined>
The package used to create Spark SQL engine remote application. If it is undefined, Kyuubi will use the default
string
1.0.0
kyuubi.session.engine
.startup.error.max
.size
8192
During engine bootstrapping, if error occurs, using this config to limit the length error message(characters).
int
1.1.0
kyuubi.session.idle
.timeout
PT6H
session idle timeout, it will be closed when it’s not accessed for this duration
duration
1.2.0
kyuubi.session
.timeout
PT6H
(deprecated)session timeout, it will be closed when it’s not accessed for this duration
duration
1.0.0

Zookeeper

Key Default Meaning Type Since
kyuubi.zookeeper
.embedded.client.port
2181
clientPort for the embedded zookeeper server to listen for client connections, a client here could be Kyuubi server, engine and JDBC client
int
1.2.0
kyuubi.zookeeper
.embedded.client.port
.address
<undefined>
clientPortAddress for the embedded zookeeper server to
string
1.2.0
kyuubi.zookeeper
.embedded.data.dir
embedded_zookeeper
dataDir for the embedded zookeeper server where stores the in-memory database snapshots and, unless specified otherwise, the transaction log of updates to the database.
string
1.2.0
kyuubi.zookeeper
.embedded.data.log.dir
embedded_zookeeper
dataLogDir for the embedded zookeeper server where writes the transaction log .
string
1.2.0
kyuubi.zookeeper
.embedded.directory
embedded_zookeeper
The temporary directory for the embedded zookeeper server
string
1.0.0
kyuubi.zookeeper
.embedded.max.client
.connections
120
maxClientCnxns for the embedded zookeeper server to limits the number of concurrent connections of a single client identified by IP address
int
1.2.0
kyuubi.zookeeper
.embedded.max.session
.timeout
60000
maxSessionTimeout in milliseconds for the embedded zookeeper server will allow the client to negotiate. Defaults to 20 times the tickTime
int
1.2.0
kyuubi.zookeeper
.embedded.min.session
.timeout
6000
minSessionTimeout in milliseconds for the embedded zookeeper server will allow the client to negotiate. Defaults to 2 times the tickTime
int
1.2.0
kyuubi.zookeeper
.embedded.port
2181
The port of the embedded zookeeper server
int
1.0.0
kyuubi.zookeeper
.embedded.tick.time
3000
tickTime in milliseconds for the embedded zookeeper server
int
1.2.0

Spark Configurations

Via spark-defaults.conf

Setting them in $SPARK_HOME/conf/spark-defaults.conf supplies with default values for SQL engine application. Available properties can be found at Spark official online documentation for Spark Configurations

Via kyuubi-defaults.conf

Setting them in $KYUUBI_HOME/conf/kyuubi-defaults.conf supplies with default values for SQL engine application too. These properties will override all settings in $SPARK_HOME/conf/spark-defaults.conf

Via JDBC Connection URL

Setting them in the JDBC Connection URL supplies session-specific for each SQL engine. For example: jdbc:hive2://localhost:10009/default;#spark.sql.shuffle.partitions=2;spark.executor.memory=5g

  • Runtime SQL Configuration

  • Static SQL and Spark Core Configuration

    • For Static SQL Configurations and other spark core configs, e.g. spark.executor.memory, they will take affect if there is no existing SQL engine application. Otherwise, they will just be ignored

Via SET Syntax

Please refer to the Spark official online documentation for SET Command

Logging

Kyuubi uses log4j for logging. You can configure it using $KYUUBI_HOME/conf/log4j.properties.

  1. #
  2. # Licensed to the Apache Software Foundation (ASF) under one or more
  3. # contributor license agreements. See the NOTICE file distributed with
  4. # this work for additional information regarding copyright ownership.
  5. # The ASF licenses this file to You under the Apache License, Version 2.0
  6. # (the "License"); you may not use this file except in compliance with
  7. # the License. You may obtain a copy of the License at
  8. #
  9. # http://www.apache.org/licenses/LICENSE-2.0
  10. #
  11. # Unless required by applicable law or agreed to in writing, software
  12. # distributed under the License is distributed on an "AS IS" BASIS,
  13. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  14. # See the License for the specific language governing permissions and
  15. # limitations under the License.
  16. #
  17. # Set everything to be logged to the console
  18. log4j.rootCategory=INFO, console
  19. log4j.appender.console=org.apache.log4j.ConsoleAppender
  20. log4j.appender.console.target=System.err
  21. log4j.appender.console.layout=org.apache.log4j.PatternLayout
  22. log4j.appender.console.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss.SSS} %p %c{2}: %m%n
  23. # Set the default kyuubi-ctl log level to WARN. When running the kyuubi-ctl, the
  24. # log level for this class is used to overwrite the root logger's log level.
  25. log4j.logger.org.apache.kyuubi.ctl.ServiceControlCli=ERROR

Other Configurations

Hadoop Configurations

Specifying HADOOP_CONF_DIR to the directory contains hadoop configuration files or treating them as Spark properties with a spark.hadoop. prefix. Please refer to the Spark official online documentation for Inheriting Hadoop Cluster Configuration. Also, please refer to the Apache Hadoop‘s online documentation for an overview on how to configure Hadoop.

Hive Configurations

These configurations are used for SQL engine application to talk to Hive MetaStore and could be configured in a hive-site.xml. Placed it in $SPARK_HOME/conf directory, or treating them as Spark properties with a spark.hadoop. prefix.

User Defaults

In Kyuubi, we can configure user default settings to meet separate needs. These user defaults override system defaults, but will be overridden by those from JDBC Connection URL or Set Command if could be. They will take effect when creating the SQL engine application ONLY. User default settings are in the form of ___{username}___.{config key}. There are three continuous underscores(_) at both sides of the username and a dot(.) that separates the config key and the prefix. For example:

  1. # For system defaults
  2. spark.master=local
  3. spark.sql.adaptive.enabled=true
  4. # For a user named kent
  5. ___kent___.spark.master=yarn
  6. ___kent___.spark.sql.adaptive.enabled=false
  7. # For a user named bob
  8. ___bob___.spark.master=spark://master:7077
  9. ___bob___.spark.executor.memory=8g

In the above case, if there are related configurations from JDBC Connection URL, kent will run his SQL engine application on YARN and prefer the Spark AQE to be off, while bob will activate his SQL engine application on a Spark standalone cluster with 8g heap memory for each executor and obey the Spark AQE behavior of Kyuubi system default. On the other hand, for those users who do not have custom configurations will use system defaults.