Kyuubi uses Apache Log4j2 for logging since version v1.5.0. For versions v1.4.1 and below, it uses Apache Log4j.

In general, there are mainly three components in the Kyuubi architecture that will produce component-oriented logs to help you trace breadcrumbs for SQL workloads against Kyuubi.

  • Logs of Kyuubi Server
  • Logs of Kyuubi Engines
  • Operation logs

In addition, a Kyuubi deployment for production usually relies on some other external systems. For example, both Kyuubi servers and engines will use Apache Zookeeper for service discovery. The instructions for external system loggings will not be included in this article.

Logs of Kyuubi Server

Logs of Kyuubi Server show us the activities of the server instance including how start/stop, how does it response client requests, etc.

Configuring Server Logging

Basic Configurations

You can configure it by adding a log4j2.xml file in the $KYUUBI_HOME/conf directory. One way to start is to make a copy of the existing log4j2.xml.template located there.

For example,

  1. # cd $KYUUBI_HOME
  2. cp conf/log4j2.xml.template conf/log4j2.xml

With or without the above step, by default the server logging will redirect the logs to a file named kyuubi-${env:USER}-org.apache.kyuubi.server.KyuubiServer-${env:HOSTNAME}.out under the directory of $KYUUBI_HOME/logs.

For example, you can easily find where the server log goes when staring a Kyuubi server from the console output.

  1. $ export SPARK_HOME=/Users/kentyao/Downloads/spark/spark-3.2.0-bin-hadoop3.2
  2. $ cd ~/svn-kyuubi/v1.3.1-incubating-rc0/apache-kyuubi-1.3.1-incubating-bin
  3. $ bin/kyuubi start
  1. Starting Kyuubi Server from /Users/kentyao/svn-kyuubi/v1.3.1-incubating-rc0/apache-kyuubi-1.3.1-incubating-bin
  2. Warn: Not find kyuubi environment file /Users/kentyao/svn-kyuubi/v1.3.1-incubating-rc0/apache-kyuubi-1.3.1-incubating-bin/conf/kyuubi-env.sh, using default ones...
  3. JAVA_HOME: /Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home
  4. KYUUBI_HOME: /Users/kentyao/svn-kyuubi/v1.3.1-incubating-rc0/apache-kyuubi-1.3.1-incubating-bin
  5. KYUUBI_CONF_DIR: /Users/kentyao/svn-kyuubi/v1.3.1-incubating-rc0/apache-kyuubi-1.3.1-incubating-bin/conf
  6. KYUUBI_LOG_DIR: /Users/kentyao/svn-kyuubi/v1.3.1-incubating-rc0/apache-kyuubi-1.3.1-incubating-bin/logs
  7. KYUUBI_PID_DIR: /Users/kentyao/svn-kyuubi/v1.3.1-incubating-rc0/apache-kyuubi-1.3.1-incubating-bin/pid
  8. KYUUBI_WORK_DIR_ROOT: /Users/kentyao/svn-kyuubi/v1.3.1-incubating-rc0/apache-kyuubi-1.3.1-incubating-bin/work
  9. SPARK_HOME: /Users/kentyao/Downloads/spark/spark-3.2.0-bin-hadoop3.2
  10. SPARK_CONF_DIR: /Users/kentyao/Downloads/spark/spark-3.2.0-bin-hadoop3.2/conf
  11. HADOOP_CONF_DIR:
  12. YARN_CONF_DIR:
  13. Starting org.apache.kyuubi.server.KyuubiServer, logging to /Users/kentyao/svn-kyuubi/v1.3.1-incubating-rc0/apache-kyuubi-1.3.1-incubating-bin/logs/kyuubi-kentyao-org.apache.kyuubi.server.KyuubiServer-hulk.local.out
  14. Welcome to
  15. __ __ __
  16. /\ \/\ \ /\ \ __
  17. \ \ \/'/' __ __ __ __ __ __\ \ \____/\_\
  18. \ \ , < /\ \/\ \/\ \/\ \/\ \/\ \\ \ '__`\/\ \
  19. \ \ \\`\\ \ \_\ \ \ \_\ \ \ \_\ \\ \ \L\ \ \ \
  20. \ \_\ \_\/`____ \ \____/\ \____/ \ \_,__/\ \_\
  21. \/_/\/_/`/___/> \/___/ \/___/ \/___/ \/_/
  22. /\___/
  23. \/__/

KYUUBI_LOG_DIR

You may also notice that there is an environment variable called KYUUBI_LOG_DIR in the above example.

KYUUBI_LOG_DIR determines which folder we want to put our server log files.

For example, the below command will locate the log files to /Users/kentyao/tmp.

  1. $ mkdir /Users/kentyao/tmp
  2. $ KYUUBI_LOG_DIR=/Users/kentyao/tmp bin/kyuubi start
  1. Starting org.apache.kyuubi.server.KyuubiServer, logging to /Users/kentyao/tmp/kyuubi-kentyao-org.apache.kyuubi.server.KyuubiServer-hulk.local.out

KYUUBI_MAX_LOG_FILES

KYUUBI_MAX_LOG_FILES controls how many log files will be remained after a Kyuubi server reboots.

Custom Log4j2 Settings

Taking control of $KYUUBI_HOME/conf/log4j2.xml will also give us the ability of customizing server logging as we want.

For example, we can disable the console appender and enable the file appender like,

  1. <Configuration status="INFO">
  2. <Appenders>
  3. <File name="fa" fileName="log/dummy.log">
  4. <PatternLayout pattern="%d{yyyy-MM-dd HH:mm:ss.SSS} %p %c: %m%n%ex"/>
  5. <Filters>
  6. <RegexFilter regex=".*Thrift error occurred during processing of message.*" onMatch="DENY" onMismatch="NEUTRAL"/>
  7. </Filters>
  8. </File>
  9. </Appenders>
  10. <Loggers>
  11. <Root level="INFO">
  12. <AppenderRef ref="fa"/>
  13. </Root>
  14. </Loggers>
  15. </Configuration>

Then everything goes to log/dummy.log.

Logs of Spark SQL Engine

Spark SQL Engine is one type of Kyuubi Engines and also a typical Spark application. Thus, its logs mainly contain the logs of a Spark Driver. Meanwhile, it also includes how all the services of an engine start/stop, how does it response the incoming calls from Kyuubi servers, etc.

In general, when an exception occurs, we are able to find more information and clues in the engine’s logs.

Configuring Engine Logging

Please refer to Apache Spark online documentation -Configuring Logging for instructions.

Where to Find the Engine Log

The engine logs locate differently based on the deploy mode and the cluster manager. When using local backend or client deploy mode for other cluster managers, such as YARN, you can find the whole engine log in $KYUUBI_WORK_DIR_ROOT/${session username}/kyuubi-spark-sql-engine.log.${num}. Different session users have different folders to group all live and historical engine logs. Each engine will have one and only engine log. When using cluster deploy mode, the local engine logs only contain very little information, the main parts of engine logs are on the remote driver side, e.g. for YARN cluster, they are in ApplicationMasters’ log.

Flink SQL Engine is one type of Kyuubi Engines and also a typical Flink application. Thus, its logs mainly contain the logs of a Flink JobManager and TaskManager. Meanwhile, it also includes how all the services of an engine start/stop, how does it response the incoming calls from Kyuubi servers, etc.

In general, when an exception occurs, we are able to find more information and clues in the engine’s logs.

Configuring Engine Logging

Please refer to Apache Flink online documentation -Configuring Logging for instructions.

Where to Find the Engine Log

The engine logs locate differently based on the deploy mode and the cluster manager. When using local backend or client deploy mode for other cluster managers, such as YARN, you can find the whole engine log in $KYUUBI_WORK_DIR_ROOT/${session username}/kyuubi-flink-sql-engine.log.${num}. Different session users have different folders to group all live and historical engine logs. Each engine will have one and only engine log. When using cluster deploy mode, the local engine logs only contain very little information, the main parts of engine logs are on the remote driver side, e.g. for YARN cluster, they are in ApplicationMasters’ log.

Operation Logs

Operation log will show how SQL queries are executed, such as query planning, execution, and statistic reports.

Operation logs can reveal directly to end-users how their queries are being executed on the server/engine-side, including some process-oriented information, and why their queries are slow or in error.

For example, when you, as an end-user, use beeline to connect a Kyuubi server and execute query like below.

  1. bin/beeline -u 'jdbc:hive2://10.242.189.214:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=kyuubi' -n kent -e 'select * from src;'

You will both get the final results and the corresponding operation logs telling you the journey of the query.

  1. 0: jdbc:hive2://10.242.189.214:2181/> select * from src;
  2. 2021-10-27 17:00:19.399 INFO operation.ExecuteStatement: Processing kent's query[fb5f57d2-2b50-4a46-961b-3a5c6a2d2597]: INITIALIZED_STATE -> PENDING_STATE, statement: select * from src
  3. 2021-10-27 17:00:19.401 INFO operation.ExecuteStatement: Processing kent's query[fb5f57d2-2b50-4a46-961b-3a5c6a2d2597]: PENDING_STATE -> RUNNING_STATE, statement: select * from src
  4. 2021-10-27 17:00:19.400 INFO operation.ExecuteStatement: Processing kent's query[26e169a2-6c06-450a-b758-e577ac673d70]: INITIALIZED_STATE -> PENDING_STATE, statement: select * from src
  5. 2021-10-27 17:00:19.401 INFO operation.ExecuteStatement: Processing kent's query[26e169a2-6c06-450a-b758-e577ac673d70]: PENDING_STATE -> RUNNING_STATE, statement: select * from src
  6. 2021-10-27 17:00:19.402 INFO operation.ExecuteStatement:
  7. Spark application name: kyuubi_USER_kent_6d4b5e53-ddd2-420c-b04f-326fb2b17e18
  8. application ID: local-1635318669122
  9. application web UI: http://10.242.189.214:50250
  10. master: local[*]
  11. deploy mode: client
  12. version: 3.2.0
  13. Start time: 2021-10-27T15:11:08.416
  14. User: kent
  15. 2021-10-27 17:00:19.408 INFO metastore.HiveMetaStore: 6: get_database: default
  16. 2021-10-27 17:00:19.408 INFO HiveMetaStore.audit: ugi=kent ip=unknown-ip-addr cmd=get_database: default
  17. 2021-10-27 17:00:19.424 WARN conf.HiveConf: HiveConf of name hive.internal.ss.authz.settings.applied.marker does not exist
  18. 2021-10-27 17:00:19.424 WARN conf.HiveConf: HiveConf of name hive.stats.jdbc.timeout does not exist
  19. 2021-10-27 17:00:19.424 WARN conf.HiveConf: HiveConf of name hive.stats.retries.wait does not exist
  20. 2021-10-27 17:00:19.424 INFO metastore.HiveMetaStore: 6: Opening raw store with implementation class:org.apache.hadoop.hive.metastore.ObjectStore
  21. 2021-10-27 17:00:19.425 INFO metastore.ObjectStore: ObjectStore, initialize called
  22. 2021-10-27 17:00:19.430 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
  23. 2021-10-27 17:00:19.431 INFO metastore.ObjectStore: Initialized ObjectStore
  24. 2021-10-27 17:00:19.434 INFO metastore.HiveMetaStore: 6: get_table : db=default tbl=src
  25. 2021-10-27 17:00:19.434 INFO HiveMetaStore.audit: ugi=kent ip=unknown-ip-addr cmd=get_table : db=default tbl=src
  26. 2021-10-27 17:00:19.449 INFO metastore.HiveMetaStore: 6: get_table : db=default tbl=src
  27. 2021-10-27 17:00:19.450 INFO HiveMetaStore.audit: ugi=kent ip=unknown-ip-addr cmd=get_table : db=default tbl=src
  28. 2021-10-27 17:00:19.510 INFO operation.ExecuteStatement: Processing kent's query[26e169a2-6c06-450a-b758-e577ac673d70]: RUNNING_STATE -> RUNNING_STATE, statement: select * from src
  29. 2021-10-27 17:00:19.544 INFO memory.MemoryStore: Block broadcast_5 stored as values in memory (estimated size 343.6 KiB, free 408.6 MiB)
  30. 2021-10-27 17:00:19.558 INFO memory.MemoryStore: Block broadcast_5_piece0 stored as bytes in memory (estimated size 33.5 KiB, free 408.5 MiB)
  31. 2021-10-27 17:00:19.559 INFO spark.SparkContext: Created broadcast 5 from
  32. 2021-10-27 17:00:19.600 INFO mapred.FileInputFormat: Total input files to process : 1
  33. 2021-10-27 17:00:19.627 INFO spark.SparkContext: Starting job: collect at ExecuteStatement.scala:97
  34. 2021-10-27 17:00:19.629 INFO kyuubi.SQLOperationListener: Query [26e169a2-6c06-450a-b758-e577ac673d70]: Job 5 started with 1 stages, 1 active jobs running
  35. 2021-10-27 17:00:19.631 INFO kyuubi.SQLOperationListener: Query [26e169a2-6c06-450a-b758-e577ac673d70]: Stage 5 started with 1 tasks, 1 active stages running
  36. 2021-10-27 17:00:19.713 INFO kyuubi.SQLOperationListener: Finished stage: Stage(5, 0); Name: 'collect at ExecuteStatement.scala:97'; Status: succeeded; numTasks: 1; Took: 83 msec
  37. 2021-10-27 17:00:19.713 INFO scheduler.DAGScheduler: Job 5 finished: collect at ExecuteStatement.scala:97, took 0.085454 s
  38. 2021-10-27 17:00:19.713 INFO scheduler.StatsReportListener: task runtime:(count: 1, mean: 78.000000, stdev: 0.000000, max: 78.000000, min: 78.000000)
  39. 2021-10-27 17:00:19.713 INFO scheduler.StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100%
  40. 2021-10-27 17:00:19.713 INFO scheduler.StatsReportListener: 78.0 ms 78.0 ms 78.0 ms 78.0 ms 78.0 ms 78.0 ms 78.0 ms 78.0 ms 78.0 ms
  41. 2021-10-27 17:00:19.714 INFO scheduler.StatsReportListener: shuffle bytes written:(count: 1, mean: 0.000000, stdev: 0.000000, max: 0.000000, min: 0.000000)
  42. 2021-10-27 17:00:19.714 INFO scheduler.StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100%
  43. 2021-10-27 17:00:19.714 INFO scheduler.StatsReportListener: 0.0 B 0.0 B 0.0 B 0.0 B 0.0 B 0.0 B 0.0 B 0.0 B 0.0 B
  44. 2021-10-27 17:00:19.714 INFO scheduler.StatsReportListener: fetch wait time:(count: 1, mean: 0.000000, stdev: 0.000000, max: 0.000000, min: 0.000000)
  45. 2021-10-27 17:00:19.714 INFO scheduler.StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100%
  46. 2021-10-27 17:00:19.714 INFO scheduler.StatsReportListener: 0.0 ms 0.0 ms 0.0 ms 0.0 ms 0.0 ms 0.0 ms 0.0 ms 0.0 ms 0.0 ms
  47. 2021-10-27 17:00:19.715 INFO scheduler.StatsReportListener: remote bytes read:(count: 1, mean: 0.000000, stdev: 0.000000, max: 0.000000, min: 0.000000)
  48. 2021-10-27 17:00:19.715 INFO scheduler.StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100%
  49. 2021-10-27 17:00:19.715 INFO scheduler.StatsReportListener: 0.0 B 0.0 B 0.0 B 0.0 B 0.0 B 0.0 B 0.0 B 0.0 B 0.0 B
  50. 2021-10-27 17:00:19.715 INFO scheduler.StatsReportListener: task result size:(count: 1, mean: 1471.000000, stdev: 0.000000, max: 1471.000000, min: 1471.000000)
  51. 2021-10-27 17:00:19.715 INFO scheduler.StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100%
  52. 2021-10-27 17:00:19.715 INFO scheduler.StatsReportListener: 1471.0 B 1471.0 B 1471.0 B 1471.0 B 1471.0 B 1471.0 B 1471.0 B 1471.0 B 1471.0 B
  53. 2021-10-27 17:00:19.717 INFO scheduler.StatsReportListener: executor (non-fetch) time pct: (count: 1, mean: 61.538462, stdev: 0.000000, max: 61.538462, min: 61.538462)
  54. 2021-10-27 17:00:19.717 INFO scheduler.StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100%
  55. 2021-10-27 17:00:19.717 INFO scheduler.StatsReportListener: 62 % 62 % 62 % 62 % 62 % 62 % 62 % 62 % 62 %
  56. 2021-10-27 17:00:19.718 INFO scheduler.StatsReportListener: fetch wait time pct: (count: 1, mean: 0.000000, stdev: 0.000000, max: 0.000000, min: 0.000000)
  57. 2021-10-27 17:00:19.718 INFO scheduler.StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100%
  58. 2021-10-27 17:00:19.718 INFO scheduler.StatsReportListener: 0 % 0 % 0 % 0 % 0 % 0 % 0 % 0 % 0 %
  59. 2021-10-27 17:00:19.718 INFO scheduler.StatsReportListener: other time pct: (count: 1, mean: 38.461538, stdev: 0.000000, max: 38.461538, min: 38.461538)
  60. 2021-10-27 17:00:19.718 INFO scheduler.StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100%
  61. 2021-10-27 17:00:19.718 INFO scheduler.StatsReportListener: 38 % 38 % 38 % 38 % 38 % 38 % 38 % 38 % 38 %
  62. 2021-10-27 17:00:19.719 INFO kyuubi.SQLOperationListener: Query [26e169a2-6c06-450a-b758-e577ac673d70]: Job 5 succeeded, 0 active jobs running
  63. 2021-10-27 17:00:19.728 INFO codegen.CodeGenerator: Code generated in 12.277091 ms
  64. 2021-10-27 17:00:19.729 INFO operation.ExecuteStatement: Processing kent's query[26e169a2-6c06-450a-b758-e577ac673d70]: RUNNING_STATE -> FINISHED_STATE, statement: select * from src, time taken: 0.328 seconds
  65. 2021-10-27 17:00:19.731 INFO operation.ExecuteStatement: Query[fb5f57d2-2b50-4a46-961b-3a5c6a2d2597] in FINISHED_STATE
  66. 2021-10-27 17:00:19.731 INFO operation.ExecuteStatement: Processing kent's query[fb5f57d2-2b50-4a46-961b-3a5c6a2d2597]: RUNNING_STATE -> FINISHED_STATE, statement: select * from src, time taken: 0.33 seconds
  67. +-------------------------------------------------+--------------------+
  68. | version() | DATE '2021-10-27' |
  69. +-------------------------------------------------+--------------------+
  70. | 3.2.0 5d45a415f3a29898d92380380cfd82bfc7f579ea | 2021-10-27 |
  71. +-------------------------------------------------+--------------------+
  72. 1 row selected (0.341 seconds)

Further Readings