Kyuubi uses Apache Log4j2 for logging since version v1.5.0. For versions v1.4.1 and below, it uses Apache Log4j.
In general, there are mainly three components in the Kyuubi architecture that will produce component-oriented logs to help you trace breadcrumbs for SQL workloads against Kyuubi.
- Logs of Kyuubi Server
- Logs of Kyuubi Engines
- Operation logs
In addition, a Kyuubi deployment for production usually relies on some other external systems. For example, both Kyuubi servers and engines will use Apache Zookeeper for service discovery. The instructions for external system loggings will not be included in this article.
Logs of Kyuubi Server
Logs of Kyuubi Server show us the activities of the server instance including how start/stop, how does it response client requests, etc.
Configuring Server Logging
Basic Configurations
You can configure it by adding a log4j2.xml file in the $KYUUBI_HOME/conf directory.
One way to start is to make a copy of the existing log4j2.xml.template located there.
For example,
# cd $KYUUBI_HOMEcp conf/log4j2.xml.template conf/log4j2.xml
With or without the above step, by default the server logging will redirect the logs to a file named kyuubi-${env:USER}-org.apache.kyuubi.server.KyuubiServer-${env:HOSTNAME}.out under the directory of $KYUUBI_HOME/logs.
For example, you can easily find where the server log goes when staring a Kyuubi server from the console output.
$ export SPARK_HOME=/Users/kentyao/Downloads/spark/spark-3.2.0-bin-hadoop3.2$ cd ~/svn-kyuubi/v1.3.1-incubating-rc0/apache-kyuubi-1.3.1-incubating-bin$ bin/kyuubi start
Starting Kyuubi Server from /Users/kentyao/svn-kyuubi/v1.3.1-incubating-rc0/apache-kyuubi-1.3.1-incubating-binWarn: Not find kyuubi environment file /Users/kentyao/svn-kyuubi/v1.3.1-incubating-rc0/apache-kyuubi-1.3.1-incubating-bin/conf/kyuubi-env.sh, using default ones...JAVA_HOME: /Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/HomeKYUUBI_HOME: /Users/kentyao/svn-kyuubi/v1.3.1-incubating-rc0/apache-kyuubi-1.3.1-incubating-binKYUUBI_CONF_DIR: /Users/kentyao/svn-kyuubi/v1.3.1-incubating-rc0/apache-kyuubi-1.3.1-incubating-bin/confKYUUBI_LOG_DIR: /Users/kentyao/svn-kyuubi/v1.3.1-incubating-rc0/apache-kyuubi-1.3.1-incubating-bin/logsKYUUBI_PID_DIR: /Users/kentyao/svn-kyuubi/v1.3.1-incubating-rc0/apache-kyuubi-1.3.1-incubating-bin/pidKYUUBI_WORK_DIR_ROOT: /Users/kentyao/svn-kyuubi/v1.3.1-incubating-rc0/apache-kyuubi-1.3.1-incubating-bin/workSPARK_HOME: /Users/kentyao/Downloads/spark/spark-3.2.0-bin-hadoop3.2SPARK_CONF_DIR: /Users/kentyao/Downloads/spark/spark-3.2.0-bin-hadoop3.2/confHADOOP_CONF_DIR:YARN_CONF_DIR:Starting org.apache.kyuubi.server.KyuubiServer, logging to /Users/kentyao/svn-kyuubi/v1.3.1-incubating-rc0/apache-kyuubi-1.3.1-incubating-bin/logs/kyuubi-kentyao-org.apache.kyuubi.server.KyuubiServer-hulk.local.outWelcome to__ __ __/\ \/\ \ /\ \ __\ \ \/'/' __ __ __ __ __ __\ \ \____/\_\\ \ , < /\ \/\ \/\ \/\ \/\ \/\ \\ \ '__`\/\ \\ \ \\`\\ \ \_\ \ \ \_\ \ \ \_\ \\ \ \L\ \ \ \\ \_\ \_\/`____ \ \____/\ \____/ \ \_,__/\ \_\\/_/\/_/`/___/> \/___/ \/___/ \/___/ \/_//\___/\/__/
KYUUBI_LOG_DIR
You may also notice that there is an environment variable called KYUUBI_LOG_DIR in the above example.
KYUUBI_LOG_DIR determines which folder we want to put our server log files.
For example, the below command will locate the log files to /Users/kentyao/tmp.
$ mkdir /Users/kentyao/tmp$ KYUUBI_LOG_DIR=/Users/kentyao/tmp bin/kyuubi start
Starting org.apache.kyuubi.server.KyuubiServer, logging to /Users/kentyao/tmp/kyuubi-kentyao-org.apache.kyuubi.server.KyuubiServer-hulk.local.out
KYUUBI_MAX_LOG_FILES
KYUUBI_MAX_LOG_FILES controls how many log files will be remained after a Kyuubi server reboots.
Custom Log4j2 Settings
Taking control of $KYUUBI_HOME/conf/log4j2.xml will also give us the ability of customizing server logging as we want.
For example, we can disable the console appender and enable the file appender like,
<Configuration status="INFO"><Appenders><File name="fa" fileName="log/dummy.log"><PatternLayout pattern="%d{yyyy-MM-dd HH:mm:ss.SSS} %p %c: %m%n%ex"/><Filters><RegexFilter regex=".*Thrift error occurred during processing of message.*" onMatch="DENY" onMismatch="NEUTRAL"/></Filters></File></Appenders><Loggers><Root level="INFO"><AppenderRef ref="fa"/></Root></Loggers></Configuration>
Then everything goes to log/dummy.log.
Logs of Spark SQL Engine
Spark SQL Engine is one type of Kyuubi Engines and also a typical Spark application. Thus, its logs mainly contain the logs of a Spark Driver. Meanwhile, it also includes how all the services of an engine start/stop, how does it response the incoming calls from Kyuubi servers, etc.
In general, when an exception occurs, we are able to find more information and clues in the engine’s logs.
Configuring Engine Logging
Please refer to Apache Spark online documentation -Configuring Logging for instructions.
Where to Find the Engine Log
The engine logs locate differently based on the deploy mode and the cluster manager.
When using local backend or client deploy mode for other cluster managers, such as YARN, you can find the whole engine log in $KYUUBI_WORK_DIR_ROOT/${session username}/kyuubi-spark-sql-engine.log.${num}.
Different session users have different folders to group all live and historical engine logs.
Each engine will have one and only engine log.
When using cluster deploy mode, the local engine logs only contain very little information, the main parts of engine logs are on the remote driver side, e.g. for YARN cluster, they are in ApplicationMasters’ log.
Logs of Flink SQL Engine
Flink SQL Engine is one type of Kyuubi Engines and also a typical Flink application. Thus, its logs mainly contain the logs of a Flink JobManager and TaskManager. Meanwhile, it also includes how all the services of an engine start/stop, how does it response the incoming calls from Kyuubi servers, etc.
In general, when an exception occurs, we are able to find more information and clues in the engine’s logs.
Configuring Engine Logging
Please refer to Apache Flink online documentation -Configuring Logging for instructions.
Where to Find the Engine Log
The engine logs locate differently based on the deploy mode and the cluster manager.
When using local backend or client deploy mode for other cluster managers, such as YARN, you can find the whole engine log in $KYUUBI_WORK_DIR_ROOT/${session username}/kyuubi-flink-sql-engine.log.${num}.
Different session users have different folders to group all live and historical engine logs.
Each engine will have one and only engine log.
When using cluster deploy mode, the local engine logs only contain very little information, the main parts of engine logs are on the remote driver side, e.g. for YARN cluster, they are in ApplicationMasters’ log.
Operation Logs
Operation log will show how SQL queries are executed, such as query planning, execution, and statistic reports.
Operation logs can reveal directly to end-users how their queries are being executed on the server/engine-side, including some process-oriented information, and why their queries are slow or in error.
For example, when you, as an end-user, use beeline to connect a Kyuubi server and execute query like below.
bin/beeline -u 'jdbc:hive2://10.242.189.214:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=kyuubi' -n kent -e 'select * from src;'
You will both get the final results and the corresponding operation logs telling you the journey of the query.
0: jdbc:hive2://10.242.189.214:2181/> select * from src;2021-10-27 17:00:19.399 INFO operation.ExecuteStatement: Processing kent's query[fb5f57d2-2b50-4a46-961b-3a5c6a2d2597]: INITIALIZED_STATE -> PENDING_STATE, statement: select * from src2021-10-27 17:00:19.401 INFO operation.ExecuteStatement: Processing kent's query[fb5f57d2-2b50-4a46-961b-3a5c6a2d2597]: PENDING_STATE -> RUNNING_STATE, statement: select * from src2021-10-27 17:00:19.400 INFO operation.ExecuteStatement: Processing kent's query[26e169a2-6c06-450a-b758-e577ac673d70]: INITIALIZED_STATE -> PENDING_STATE, statement: select * from src2021-10-27 17:00:19.401 INFO operation.ExecuteStatement: Processing kent's query[26e169a2-6c06-450a-b758-e577ac673d70]: PENDING_STATE -> RUNNING_STATE, statement: select * from src2021-10-27 17:00:19.402 INFO operation.ExecuteStatement:Spark application name: kyuubi_USER_kent_6d4b5e53-ddd2-420c-b04f-326fb2b17e18application ID: local-1635318669122application web UI: http://10.242.189.214:50250master: local[*]deploy mode: clientversion: 3.2.0Start time: 2021-10-27T15:11:08.416User: kent2021-10-27 17:00:19.408 INFO metastore.HiveMetaStore: 6: get_database: default2021-10-27 17:00:19.408 INFO HiveMetaStore.audit: ugi=kent ip=unknown-ip-addr cmd=get_database: default2021-10-27 17:00:19.424 WARN conf.HiveConf: HiveConf of name hive.internal.ss.authz.settings.applied.marker does not exist2021-10-27 17:00:19.424 WARN conf.HiveConf: HiveConf of name hive.stats.jdbc.timeout does not exist2021-10-27 17:00:19.424 WARN conf.HiveConf: HiveConf of name hive.stats.retries.wait does not exist2021-10-27 17:00:19.424 INFO metastore.HiveMetaStore: 6: Opening raw store with implementation class:org.apache.hadoop.hive.metastore.ObjectStore2021-10-27 17:00:19.425 INFO metastore.ObjectStore: ObjectStore, initialize called2021-10-27 17:00:19.430 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY2021-10-27 17:00:19.431 INFO metastore.ObjectStore: Initialized ObjectStore2021-10-27 17:00:19.434 INFO metastore.HiveMetaStore: 6: get_table : db=default tbl=src2021-10-27 17:00:19.434 INFO HiveMetaStore.audit: ugi=kent ip=unknown-ip-addr cmd=get_table : db=default tbl=src2021-10-27 17:00:19.449 INFO metastore.HiveMetaStore: 6: get_table : db=default tbl=src2021-10-27 17:00:19.450 INFO HiveMetaStore.audit: ugi=kent ip=unknown-ip-addr cmd=get_table : db=default tbl=src2021-10-27 17:00:19.510 INFO operation.ExecuteStatement: Processing kent's query[26e169a2-6c06-450a-b758-e577ac673d70]: RUNNING_STATE -> RUNNING_STATE, statement: select * from src2021-10-27 17:00:19.544 INFO memory.MemoryStore: Block broadcast_5 stored as values in memory (estimated size 343.6 KiB, free 408.6 MiB)2021-10-27 17:00:19.558 INFO memory.MemoryStore: Block broadcast_5_piece0 stored as bytes in memory (estimated size 33.5 KiB, free 408.5 MiB)2021-10-27 17:00:19.559 INFO spark.SparkContext: Created broadcast 5 from2021-10-27 17:00:19.600 INFO mapred.FileInputFormat: Total input files to process : 12021-10-27 17:00:19.627 INFO spark.SparkContext: Starting job: collect at ExecuteStatement.scala:972021-10-27 17:00:19.629 INFO kyuubi.SQLOperationListener: Query [26e169a2-6c06-450a-b758-e577ac673d70]: Job 5 started with 1 stages, 1 active jobs running2021-10-27 17:00:19.631 INFO kyuubi.SQLOperationListener: Query [26e169a2-6c06-450a-b758-e577ac673d70]: Stage 5 started with 1 tasks, 1 active stages running2021-10-27 17:00:19.713 INFO kyuubi.SQLOperationListener: Finished stage: Stage(5, 0); Name: 'collect at ExecuteStatement.scala:97'; Status: succeeded; numTasks: 1; Took: 83 msec2021-10-27 17:00:19.713 INFO scheduler.DAGScheduler: Job 5 finished: collect at ExecuteStatement.scala:97, took 0.085454 s2021-10-27 17:00:19.713 INFO scheduler.StatsReportListener: task runtime:(count: 1, mean: 78.000000, stdev: 0.000000, max: 78.000000, min: 78.000000)2021-10-27 17:00:19.713 INFO scheduler.StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100%2021-10-27 17:00:19.713 INFO scheduler.StatsReportListener: 78.0 ms 78.0 ms 78.0 ms 78.0 ms 78.0 ms 78.0 ms 78.0 ms 78.0 ms 78.0 ms2021-10-27 17:00:19.714 INFO scheduler.StatsReportListener: shuffle bytes written:(count: 1, mean: 0.000000, stdev: 0.000000, max: 0.000000, min: 0.000000)2021-10-27 17:00:19.714 INFO scheduler.StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100%2021-10-27 17:00:19.714 INFO scheduler.StatsReportListener: 0.0 B 0.0 B 0.0 B 0.0 B 0.0 B 0.0 B 0.0 B 0.0 B 0.0 B2021-10-27 17:00:19.714 INFO scheduler.StatsReportListener: fetch wait time:(count: 1, mean: 0.000000, stdev: 0.000000, max: 0.000000, min: 0.000000)2021-10-27 17:00:19.714 INFO scheduler.StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100%2021-10-27 17:00:19.714 INFO scheduler.StatsReportListener: 0.0 ms 0.0 ms 0.0 ms 0.0 ms 0.0 ms 0.0 ms 0.0 ms 0.0 ms 0.0 ms2021-10-27 17:00:19.715 INFO scheduler.StatsReportListener: remote bytes read:(count: 1, mean: 0.000000, stdev: 0.000000, max: 0.000000, min: 0.000000)2021-10-27 17:00:19.715 INFO scheduler.StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100%2021-10-27 17:00:19.715 INFO scheduler.StatsReportListener: 0.0 B 0.0 B 0.0 B 0.0 B 0.0 B 0.0 B 0.0 B 0.0 B 0.0 B2021-10-27 17:00:19.715 INFO scheduler.StatsReportListener: task result size:(count: 1, mean: 1471.000000, stdev: 0.000000, max: 1471.000000, min: 1471.000000)2021-10-27 17:00:19.715 INFO scheduler.StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100%2021-10-27 17:00:19.715 INFO scheduler.StatsReportListener: 1471.0 B 1471.0 B 1471.0 B 1471.0 B 1471.0 B 1471.0 B 1471.0 B 1471.0 B 1471.0 B2021-10-27 17:00:19.717 INFO scheduler.StatsReportListener: executor (non-fetch) time pct: (count: 1, mean: 61.538462, stdev: 0.000000, max: 61.538462, min: 61.538462)2021-10-27 17:00:19.717 INFO scheduler.StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100%2021-10-27 17:00:19.717 INFO scheduler.StatsReportListener: 62 % 62 % 62 % 62 % 62 % 62 % 62 % 62 % 62 %2021-10-27 17:00:19.718 INFO scheduler.StatsReportListener: fetch wait time pct: (count: 1, mean: 0.000000, stdev: 0.000000, max: 0.000000, min: 0.000000)2021-10-27 17:00:19.718 INFO scheduler.StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100%2021-10-27 17:00:19.718 INFO scheduler.StatsReportListener: 0 % 0 % 0 % 0 % 0 % 0 % 0 % 0 % 0 %2021-10-27 17:00:19.718 INFO scheduler.StatsReportListener: other time pct: (count: 1, mean: 38.461538, stdev: 0.000000, max: 38.461538, min: 38.461538)2021-10-27 17:00:19.718 INFO scheduler.StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100%2021-10-27 17:00:19.718 INFO scheduler.StatsReportListener: 38 % 38 % 38 % 38 % 38 % 38 % 38 % 38 % 38 %2021-10-27 17:00:19.719 INFO kyuubi.SQLOperationListener: Query [26e169a2-6c06-450a-b758-e577ac673d70]: Job 5 succeeded, 0 active jobs running2021-10-27 17:00:19.728 INFO codegen.CodeGenerator: Code generated in 12.277091 ms2021-10-27 17:00:19.729 INFO operation.ExecuteStatement: Processing kent's query[26e169a2-6c06-450a-b758-e577ac673d70]: RUNNING_STATE -> FINISHED_STATE, statement: select * from src, time taken: 0.328 seconds2021-10-27 17:00:19.731 INFO operation.ExecuteStatement: Query[fb5f57d2-2b50-4a46-961b-3a5c6a2d2597] in FINISHED_STATE2021-10-27 17:00:19.731 INFO operation.ExecuteStatement: Processing kent's query[fb5f57d2-2b50-4a46-961b-3a5c6a2d2597]: RUNNING_STATE -> FINISHED_STATE, statement: select * from src, time taken: 0.33 seconds+-------------------------------------------------+--------------------+| version() | DATE '2021-10-27' |+-------------------------------------------------+--------------------+| 3.2.0 5d45a415f3a29898d92380380cfd82bfc7f579ea | 2021-10-27 |+-------------------------------------------------+--------------------+1 row selected (0.341 seconds)
