Dolphin Scheduler目录配置文件解读

(讲解配置文件的作用,具体配置在install.sh部署文件中完成)
Image 1: image.png

  • bin 启动脚本
  • conf 配置文件
  • lib ds依赖的jar包
  • script 数据库创建升级脚本,部署分发脚本
  • sql ds的元数据创建升级sql文件
  • install脚本 部署ds主要的配置文件修改处

bin

bin目录下比较重要的是dolphinscheduler-daemon文件,之前版本中极容易出现的找不到jdk问题来源,当前版本的jdk已经export了本机的$JAVA_HOME,再也不用担心找不到jdk了。
Image 2: image.png

conf

非常重要的配置文件目录!!!
非常重要的配置文件目录!!!
非常重要的配置文件目录!!!
Image 3: image.png

  • env目录下的.dolphinscheduller_env.sh文件中记录了所有跟ds-task相关的环境变量,1.2.0版本的Spark不具备指定Spark版本的功能,可以注释掉SPARK_HOME1或者将SPARK_HOME1和SPARK_HOME2均配置为集群中的Spark2。下面给出CDH中的配置,测试环境中没有部署Flink,请忽略Flink的配置。(特别注意这是个隐藏文件,需要ls -al)
  1. export HADOOP_HOME=/opt/cloudera/parcels/CDH/lib/hadoop
  2. export HADOOP_CONF_DIR=/opt/cloudera/parcels/CDH/lib/hadoop/etc/hadoop
  3. #可以注释掉,也可以配置为SPARK_HOME2
  4. #export SPARK_HOME1=/opt/cloudera/parcels/SPARK2/lib/spark2
  5. export SPARK_HOME2=/opt/cloudera/parcels/SPARK2/lib/spark2
  6. export PYTHON_HOME=/usr/local/anaconda3/bin/python
  7. export JAVA_HOME=/usr/java/jdk1.8.0_131
  8. export HIVE_HOME=/opt/cloudera/parcels/CDH/lib/hive
  9. export FLINK_HOME=/opt/soft/flink
  10. export PATH=$HADOOP_HOME/bin:$SPARK_HOME1/bin:$SPARK_HOME2/bin:$PYTHON_HOME:$JAVA_HOME/bin:$HIVE_HOME/bin:$PATH:$FLINK_HOME/bin:$PATH

common目录

common目录包含:common.properties和hadoop/hadoop.properties

  • common.properies
    • ds的task队列实现方式,默认是Zookeeper
    • ds的task和资源的worker执行路径
    • 资源中心
      • 资源中心可选择HDFS和S3
    • 资源文件类型
    • kerberos
    • 开发状态
      • 开发测试可以开启,生产环境建议设置为false
    • ds的环境变量配置,本地调试的时候,需要保证dolphinscheduler.env.path存在
  • hadoop.properties
    • hdfs namenode配置
      • 单点可以直接写namenode的ip
      • hdfsHA需要将集群的core-site.xml和hdfs-site.xml文件拷贝到ds的conf目录下
    • s3配置
    • yarn resourcemanager配置
      • 单点配置yarn.application.status.address
      • HA配置yarn.resourcemanager.ha.rm.ids

config目录

config目录包含install_config.conf和run_config.conf

  • install_config.conf
    • ds的安装路径
    • 部署用户
    • 部署ds的机器组ip
  • run_config.conf
    • 指定ds的masters,workers,alertServer,apiServer部署在哪些机器上

alert.properties

  • 邮件告警配置
  • excel下载目录
  • 企业微信配置

application-api.properties

  • apiserver端口,上下文,日志等

application-dao.properties

敲黑板,重点!!!ds的元数据库配置,在ds-1.2.0中默认的数据库是pg,如果要使用MySQL,需要将MySQL的jdbc包放到lib目录下。

  • ds元数据库配置

master.properties

  • master执行线程数
  • master并行任务上限
  • master资源CPU和内存阈值,超出阈值不会进行dag切分

worker.properties

  • worker执行线程数
  • worker一次提交任务数
  • worker资源CPU和内存阈值,超出不会去task队列拉取task

Zookeeper.properties

  • zk集群
  • ds所需zk的znode,包含dag和task的分布式锁和master和worker的容错

quartz.properties

ds的定时由quartz框架完成,特别注意里边有quartz的数据库配置!!!

  • quartz的基本属性,线程池和job配置
  • quartz元数据库配置

install脚本

install.sh部署脚本是ds部署中的重头戏,下面将参数分组进行分析。

数据库配置

  1. # for example postgresql or mysql ...
  2. dbtype="postgresql"
  3. # db config
  4. # db address and port
  5. dbhost="192.168.xx.xx:5432"
  6. # db name
  7. dbname="dolphinscheduler"
  8. # db username
  9. username="xx"
  10. # db passwprd
  11. # Note: if there are special characters, please use the \ transfer character to transfer
  12. passowrd="xx"
  • dbtype参数可以设置postgresql和mysql,这里指定了ds连接元数据库的jdbc相关信息

部署用户&目录

  1. # conf/config/install_config.conf config
  2. # Note: the installation path is not the same as the current path (pwd)
  3. installPath="/data1_1T/dolphinscheduler"
  4. # deployment user
  5. # Note: the deployment user needs to have sudo privileges and permissions to operate hdfs. If hdfs is enabled, the root directory needs to be created by itself
  6. deployUser="dolphinscheduler"
  • installPath是安装路径,在执行install.sh之后,会把ds安装到指定目录,如/opt/ds-agent。installPath不要和当前要一键安装的install.sh是同一目录。
  • deployUser是指ds的部署用户,该用户需要在部署ds的机器上打通sudo免密,并且需要具有操作hdfs的权限,建议挂到hadoop的supergroup组下。

zk集群&角色指定

  • 配置zk集群的时候,特别注意:要用ip:2181的方式配置上去,一定要把端口带上。
  • ds一共包括master worker alert api四种角色,其中alert api只需指定一台机器即可,master和worker可以部署多态机器。下面的例子就是在4台机器中,部署2台master,2台worker,1台alert,1台api
  • ips参数,填写所有需要部署机器的hostname
  • masters,填写部署master机器的hostname
  • workers,填写部署worker机器的hostname
  • alertServer,填写部署alert机器的hostname
  • apiServers,填写部署api机器的hostname
  • zkroot参数可以通过调整,在一套zk集群中,托管多个ds集群,如配置zkRoot=”/dspro”,zkRoot=”/dstest”
  1. # zk cluster
  2. zkQuorum="192.168.xx.xx:2181,192.168.xx.xx:2181,192.168.xx.xx:2181"
  3. # install hosts
  4. # Note: install the scheduled hostname list. If it is pseudo-distributed, just write a pseudo-distributed hostname
  5. ips="ark0,ark1,ark2,ark3"
  6. # conf/config/run_config.conf config
  7. # run master machine
  8. # Note: list of hosts hostname for deploying master
  9. masters="ark0,ark1"
  10. # run worker machine
  11. # note: list of machine hostnames for deploying workers
  12. workers="ark2,ark3"
  13. # run alert machine
  14. # note: list of machine hostnames for deploying alert server
  15. alertServer="ark3"
  16. # run api machine
  17. # note: list of machine hostnames for deploying api server
  18. apiServers="ark1"
  19. # zk config
  20. # zk root directory
  21. zkRoot="/dolphinscheduler"
  22. # used to record the zk directory of the hanging machine
  23. zkDeadServers="$zkRoot/dead-servers"
  24. # masters directory
  25. zkMasters="$zkRoot/masters"
  26. # workers directory
  27. zkWorkers="$zkRoot/workers"
  28. # zk master distributed lock
  29. mastersLock="$zkRoot/lock/masters"
  30. # zk worker distributed lock
  31. workersLock="$zkRoot/lock/workers"
  32. # zk master fault-tolerant distributed lock
  33. mastersFailover="$zkRoot/lock/failover/masters"
  34. # zk worker fault-tolerant distributed lock
  35. workersFailover="$zkRoot/lock/failover/workers"
  36. # zk master start fault tolerant distributed lock
  37. mastersStartupFailover="$zkRoot/lock/failover/startup-masters"
  38. # zk session timeout
  39. zkSessionTimeout="300"
  40. # zk connection timeout
  41. zkConnectionTimeout="300"
  42. # zk retry interval
  43. zkRetrySleep="100"
  44. # zk retry maximum number of times
  45. zkRetryMaxtime="5"

邮件配置&excel文件路径

  • 邮件配置这块也是大家非常容易出问题的,建议可以拉一下ds的代码,跑一下alert.MailUtilisTest这个测试类,下面给出QQ邮箱配置方式。如果是内网邮箱,需要注意的是ssl是否需要关闭,以及mail.user登陆用户是否需要去掉邮箱后缀。
  • excel路径则需要保证该路径的写入权限
  1. #QQ邮箱配置
  2. # alert config
  3. # mail protocol
  4. mailProtocol="SMTP"
  5. # mail server host
  6. mailServerHost="smtp.qq.com"
  7. # mail server port
  8. mailServerPort="465"
  9. # sender
  10. mailSender="783xx8369@qq.com"
  11. # user
  12. mailUser="783xx8369@qq.com"
  13. # sender password
  14. mailPassword="邮箱授权码"
  15. # TLS mail protocol support
  16. starttlsEnable="false"
  17. sslTrust="smtp.qq.com"
  18. # SSL mail protocol support
  19. # note: The SSL protocol is enabled by default.
  20. # only one of TLS and SSL can be in the true state.
  21. sslEnable="true"
  22. # download excel path
  23. xlsFilePath="/tmp/xls"
  24. # alert port
  25. alertPort=7789

apiServer配置

  • apiServer这里可以关注一下,apiserver的端口和上下文即apiServerPort和apiServerContextPath参数
  1. # api config
  2. # api server port
  3. apiServerPort="12345"
  4. # api session timeout
  5. apiServerSessionTimeout="7200"
  6. # api server context path
  7. apiServerContextPath="/dolphinscheduler/"
  8. # spring max file size
  9. springMaxFileSize="1024MB"
  10. # spring max request size
  11. springMaxRequestSize="1024MB"
  12. # api max http post size
  13. apiMaxHttpPostSize="5000000"

资源中心&YARN

  • ds的资源中心支持HDFS和S3.
  • resUploadStartupType=”HDFS”则开启hdfs作为资源中心。
  • defaultFS,如果hdfs没有配置HA则需要在这里写上单点namenode的ip,如果HDFS是HA则需要将集群的core-site.xml文件和hdfs-site.xml文件拷贝到conf目录下
  • yarnHaIps,如果yarn启用了HA,配置两个resourcemanager的ip,如果是单点,配置空字符串
  • singleYarnIp,如果yarn是单点,配置resourcemanager的ip
  • hdfsPath,HDFS上ds存储资源的根路径,可采用默认值,如果是从1.1.0版本进行升级,需要注意这个地方,改为/escheduler
  1. # resource Center upload and select storage method:HDFS,S3,NONE
  2. resUploadStartupType="NONE"
  3. # if resUploadStartupType is HDFS,defaultFS write namenode address,HA you need to put core-site.xml and hdfs-site.xml in the conf directory.
  4. # if S3,write S3 address,HA,for example :s3a://dolphinscheduler,
  5. # Note,s3 be sure to create the root directory /dolphinscheduler
  6. defaultFS="hdfs://mycluster:8020"
  7. # if S3 is configured, the following configuration is required.
  8. s3Endpoint="http://192.168.xx.xx:9010"
  9. s3AccessKey="xxxxxxxxxx"
  10. s3SecretKey="xxxxxxxxxx"
  11. # resourcemanager HA configuration, if it is a single resourcemanager, here is yarnHaIps=""
  12. yarnHaIps="192.168.xx.xx,192.168.xx.xx"
  13. # if it is a single resourcemanager, you only need to configure one host name. If it is resourcemanager HA, the default configuration is fine.
  14. singleYarnIp="ark1"
  15. # hdfs root path, the owner of the root path must be the deployment user.
  16. # versions prior to 1.1.0 do not automatically create the hdfs root directory, you need to create it yourself.
  17. hdfsPath="/dolphinscheduler"
  18. # have users who create directory permissions under hdfs root path /
  19. # Note: if kerberos is enabled, hdfsRootUser="" can be used directly.
  20. hdfsRootUser="hdfs"

开发状态

  • devState在测试环境部署的时候可以调为true,生产环境部署建议调为false
  1. # development status, if true, for the SHELL script, you can view the encapsulated SHELL script in the execPath directory.
  2. # If it is false, execute the direct delete
  3. devState="true"

角色参数

  • 下面的参数主要是调整的application.properties里边的配置,涉及master,worker和apiserver
  • apiServerPort可以自定义修改apiserver的端口,注意需要跟前端保持一致。
  • master和worker的参数,初次部署建议保持默认值,如果在运行当中出现性能问题在作调整,有条件可以压一下自身环境中的master和worker的最佳线程数。
  • worker.reserved.memory是worker的内存阈值,masterReservedMemory是master的内存阈值,建议调整为0.1
  • masterMaxCpuLoadAvg建议注释掉,ds-1.2.0master和worker的CPU负载给出了默认cpu线程数 * 2的默认值
  1. # master config
  2. # master execution thread maximum number, maximum parallelism of process instance
  3. masterExecThreads="100"
  4. # the maximum number of master task execution threads, the maximum degree of parallelism for each process instance
  5. masterExecTaskNum="20"
  6. # master heartbeat interval
  7. masterHeartbeatInterval="10"
  8. # master task submission retries
  9. masterTaskCommitRetryTimes="5"
  10. # master task submission retry interval
  11. masterTaskCommitInterval="100"
  12. # master maximum cpu average load, used to determine whether the master has execution capability
  13. #masterMaxCpuLoadAvg="10"
  14. # master reserve memory to determine if the master has execution capability
  15. masterReservedMemory="1"
  16. # master port
  17. masterPort=5566
  18. # worker config
  19. # worker execution thread
  20. workerExecThreads="100"
  21. # worker heartbeat interval
  22. workerHeartbeatInterval="10"
  23. # worker number of fetch tasks
  24. workerFetchTaskNum="3"
  25. # worker reserve memory to determine if the master has execution capability
  26. workerReservedMemory="1"
  27. # master port
  28. workerPort=7788

特别注意

  • ds需要启用资源中心之后,才可以创建租户,因此资源中心的配置一定要正确
  • ds老版本部署需要配置JDK的问题已经解决
  • installPath不要和当前要一键安装的install.sh是同一目录
  • ds的task运行都依赖env目录下的环境变量文件,需要正确配置
  • HDFS高可用,需要把core-site.xml和hdfs-site.xml文件拷贝到conf目录下
  • 邮件配置中mailUser和mailSender的区别