How to use

The installation of the one-stop platform streampark-console has been introduced in detail in the previous chapter. In this chapter, let’s see how to quickly deploy and run a job with streampark-console. The official structure and specification) and projects developed with streampark are well supported. Let’s use streampark-quickstart to quickly start the journey of streampark-console

streampark-quickstart is a sample program for developing Flink by StreamPark. For details, please refer to:

Deploy DataStream tasks

The following example demonstrates how to deploy a DataStream application

Deploy the FlinkSql task

The following example demonstrates how to deploy a FlinkSql application

  • The flink sql used in the project demonstration is as follows
  1. CREATE TABLE user_log (
  2. user_id VARCHAR,
  3. item_id VARCHAR,
  4. category_id VARCHAR,
  5. behavior VARCHAR,
  6. ts TIMESTAMP(3)
  7. ) WITH (
  8. 'connector' = 'kafka', -- Using the kafka connector
  9. 'properties.group.id' = 'group01' ,
  10. 'topic' = 'user_behavior', -- kafka topic
  11. 'properties.bootstrap.servers'='kafka-1:9092,kafka-2:9092,kafka-3:9092',
  12. 'scan.startup.mode' = 'earliest-offset', -- Read from start offset
  13. 'format' = 'json' -- The data source format is json
  14. );
  15. CREATE TABLE pvuv_sink (
  16. dt VARCHAR,
  17. pv BIGINT,
  18. uv BIGINT,
  19. PRIMARY KEY (dt,pv,uv) NOT ENFORCED
  20. ) WITH (
  21. 'connector' = 'jdbc', -- using jdbc connector
  22. 'url' = 'jdbc:mysql://test-mysql:3306/test', -- jdbc url
  23. 'table-name' = 'pvuv_sink', -- Table Name
  24. 'username' = 'root', -- username
  25. 'password' = '123456', --password
  26. 'sink.buffer-flush.max-rows' = '1' -- Default 5000, changed to 1 for demonstration
  27. );
  28. INSERT INTO pvuv_sink
  29. SELECT
  30. DATE_FORMAT(ts, 'yyyy-MM-dd HH:00') dt,
  31. COUNT(*) AS pv,
  32. COUNT(DISTINCT user_id) AS uv
  33. FROM user_log
  34. GROUP BY DATE_FORMAT(ts, 'yyyy-MM-dd HH:00');
  • The maven dependencies are used as follows
  1. <dependency>
  2. <groupId>mysql</groupId>
  3. <artifactId>mysql-connector-java</artifactId>
  4. <version>5.1.48</version>
  5. </dependency>
  6. <dependency>
  7. <groupId>org.apache.flink</groupId>
  8. <artifactId>flink-sql-connector-kafka_2.11</artifactId>
  9. <version>1.14.6</version>
  10. </dependency>
  11. <dependency>
  12. <groupId>org.apache.flink</groupId>
  13. <artifactId>flink-connector-jdbc_2.11</artifactId>
  14. <version>1.14.6</version>
  15. </dependency>
  • The data sent by Kafka simulation is as follows
  1. {"user_id": "543462", "item_id":"1715", "category_id": "1464116", "behavior": "pv", "ts":"2021-02-01 01:00:00"}
  2. {"user_id": "662867", "item_id":"2244074","category_id":"1575622","behavior": "pv", "ts":"2021-02-01 01:00:00"}
  3. {"user_id": "662867", "item_id":"2244074","category_id":"1575622","behavior": "pv", "ts":"2021-02-01 01:00:00"}
  4. {"user_id": "662867", "item_id":"2244074","category_id":"1575622","behavior": "learning flink", "ts":"2021-02-01 01:00:00"}

Task start process

The task startup flow chart is as follows

Quick Start - 图1

streampark-console submit task process

Regarding the concept of the project, Development Mode, savepoint, NoteBook, custom jar management, task release, task recovery, parameter configuration, parameter comparison, multi-version management and more tutorials and documents will be continuously updated. ..