Devlive 开源社区 本次搜索耗时 0.367 秒,为您找到 865 个相关结果.
  • State Management and Watermarks

    Managing Watermarks in a Job Basics Task Failures Multi-Dataset Jobs Gobblin State Deep Dive State class hierarchy How States are Used in a Gobblin Job This page has two p...
  • Flink Configuration

    Flink Configuration Catalog Configuration A catalog is created and named by executing the following query (replace <catalog_name> with your catalog name and <config_key> =<confi...
  • FAQs

    Gobblin General Questions What is Gobblin? What programming languages does Gobblin support? Does Gobblin require any external software to be installed? What Hadoop versions can ...
  • Structured Streaming

    Spark Structured Streaming Iceberg uses Apache Spark’s DataSourceV2 API for data source and catalog implementations. Spark DSv2 is an evolving API with different levels of support...
  • Metrics Reporting

    Metrics Reporting As of 1.1.0 Iceberg supports the MetricsReporter and the MetricsReport APIs. These two APIs allow expressing different metrics reports while supporting a plu...
  • Schedulers

    Introduction Quartz Azkaban Oozie Launching Gobblin in Local Mode Example Config Files Uploading Files to HDFS Adding Gobblin jar Dependencies Launching the Job Launching ...
  • Flink Connector

    Flink Connector Apache Flink supports creating Iceberg table directly without creating the explicit Flink catalog in Flink SQL. That means we can just create an iceberg table by s...
  • Apache Spark

    Querying from Apache Spark To read an Apache XTable™ (Incubating) synced target table (regardless of the table format) in Apache Spark locally or on services like Amazon EMR, Goog...
  • Partitioning

    Partitioning What is partitioning? Partitioning is a way to make queries faster by grouping similar rows together when writing. For example, queries for log entries from a logs ...