Introduction Docker Docker Repositories Run the docker image with simple wikipedia jobs Use Gobblin Standalone on Docker for Kafka and HDFS Ingestion Run Gobblin as a Service ...
RisingWave RisingWave is a Postgres-compatible SQL database designed for real-time event streaming data processing, analysis, and management. It can ingest millions of events per...
Querying from Amazon Athena To read an Apache XTable™ (Incubating) synced target table (regardless of the table format) in Amazon Athena, you can create the table either by: Usi...
Iceberg JDBC Integration JDBC Catalog Iceberg supports using a table in a relational database to manage Iceberg tables through JDBC. The database that JDBC connects to must suppo...
Overview How to Use Templates Available Templates How to Create Your Own Template How does Template Work in Gobblin Overview The job configuration template is implemented fo...
Schemas Iceberg tables support the following types: Type Description Notes boolean True or false int 32-bit signed integers Can promote to long long ...
Introduction Usage Configuration Developer Guide Introduction The Gobblin Compliance module allows for data purging to meet regulatory compliance requirements. The module in...
Rewrite files action Iceberg provides API to rewrite small files into large files by submitting Flink batch jobs. The behavior of this Flink action is the same as Spark’s rewriteD...
Introduction Usage Configuration Developer Guide Introduction The Gobblin Compliance module allows for data purging to meet regulatory compliance requirements. The module in...
Introduction Hadoop and S3 The s3a File System The s3 File System Getting Gobblin to Publish to S3 Signing Up For AWS Setting Up EC2 Launching an EC2 Instance EC2 Package I...