Documentation Apache Iceberg is an open table format for huge analytic datasets. Iceberg adds tables to compute engines including Spark, Trino, PrestoDB, Flink, Hive and Impala u...
Introduction Quartz Azkaban Oozie Launching Gobblin in Local Mode Example Config Files Uploading Files to HDFS Adding Gobblin jar Dependencies Launching the Job Launching ...
Introduction Hadoop and S3 The s3a File System The s3 File System Getting Gobblin to Publish to S3 Signing Up For AWS Setting Up EC2 Launching an EC2 Instance EC2 Package I...