Apache Hudi provides the ability to post a callback notification about a write commit. This may be valuable if you need an event notification stream to take actions with other services after a Hudi write commit. You can push a write commit callback notification into HTTP endpoints or to a Kafka server.

HTTP Endpoints

You can push a commit notification to an HTTP URL and can specify custom values by extending a callback class defined below.

Config Description Required Default
TURN_CALLBACK_ON Turn commit callback on/off optional false (callbacks off)
CALLBACK_HTTP_URL Callback host to be sent along with callback messages required N/A
CALLBACK_HTTP_TIMEOUT_IN_SECONDS Callback timeout in seconds optional 3
CALLBACK_CLASS_NAME Full path of callback class and must be a subclass of HoodieWriteCommitCallback class, org.apache.hudi.callback.impl.HoodieWriteCommitHttpCallback by default optional org.apache.hudi.callback.impl.HoodieWriteCommitHttpCallback
CALLBACK_HTTP_API_KEY_VALUE Http callback API key optional hudi_write_commit_http_callback

Kafka Endpoints

You can push a commit notification to a Kafka topic so it can be used by other real time systems.

Config Description Required Default
TOPIC Kafka topic name to publish timeline activity into. required N/A
PARTITION It may be desirable to serialize all changes into a single Kafka partition for providing strict ordering. By default, Kafka messages are keyed by table name, which guarantees ordering at the table level, but not globally (or when new partitions are added) required N/A
RETRIES Times to retry the produce optional 3
ACKS kafka acks level, all by default to ensure strong durability optional all
BOOTSTRAP_SERVERS Bootstrap servers of kafka cluster, to be used for publishing commit metadata required N/A

Pulsar Endpoints

You can push a commit notification to a Pulsar topic so it can be used by other real time systems.

Config Description Required Default
hoodie.write.commit.callback.pulsar.broker.service.url Server’s Url of pulsar cluster to use to publish commit metadata. required N/A
hoodie.write.commit.callback.pulsar.topic Pulsar topic name to publish timeline activity into required N/A
hoodie.write.commit.callback.pulsar.producer.route-mode Message routing logic for producers on partitioned topics. optional RoundRobinPartition
hoodie.write.commit.callback.pulsar.producer.pending-queue-size The maximum size of a queue holding pending messages. optional 1000
hoodie.write.commit.callback.pulsar.producer.pending-total-size The maximum number of pending messages across partitions. required 50000
hoodie.write.commit.callback.pulsar.producer.block-if-queue-full When the queue is full, the method is blocked instead of an exception is thrown. optional true
hoodie.write.commit.callback.pulsar.producer.send-timeout The timeout in each sending to pulsar. optional 30s
hoodie.write.commit.callback.pulsar.operation-timeout Duration of waiting for completing an operation. optional 30s
hoodie.write.commit.callback.pulsar.connection-timeout Duration of waiting for a connection to a broker to be established. optional 10s
hoodie.write.commit.callback.pulsar.request-timeout Duration of waiting for completing a request. optional 60s
hoodie.write.commit.callback.pulsar.keepalive-interval Duration of keeping alive interval for each client broker connection. optional 30s

Bring your own implementation

You can extend the HoodieWriteCommitCallback class to implement your own way to asynchronously handle the callback of a successful write. Use this public API:

https://github.com/apache/hudi/blob/master/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/callback/HoodieWriteCommitCallback.java