Kafka and IIDR CDC

3.4k views Asked by At

I am trying to build a CDC pipeline using : DB2--IBM CDC --Kafka and I am trying to figure out the right way to setup this . I tried below things -

1.Setup a 3 node kafka cluster on linux on prem

2.Installed IIDR CDC software on linux on prem using - setup-iidr-11.4.0.1-5085-linux-x86.bin file . The CDC instance is up and running .

The various online documentation suggest to install 'IIDR management console ' to configure the source datastore and CDC server configuration and also Kafka subscription configuration to build the pipeline .

Currently I do not have the management console installed . Few questions on this -

1.Is there any alternative to IBM CDC management console for setting up the kafka-CDC pipeline ?

2.How can I get the IIDR management console ? and if we install it on our local windows dekstop and try to connect to CDC/Kafka which are on remote linux servers, will it work ?

3.Any other method to setup the data ingestion IIDR CDC to Kafka ?

I am fairly new to CDC/ IIDR , please help !

1

There are 1 answers

1
Shawn On BEST ANSWER

I own the development of the IIDR Kafka target for our CDC Replication product.

Management Console is the best way to setup the subscription initially. You can install it on a windows box.

Technically I believe you can use our scripting language called CHCCLP to setup a subscription as well. But I recommend using the GUI.

Here are links to our resources on our IIDR (CDC) Kafka Target. Search for the "Kafka" section.

"https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/W8d78486eafb9_4a06_a482_7e7962f5ac59/page/IIDR%20Wiki"

An example of setting up a subscription and replicating is this video

https://ibm.box.com/s/ur8jokg6tclsx5fcav5g86a3n57mqtd5

Management console and access server can be obtained from IBM fix central.

I have installed MC/Access server on my VM and on my personal windows box to use it against my linux VMs. You will need connectivity of course.

You can definitely follow up with our Support and they'll be able to sort you out. Plus we have docs in our knowledge centre on MC starting here.... https://www.ibm.com/support/knowledgecenter/en/SSTRGZ_11.4.0/com.ibm.cdcdoc.mcadminguide.doc/concepts/overview_of_cdc.html

You'll find our Kafka target is very flexible it comes with five different formats to write data into Kafka, and you can choose to capture data in an audit format, or the Kafka compaction compatible key, null for a delete method.

Additionally you can even use the product to write several records to several different topics in several formats from a single insert operation. This is useful if some of your consumer apps want JSON and others Avro binary. Additionally you can use this to put all the data to more secure topics, and write out just some of the data to topics that more people have access to.

We even have customers who encrypt columns in flight when replicating.

Finally the product's transformations can be parallelized even if you choose to only use one producer to write out data.

Actually one more finally, we additionally provide the option to use a special consumer which produces database ACID semantics for data written into Kafka and shred across topics and partitions. It re-orders it. we call it the transactionally consistent consumer. It provides operation order, bookmarks for restarting applications, and allows parallelism in performance but ordered, exactly once, deduplicated consumption of data.

From my talk at the Kafka Summit...

https://www.confluent.io/kafka-summit-sf18/a-solution-for-leveraging-kafka-to-provide-end-to-end-acid-transactions