Using json objects (with maintained schema in registry) instead of string or hash as message key in kafka

32 views Asked by At

Folks,

I am wondering if there is any dis/advantage to using json objects with schema maintained in schema registry as message keys.

Assume I have 2 event streams coupled with 2 tables in a source system. Each of those tables have its own business keys.

  1. Product table has product_number and base_product_id as business key (a composite key) in the source system (key: product_number, base_product_id)

  2. Invoice table has invoice_id as its business key and prod_id and prod_num as composite foreign key that points to Product table in the source system.

I’d like to enrich my Invoice stream with records in a globalKtable built on top of my Product event stream in a kstream application by applying an inner join between the two.

I can think of 3 ways to configure my data producers for assigning keys:

1. concatenate the value of the keys. e.g.: key=valueOf(prod_id).concat(prod_num)

2. define key as json object with schema maintained in schema registry. e.g.: key={“prod_id”: “AAM64”, “prod_num”: “334”}

3. use a hash function to construct the key. e.g.: key=hashFunction(valueOf(prod_id).concat(prod_num))

option 2 enforces using the same structure and order of key in my Product and Invoice event streams as field names are part of my keys, and as such the join condition will fail if field names do not match.

Any recommendation as to which approach would make sense is highly appreciated.

I found concatenation the easiest but I am not sure if it is the best.

0

There are 0 answers