Count the categories from particular one column in using ADF(azure data factory)

Question

Count the categories from particular one column in using ADF(azure data factory)

43 views Asked by Gaurav Parmar At 20 March 2024 at 12:56

I have data in which it has one column named category. Categories are chair and table.

But Issue is that I want to to perform count for table,chair and two more categories bowl and plat which is not present in my data.
I want to get count for those all 4 categories from data. If not found that category it should write zero for its count. So how to do that in adf?

I have tried to do filtering, derived column but derived column give me additional column but i want it at row level.

Original Q&A

There are 1 answers

**Rakesh Govindula** · Accepted Answer · 2024-03-20T14:50:32+00:00

As the remaining values doesn't exists in the column, Dataflow can't give the count for those.

So, you can follow below approach to achieve your requirement. In this approach, you need an array containing values of all categories like ['chair','table','bowl','plat'].

This is starting data that I am taking in the dataflow source:

category
chair
table
chair
table
table
chair
chair
table

First use an aggregate transformation to get the category and count columns.

Give category column in the Group by section and create a new column count with count(category) expression in the aggregate section like below.

enter image description here

It will give the result with existing values.

enter image description here

Now, take another aggregate transformation after this. In this, leave group by section empty and in the aggregate section, create a column for an array of existing categories using collect(category) expression.

enter image description here

It will produce below results.

enter image description here

Next, take a derived column transformation and create two columns using the above original array column and the array of all categories.

rem_items - toString(unfold(except(['chair','table','bowl','plat'],original)))

count2 - 0

enter image description here

This will first generate an array of non-existing categories and unfolds that array as rows. Those will be converted to string type column like below. Coming to count2 columns, whatever the values that are coming from this transformation, the count for those will be 0 only.

enter image description here

Now, use a select transformation and remove the extra original column.

enter image description here

As a result, select transformation will only contains two columns.

Now, create a New branch after the aggregate1 transformation and add a Union transformation to it. Use the above select1 transformation as another input and do the union by position.

enter image description here

This will give the desired result.

enter image description here

Add your sink to the union transformation and you can run the dataflow.

TechQA.

Count the categories from particular one column in using ADF(azure data factory)

There are 1 answers

Related Questions in AZURE

Related Questions in AZURE-DATA-FACTORY

Related Questions in EXPRESSIONBUILDER

Popular Questions

Trending Questions