How to delete customer information from hdfs

Question

How to delete customer information from hdfs

38 views Asked by Manoj Kumar Dhakad At 22 April 2020 at 06:21

Suppose, I have several customers today so I am storing their information like customer_id, customer_name, customer_emailid etc. If my customer is leaving and he wants that his personal information should be removed from my hdfs.

So I have below two approaches to achieve the same.

Approach 1:

1.Create Internal Table on top of HDFS

2.Create external table from first table using filter logic

3.While Creating 2nd Table apply udfs on specific columns for more column filtering

Approach 2:

Spark=> Read, filter, write

Is there any other solution?

Original Q&A

There are 1 answers

**leftjoin** · Answer 1 · 2020-04-22T07:04:38+00:00

Approach 2 is possible in Hive - select, filter, write

Create a table on top of directory in hdfs (external or managed, does not matter in this context, better external if you are going to drop table later and keep the data as is). Insert overwrite table or partition from select with filter.

insert overwrite mytable 
select *                       
 from mytable --the same table
where customer_id not in (...) --filter rows

TechQA.

How to delete customer information from hdfs

There are 1 answers

Related Questions in HIVE

Related Questions in HDFS

Related Questions in DATA-WAREHOUSE

Related Questions in HDFSTORE

Popular Questions

Trending Questions