how to ignore key-value pair in Map-Reduce if values are blank?

2.4k views Asked by At

I have a tab separated input file from where I am reading 2 columns in Map-Reduce. 1 column is the key and the other value. So my requirement is, If value is blank i.e.. it contains space or tab or any other character, even the key should not be processed to the reducer.In whole, it should discard that record and fetch the next record which has value. Have written the following code, but it does not work. It executes all the records. It does not filter anything.

public static class Map extends Mapper<LongWritable, Text, Text,Text> 
    {
        private Text vis = new Text();
        private Text eValue = new Text();
        public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException 
        {
            String line=value.toString();
            String[] arr=line.split("\t");
            vis.set(arr[8]);
            eValue.set(arr[287]);
            if (!eValue.equals("\t") || eValue.equals(" "))
            {
                context.write(vis,eValue);
            }
            } 
    }

Any help is appreciated. Thanks in advance.

PS : I am using Hadoop-2.6.0

3

There are 3 answers

2
Ramzy On

You are doing it right with respect to the design. However, the if condition is not what you would expect I suppose. First understand what values are you getting in map,if you have a blank value. And once you split based on '\t', how are you expecting it to be present still in the individual words. Think again, and modify the if condition.

1
Veera On

You could use below statement instead of multiple check conditions.

        if (!(eValue.toString().isEmpty()))
    {
        context.write(vis,eValue);
    }
3
vishnu viswanath On

You have to check for one more condition

eValue.equals("")

Also, your not condition is only applicable for \t. You need to put that for all the conditions together (if your requirement is to omit all values with space/tab/empty).

if (!(eValue.equals("\t") || eValue.equals(" ") || eValue.equals("")))
    {
            context.write(vis,eValue);
 }