Hadoop Reducer Iterate Twice in JAVA

197 views Asked by At

I have following input in rvalue

14
7
39
40

Expected output

14  
14
7
39
40

7
14
7
39
40
.
.
.

In first for loop it will read the value 14 and in nested for-loop it read the value 14 next 14,7 & 14,39 etc..

On mine below code, it's not working, when I print "cache.size();" it print the value 1 also when I print the variable i it also prints as 1. Please let me know what's wrong on this?

public static class Reducerclass  extends Reducer<Text,Text,Text,Text> {
    DateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd hh:mm:ss a");

    public void reduce(Text rkey, Iterable<Text> rvalue, Context context) 
                                       throws IOException, InterruptedException {            
        ArrayList<String> cache = new ArrayList<String>();

        for(Text value : rvalue) {  
            cache.add(value.toString());
        }   

        int size = cache.size();    
        System.out.println("size-->" + cache.size());

        for (int i = 0; i < size; i++) { 
            System.out.println("  i -->" + i);
            for (int j = 0; j < size; j++) {
                System.out.println("  j -->" + j);
            }
        }
    }
}
1

There are 1 answers

6
Sahan Jayasumana On

what is the output of:

  for(Text value : rvalue) {  
            System.out.println(value.toString()+","+cache.size())
            cache.add(value.toString());
        }   

lets say its something like

14,1
7,1
39,1 ...

then you actually dont have those values in rvalue, instead your seeing the output of multiple reducers running on that dataset, you can verify by

System.out.println(key.toString()+","+value.toString()+","+cache.size())

however if it is something like

14,1
7,2
39,3 ...

then the problem is that you are only printing i and j

    System.out.println("  i -->" + i);
    System.out.println("  i -->" + i);

instead of the values in the cache:

System.out.println(cache.get(i)+","+cache.get(j))