The count changes for every run for the same set of files. The following code is still not data consistent. How to make thread safe? Simple word count code.
package ConcurrentHashMapDemo;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ConcurrentMap;
class FileReaderTask implements Runnable {
private String filePath;
private String fileName;
private ConcurrentMap<String, Integer> wordCountMap;
public FileReaderTask(String filePath, String fileName,
ConcurrentMap<String, Integer> wordCountMap) {
this.filePath = filePath;
this.fileName = fileName;
this.wordCountMap = wordCountMap;
}
public void run() {
File jobFile = new File(filePath + fileName);
try {
BufferedReader bReader = new BufferedReader(new FileReader(jobFile));
String line = "";
while ((line = bReader.readLine()) != null) {
String[] strArray = line.split(" ");
for (String str : strArray) {
if (wordCountMap.containsKey(str)) {
wordCountMap.replace (str.trim(),
wordCountMap.get(str.trim()) + 1);
} else {
wordCountMap.putIfAbsent(str.trim(), 1);
}
}
}
//Thread.sleep(10000);
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
public class Main {
public static void main(String[] args) {
ConcurrentMap<String, Integer> wordCountMap = new ConcurrentHashMap<String, Integer>();
File fileDir = new File("c://job_files");
Thread[] threads = new Thread[fileDir.listFiles().length];
for(int i=0;i<threads.length;i++){
FileReaderTask frt = new FileReaderTask("c:/job_files/", fileDir.listFiles()[i].getName(), wordCountMap);
threads[i]= new Thread(frt);
threads[i].start();
}
//
for(int i=0;i<threads.length;i++){
try {
threads[i].join();
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
for(Map.Entry<String, Integer> entry: wordCountMap.entrySet()){
String key = entry.getKey();
System.out.println(key +" - - "+wordCountMap.get(key));
}
System.out.println("Main");
}
}
The concurrent containers ensure internal consistency (for example not adding the same key twice), but they do nothing to protect the stored values. Your code as it stands now has a race condition. Another thread can increment the counter between your call to
get
and your call toreplace
. Thereplace
then puts the wrong value in the map, losing the increment performed by the other thread.You need to make your increment atomic. Something like this, which uses the version of
replace
which ensures the value in the map is still the same before peforming the replacement: