I am trying to run a flink job using a file from HDFS. I have created a dataset as following -
DataSource<Tuple2<LongWritable, Text>> visits = env.readHadoopFile(new TextInputFormat(), LongWritable.class,Text.class, Config.pathToVisits());
I am using flink's latest version - 0.9.0-milestone-1-hadoop1 (I have also tried with 0.9.0-milestone-1)
whereas my Hadoop version is 2.6.0
But, I get the following exception when I try to execute the job. I have searched for similar problem, and it is related to version incompatibility between client and hdfs.
Exception in thread "main" org.apache.hadoop.ipc.RemoteException: Server IPC version 9 cannot communicate with client version 4
at org.apache.hadoop.ipc.Client.call(Client.java:1113)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
at com.sun.proxy.$Proxy5.getProtocolVersion(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
Can you please let me know what changes should I make in my pom, so that it points to correct Hadoop/HDFS version? or changes elsewhere? Or I need to downgrade the hadoop installation?
Have you tried the Hadoop-2 build of Flink? Have a look at the downloads page. There is a build called
flink-0.9.0-milestone-1-bin-hadoop2.tgz
that should work with Hadoop 2.