Pig: Illustrate error 2997

616 views Asked by At

Below code is working fine and producing the results at the grunt (local mode) except the illustrate on last relation is giving the error 2997

/* Open Grunt in local mode pig -x local */

STOCK_A= LOAD '/media/sf_sand/NYSE_daily_prices_A.csv' USING PigStorage(',') AS (exchange:chararray,symbol:chararray,date:chararray,open:float,high:float,low:float,close:float,volume:int,adj_close:float); 
describe STOCK_A;
illustrate STOCK_A;
b= LIMIT STOCK_A 100;
describe b;
illustrate b;
c= FOREACH b GENERATE *;
illustrate c is working
c= FOREACH b GENERATE symbol,date,close;
dump c;  — working

Illustrate c is not working below is the error ( Error 2997 Encountered IO exception):

015-06-10 11:52:23,621 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///
2015-06-10 11:52:23,647 [main] WARN  org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2015-06-10 11:52:23,647 [main] INFO  org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[ConstantCalculator, LoadTypeCastInserter, PredicatePushdownOptimizer, StreamTypeCastInserter], RULES_DISABLED=[AddForEach, ColumnMapKeyPrune, GroupByConstParallelSetter, LimitOptimizer, MergeFilter, MergeForEach, PartitionFilterOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter]}
2015-06-10 11:52:23,650 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2015-06-10 11:52:23,650 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2015-06-10 11:52:23,650 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2015-06-10 11:52:23,651 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2015-06-10 11:52:23,651 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2015-06-10 11:52:23,658 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2015-06-10 11:52:23,658 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2015-06-10 11:52:23,658 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Distributed cache not supported or needed in local mode. Setting key [pig.schematuple.local.dir] with code temp directory: /tmp/1433937143658-0
2015-06-10 11:52:23,667 [main] INFO  org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2015-06-10 11:52:23,669 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being processed per job phase (AliasName[line,offset]): M: STOCK_A[3,9] C:  R:
2015-06-10 11:52:23,672 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2015-06-10 11:52:23,672 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2015-06-10 11:52:23,705 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2015-06-10 11:52:23,707 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2015-06-10 11:52:23,707 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2015-06-10 11:52:23,708 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2015-06-10 11:52:23,708 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2015-06-10 11:52:23,708 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase detected, estimating # of required reducers.
2015-06-10 11:52:23,709 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1
2015-06-10 11:52:23,723 [main] WARN  org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2015-06-10 11:52:23,727 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Map - Aliases being processed per job phase (AliasName[line,offset]): M: STOCK_A[3,9],STOCK_A[-1,-1],c[8,3] C:  R: b[4,3]
2015-06-10 11:52:23,727 [main] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger - org.apache.pig.builtin.Utf8StorageConverter(FIELD_DISCARDED_TYPE_CONVERSION_FAILED): Unable to interpret value [115, 116, 111, 99, 107, 95, 112, 114, 105, 99, 101, 95, 111, 112, 101, 110] in field being converted to float, caught NumberFormatException <For input string: "stock_price_open"> field discarded
2015-06-10 11:52:23,727 [main] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger - org.apache.pig.builtin.Utf8StorageConverter(FIELD_DISCARDED_TYPE_CONVERSION_FAILED): Unable to interpret value [115, 116, 111, 99, 107, 95, 112, 114, 105, 99, 101, 95, 104, 105, 103, 104] in field being converted to float, caught NumberFormatException <For input string: "stock_price_high"> field discarded
2015-06-10 11:52:23,727 [main] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger - org.apache.pig.builtin.Utf8StorageConverter(FIELD_DISCARDED_TYPE_CONVERSION_FAILED): Unable to interpret value [115, 116, 111, 99, 107, 95, 112, 114, 105, 99, 101, 95, 108, 111, 119] in field being converted to float, caught NumberFormatException <For input string: "stock_price_low"> field discarded
2015-06-10 11:52:23,727 [main] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger - org.apache.pig.builtin.Utf8StorageConverter(FIELD_DISCARDED_TYPE_CONVERSION_FAILED): Unable to interpret value [115, 116, 111, 99, 107, 95, 112, 114, 105, 99, 101, 95, 99, 108, 111, 115, 101] in field being converted to float, caught NumberFormatException <For input string: "stock_price_close"> field discarded
2015-06-10 11:52:23,727 [main] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger - org.apache.pig.builtin.Utf8StorageConverter(FIELD_DISCARDED_TYPE_CONVERSION_FAILED): Unable to interpret value [115, 116, 111, 99, 107, 95, 118, 111, 108, 117, 109, 101] in field being converted to int, caught NumberFormatException <For input string: "stock_volume"> field discarded
2015-06-10 11:52:23,727 [main] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger - org.apache.pig.builtin.Utf8StorageConverter(FIELD_DISCARDED_TYPE_CONVERSION_FAILED): Unable to interpret value [115, 116, 111, 99, 107, 95, 112, 114, 105, 99, 101, 95, 97, 100, 106, 95, 99, 108, 111, 115, 101] in field being converted to float, caught NumberFormatException <For input string: "stock_price_adj_close"> field discarded
java.lang.ClassCastException
2015-06-10 11:52:23,727 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2997: Encountered IOException. Exception
1

There are 1 answers

0
glefait On

In the last line of your log, you have the following error :

Unable to interpret value [115, 116, 111, 99, 107, 95, 112, 114, 105, 99, 101, 95, 97, 100, 106, 95, 99, 108, 111, 115, 101] in field being converted to float, caught NumberFormatException field discarded java.lang.ClassCastException 2015-06-10 11:52:23,727 [main] 

Could you provide a sample of you csv file as I think event STOCK_A is not okay ?

You may also LIMIT the input to few lines, and show the results of DESCRIBE and DUMP on those lines.