Read Dataset from Impala to SAS

68 views Asked by At

In SAS i'm trying to use a proc sql to read a dataset from impala and write a subset of column in my server SAS. I'm using this code:

proc sql; connect to impala (user="XXX" pw="XXX" dsn=impNewPROD database=db_impala); create table SAS as select * from connection to impala ( select a, b, c from impala_table); quit;

The code works well but i have a problem with the column in database SAS. For example, column a in impala table is a CHAR with 16 length while in dataset SAS i see the column is a CHAR 32.767 byte.

Why?

I aspect to have the same format.

Could you suggest a best practice to adopt in proc sql or data step to have the correct results?

Thanks!

I try with put, input and format functions in proc SQL but it doesn't work.

1

There are 1 answers

1
Stu Sztukowski On

From the documentation, this is an expected result of the SAS/ACCESS engine when querying an Impala database.

STRING: Depending on the length of Impala string data, the SAS character format ($32767.) might be unnecessarily large for short STRING columns. Alternatively, the SAS character format ($32767.) might truncate Impala STRING columns that contain more than 32,767 characters.

Once you do your query, you can shrink the column. e.g.

proc sql; 
    connect to impala (user="XXX" pw="XXX" dsn=impNewPROD database=db_impala); 
        create table SAS as  
        select *  
        from connection to impala 
            (select a, b, c from impala_table)
    ; 
quit;

data sas;
    length a $16.;
    set sas;
run;

Alternatively, you can use the squeeze macro to shrink all of the character columns to their minimum required length:

%squeeze(sas, sas);