Reading a Delta Lake table from a S3 Bucket

587 views Asked by At

I'm trying to use the library delta-rs to read some delta tables from a S3 bucket, but I'm not able to get them from the bucket. Here is the snippet of my code:

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>>  {

    let credentials = Credentials::new(Some(AWS_ACCESS_KEY_ID), Some(AWS_SECRET_ACCESS_KEY), None, None, None).unwrap();

    let region: Region = REGION.parse().unwrap();

    let bucket = Bucket::new(&S3_TEST_BUCKET, region, credentials).unwrap();
    
    let url = bucket.url();

    match deltalake::parse_uri(&url){
        Ok(uri) => println!("{}", uri.path()),
        Err(e) => println!("{}", e)
    }

    Ok(())        
    }

My problem is, it seems that the parse_uri needs a parameter that is an attribute from s3, like is shown in the library function:

pub fn parse_uri<'a>(path: &'a str) -> Result<Uri<'a>, UriError> {
    let parts: Vec<&'a str> = path.split("://").collect();

    if parts.len() == 1 {
        return Ok(Uri::LocalPath(parts[0]));
    }

    match parts[0] {
        "s3" => {
            cfg_if::cfg_if! {
                if #[cfg(any(feature = "s3", feature = "s3-rustls"))] {
                    let mut path_parts = parts[1].splitn(2, '/');
                    let bucket = match path_parts.next() {
                        Some(x) => x,
                        None => {
                            return Err(UriError::MissingObjectBucket);
                        }
                    };
                    let key = match path_parts.next() {
                        Some(x) => x,
                        None => {
                            return Err(UriError::MissingObjectKey);
                        }
                    };

                    Ok(Uri::S3Object(s3::S3Object { bucket, key }))
                } else {
                    Err(UriError::InvalidScheme(String::from(parts[0])))
                }
            }
        }
[...]

However I cannot get a string that is also an attribute from s3. And my code is always falling on the else clause returning InvalidSchema.

To connect to s3 I'm using rust-s3 = "0.26.4" and to read delta tables I'm using deltalake = "0.4.0".

0

There are 0 answers