I am trying to create an external hive table from Dynamodb on aws emr, using pyspark code. The query works fine when I execute it on the hive prompt, but fails when I execute it as a pyspark job. The code is as below:

from pyspark import SparkContext
from pyspark.sql import SparkSession
import os

spark = SparkSession.builder.enableHiveSupport().getOrCreate()
spark.sql('use ash_data')

spark.sql(
    """
    CREATE EXTERNAL TABLE dummyTable
        (item MAP<STRING, STRING>)
    STORED BY 'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler'
    TBLPROPERTIES ("dynamodb.table.name" = "testdynamodb")
    """
)

It keeps giving me the following error:

pyspark.sql.utils.ParseException: u'\nOperation not allowed: STORED BY(line 4, pos 4)\n\n== SQL ==\n\n    CREATE EXTERNAL TABLE dummyTable\n        (item MAP<STRING, STRING>)\n    STORED BY \'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler\'\n----^^^\n    TBLPROPERTIES ("dynamodb.table.name" = "testdynamodb")\n    \n'

Do we need to setup any permissions or roles to make this work ? Has anybody found any solution for this error ?

Thanks

0 Answers