all.I have try to use beeline to connect to Spark Thrift Server,and the data is on hive.So for security,i want to use sentry to do autherization for different user when they use spark thrift server to operate data on hive.The Spark thrift Server is ok but the sentry does not work because any user can use "select " to view any tables.Below is part of logs:
18/05/31 17:59:05 WARN conf.HiveConf: HiveConf of name hive.sentry.conf.url does not exist
18/05/31 17:59:05 WARN conf.HiveConf: HiveConf of name hive.server2.enable.impersonation does not exist
18/05/31 17:59:05 WARN conf.HiveConf: HiveConf of name hive.server2.thrift.http.min.worker.threads does not exist
18/05/31 17:59:05 WARN conf.HiveConf: HiveConf of name hive.server2.thrift.http.max.worker.threads does not exist
18/05/31 17:59:05 INFO metastore.ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
18/05/31 17:59:05 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
18/05/31 17:59:05 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
18/05/31 17:59:06 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
18/05/31 17:59:06 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
18/05/31 17:59:06 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
18/05/31 17:59:06 INFO metastore.ObjectStore: Initialized ObjectStore
18/05/31 17:59:06 WARN metastore.ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
18/05/31 17:59:06 WARN metastore.ObjectStore: Failed to get database default, returning NoSuchObjectException
18/05/31 17:59:06 INFO metastore.HiveMetaStore: Added admin role in metastore
18/05/31 17:59:06 INFO metastore.HiveMetaStore: Added public role in metastore
18/05/31 17:59:06 INFO metastore.HiveMetaStore: No user is added in admin role, since config is empty
18/05/31 17:59:06 INFO metastore.HiveMetaStore: 0: get_all_databases
18/05/31 17:59:06 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr cmd=get_all_databases
18/05/31 17:59:06 INFO metastore.HiveMetaStore: 0: get_functions: db=default pat=*
18/05/31 17:59:06 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr cmd=get_functions: db=default pat=*
18/05/31 17:59:06 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.
18/05/31 17:59:06 INFO session.SessionState: Created local directory: /tmp/0ebe7928-87f9-46b2-8160-bf7e15c22b56_resources
18/05/31 17:59:06 INFO session.SessionState: Created HDFS directory: /tmp/hive/root/0ebe7928-87f9-46b2-8160-bf7e15c22b56
18/05/31 17:59:06 INFO session.SessionState: Created local directory: /tmp/root/0ebe7928-87f9-46b2-8160-bf7e15c22b56
18/05/31 17:59:06 INFO session.SessionState: Created HDFS directory: /tmp/hive/root/0ebe7928-87f9-46b2-8160-bf7e15c22b56/_tmp_space.db
18/05/31 17:59:06 INFO client.HiveClientImpl: Warehouse location for Hive client (version 1.2.1) is /user/hive/warehouse
18/05/31 17:59:06 INFO service.CompositeService: Operation log root directory is created: /var/log/hive/operation_logs
18/05/31 17:59:06 INFO service.AbstractService: HiveServer2: Async execution pool size 100
18/05/31 17:59:06 INFO service.AbstractService: Service:OperationManager is inited.
18/05/31 17:59:06 INFO service.AbstractService: Service: SessionManager is inited.
18/05/31 17:59:06 INFO service.AbstractService: Service: CLIService is inited.
18/05/31 17:59:06 INFO service.AbstractService: Service:ThriftBinaryCLIService is inited.
18/05/31 17:59:06 INFO service.AbstractService: Service: HiveServer2 is inited.
18/05/31 17:59:06 INFO service.AbstractService: Service:OperationManager is started.
18/05/31 17:59:06 INFO service.AbstractService: Service:SessionManager is started.
18/05/31 17:59:06 INFO service.AbstractService: Service:CLIService is started.
18/05/31 17:59:06 INFO metastore.ObjectStore: ObjectStore, initialize called
18/05/31 17:59:06 INFO DataNucleus.Query: Reading in results for query "org.datanucleus.store.rdbms.query.SQLQuery@0" since the connection used is closing
18/05/31 17:59:06 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
18/05/31 17:59:06 INFO metastore.ObjectStore: Initialized ObjectStore
18/05/31 17:59:06 INFO metastore.HiveMetaStore: 0: get_databases: default
18/05/31 17:59:06 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr cmd=get_databases: default
and the config is as belows:
hive-site.xml:
<property>
<name>hive.sentry.conf.url</name>
<value>file:///opt/tmp/spark-2.2.0-bin-hadoop2.6/conf/sentry-site.xml</value>
</property>
<property>
<name>hive.stats.collect.scancols</name>
<value>true</value>
</property>
<property>
<name>hive.metastore.pre.event.listeners</name>
<value>org.apache.sentry.binding.metastore.MetastoreAuthzBinding</value>
</property>
<property>
<name>hive.metastore.event.listeners</name>
<value>org.apache.sentry.binding.metastore.SentryMetastorePostEventListener</value>
</property>
<property>
<name>hive.server2.session.hook</name>
<value>org.apache.sentry.binding.hive.HiveAuthzBindingSessionHook</value>
</property>
<property>
<name>hive.security.authorization.task.factory</name>
<value>org.apache.sentry.binding.hive.SentryHiveAuthorizationTaskFactoryImpl</value>
</property>
<property>
<name>hive.server2.enable.impersonation</name>
<value>true</value>
</property>
sentry-site.xml
<configuration>
<property>
<name>sentry.service.security.mode</name>
<value>none</value>
</property>
<property>
<name>sentry.service.client.server.rpc-address</name>
<value>hadoop008053.ppdgdsl.com</value>
</property>
<property>
<name>sentry.service.client.server.rpc-port</name>
<value>8038</value>
</property>
</configuration>
So i want to know why the sentry does not work?In the logs i saw "18/05/31 17:59:05 WARN conf.HiveConf: HiveConf of name hive.sentry.conf.url does not exist ",but it exists in hive-site.xml and the sentry-site.xml is in the right place.Can any body give me some suggestions or the deployment doc for Spark Thrift Server intergrate with sentry or Hive Server2 intergrate with Sentry?Thanks in advance.