We have Python3 application to connect to Hbase and fetch data.
The connectivity was working fine with Kerberos Hbase Thrift Binary protocol (in TSocket) until the Hadoop team moved the Hadoop system to Cloudera and Cloudera manager which start Kerberos Hbase Thrift in HTTPS mode.
Now the protocol changed from TSocket to HTTP/HTPS and Python code cannot authenticate using HTTP Client with SASL kerberos.
Current Python version used ins Python 3.6.8
and package versions are
thrift=0.13.0
hbase-thrift=0.20.4
pure_sasl=0.5.1
Working code in TSocket mode:
############
from thrift.transport import TSocket,TTransport
from thrift.protocol import TBinaryProtocol
from hbase import Hbase
from hbase.ttypes import *
import jprops
from subprocess import call, check_output
#read cluster.properties
with open('/data/properties/cluster.properties') as fp:
properties = jprops.load_properties(fp)
# kerberos ticket
kerberos_ticket():
principal = properties["principal"]
kinitCommand = "kinit" + " " + "-kt"+ " " + keyTab + " " + principal
call(kinitCommand, shell="True")
return
# Hbase connection
def hbase_connection():
#get hbase data
thriftHost = properties["thriftHost"]
hbaseService = properties["hbaseService"]
Tsock = TSocket.TSocket(thriftHost, 9090)
Tsock.setTimeout(2000000) #Milliseconds timeout
transport = TTransport.TSaslClientTransport(
Tsock,
host=thriftHost,
service=hbaseService,
mechanism='GSSAPI'
)
protocol = TBinaryProtocol.TBinaryProtocol(transport)
client = Hbase.Client(protocol)
return client,transport
#get kerberized ticket
kerberos_ticket()
client,transport = hbase_connection()
transport.open()
print(client.getTableNames())
###########
I found that in the TTransport.py code there was a comment it just supports TSocket
https://github.com/apache/thrift/blob/master/lib/py/src/transport/TTransport.py
TTransport.TSaslClientTransport
"transport: an underlying transport to use, typically just a TSocket"
We tried to use
https://github.com/apache/thrift/blob/master/lib/py/src/transport/THttpClient.py
THttpClient.THttpClient(url)
but it cannot be used in TTransport.TSaslClientTransport for SASL kerberos.
Please help to suggest if Python cannot be used in Cloudera managed Kerberos Hbase thrift HTTPS and any alternative method to connect Hbase (Kerberos) using Python.
PS: I went through this link with a similar issue but had no concrete solution
Python program to connect to HBase via thrift server in Http mode
Thanks in advance,
Manjil
question from:
https://stackoverflow.com/questions/65936144/python3-connection-to-kerberos-hbase-thrift-https 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…