hadoop - Python Hive Metastore partition timeout

Question

Welcome To Ask or Share your Answers For Others

hadoop - Python Hive Metastore partition timeout

posted Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

hadoop - Python Hive Metastore partition timeout

We have ETL jobs in Python (Luigi). They all connect to Hive Metastore to get partitions info.

Code:

from hive_metastore import ThriftHiveMetastore
client = ThriftHiveMetastore.Client(protocol)
partitions = client.get_partition_names('sales', 'salesdetail', -1)

-1 is max_parts (max partitions returned)

It randomly times out like this:

  File "/opt/conda/envs/etl/lib/python2.7/site-packages/luigi/contrib/hive.py", line 210, in _existing_partitions
    partition_strings = client.get_partition_names(database, table, -1)
  File "/opt/conda/envs/etl/lib/python2.7/site-packages/hive_metastore/ThriftHiveMetastore.py", line 1703, in get_partition_names
    return self.recv_get_partition_names()
  File "/opt/conda/envs/etl/lib/python2.7/site-packages/hive_metastore/ThriftHiveMetastore.py", line 1716, in recv_get_partition_names
    (fname, mtype, rseqid) = self._iprot.readMessageBegin()
  File "/opt/conda/envs/etl/lib/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py", line 126, in readMessageBegin
    sz = self.readI32()
  File "/opt/conda/envs/etl/lib/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py", line 206, in readI32
    buff = self.trans.readAll(4)
  File "/opt/conda/envs/etl/lib/python2.7/site-packages/thrift/transport/TTransport.py", line 58, in readAll
    chunk = self.read(sz - have)
  File "/opt/conda/envs/etl/lib/python2.7/site-packages/thrift/transport/TTransport.py", line 159, in read
    self.__rbuf = StringIO(self.__trans.read(max(sz, self.__rbuf_size)))
  File "/opt/conda/envs/etl/lib/python2.7/site-packages/thrift/transport/TSocket.py", line 105, in read
    buff = self.handle.recv(sz)
timeout: timed out

This error happens occasionally.
There is 15 minute timeout on Hive Metastore.
When I investigate to run get_partition_names separately, it returns data within a few seconds.
Even when I set socket.timeout to 1 or 2 seconds, query completes.
There is no record of socket close connection message in Hive metastore logs cat /var/log/hive/..log.out

The tables it usually times out on have large number of partitions ~10K+. But as mentioned before, they only time out randomly. And they return partitions metadata quickly when that portion of code alone is tested.

Any ideas why it times out randomly, or how to catch these timeout errors in metastore logs, or how to fix them ?

question from:https://stackoverflow.com/questions/65898409/python-hive-metastore-partition-timeout

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

Categories

hadoop - Python Hive Metastore partition timeout

hadoop - Python Hive Metastore partition timeout

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags