I can answer the first question about the Trino CLI.
Before you can run a query in Trino on your data in HDFS, you will need to configure the hive connector catalog first. In your Trino installation, there should be an etc
directory. Beneath that directory is the etc/catalog
directory.
Make a new file etc/catalog/hive.properties
and add the following configuration.
connector.name=hive-hadoop2
hive.metastore.uri=thrift://<your-metastore-ip-address>:9083
Let's break down what these properties mean:
connector.name=hive-hadoop2
indicates that the catalog will use the Trino hive connector.
hive.metastore.uri=thrift://<your-metastore-ip-address>:9083
tells Trino where to find the metastore that is installed with Hive.
If you're not sure where to find your metastore ip address, the hive documentation indicates some configuration files that contain them depending on which version of Hadoop/Hive you are running.
Hive and Trino share the metastore, but run the queries on entirely different resources. I wrote this blog to help introduce these concepts when folks are starting with Trino. Maybe it can help as you start.
Assuming there's nothing too complex about your setup, that should be all that is required. In some cases you may need the hive.config.resources
to contain the path of your hdfs-site.xml
and core-site.xml
.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…