You can use Apache Slider. to manually install Presto on a YARN-based cluster.
Installing and Integrating Presto with YARN
- Manual Installation on a YARN-Based Cluster
- Deploying Presto on a YARN-Based Cluster
- Using Apache Slider to Manually Install Presto on a YARN-Based Cluster
- Debugging and Logging
The installation procedures assume that you have a basic knowledge of Presto and the configuration files and properties it uses.
All example files referred to are from: https://github.com/prestodb/presto-yarn/
- A cluster with HDP 2.2+ or CDH5.4+ installed
- Apache Slider 0.80.0 (download from https://slider.incubator.apache.org/)
- JDK 1.8
- openssl >= 1.0.1e-16
When you use Slider to install Presto on a YARN-based cluster, the Presto installation directory structure differs from the standard structure.
For more information, see: Presto Installation Directory Structure for YARN-Based Clusters.
Before installation, you must configure the .json files required for running Presto.
For more information, see: Presto Configuration Options for YARN-Based Clusters.
- Download the slider 0.80.0 installation file from http://slider.incubator.apache.org/index.html to one of your nodes in the cluster.
tar -xvf slider-0.80.0-incubating-all.tar.gz
- Now configure Slider with JAVA_HOME and HADOOP_CONF_DIR in
export JAVA_HOME=/usr/lib/jvm/java export HADOOP_CONF_DIR=/etc/hadoop/conf
- Configure zookeeper in
conf/slider-client.xml. In case zookeper is listening on
master:2181you need to add there the following section:
<property> <name>slider.zookeeper.quorum</name> <value>master:2181</value> </property>
- Configure path where slider packages will be installed
<property> <name>fs.defaultFS</name> <value>hdfs://master/</value> </property>
Make sure the user running slider, which should be same as
appConfig.json, has a home dir in HDFS (See note here: appConfig.json).
For more details about appConfig.json and resources.json, see Presto Configuration Options for YARN-Based Clusters
su hdfs $ hdfs dfs -mkdir -p /user/<user> $ hdfs dfs -chown <user>:<user> -R /user/<user>
- Now run Slider:
su <user> cd slider-0.80.0-incubating bin/slider package --install --name PRESTO --package ../presto-yarn-package-*.zip bin/slider create presto1 --template appConfig.json --resources resources.json (using modified .json files as per your requirement)
This should start your application, and you can see it under the Yarn ResourceManager webUI.If your application is successfully run, it should continuously be available in the YARN resource manager as a “RUNNING” application. If the job fails, please be sure to check the job history’s logs along with the logs on the node’s disk. See Debugging and Logging for YARN-Based Clusters.
You can use the following Slider commands to manage your existing Presto application.
If you want to check the status of running application you run the
following, and you will have status printed to a file
bin/slider status presto1 --out status_file
If you want to re-create the app due to some failures or you want to reconfigure Presto (eg: add a new connector)
bin/slider destroy presto1 bin/slider create presto1 --template appConfig.json --resources resources.json
Delete the app including the app package.
bin/slider package --delete --name PRESTO
Flex the number of Presto workers to the new value. If greater than before, new copies of the worker will be requested. If less, component instances will be destroyed.
Changes are immediate and depend on the availability of resources in the
YARN cluster. Make sure while flex that there are extra nodes
available(if adding) with YARN nodemanagers running and also Presto data
directory pre-created/owned by
yarn user. Also make sure these nodes
do not have a Presto component already running, which may cause flex-ing
to deploy worker on these nodes and eventually failing.
eg: Asumme there are 2 nodes (with YARN nodemanagers running) in the cluster and you initially deployed only one of the nodes with Presto via Slider. If you want to deploy and start Presto WORKER component on the second node (assuming it meets all resource requirements) and thus have the total number of WORKERS to be 2, then run:
bin/slider flex presto1 --component WORKER 2
Please note that if your cluster already had 3 WORKER nodes running, the above command will destroy one of them and retain 2 WORKERs.
The following advanced configuration options are available:
- Configuring memory, CPU, and YARN CGroups
- Failure policy
- YARN label
For more information, see Advanced Configuration Options for YARN-Based Clusters.