If you are planning to use HDP distribution, you can use Ambari and Apache Slider. to perform automated Presto installation and integration on a YARN-based cluster. During installation, both the Apache Slider package and Presto are installed.
Installing and Integrating Presto with YARN
- Automated Installation on a YARN-Based Cluster
- Deploying Presto on a YARN-Based Cluster
- Using Ambari Slider View to Install Presto on a YARN-Based Cluster
- Additional Configuration Options
- Debugging and Logging
The installation procedures assume that you have a basic knowledge of Presto and the configuration files and properties it uses.
All example files referred to are from: https://github.com/prestodb/presto-yarn/
- A cluster with HDP 2.2+ or CDH5.4+ installed
- Apache Slider 0.80.0 (download from https://slider.incubator.apache.org/)
- JDK 1.8
- openssl >= 1.0.1e-16
- Ambari 2.1
When you use Ambari Slider View to install Presto on a YARN-based cluster, the Presto installation directory structure differs from the standard structure.
For more information, see Presto Installation Directory Structure for YARN-Based Clusters.
During installation, Ambari Slider View allows you to select configuration options required for running Presto.
For more information, see Presto Configuration Options for YARN-Based Clusters.
Ambari supports deploying Slider application packages using Slider View and provides Slider integration. Slider View for Ambari allows you to deploy and manage Slider apps from Ambari Web.
Use Ambari Slider View and the following steps to deploy Presto on YARN:
- Install the Ambari server. See: http://docs.hortonworks.com/HDPDocuments/Ambari-184.108.40.206/bk_Installing_HDP_AMB/content/ch_Installing_Ambari.html.
- Download the Apache Slider package. See: https://slider.incubator.apache.org/
- Copy the Presto app package
/var/lib/ambari-server/resources/apps/directory on your Ambari server node.
- Restart ambari-server.
- Log on to Apache Ambari,
- Name your cluster, provide the configuration of the cluster, and follow the steps on the WebUI.
- Customize/configure the services and install them. A minimum of HDFS, YARN, Zookeeper is required for Slider to work. You must also select Slider to be installed.
- For the Slider client installed, you need to update its configuration if
you are not using the default installation paths for Hadoop and Zookeeper.
slider-env.shshould point to your JAVA_HOME and HADOOP_CONF_DIR
export JAVA_HOME=/usr/lib/jvm/java export HADOOP_CONF_DIR=/etc/hadoop/conf
For zookeeper, if you are using a different installation directory from the default one at
- Add a custom property to the
slider-clientsection in Slider configuration with the key:
- If using a different port from the default
2181, then add the key
masteris the node and
5181is the port.
- Add a custom property to the
Once you have all the services up and running on the cluster, you can configure Slider in Ambari to create and manage your application by creating a “View”.
- Go to
admin(top right corner) ->
- From the left pane, select
- Create a Slider View by populating all the necessary fields with a preferred instance name (for example, Slider).
ambari.server.urlcan be of the format
<clustername>is what you have named your Ambari cluster.
- Select the “Views” control icon in the upper right.
- Select the instance you created in the previous step (for example, “Slider”).
Create Appto create a new Presto YARN application.
- Go to
Provide details of the Presto service. By default, the UI will be populated with the values you have in the
*-default.jsonfiles in your
The app name should be of lower case. For example: presto1.
You can set the configuration property fields required for your cluster. For example, if you want to set a connector for Presto, you can update the
global.catalogproperty. See the following for an explanation of each configuration property.
Prepare HDFS for Slider. The user directory you create here should be for the same user you set in the
global.app_userfield. If the
app_useris going to be
yarnthen do the following:
su hdfs hdfs dfs -mkdir -p /user/yarn su hdfs hdfs dfs -chown yarn:yarn /user/yarn
- Change the
global.presto_server_portfrom 8080 to another unused port, for example, 8089, since Ambari by default uses 8080.
- Pre-create the data directory in the UI (added in
/var/lib/presto/) on all nodes. The directory must be owned by
global.app_user, otherwise Slider will fail to start Presto due to permission errors.
mkdir -p /var/lib/presto/data chown -R yarn:hadoop /var/lib/presto/data
If you want to add any additional custom properties, use the Custom property section. Additional properties currently supported are:
For the requirements and format of these properties, see:
Click Finish. This is the equivalent of
createperformed with the bin/slider script. If successfully deployed, you will see the YARN application started for Presto. You can do the following:
app launchedand monitor the status from Slider view.
- Click``Quick Links``, which should take you to the YARN WebUI.
If your application is running successfully, it should always be available in the YARN resource manager as a “RUNNING” application.
If the job fails, check the job history’s logs and the logs on the node’s disk. See Debugging and Logging for YARN-Based Clusters.
You can manage the application lifecycle (for example: start, stop, flex, and destroy) from the View UI.
After you install Presto and Slider, you can reconfigure Presto or perform additional configuration.
After you launch Presto you can update its configuration. For example, you can add a new connector.
- On the Slider View instance screen, go to
- Stop the running Presto application.
- Click Destroy` to remove the existing Presto instance running in Slider.
- Click the
Create Appbutton to re-create a new Presto instance in Slider and make configuration updates.
The following advanced configuration options are available:
- Configuring memory, CPU, and YARN CGroups
- Failure policy
- YARN label
For more information, see Advanced Configuration Options for YARN-Based Clusters