AWS EMR
Last updated
Last updated
gProfiler can be added to your EMR cluster's bootstrap actions from either the AWS console or the AWS CLI, in order to automatically include it when deploying clusters. Note that gProfiler bootstrap actions install and load gProfiler before your cluster's core services such as Spark or Hadoop. To set up a gProfiler bootstrap action, follow these instructions.
You have the option to specify a bootstrap action when creating a cluster. When creating a cluster, choose Advanced Options and then navigate to General Cluster Settings. Under Bootstrap Action select Configure and add under the Advanced Options.
You can pass references to bootstrap action scripts on Amazon EMR by adding the --bootstrap-actions
parameter when you create the cluster using the create-cluster
command. It will look like this:
--bootstrap-actions Path="s3://mybucket/filename",Args=[arg1,arg2]
To install gProfiler on your EMR cluster you will need the following arguments:
Token: Your user's unique API key, or the API token which is given on gProfiler's “Install Service” page:
EMR Cluster Name: The name of the EMR cluster you would like to install gProfiler on.
Service ID: The service identifier you defined, which is used to correlate all the agents that are installed on the machines in your service.
Download Bucket: The name of the S3 bucket containing your cluster's bootstrap scripts.
When creating a cluster, add gProfiler installation as a bootstrap action using the following command, but remember to choose the relevant cluster name and fill in the appropriate Token and Service ID, preserving the TOKEN=
and SERVICE=
prefixes:
aws emr create-cluster --name "MY-Cluster" ...
--bootstrap-actions "Path=s3://download.granulate.io/granulate_generic_env_wrapper.sh,Args=[<https://s3.amazonaws.com/download.granulate.io/granulate_run_gprofiler.sh,TOKEN=<TOKEN>,SERVICE=<SERVICE>]"
Open the Amazon EMR console at https://console.aws.amazon.com/elasticmapreduce/
Choose Create cluster
Click Go to Advanced Options
On the Create Cluster - Advanced Options screen, in steps 1 and 2 choose the options you prefer and proceed to Step 3: General Cluster Settings
Under Bootstrap Actions select Configure and Add, choose Custom Actions, and fill in the following:
Name: Granulate gProfiler
Script Location: s3://download.granulate.io/granulate_generic_env_wrapper.sh
Optional Arguments (note the line breaks between the arguments, and remember to fill in the appropriate TOKEN
and SERVICE_ID
values, preserving the TOKEN=
and SERVICE=
prefixes ):
<https://s3.amazonaws.com/download.granulate.io/granulate_run_gprofiler.sh>
TOKEN=<TOKEN>
SERVICE=<SERVICE_ID>
The filled form will look like this:
7. Click Add
8. Proceed to create the cluster. Your bootstrap action(s) will be performed after the cluster has been provisioned and initialized.