AWS EMR
gProfiler can be added to your EMR cluster's bootstrap actions from either the AWS console or the AWS CLI, in order to automatically include it when deploying clusters. Note that gProfiler bootstrap actions install and load gProfiler before your cluster's core services such as Spark or Hadoop. To set up a gProfiler bootstrap action, follow these instructions.
In the AWS EMR Console
You have the option to specify a bootstrap action when creating a cluster. When creating a cluster, choose Advanced Options and then navigate to General Cluster Settings. Under Bootstrap Action select Configure and add under the Advanced Options.
In the AWS CLI
You can pass references to bootstrap action scripts on Amazon EMR by adding the --bootstrap-actions
parameter when you create the cluster using the create-cluster
command. It will look like this:
--bootstrap-actions Path="s3://mybucket/filename",Args=[arg1,arg2]
What You Need to Install gProfiler on EMR
To install gProfiler on your EMR cluster you will need the following arguments:
Token: Your user's unique API key, or the API token which is given on gProfiler's “Install Service” page:
EMR Cluster Name: The name of the EMR cluster you would like to install gProfiler on.
Service ID: The service identifier you defined, which is used to correlate all the agents that are installed on the machines in your service.
Download Bucket: The name of the S3 bucket containing your cluster's bootstrap scripts.
Installing gProfiler on EMR
With the AWS CLI
When creating a cluster, add gProfiler installation as a bootstrap action using the following command, but remember to choose the relevant cluster name and fill in the appropriate Token and Service ID, preserving the TOKEN=
and SERVICE=
prefixes:
aws emr create-cluster --name "MY-Cluster" ...
--bootstrap-actions "Path=s3://download.granulate.io/granulate_generic_env_wrapper.sh,Args=[<https://s3.amazonaws.com/download.granulate.io/granulate_run_gprofiler.sh,TOKEN=<TOKEN>,SERVICE=<SERVICE>]"
With the AWS Console
Open the Amazon EMR console at https://console.aws.amazon.com/elasticmapreduce/
Choose Create cluster
Click Go to Advanced Options
On the Create Cluster - Advanced Options screen, in steps 1 and 2 choose the options you prefer and proceed to Step 3: General Cluster Settings
Under Bootstrap Actions select Configure and Add, choose Custom Actions, and fill in the following:
Name: Granulate gProfiler
Script Location:
s3://download.granulate.io/granulate_generic_env_wrapper.sh
Optional Arguments (note the line breaks between the arguments, and remember to fill in the appropriate
TOKEN
andSERVICE_ID
values, preserving theTOKEN=
andSERVICE=
prefixes ):<https://s3.amazonaws.com/download.granulate.io/granulate_run_gprofiler.sh>
TOKEN=<TOKEN>
SERVICE=<SERVICE_ID>
The filled form will look like this:
7. Click Add
8. Proceed to create the cluster. Your bootstrap action(s) will be performed after the cluster has been provisioned and initialized.
Last updated