Google Dataflow
We currently only support Python pipelines. Java support is coming soon.
You can read the official documentation for Apache Beam here on adding non-Python dependencies, which we used to implement this installation method.
Copy the Dataflow setup file over to where you wish to start your Dataflow jobs from. Replace the values of service_name
and gprofiler_token
:
Replace
<token>
in the command line with your token you got from the gProfiler Performance Studio site.Replace
<service name>
in the command line with the service name you wish to use.
Whenever you start a Dataflow job, add the --setup_file /path/to/setup.py
flag with your setup.py
copy (PLEASE NOTE - the flag is --setup_file
and not --setup-file
). For example, here's a command that starts an example Apache Beam job with gProfiler:
If you are already using the --setup_file
flag for your own setup, you can merge your setup file with the gProfiler one. Copy over all of the code in the gProfiler setup file except the setuptools.setup
call, and add the following keyword argument to your setuptools.setup
call:
For example:
Last updated