Skip to content

Commit

Permalink
refactor for dispatch style
Browse files Browse the repository at this point in the history
  • Loading branch information
msarahan committed Nov 8, 2024
1 parent 69cb461 commit 64cd994
Show file tree
Hide file tree
Showing 21 changed files with 9,343 additions and 295 deletions.
54 changes: 54 additions & 0 deletions .github/workflows/test-child-workflow.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
name: imitation

on:
workflow_call:

permissions:
actions: read
checks: none
contents: read
deployments: none
discussions: none
id-token: write
issues: none
packages: read
pages: none
pull-requests: read
repository-projects: none
security-events: none
statuses: none

jobs:
build:
name: Jobby McJobface
runs-on: ubuntu-latest
strategy:
matrix:
CERTS: ["with","without"]
steps:
# These would be things like the top-level traceparent, global resource attributes, and any custom
# repo/branch for shared-actions. Note that we have a chicken/egg here. This dispatch action can't
# utilize the custom shared-actions repo/branch because we don't know it yet. It will take effect
# for the tasks below this action.
- name: Load base env vars
uses: rapidsai/shared-actions/telemetry-dispatch-load-base-env-vars@telemetry-dispatch-actions
- name: Dummy work
shell: bash
run: echo "This is dumm"
# if statements are parsed before matrix, so we can't use matrix directly.
# https://github.com/actions/runner/issues/1985
- name: Get matrix value
id: matrix_value
run: echo CERTS=${{matrix.CERTS}} >> ${GITHUB_OUTPUT}
- name: Send telemetry summary (with certs)
uses: rapidsai/shared-actions/telemetry-dispatch-write-summary@telemetry-dispatch-actions
id: summary_with_certs
if: ${{steps.matrix_value.outputs.CERTS}} == 'with'
with:
cert_concat: ${{ secrets.OTEL_EXPORTER_OTLP_CA_CERTIFICATE }};${{ secrets.OTEL_EXPORTER_OTLP_CLIENT_CERTIFICATE }};${{ secrets.OTEL_EXPORTER_OTLP_CLIENT_KEY }}
# this one should error internally, but the status in the job view should be green. Ideally, we should see something about
- name: Send telemetry summary (without certs)
continue-on-error: true
uses: rapidsai/shared-actions/telemetry-dispatch-write-summary@telemetry-dispatch-actions
id: summary_without_certs
if: ${{steps.matrix_value.outputs.CERTS}} == 'without'
114 changes: 78 additions & 36 deletions .github/workflows/test-telemetry-setup.yaml
Original file line number Diff line number Diff line change
@@ -1,51 +1,93 @@
# This workflow is meant to imitate the behavior of RAPIDS project PR workflows, such as


on:
pull_request:
push:
workflow_dispatch:

env:
SHARED_ACTIONS_REF: ${{github.ref}}
SHARED_ACTIONS_REF: ${{ github.ref}}

defaults:
run:
shell: bash

jobs:
compute_traceparent:
base-env-setup:
runs-on: ubuntu-latest
# These will be stashed. The names are not arbitrary. They match special OpenTelemetry names
# or names that are hard-coded in actions/scripts downstream.
env:
SHARED_ACTIONS_REPO: rapidsai/shared-actions
SHARED_ACTIONS_REF: ${{ github.ref }}
# this should stay the same throughout this workflow, but child workflows will each
# have their own OTEL_SERVICE_NAME. It is generally the job name, including any matrix elements.
# This is what distinguishes one job trace from another, so it is important to be distinct
# between jobs.
OTEL_SERVICE_NAME: test-telemetry plus something
# TODO: this should be set as an org-wide variable
OTEL_EXPORTER_OTLP_ENDPOINT: https://tempo.gha-runners.nvidia.com:4318
OTEL_EXPORTER_OTLP_PROTOCOL: "http/protobuf"
OTEL_RESOURCE_ATTRIBUTES: "git.repository=${{ github.repository }},git.ref=${{ github.ref }},git.sha=${{ github.sha }},git.job_url=${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}"
outputs:
service-name: ${{ steps.export.outputs.service_name }}

steps:
- name: Checkout actions
uses: actions/checkout@v4
with:
ref: ${{env.SHARED_ACTIONS_REF}}
path: ./shared-actions
- name: Get job traceparent
uses: ./shared-actions/telemetry-traceparent
id: job-traceparent
- name: Echo value from job
run: echo "${{steps.job-traceparent.outputs.traceparent}}"
example_matrix:
name: Test ${{ matrix.os}}-${{ matrix.version }}
- name: Compute traceparent and stash telemetry-related env vars
uses: rapidsai/shared-actions/telemetry-dispatch-stash-base-env-vars@telemetry-dispatch-actions
- name: Export service name so we can check it below
id: export
run: echo service_name="${OTEL_SERVICE_NAME}" >> ${GITHUB_OUTPUT}
child-workflow:
needs: base-env-setup
secrets: inherit
uses: rapidsai/shared-actions/.github/workflows/test-child-workflow.yaml@telemetry-dispatch-actions
summarize-top-level:
runs-on: ubuntu-latest
strategy:
matrix:
version: [1, 2]
os: [ubuntu-latest] # , windows-latest
continue-on-error: true
needs:
- base-env-setup
- child-workflow
steps:
- name: Checkout actions
uses: actions/checkout@v4
- name: Load base env vars, including OTEL_SERVICE_NAME
uses: rapidsai/shared-actions/telemetry-dispatch-load-base-env-vars@telemetry-dispatch-actions
with:
ref: ${{env.SHARED_ACTIONS_REF}}
path: ./shared-actions
# Run job with traceparent. We'll validate that this matches in the summary data.
- name: Get job traceparent in matrix job
uses: ./shared-actions/telemetry-traceparent
id: job-traceparent
- name: Generate traceparent for a step
uses: ./shared-actions/telemetry-traceparent
id: step-traceparent
load_service_name: "true"
- name: Check if service name took on an unexpected value
run: |
echo "(should be the value set to the OTEL_SERVICE_NAME env var in base-env-setup job)"
[ "${OTEL_SERVICE_NAME}" = "${{needs.base-env-setup.outputs.service-name}}" ] || exit 1
- name: Telemetry summarize
uses: rapidsai/shared-actions/telemetry-dispatch-write-summary@telemetry-dispatch-actions
with:
step_name: "Download gha-tools with git clone"
- name: Echo computed step traceparent
cert_concat: "${{ secrets.OTEL_EXPORTER_OTLP_CA_CERTIFICATE }};${{ secrets.OTEL_EXPORTER_OTLP_CLIENT_CERTIFICATE }};${{ secrets.OTEL_EXPORTER_OTLP_CLIENT_KEY }}"

- name: Check if service name was altered during telemetry summary
run: |
echo "GHA tools clone job traceparent: ${{ steps.step-traceparent.outputs.traceparent }}"
- name: Test OTel export of job JSON
uses: ./shared-actions/telemetry-summarize
echo "(should be the value set to the OTEL_SERVICE_NAME env var in base-env-setup job)"
[ "${OTEL_SERVICE_NAME}" = "${{needs.base-env-setup.outputs.service-name}}" ] || exit 1
- name: Query the Tempo HTTP API and check that our trace is present and has expected properties
run: |
TRACE_ID=$( cut -d '-' -f 2 <<< "$TRACEPARENT" );
echo "Trace ID is: ${TRACE_ID}";
TRACE_URL="${OTEL_EXPORTER_OTLP_ENDPOINT/4318/3200}/api/traces/${TRACE_ID}"
echo "Trace URL is: ${TRACE_URL}"
curl \
--cert /tmp/certs/client.crt.pem --key /tmp/certs/client.key.pem --cacert /tmp/certs/ca.crt.pem \
-Gs "${TRACE_URL}" > trace_record.json;
- name: Upload trace record
uses: actions/upload-artifact@v4
with:
traceparent: ${{ steps.job-traceparent.outputs.traceparent}}
name: trace-record
path: trace_record.json
- name: Validate span metadata
# these are not returned in any particular order. The span kind is the only one
# that we can reliably expect to be the same.
run: |
span_kind="$(jq -r '.batches[0].scopeSpans[0].spans[0].kind' trace_record.json )";
echo "Checking if span kind is as expected"
echo "Span kind is: "${span_kind}""
[ "${span_kind}" = "SPAN_KIND_CLIENT" ] || exit 1
echo "Verify that job names (also called service name) are correct"
job_names="$(jq -c '[.batches[].resource.attributes[] | select(.key == "service.name") | .value.stringValue] | unique' trace_record.json)"
[ "$job_names" = '["child-workflow / Jobby McJobface (with)","child-workflow / Jobby McJobface (without)","test-telemetry plus something"]' ] || exit 1
99 changes: 68 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,45 +3,82 @@
Contains all of the shared composite actions used by RAPIDS.

Actions that refer to each other assume that they have been checked out to the
./shared-actions folder. This assumption is what allow code reuse between
actions. Your general usage pattern for using these actions in other repos
should be:
./shared-actions folder. This *should* be the root of the GitHub Actions workspace.
This assumption is what allow code reuse between actions.

In general, we should try to never call "implementation actions" here. Instead,
we should prefer to create "dispatch actions" that clone shared-actions from a particular repo
at a particular ref, and then dispatch to an implementation action from that repo.
This adds complexity, but has other advantages:

* simplifies specifying a custom branch for actions for development and testing
* Changes all shared-actions calls in a workflow at once, instead of changing each one
* allows reuse of shared-actions within the shared-actions repo. Trying to use these
without the clone and relative path would not otherwise keep the repo and ref
consistent, leading to great confusion over why changes aren't being reflected.

## Example dispatch action

```yaml
name: 'Example dispatch action'
description: |
The purpose of this wrapper is to keep it easy for external consumers to switch branches of
the shared-actions repo when they are changing something about shared-actions and need to test it
in their pipelines.
Inputs here are all assumed to be env vars set outside of this script.
Set them in your main repo's workflows.
runs:
using: 'composite'
steps:
- name: Clone shared-actions repo
uses: actions/checkout@v4
with:
repository: ${{ env.SHARED_ACTIONS_REPO}}
ref: ${{ env.SHARED_ACTIONS_REF}}
path: ./shared-actions
- name: Stash base env vars
uses: ./shared-actions/_stash-base-env-vars
```
...
In this action, the "implementation action" is the
`./shared-actions/_stash-base-env-vars`. You can have inputs in your
dispatch actions. You would just pass them through to the implementation action.
Environment variables do carry through from the parent workflow through the
dispatch action, into the implemetation action. In most cases, it is simpler
(though less explicit) to set environment variables instead of plumbing inputs
through each action.

Environment variables are hard-coded, not detected. If you want to pass a different
environment variable through, you need to add it to implementation stash action,
like `telemetry-impls/stash-base-env-vars/action.yml`. You do not need to
explicitly specify it on the loading side.

## Implementation action

These are similar to dispatch actions, except that they should not clone
shared-actions. They can depend on other actions from the shared-actions
repository using the `./shared-actions` relative path.

## Example calling workflow

The key detail here is that the presence of the SHARED_ACTIONS_REPO and/or
SHARED_ACTIONS_REF environment variables is what changes the shared-actions
dispatch. The `uses` line should not change.

```yaml
env:
SHARED_ACTIONS_REF: 'main'
# Change these in PRs
SHARED_ACTIONS_REPO: some-fork/shared-actions
SHARED_ACTIONS_REF: some-custom-branch
jobs:
actions-user:
runs-on: ubuntu-latest
steps:
- name: Checkout actions
uses: actions/checkout@v4
with:
repository: rapidsai/shared-actions
ref: ${{env.SHARED_ACTIONS_REF}}
path: ./shared-actions
- name: run script
uses: ./shared-actions/some-script-folder-name
with:
blah: yes
```

Instead of something like:

```
- name: Telemetry setup
id: telemetry-setup
uses: rapidsai/shared-actions/telemetry-traceparent@add-telemetry
```

This latter syntax is difficult because the branch info does not cascade
recursively into any checkouts that might be done in an action, and also because
this syntax does not support actions calling other actions with relative paths.

Note that the cloning/checkout order matters! The actions/checkout action wipes
the destination before cloning into it. That means that if you clone the shared-
actions repo in a folder, then clone the main repo without a path, the shared-
actions folder will be removed when you go looking for it. See https://github.com/actions/checkout/issues/1525#issuecomment-2076363261
# DO NOT change this in PRs
uses: rapidsai/shared-actions/dispatch-script@main
```
52 changes: 0 additions & 52 deletions github-actions-job-info/example-gha-job-log.json

This file was deleted.

Loading

0 comments on commit 64cd994

Please sign in to comment.