Skip to content

Deploying a Github Sensor

This guide will walk through the steps to install and configure a Github sensor deployment to collect metrics from github and publish the metrics to the client tenant running in cloud.

Pre-Requisites

Follow the guide for Common Pre-requisites to complete the pre-requisites for deploying a sensor.

How it works

The Github sensor collects data from Github using the provided access token and then sends it to the configured stream. The sensor is triggered based on a specified cron-style schedule.

Github Token

To configure the Github sensor, the user needs to provide the sensor specification as a Kubernetes custom resource definition (CRD) and create a secret with the access token generated from Github. The user can then deploy the sensor on the Kubernetes cluster.

The secret created to store the token must be created like this:

kind: Secret
apiVersion: v1
metadata:
  name: <secret-name>
  namespace: <sample_tenant>
data:
  token: <redacted base64 string of pat token>

To ensure the token is properly encoded in base64, you can use the following command:

echo -n "your-github-access-token" | base64

Replace "your-github-access-token" with your actual Github access token.

Custom Host

By default api.github.com is used as host and uploads.github.com is used as upload url. If you need to use your own host and upload url you will need to specify:

data:
  host: "personal-host"
  uploadUrl: "personal-uploadUrl"

Repo Groups

The Github sensor requires an array of repoGroups to be configured in order to collect data. The configuration for each repoGroup looks like this:

data:
  repoGroups:
    - owner: "owner"
      metrics:
        - "metric-name-1"
        - "metric-name-2"
      repos:
        - "*" # to get all repos or...
        - "repo-name-1"
        - "repo-name-2"

When the sensor is triggered, it retrieves the desired data for each repo in the array of repoGroups, generating a new signal for each specified metric. The retrieved data is then packaged as a signal and sent to the configured stream, where it can be analyzed and processed by the Fitness Functions.

Metrics

There are various metrics that can be collected using Orcasio's Github sensor, and these can be specified in the metrics field.

By default, information about the project is included in all signals. This information can be checked in: https://docs.github.com/en/rest/repos#get-a-repository

The possible metrics are:

  • Value: workflows
  • Description: For each workflow it looks for the las 10 runs and includes the jobs for each run.
  • Github docs:
    1. Workflows
    2. Runs
    3. Jobs
  • Sample signal:
    {
      "kind": "FitnessSignal",
      "apiVersion": "fitness.orcasio.com/v1alpha3",
      "metadata": {
        "name": "orcas-signal-github-cron-1682003520899",
        "creationTimestamp": "2023-04-20T15:12:00Z"
      },
      "spec": {
        "sensor": "sensor://fitness.orcasio.net/github",
        "sensorId": "e1d486fd-ce2c-444f-8a8a-a1e1730ff63a",
        "source": "github",
        "tenant": "{tenant-name}",
        "group": "{org}/{repo}{endpoint-name}",
        "time": "2023-04-20T15:12:00Z",
        "tags": { "foo": "bar" },
        "data": {
          "metadata": { "creationTimestamp": null, "name": "github-cron" },
          "response": {
            "repo": { "foo": "bar" }, // Github's response
            "workflows": [
              {
                // Github's Workflow data
                "runs": [
                  {
                    // Github's Run data
                    "jobs": [] // Github's Jobs data
                  }
                ]
              }
            ]
          },
          "result": "success",
          "type": "project"
        }
      },
      "status": { "state": "", "reason": "", "time": null }
    }
    

  • Value: dependabot
  • Description: It brings all the alerts generated by dependabot
  • Github docs:
    1. Dependabot
  • Sample signal:
    {
      "kind": "FitnessSignal",
      "apiVersion": "fitness.orcasio.com/v1alpha3",
      "metadata": {
        "name": "orcas-signal-github-cron-1682003520899",
        "creationTimestamp": "2023-04-20T15:12:00Z"
      },
      "spec": {
        "sensor": "sensor://fitness.orcasio.net/github",
        "sensorId": "e1d486fd-ce2c-444f-8a8a-a1e1730ff63a",
        "source": "github",
        "tenant": "{tenant-name}",
        "group": "{org}/{repo}{endpoint-name}",
        "time": "2023-04-20T15:12:00Z",
        "tags": { "foo": "bar" },
        "data": {
          "metadata": { "creationTimestamp": null, "name": "github-cron" },
          "response": {
            "repo": { "foo": "bar" }, // Github's response
            "dependabot": {
              "alerts": [] // Github's response
            }
          },
          "result": "success",
          "type": "project"
        }
      },
      "status": { "state": "", "reason": "", "time": null }
    }
    

Configuration

Copy the following yml file and make changes for all the TODO sections.

kind: FitnessSensor
apiVersion: fitness.orcasio.com/v1alpha3
metadata:
  ## TODO: name of the sensor
  name: <sensor-name>
  namespace: <tenant_ID>
spec:
  sensor: sensor://fitness.orcasio.net/github
  source: github
  enabled: true
  secret: <secrets_name>    
  trigger:
    name: cron
    cron:
      ## TODO - every 5 minutes
      schedule: "0 0/5 * * * *"
  data:
    host: api.github.com
    uploadHost: uploads.github.com
    repoGroups:
      ## TODO: metrics and repos
      - owner: "owner"
        metrics:
          - "action-name1"
          - "action-name2"
        repos:
          - "repo1"
          - "repo2"

Github Sensor configuration

As shown in the sample the configuration for a Github Fitness Sensor is done using a Kubernetes CRD object. This specification is picked up by the Pulse Sensor Deployment and the deployment runs the sensor on a time basis to fetch the data and publish to the remote tenant as a stream over HTTP.

Sensor Specification

Name Description Value Value Type
kind Configuration object kind FitnessSensor (fixed - do not change
apiVersion Configuration object version fitness.orcasio.com/v1alpha3 (fixed - do not change
metadata.name Unique name of sensor <unique string> Ensure name is unique and no spaces or special chars except for "-"
spec.sensor Type of Fitness Sensor => github sensor://fitness.orcasio.net/github (fixed - do not change to use github Sensor
spec.source Specify Source Name to identify System/Platform/Tool <string> ex: "sample_tenant-inventory
spec.enabled Boolean value to specify if Sensor is enabled to collect data or not <true or false> specify true to enable the sensor
spec.tenant Value to specify the Pulse Platform tenant sample_tenant (fixed - do not change
spec.trigger.name Cron trigger type cron (fixed - do not change)
spec.trigger.cron.schedule Cron style schedule to enable timer (cron style schedule ex: "0 0/5 * * * *" specifies a 5 min timer
spec.data.apiPath Path prefix for API endpoint "repos/" (fixed - do not change)
spec.data.host Github api host <string> default: "api.github.com"
spec.data.uploadHost Github api uploadhost default: "uploads.github.com"
spec.data.repoGroups.owner Owner of the repos. Could be an org or a user <string>
spec.data.repoGroups.metrics Array of metrics to query from repos <string>
spec.data.repoGroups.repos Array of repos to query. Can be "*" to query all repos <string>
spec.tags Object with key/value pairs to specify static tags Object Array ex: foo:bar baz:qux

Deploy Custom Sensor Specification

Create the FitnessSensor object defining the sensor configuration for the Custom Rest API Sensor. The FitnessSensor object needs to replace the last section of the file below

## TODO - CHANGE THE SAMPLE BELOW AS NEEDED
## CUSTOM SENSOR CONFIG
kind: FitnessSensor
apiVersion: fitness.orcasio.com/v1alpha3
metadata:
  name: turtles-all-the-way
  namespace: <tenant_ID>
spec:
  sensor: sensor://fitness.orcasio.net/github
  source: github
  ## TODO - secret name with token
  secret: <secret_name>
  enabled: true
  ## TODO - cron for every 5 minutes
  trigger:
    name: cron
    cron:
      schedule: "0 * * * * *"
  data:
  ## TODO - change repo groups as needed
    host: api.github.com
    repoGroups:
      - owner: <owner>
        metrics:
          - <metric_name>
        repos:
          - <repo_name>
  tags:
    env: dev
    org: it

Once all other TODOs in the file are completed the configuration can be applied to a Kubernetes cluster

kubectl -n <tenant-id> apply -f <filename.yaml>

Examples

  • Description: This sensor will generate a new signal for each owned by "sampleOrg" with workflow data:
  • Schedule: The sensor will run 1 time every day
  • Sample sensor:
apiVersion: fitness.orcasio.com/v1alpha3
kind: FitnessSensor
spec:
  sensor: sensor://fitness.orcasio.net/github
  source: github
  tenant: sample_tenant
  secret: github
  tags:
    foo: bar
  trigger:
    cron:
      schedule: 0 0 0 * * *
    name: cron
  enabled: true
  data:
    repoGroups:
      - owner: sampleOrg
        metrics:
          - workflows
        repos:
          - "*"
  • Sample secret:
kind: Secret
apiVersion: v1
metadata:
  name: github
  namespace: orcas
data:
  token: encodedTokenInBase64
  • Description: This sensor will generate a new signal for "repo1" and "repo2" owned by "sampleOrg" with workflow data; and an other signal with dependabot data:
  • Schedule: The sensor will run 2 times every day
  • Sample sensor:
apiVersion: fitness.orcasio.com/v1alpha3
kind: FitnessSensor
spec:
  sensor: sensor://fitness.orcasio.net/github
  source: github
  tenant: sample_tenant
  secret: github
  tags:
    foo: bar
  trigger:
    cron:
      schedule: 0 0 0/12 * * *
    name: cron
  enabled: true
  data:
    repoGroups:
      - owner: sampleOrg
        metrics:
          - workflows
          - dependabot
        repos:
          - "repo1"
      - "repo2"
  • Sample secret:

```yaml kind: Secret apiVersion: v1 metadata: name: github namespace: orcas data: token: encodedTokenInBase64