Deploying a Github Sensor
This guide will walk through the steps to install and configure a Github sensor deployment to collect metrics from github and publish the metrics to the client tenant running in cloud.
Pre-Requisites
Follow the guide for Common Pre-requisites to complete the pre-requisites for deploying a sensor.
How it works
The Github sensor collects data from Github using the provided access token and then sends it to the configured stream. The sensor is triggered based on a specified cron-style schedule.
Github Token
To configure the Github sensor, the user needs to provide the sensor specification as a Kubernetes custom resource definition (CRD) and create a secret with the access token generated from Github. The user can then deploy the sensor on the Kubernetes cluster.
The secret created to store the token must be created like this:
kind: Secret
apiVersion: v1
metadata:
name: <secret-name>
namespace: <sample_tenant>
data:
token: <redacted base64 string of pat token>
To ensure the token is properly encoded in base64, you can use the following command:
Replace "your-github-access-token" with your actual Github access token.
Custom Host
By default api.github.com is used as host and uploads.github.com is used as upload url. If you need to use your own host and upload url you will need to specify:
Repo Groups
The Github sensor requires an array of repoGroups to be configured in order to collect data. The configuration for each repoGroup looks like this:
data:
repoGroups:
- owner: "owner"
metrics:
- "metric-name-1"
- "metric-name-2"
repos:
- "*" # to get all repos or...
- "repo-name-1"
- "repo-name-2"
When the sensor is triggered, it retrieves the desired data for each repo in the array of repoGroups, generating a new signal for each specified metric. The retrieved data is then packaged as a signal and sent to the configured stream, where it can be analyzed and processed by the Fitness Functions.
Metrics
There are various metrics that can be collected using Orcasio's Github sensor, and these can be specified in the metrics field.
By default, information about the project is included in all signals. This information can be checked in: https://docs.github.com/en/rest/repos#get-a-repository
The possible metrics are:
- Value: workflows
- Description: For each workflow it looks for the las 10 runs and includes the jobs for each run.
- Github docs:
- Sample signal:
{ "kind": "FitnessSignal", "apiVersion": "fitness.orcasio.com/v1alpha3", "metadata": { "name": "orcas-signal-github-cron-1682003520899", "creationTimestamp": "2023-04-20T15:12:00Z" }, "spec": { "sensor": "sensor://fitness.orcasio.net/github", "sensorId": "e1d486fd-ce2c-444f-8a8a-a1e1730ff63a", "source": "github", "tenant": "{tenant-name}", "group": "{org}/{repo}{endpoint-name}", "time": "2023-04-20T15:12:00Z", "tags": { "foo": "bar" }, "data": { "metadata": { "creationTimestamp": null, "name": "github-cron" }, "response": { "repo": { "foo": "bar" }, // Github's response "workflows": [ { // Github's Workflow data "runs": [ { // Github's Run data "jobs": [] // Github's Jobs data } ] } ] }, "result": "success", "type": "project" } }, "status": { "state": "", "reason": "", "time": null } }
- Value: dependabot
- Description: It brings all the alerts generated by dependabot
- Github docs:
- Sample signal:
{ "kind": "FitnessSignal", "apiVersion": "fitness.orcasio.com/v1alpha3", "metadata": { "name": "orcas-signal-github-cron-1682003520899", "creationTimestamp": "2023-04-20T15:12:00Z" }, "spec": { "sensor": "sensor://fitness.orcasio.net/github", "sensorId": "e1d486fd-ce2c-444f-8a8a-a1e1730ff63a", "source": "github", "tenant": "{tenant-name}", "group": "{org}/{repo}{endpoint-name}", "time": "2023-04-20T15:12:00Z", "tags": { "foo": "bar" }, "data": { "metadata": { "creationTimestamp": null, "name": "github-cron" }, "response": { "repo": { "foo": "bar" }, // Github's response "dependabot": { "alerts": [] // Github's response } }, "result": "success", "type": "project" } }, "status": { "state": "", "reason": "", "time": null } }
Configuration
Copy the following yml file and make changes for all the TODO sections.
kind: FitnessSensor
apiVersion: fitness.orcasio.com/v1alpha3
metadata:
## TODO: name of the sensor
name: <sensor-name>
namespace: <tenant_ID>
spec:
sensor: sensor://fitness.orcasio.net/github
source: github
enabled: true
secret: <secrets_name>
trigger:
name: cron
cron:
## TODO - every 5 minutes
schedule: "0 0/5 * * * *"
data:
host: api.github.com
uploadHost: uploads.github.com
repoGroups:
## TODO: metrics and repos
- owner: "owner"
metrics:
- "action-name1"
- "action-name2"
repos:
- "repo1"
- "repo2"
Github Sensor configuration
As shown in the sample the configuration for a Github Fitness Sensor is done using a Kubernetes CRD object. This specification is picked up by the Pulse Sensor Deployment and the deployment runs the sensor on a time basis to fetch the data and publish to the remote tenant as a stream over HTTP.
Sensor Specification
| Name | Description | Value | Value Type |
|---|---|---|---|
| kind | Configuration object kind | FitnessSensor | (fixed - do not change |
| apiVersion | Configuration object version | fitness.orcasio.com/v1alpha3 | (fixed - do not change |
| metadata.name | Unique name of sensor | <unique string> | Ensure name is unique and no spaces or special chars except for "-" |
| spec.sensor | Type of Fitness Sensor => github | sensor://fitness.orcasio.net/github | (fixed - do not change to use github Sensor |
| spec.source | Specify Source Name to identify System/Platform/Tool | <string> | ex: "sample_tenant-inventory |
| spec.enabled | Boolean value to specify if Sensor is enabled to collect data or not | <true or false> | specify true to enable the sensor |
| spec.tenant | Value to specify the Pulse Platform tenant | sample_tenant | (fixed - do not change |
| spec.trigger.name | Cron trigger type | cron | (fixed - do not change) |
| spec.trigger.cron.schedule | Cron style schedule to enable timer | (cron style schedule | ex: "0 0/5 * * * *" specifies a 5 min timer |
| spec.data.apiPath | Path prefix for API endpoint | "repos/" | (fixed - do not change) |
| spec.data.host | Github api host | <string> | default: "api.github.com" |
| spec.data.uploadHost | Github api uploadhost | default: "uploads.github.com" | |
| spec.data.repoGroups.owner | Owner of the repos. Could be an org or a user | <string> | |
| spec.data.repoGroups.metrics | Array of metrics to query from repos | <string> | |
| spec.data.repoGroups.repos | Array of repos to query. Can be "*" to query all repos | <string> | |
| spec.tags | Object with key/value pairs to specify static tags | Object Array | ex: foo:bar baz:qux |
Deploy Custom Sensor Specification
Create the FitnessSensor object defining the sensor configuration for the Custom Rest API Sensor. The FitnessSensor object needs to replace the last section of the file below
## TODO - CHANGE THE SAMPLE BELOW AS NEEDED
## CUSTOM SENSOR CONFIG
kind: FitnessSensor
apiVersion: fitness.orcasio.com/v1alpha3
metadata:
name: turtles-all-the-way
namespace: <tenant_ID>
spec:
sensor: sensor://fitness.orcasio.net/github
source: github
## TODO - secret name with token
secret: <secret_name>
enabled: true
## TODO - cron for every 5 minutes
trigger:
name: cron
cron:
schedule: "0 * * * * *"
data:
## TODO - change repo groups as needed
host: api.github.com
repoGroups:
- owner: <owner>
metrics:
- <metric_name>
repos:
- <repo_name>
tags:
env: dev
org: it
Once all other TODOs in the file are completed the configuration can be applied to a Kubernetes cluster
Examples
- Description: This sensor will generate a new signal for each owned by "sampleOrg" with workflow data:
- Schedule: The sensor will run 1 time every day
- Sample sensor:
apiVersion: fitness.orcasio.com/v1alpha3
kind: FitnessSensor
spec:
sensor: sensor://fitness.orcasio.net/github
source: github
tenant: sample_tenant
secret: github
tags:
foo: bar
trigger:
cron:
schedule: 0 0 0 * * *
name: cron
enabled: true
data:
repoGroups:
- owner: sampleOrg
metrics:
- workflows
repos:
- "*"
- Sample secret:
- Description: This sensor will generate a new signal for "repo1" and "repo2" owned by "sampleOrg" with workflow data; and an other signal with dependabot data:
- Schedule: The sensor will run 2 times every day
- Sample sensor:
apiVersion: fitness.orcasio.com/v1alpha3
kind: FitnessSensor
spec:
sensor: sensor://fitness.orcasio.net/github
source: github
tenant: sample_tenant
secret: github
tags:
foo: bar
trigger:
cron:
schedule: 0 0 0/12 * * *
name: cron
enabled: true
data:
repoGroups:
- owner: sampleOrg
metrics:
- workflows
- dependabot
repos:
- "repo1"
- "repo2"
- Sample secret:
```yaml kind: Secret apiVersion: v1 metadata: name: github namespace: orcas data: token: encodedTokenInBase64