GitHub API Integration
3 minute read
Overview
GitHub’s REST API provides comprehensive access to repository data, organization events, issues, pull requests, and more. The Edge Delta HTTP Pull source can efficiently retrieve this data using GitHub’s standard pagination and authentication mechanisms.
Authentication
GitHub requires authentication for most API endpoints. Use a personal access token (PAT) or GitHub App token stored in environment variables.
header_expressions:
Authorization: Concat(["Bearer ", EDXEnv("GITHUB_TOKEN", "")], "")
Basic GitHub Events Pull
Monitor organization events with dynamic time windows:
nodes:
- name: github_events
type: http_pull_input
endpoint: https://api.github.com/orgs/edgedelta/events
method: GET
# Headers
headers:
- header: Accept
value: application/vnd.github.v3+json
# Dynamic authentication
header_expressions:
Authorization: Concat(["Bearer ", EDXEnv("GITHUB_TOKEN", "")], "")
# Time-based filtering
parameter_expressions:
since: FormatTime(Now() - Duration("10m"), "%Y-%m-%dT%H:%M:%SZ")
per_page: "100"
pull_interval: 5m
This configuration:
- Polls GitHub every 5 minutes
- Retrieves events from the last 10 minutes (overlapping windows)
- Uses secure token authentication from environment variables
- Requests up to 100 events per page
Repository Data with Pagination
Retrieve all repositories from an organization using Link header pagination:
nodes:
- name: github_repos
type: http_pull_input
endpoint: https://api.github.com/orgs/edgedelta/repos
method: GET
header_expressions:
Authorization: Concat(["Bearer ", EDXEnv("GITHUB_TOKEN", "")], "")
parameter_expressions:
per_page: "100"
sort: "updated"
direction: "desc"
# Automatic pagination
pagination:
link_relation: "next"
max_parallel_requests: 3
pull_interval: 1h
GitHub returns Link headers like:
Link: <https://api.github.com/organizations/123/repos?page=2>; rel="next",
<https://api.github.com/organizations/123/repos?page=10>; rel="last"
The agent automatically follows these links to retrieve all pages.
Pull Requests Monitoring
Monitor pull requests across repositories:
nodes:
- name: github_pull_requests
type: http_pull_input
endpoint: https://api.github.com/repos/edgedelta/edgedelta/pulls
method: GET
header_expressions:
Authorization: Concat(["Bearer ", EDXEnv("GITHUB_TOKEN", "")], "")
parameters:
- name: state
value: "all"
parameter_expressions:
since: FormatTime(Now() - Duration("24h"), "%Y-%m-%dT%H:%M:%SZ")
per_page: "50"
pagination:
link_relation: "next"
max_parallel_requests: 2
pull_interval: 15m
Issues and Comments
Track issues and their comments:
nodes:
- name: github_issues
type: http_pull_input
endpoint: https://api.github.com/repos/edgedelta/edgedelta/issues
method: GET
header_expressions:
Authorization: Concat(["Bearer ", EDXEnv("GITHUB_TOKEN", "")], "")
parameters:
- name: state
value: "all"
- name: sort
value: "updated"
parameter_expressions:
since: FormatTime(Now() - Duration("1h"), "%Y-%m-%dT%H:%M:%SZ")
per_page: "100"
pagination:
link_relation: "next"
pull_interval: 10m
Commit Activity
Monitor repository commits:
nodes:
- name: github_commits
type: http_pull_input
endpoint: https://api.github.com/repos/edgedelta/edgedelta/commits
method: GET
header_expressions:
Authorization: Concat(["Bearer ", EDXEnv("GITHUB_TOKEN", "")], "")
parameter_expressions:
since: FormatTime(Now() - Duration("6h"), "%Y-%m-%dT%H:%M:%SZ")
per_page: "100"
pagination:
link_relation: "next"
max_parallel_requests: 2
pull_interval: 30m
Workflow Runs (GitHub Actions)
Monitor CI/CD workflow executions:
nodes:
- name: github_workflows
type: http_pull_input
endpoint: https://api.github.com/repos/edgedelta/edgedelta/actions/runs
method: GET
header_expressions:
Authorization: Concat(["Bearer ", EDXEnv("GITHUB_TOKEN", "")], "")
parameter_expressions:
created: Concat([">", FormatTime(Now() - Duration("1h"), "%Y-%m-%dT%H:%M:%SZ")], "")
per_page: "100"
pagination:
link_relation: "next"
pull_interval: 5m
Release Monitoring
Track new releases and deployments:
nodes:
- name: github_releases
type: http_pull_input
endpoint: https://api.github.com/repos/edgedelta/edgedelta/releases
method: GET
header_expressions:
Authorization: Concat(["Bearer ", EDXEnv("GITHUB_TOKEN", "")], "")
parameters:
- name: per_page
value: "30"
pagination:
link_relation: "next"
pull_interval: 1h
Rate Limiting Considerations
GitHub API has rate limits:
- Authenticated requests: 5,000 per hour
- Unauthenticated: 60 per hour
Best practices:
- Always authenticate requests
- Use appropriate
pull_interval
values - Limit
max_parallel_requests
for pagination - Add rate limit handling:
retry_http_code:
- 403 # Forbidden (often rate limit)
- 429 # Too Many Requests
Environment Setup
Set up required environment variables:
# GitHub Personal Access Token
export GITHUB_TOKEN="ghp_xxxxxxxxxxxxxxxxxxxx"
# Optional: Custom API host for GitHub Enterprise
export GITHUB_API_HOST="api.github.company.com"
Common Parameters
Parameter | Description | Example |
---|---|---|
since |
ISO 8601 timestamp | 2024-01-01T00:00:00Z |
per_page |
Results per page (max 100) | 100 |
sort |
Sort field | created , updated , pushed |
direction |
Sort direction | asc , desc |
state |
Filter by state | open , closed , all |
Troubleshooting
401 Unauthorized:
- Verify
GITHUB_TOKEN
environment variable is set - Check token has required scopes for the endpoint
403 Forbidden:
- Often indicates rate limiting
- Check
X-RateLimit-Remaining
response header - Increase
pull_interval
if hitting limits
422 Unprocessable Entity:
- Check timestamp format in
since
parameter - Verify query parameters are valid for the endpoint
Incomplete data:
- Ensure
pagination.link_relation
is set to"next"
- Check
max_parallel_requests
isn’t too high for rate limits