Back to all posts

GitHub Agentic Workflows: CI/CD Meets AI

What if your GitHub Actions workflows could think instead of just follow orders? GitHub Agentic Workflows let you write automations in plain Markdown and have an AI coding agent — Copilot CLI, Claude Code, or OpenAI Codex — execute them inside GitHub Actions. No more wrestling with YAML for tasks that are easier to describe in words.

Availability: Currently in technical preview. Requires a GitHub Copilot plan that includes premium requests (Pro+, Business, or Enterprise). Each workflow run typically uses two premium requests.

Traditional vs. agentic — what's the difference?

A traditional workflow does exactly what you tell it, every time, in the same way. An agentic workflow uses AI to read your repo, understand the situation, and adapt. It interprets natural-language instructions flexibly rather than following a rigid script.

Imagine a workflow that triages new issues. A traditional one checks labels with if/then logic. An agentic one actually reads the issue body, understands what the user is asking, and decides how to respond — maybe requesting clarification, suggesting a label, or drafting a fix.

The two-file model

Every agentic workflow lives as a .md file with two parts: YAML frontmatter for configuration and a Markdown body for the AI's instructions.

---
on:
  issues:
    types: [opened]
permissions: read-all
safe-outputs:
  add-comment:
---

# Issue Clarifier
Analyze the current issue and ask for additional details
if it's unclear.

Run gh aw compile and it produces a companion .lock.yml — a hardened GitHub Actions workflow with security baked in. Both files get committed. The .md is your human-readable source of truth; the .lock.yml is what Actions actually executes.

When to use which

Traditional GitHub Actions Agentic Workflows
Authored in YAML Markdown
Logic type Deterministic, fixed steps AI-driven, context-aware
Write access Configured per step Read-only by default; writes via safe outputs only
Best for Build, test, release pipelines Triage, docs, reporting, code quality

The key insight: agentic workflows augment CI/CD, they don't replace it. If a task can be expressed in words — "summarize this PR", "triage this issue", "update the README" — it's probably a good fit.

The "Continuous AI" pattern

GitHub calls this broader vision Continuous AI — the systematic, automated application of AI to software collaboration, just like CI/CD applies automation to builds and deploys. Here's what that looks like in practice:

  • Continuous triage — summarize, label, and route new issues automatically
  • Continuous documentation — keep READMEs aligned with code changes
  • Continuous simplification — open PRs for routine refactoring
  • Continuous test improvement — assess coverage gaps and add tests
  • Continuous quality hygiene — investigate CI failures and propose fixes
  • Continuous reporting — daily health, activity, and trend reports

Security: three layers deep

Giving an AI agent write access to your repo sounds scary — so agentic workflows don't. The security model has three layers:

Layer What it enforces
Substrate Container isolation, network firewall, MCP server sandboxing
Configuration Schema validation, SHA-pinned actions, security scanners
Plan Integrity filtering, content sanitization, secret redaction, threat detection

The standout mechanism is Safe Outputs. The agent never gets direct write access. Instead, any requested write — creating an issue, adding a comment, opening a PR — is buffered as an artifact. A separate threat-detection job with its own AI security prompt reviews the output before a minimal-permission job applies it. If anything looks wrong, nothing gets written. Period.

Integrity filtering also controls what the agent can read. On public repos, content from unapproved authors is automatically filtered out, preventing prompt injection through crafted issues or comments.

MCP under the hood

Agents access GitHub and external services through the Model Context Protocol (MCP). The MCP Gateway spawns isolated containers per server, and individual tools must be explicitly allowlisted — anything not on the list is blocked. This means the agent can only interact with the specific services you've approved.

Getting started (carefully)

A few practical tips if you want to try agentic workflows:

  • PRs are never merged automatically — humans always review and approve
  • Start small — begin with low-risk outputs like comments or draft issues before enabling PR creation
  • Treat the Markdown as code — review changes, keep scope small, evolve intentionally
  • Budget for two premium requests per run — one for the agent, one for the guardrail check

Documentation