Skip to content

[Plan]: Stage 3 update gh-aw to use threat detection container #24

@davidslater

Description

@davidslater

Contribution type

Feature request

What do you want to contribute?

Implement Stage 3 of the threat detection extraction: update github/gh-aw to consume the github/gh-aw-threat-detection container instead of running inline threat detection logic.

Stages 1 and 2 are complete in this repository. This issue tracks the parent-repository integration work only:

  • add default threat detection container registry/version constants in github/gh-aw
  • generate container-based detection jobs/steps
  • wire detection container version metadata into the lock/schema flow
  • add detection image pre-pull support where appropriate
  • remove or retire parent-repo inline detection code that has moved into this component
  • preserve frontmatter configuration and workflow orchestration behavior

Agent analysis and findings

Issue #4 described the original extraction plan and listed Stage 3 as the parent repository update. The current state is:

  • Stage 1 is complete: this repository contains the standalone CLI, Go packages, prompt template, spec, tests, and CI.
  • Stage 2 is complete: this repository contains a Dockerfile, container smoke test, GHCR-oriented release workflow, and promotion workflow.
  • Stage 3 remains: github/gh-aw still needs to switch from inline detection orchestration to a pinned container image reference.

Relevant parent-repository areas from the original issue:

  • pkg/workflow/threat_detection.go
    • currently owns detection config, job step generation, engine selection, and model env vars
  • pkg/workflow/compiler_jobs.go
    • creates the detection job in compiled workflow YAML
  • pkg/workflow/compiler_safe_outputs_job.go
    • conditions safe output on detection result
  • pkg/workflow/safe_jobs.go and pkg/workflow/safe_jobs_needs_validation.go
    • handle needs: dependency wiring
  • pkg/workflow/copilot_engine_execution.go
    • injects model env vars for detection
  • pkg/workflow/artifacts.go and pkg/workflow/publish_assets.go
    • handle detection artifact upload/download paths
  • pkg/workflow/cache.go
    • integrates cache memory with detection
  • pkg/constants/job_constants.go
    • includes detection artifact constants
  • pkg/workflow/lock_schema.go
    • includes detection model metadata today
  • pkg/workflow/agentic_engine.go
    • includes detection model default behavior today
  • pkg/workflow/prompts/threat_detection.md
    • prompt template now owned by this repository
  • actions/setup/js/setup_threat_detection.cjs
  • actions/setup/js/parse_threat_detection_results.cjs
  • actions/setup/js/handle_detection_runs.cjs
    • JavaScript detection support that should be retired if fully replaced by the container contract

The desired parent behavior should mirror the firewall extraction pattern, using strict pinned image constants such as:

const DefaultThreatDetectionRegistry = "ghcr.io/github/gh-aw-threat-detection"
const DefaultThreatDetectionVersion Version = "v1.0.0"

Use case and expected behavior

As a gh-aw maintainer, I want compiled workflows to use the independently released threat detection container so that detection prompts, parsing, and runtime behavior can be versioned and released independently from the main gh-aw CLI.

Expected behavior after implementation:

  • gh-aw pins the default threat detection image by registry and version.
  • Compiled workflows run threat detection through the container contract from this repository.
  • Existing frontmatter configuration remains supported where it is still orchestrator-owned.
  • Safe output jobs still depend on detection results exactly as before.
  • Detection container version information is represented in lock metadata.
  • Inline detection implementation details that moved into this repository are removed from github/gh-aw or clearly deprecated behind compatibility code.
  • Existing workflows and tests that depend on threat detection continue to pass after updating expected YAML output.

Complete step-by-step agentic plan

  1. In github/gh-aw, read the firewall integration pattern:

    • default firewall registry/version constants
    • compiled workflow container step generation
    • Docker image pre-pull collection
    • lock/schema version metadata
    • version bump or compatibility automation
  2. Read the current parent-repo detection implementation:

    • pkg/workflow/threat_detection.go
    • detection-related compiler jobs
    • detection-related safe output needs: wiring
    • detection artifact upload/download logic
    • lock schema detection model fields
    • detection-related JavaScript setup/parse scripts
  3. Add threat detection container constants in the appropriate constants/version location:

    • default registry: ghcr.io/github/gh-aw-threat-detection
    • default version: first released pinned version, likely v1.0.0 or the first available release tag
  4. Update compiled detection job generation:

    • replace inline AI engine invocation with a container invocation of threat-detect
    • mount or pass the artifacts directory according to the container contract
    • pass supported environment variables only when needed
    • preserve exit code behavior: 0 safe, 1 threat detected, 2 infrastructure/configuration error
  5. Preserve parent-owned orchestration behavior:

    • frontmatter parsing
    • whether a detection job is created
    • safe output job dependencies
    • custom pre/post step handling if still supported outside the container
    • artifact naming and upload/download behavior
  6. Update lock/schema metadata:

    • record the detection container registry/version or image reference
    • retire detection-model metadata if it no longer belongs in the parent lock file, or preserve it only if still required for compatibility
  7. Add detection image pre-pull support:

    • include the pinned threat detection image in Docker image collection/pre-pull code, mirroring firewall behavior
  8. Remove or retire duplicated parent-repo implementation:

    • delete the inline prompt template once no longer referenced
    • delete JS setup/parse scripts once no longer referenced
    • remove direct detection engine invocation code
    • remove GetDefaultDetectionModel() from the engine interface if no longer needed outside detection
  9. Update documentation in github/gh-aw:

    • reference the external detection component
    • document default registry/version behavior
    • document how operators update or pin the detection container version
    • document any remaining compatibility flag if one is introduced
  10. Update tests:

    • adjust golden workflow YAML tests for container-based detection
    • update detection job tests
    • update safe output dependency tests
    • update lock/schema tests
    • remove tests that only covered deleted inline parsing/prompt code
    • add tests for image constants and pre-pull collection
  11. Run parent-repo validation:

    • normal Go test suite
    • relevant workflow compilation tests
    • generated YAML/golden tests
    • any Docker/pre-pull tests

Specific implementation details and examples

Container contract to integrate:

  • Input: artifacts directory containing workflow prompt, agent output JSON, patch/bundle files, and optional comment memory.
  • Output JSON:
{
  "prompt_injection": false,
  "secret_leak": false,
  "malicious_patch": false,
  "reasons": []
}
  • Exit codes:
    • 0: safe
    • 1: threat detected
    • 2: infrastructure/configuration error

Suggested parent constants:

const DefaultThreatDetectionRegistry = "ghcr.io/github/gh-aw-threat-detection"
const DefaultThreatDetectionVersion Version = "v1.0.0"

Relevant github/gh-aw files to inspect or update:

  • pkg/workflow/threat_detection.go
  • pkg/workflow/compiler_jobs.go
  • pkg/workflow/compiler_safe_outputs_job.go
  • pkg/workflow/safe_jobs.go
  • pkg/workflow/safe_jobs_needs_validation.go
  • pkg/workflow/copilot_engine_execution.go
  • pkg/workflow/artifacts.go
  • pkg/workflow/publish_assets.go
  • pkg/workflow/cache.go
  • pkg/workflow/lock_schema.go
  • pkg/workflow/agentic_engine.go
  • pkg/constants/job_constants.go
  • Docker image collection/pre-pull code that currently handles firewall images
  • detection documentation and reference docs

Acceptance criteria:

  • github/gh-aw compiles workflows that use the threat detection container image.
  • The compiled workflow pins the configured detection image version.
  • Existing safe output needs: behavior remains correct.
  • Lock metadata records the detection container version or image reference.
  • Docker image pre-pull includes the detection image where applicable.
  • Parent-repo inline prompt parsing and result parsing are removed or explicitly left only behind documented compatibility behavior.
  • Existing threat-detection tests are updated and pass.
  • Documentation describes the new containerized integration.

Validation ideas:

  • Run all relevant github/gh-aw tests.
  • Compile representative workflows with threat detection enabled.
  • Confirm generated YAML invokes ghcr.io/github/gh-aw-threat-detection:<version>.
  • Confirm threat detection disabled workflows do not include the container.
  • Confirm safe output jobs still depend on detection when detection is enabled.
  • Confirm lock metadata changes are deterministic.

Suggested labels

  • task

Contributor checklist

  • I read CONTRIBUTING.md and understand that non-core contributors should not open pull requests directly.
  • I included a detailed agentic plan for the core team to review and implement.
  • I included agent analysis and findings, or explained why they are not applicable.
  • I included expected behavior, implementation details, and validation ideas.

Metadata

Metadata

Assignees

No one assigned

    Labels

    taskHow We Work work item type: 1 day or less long effort.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions