flake-update-20260505

Analyze Workflow

Deep LLM-powered analysis of backlog issues using agent tools for investigation.

Steps

1. Load Configuration

Read backlog-triage.toml from the project directory for version info, component mappings, and file paths.

2. Load Backlog Data

python3 -c "import json; d=json.load(open('srvkp-backlog-full.json')); print(f'{len(d)} issues')"

3. Load or Initialize Analysis Checkpoint

Check if srvkp-backlog-llm-analysis.json exists. If so, load already-analyzed issue keys to skip them (resume support).

4. Process Issues

For each unanalyzed issue, perform investigation. Process in groups of ~20 but investigate each individually.

4a. Read the Issue

Read the issue’s summary, description, comments, labels, components, priority, age.

4b. Quick Classification (no tools needed)

Some issues can be classified immediately without upstream investigation:

  • CVE for EOL version (e.g., [pipelines-1.14] in title) → CLOSE, high confidence
  • Test/QA task for EOL version → CLOSE
  • No description, no comments, 1+ year old → REVIEW_TO_CLOSE
  • Blocker/Critical with recent activity → HIGH_PRIORITY, keep

4c. Investigate Upstream (when needed)

For issues that need investigation, use the agent’s tools:

Search for related upstream work:

# Search commits in the relevant repo
git -C ~/src/tektoncd/{repo} log --oneline --all --grep="{keyword}" | head -20

# Search merged PRs on GitHub
gh pr list -R tektoncd/{repo} --state merged --search "{keyword}" --limit 10 --json number,title,mergedAt,url

# Search closed issues on GitHub
gh issue list -R tektoncd/{repo} --state closed --search "{keyword}" --limit 10 --json number,title,closedAt,url

# Check if a specific feature/TEP was implemented
gh pr list -R tektoncd/{repo} --state merged --search "TEP-{number}" --json number,title,mergedAt

# Read a specific PR for details
gh pr view {number} -R tektoncd/{repo} --json title,body,mergedAt,files

Choose keywords intelligently:

  • Extract meaningful terms from summary and description
  • Use component-specific terms (e.g., “cancel-in-progress”, “remote resolution”)
  • Search for TEP numbers, feature names, error messages
  • Check for related config flag names or API fields

Determine the right repo to search:

  • Use the Jira component → repo mapping from config
  • If no component, infer from summary keywords
  • For cross-cutting issues, check multiple repos

4d. Classify

Based on investigation, assign:

  • recommendation: CLOSE, REVIEW_TO_CLOSE, NEEDS_TRIAGE, KEEP, HIGH_PRIORITY
  • relevance_score: 0-100
  • confidence: high, medium, low
  • reason: 2-4 sentences with specific evidence
  • tags: applicable tags from: eol-version, addressed-upstream, stale, customer-impact, active-work, security, blocked, no-description, duplicate, superseded, upstream-open
  • upstream_evidence: specific PR/issue/commit reference, or null
  • suggested_comment: what to post on the Jira issue if closing, or null

5. Save Checkpoint

After each group of ~20 issues, save the analysis JSON:

# The analysis file format:
{
  "generated": "2026-04-16T...",
  "model": "agent",
  "total": 1572,
  "issues": [
    {
      "key": "SRVKP-1234",
      "recommendation": "CLOSE",
      "relevance_score": 10,
      "confidence": "high",
      "reason": "This CVE tracks a vulnerability in pipelines-1.14 which is EOL...",
      "tags": ["eol-version", "security"],
      "upstream_evidence": null,
      "suggested_comment": "Closing: pipelines-1.14 is EOL. If this affects a supported version, please file a new issue."
    }
  ]
}

6. Summary

After processing all issues, print a summary:

  • Total issues analyzed
  • Breakdown by recommendation
  • Breakdown by confidence
  • Number with upstream evidence found
  • Time taken

Parallelization with Subagents

For faster processing, the orchestrator can delegate batches to subagents. Each subagent:

  1. Receives a batch of ~20-30 issues (the raw JSON)
  2. Has access to the same tools (git, gh, bash)
  3. Investigates and classifies each issue
  4. Returns the analysis JSON array

The orchestrator:

  1. Splits unanalyzed issues into batches
  2. Dispatches batches in parallel (respecting limits)
  3. Merges results and saves checkpoints
  4. Handles failures gracefully (retry individual issues)

Subagent Task Template

You are analyzing Jira backlog issues for the OpenShift Pipelines (SRVKP) project.
Upstream tektoncd repos are in ~/src/tektoncd/{repo}.

Current supported versions: 1.19, 1.20, 1.21. EOL: 1.1-1.18.

For each issue below, investigate using git log, gh issue/pr commands, then classify.
Write results to {output_file} as a JSON array.

Issues to analyze:
{issues_json}

Optimization Tips

  • Batch quick classifications: Issues obviously targeting EOL versions don’t need upstream investigation
  • Cache gh searches: If you’ve already searched for “cancel-in-progress” in pipelines-as-code, reuse that result for similar issues
  • Component focus: When analyzing a batch, group by component so upstream searches are reusable
  • Depth control: User can request --shallow (quick heuristics only) or --deep (full investigation per issue)