Commit 9991453b331f
Changed files (6)
dots
config
claude
skills
BacklogTriage
workflows
dots/config/claude/skills/BacklogTriage/workflows/Analyze.md
@@ -0,0 +1,151 @@
+# Analyze Workflow
+
+Deep LLM-powered analysis of backlog issues using agent tools for investigation.
+
+## Steps
+
+### 1. Load Configuration
+
+Read `backlog-triage.toml` from the project directory for version info, component mappings, and file paths.
+
+### 2. Load Backlog Data
+
+```bash
+python3 -c "import json; d=json.load(open('srvkp-backlog-full.json')); print(f'{len(d)} issues')"
+```
+
+### 3. Load or Initialize Analysis Checkpoint
+
+Check if `srvkp-backlog-llm-analysis.json` exists. If so, load already-analyzed issue keys to skip them (resume support).
+
+### 4. Process Issues
+
+For each unanalyzed issue, perform investigation. Process in groups of ~20 but investigate each individually.
+
+#### 4a. Read the Issue
+
+Read the issue's summary, description, comments, labels, components, priority, age.
+
+#### 4b. Quick Classification (no tools needed)
+
+Some issues can be classified immediately without upstream investigation:
+
+- **CVE for EOL version** (e.g., `[pipelines-1.14]` in title) → CLOSE, high confidence
+- **Test/QA task for EOL version** → CLOSE
+- **No description, no comments, 1+ year old** → REVIEW_TO_CLOSE
+- **Blocker/Critical with recent activity** → HIGH_PRIORITY, keep
+
+#### 4c. Investigate Upstream (when needed)
+
+For issues that need investigation, use the agent's tools:
+
+**Search for related upstream work:**
+```bash
+# Search commits in the relevant repo
+git -C ~/src/tektoncd/{repo} log --oneline --all --grep="{keyword}" | head -20
+
+# Search merged PRs on GitHub
+gh pr list -R tektoncd/{repo} --state merged --search "{keyword}" --limit 10 --json number,title,mergedAt,url
+
+# Search closed issues on GitHub
+gh issue list -R tektoncd/{repo} --state closed --search "{keyword}" --limit 10 --json number,title,closedAt,url
+
+# Check if a specific feature/TEP was implemented
+gh pr list -R tektoncd/{repo} --state merged --search "TEP-{number}" --json number,title,mergedAt
+
+# Read a specific PR for details
+gh pr view {number} -R tektoncd/{repo} --json title,body,mergedAt,files
+```
+
+**Choose keywords intelligently:**
+- Extract meaningful terms from summary and description
+- Use component-specific terms (e.g., "cancel-in-progress", "remote resolution")
+- Search for TEP numbers, feature names, error messages
+- Check for related config flag names or API fields
+
+**Determine the right repo to search:**
+- Use the Jira component → repo mapping from config
+- If no component, infer from summary keywords
+- For cross-cutting issues, check multiple repos
+
+#### 4d. Classify
+
+Based on investigation, assign:
+- `recommendation`: CLOSE, REVIEW_TO_CLOSE, NEEDS_TRIAGE, KEEP, HIGH_PRIORITY
+- `relevance_score`: 0-100
+- `confidence`: high, medium, low
+- `reason`: 2-4 sentences with specific evidence
+- `tags`: applicable tags from: eol-version, addressed-upstream, stale, customer-impact, active-work, security, blocked, no-description, duplicate, superseded, upstream-open
+- `upstream_evidence`: specific PR/issue/commit reference, or null
+- `suggested_comment`: what to post on the Jira issue if closing, or null
+
+### 5. Save Checkpoint
+
+After each group of ~20 issues, save the analysis JSON:
+
+```python
+# The analysis file format:
+{
+ "generated": "2026-04-16T...",
+ "model": "agent",
+ "total": 1572,
+ "issues": [
+ {
+ "key": "SRVKP-1234",
+ "recommendation": "CLOSE",
+ "relevance_score": 10,
+ "confidence": "high",
+ "reason": "This CVE tracks a vulnerability in pipelines-1.14 which is EOL...",
+ "tags": ["eol-version", "security"],
+ "upstream_evidence": null,
+ "suggested_comment": "Closing: pipelines-1.14 is EOL. If this affects a supported version, please file a new issue."
+ }
+ ]
+}
+```
+
+### 6. Summary
+
+After processing all issues, print a summary:
+- Total issues analyzed
+- Breakdown by recommendation
+- Breakdown by confidence
+- Number with upstream evidence found
+- Time taken
+
+## Parallelization with Subagents
+
+For faster processing, the orchestrator can delegate batches to subagents. Each subagent:
+
+1. Receives a batch of ~20-30 issues (the raw JSON)
+2. Has access to the same tools (git, gh, bash)
+3. Investigates and classifies each issue
+4. Returns the analysis JSON array
+
+The orchestrator:
+1. Splits unanalyzed issues into batches
+2. Dispatches batches in parallel (respecting limits)
+3. Merges results and saves checkpoints
+4. Handles failures gracefully (retry individual issues)
+
+### Subagent Task Template
+
+```
+You are analyzing Jira backlog issues for the OpenShift Pipelines (SRVKP) project.
+Upstream tektoncd repos are in ~/src/tektoncd/{repo}.
+
+Current supported versions: 1.19, 1.20, 1.21. EOL: 1.1-1.18.
+
+For each issue below, investigate using git log, gh issue/pr commands, then classify.
+Write results to {output_file} as a JSON array.
+
+Issues to analyze:
+{issues_json}
+```
+
+## Optimization Tips
+
+- **Batch quick classifications**: Issues obviously targeting EOL versions don't need upstream investigation
+- **Cache gh searches**: If you've already searched for "cancel-in-progress" in pipelines-as-code, reuse that result for similar issues
+- **Component focus**: When analyzing a batch, group by component so upstream searches are reusable
+- **Depth control**: User can request `--shallow` (quick heuristics only) or `--deep` (full investigation per issue)
dots/config/claude/skills/BacklogTriage/workflows/Fetch.md
@@ -0,0 +1,36 @@
+# Fetch Workflow
+
+Fetch or refresh Jira backlog data and ensure upstream repos are up-to-date.
+
+## Steps
+
+### 1. Check for fetch script
+
+Look for `fetch-backlog.py` in the project directory. If missing, inform the user they need to create it or copy from a template.
+
+### 2. Run fetch
+
+```bash
+python3 fetch-backlog.py
+```
+
+This uses `jrc` (jayrat CLI) to paginate through all backlog issues, fetching full details including descriptions and comments.
+
+### 3. Update upstream repos
+
+For each repo in the component map, fetch the latest:
+
+```bash
+for repo in pipeline triggers chains results cli operator pipelines-as-code catalog dashboard; do
+ git -C ~/src/tektoncd/$repo fetch upstream --quiet 2>/dev/null || \
+ git -C ~/src/tektoncd/$repo fetch origin --quiet 2>/dev/null
+done
+```
+
+### 4. Verify
+
+Confirm the backlog JSON exists and report issue count:
+
+```bash
+python3 -c "import json; d=json.load(open('srvkp-backlog-full.json')); print(f'{len(d)} issues loaded')"
+```
dots/config/claude/skills/BacklogTriage/workflows/FullPipeline.md
@@ -0,0 +1,45 @@
+# FullPipeline Workflow
+
+End-to-end backlog triage: fetch → analyze → report.
+
+## Steps
+
+### 1. Fetch (if needed)
+
+Check if `srvkp-backlog-full.json` exists and is less than 24 hours old. If stale or missing:
+- Run `workflows/Fetch.md`
+
+### 2. Analyze
+
+Run `workflows/Analyze.md` with resume support:
+- Load existing analysis checkpoint
+- Only process unanalyzed issues
+- Save checkpoints every ~20 issues
+
+### 3. Report
+
+Run `workflows/Report.md`:
+- Generate HTML report
+- Open in browser
+
+## Options
+
+The user may specify:
+
+| Option | Effect |
+|--------|--------|
+| `--component "Pipelines as Code"` | Only analyze issues for that component |
+| `--shallow` | Quick heuristic classification, no upstream investigation |
+| `--deep` | Full investigation for every issue (slow but thorough) |
+| `--close-only` | Only analyze issues likely to be closable (old, EOL, no-description) |
+| `--reset` | Discard previous analysis, re-analyze everything |
+| `--parallel` | Use subagents for parallel processing |
+
+## Typical Run Times
+
+| Scope | Depth | Issues | Estimated Time |
+|-------|-------|--------|----------------|
+| Full backlog | shallow | ~1500 | 5-10 minutes |
+| Full backlog | medium | ~1500 | 30-60 minutes |
+| Close candidates | deep | ~300 | 15-30 minutes |
+| Single component | deep | ~100 | 10-15 minutes |
dots/config/claude/skills/BacklogTriage/workflows/Report.md
@@ -0,0 +1,66 @@
+# Report Workflow
+
+Generate an interactive HTML report from the LLM analysis.
+
+## Steps
+
+### 1. Load Analysis
+
+Read `srvkp-backlog-llm-analysis.json` and the original backlog data for metadata enrichment.
+
+### 2. Generate Report
+
+Run the report generator:
+
+```bash
+python3 gen-report-llm.py
+```
+
+Or, if the script doesn't exist, generate the HTML directly. The report should include:
+
+#### Header
+- Total issues analyzed
+- Model used and confidence distribution
+- Recommendation summary cards (CLOSE, REVIEW_TO_CLOSE, NEEDS_TRIAGE, KEEP, HIGH_PRIORITY)
+
+#### Filters
+- Filter by recommendation
+- Search by text (key, summary, reason)
+- Filter by component, type, priority
+
+#### Component Sections
+Grouped by Jira component, sorted by cleanup potential (most closable first).
+
+Each section is collapsible and shows:
+- Component name with recommendation badge counts
+- Auto-expanded if it contains CLOSE or HIGH_PRIORITY items
+
+#### Per-Issue Cards
+Each issue displays:
+- **Key** (linked to Jira)
+- **Type**, **Priority**, **Score** badges
+- **Confidence** indicator (high/medium/low)
+- **Recommendation** badge
+- **LLM Reason** — the full explanation from the analysis (this is the key differentiator from heuristic reports)
+- **Upstream Evidence** — linked to GitHub PR/issue/commit
+- **Suggested Comment** — what to post before closing
+- **Tags** — semantic tags from analysis
+- **Age** and **dates**
+
+#### Styling
+- Dark theme (GitHub dark style)
+- Color-coded issue borders: red=CLOSE, orange=REVIEW, yellow=TRIAGE, green=KEEP, purple=HIGH
+- Responsive layout
+
+### 3. Open Report
+
+```bash
+xdg-open srvkp-backlog-llm-report.html
+```
+
+### 4. Summary
+
+Print stats:
+- Report file path and size
+- Recommendation distribution
+- Top components by cleanup potential
dots/config/claude/skills/BacklogTriage/SKILL.md
@@ -0,0 +1,145 @@
+---
+name: BacklogTriage
+description: "Deep LLM-powered Jira backlog triage against upstream repositories. USE WHEN user says 'triage backlog', 'backlog analysis', 'clean backlog', 'analyze jira backlog', or wants to assess relevance of Jira issues against upstream git/GitHub activity."
+---
+
+# BacklogTriage
+
+Deep analysis of Jira backlog issues by investigating upstream repositories to determine if issues are still relevant, have been addressed, or should be closed. Uses the agent's tools (git, gh, jrc) to investigate each issue — no pre-truncated context dumps.
+
+## Architecture
+
+Unlike batch-prompt approaches, this skill leverages the **agent pattern**: the LLM reads each issue, decides what to investigate, and uses tools to pull exactly the evidence it needs. This means:
+
+- No context window limitations on upstream data
+- The LLM decides which repos, commits, PRs, and issues to look at
+- It can go deep: read specific PRs, check commit diffs, trace feature implementations
+- Iterative reasoning: "this mentions TEP-0137, let me check if that landed"
+
+## Project Configuration
+
+The skill expects a `backlog-triage.toml` config file in the project directory:
+
+```toml
+[project]
+name = "OpenShift Pipelines"
+jira_project = "SRVKP"
+jira_base = "https://issues.redhat.com/browse"
+
+[versions]
+current = ["1.19", "1.20", "1.21"]
+development = ["1.22"]
+eol = ["1.1", "1.2", "1.3", "1.4", "1.5", "1.6", "1.7", "1.8", "1.9", "1.10", "1.11", "1.12", "1.13", "1.14", "1.15", "1.16", "1.17", "1.18"]
+eol_ocp = ["4.1", "4.2", "4.3", "4.4", "4.5", "4.6", "4.7", "4.8", "4.9", "4.10", "4.11", "4.12", "4.13"]
+
+[upstream]
+org = "tektoncd"
+repos_dir = "~/src/tektoncd"
+# Map Jira components to upstream repos
+[upstream.component_map]
+"Tekton Pipelines" = "pipeline"
+"Pipelines as Code" = "pipelines-as-code"
+"Tekton Triggers" = "triggers"
+"Tekton Chains" = "chains"
+"Tekton Results" = "results"
+"Tekton CLI" = "cli"
+"Operator" = "operator"
+"Tekton Ecosystem" = "catalog"
+"UI" = "dashboard"
+"Tekton Hub" = "hub"
+
+[files]
+backlog = "srvkp-backlog-full.json" # fetched by fetch-backlog.py
+analysis = "srvkp-backlog-llm-analysis.json"
+report = "srvkp-backlog-llm-report.html"
+```
+
+## Workflow Routing
+
+| Workflow | Trigger | File |
+|----------|---------|------|
+| **Fetch** | "fetch backlog", "update backlog data" | `workflows/Fetch.md` |
+| **Analyze** | "analyze backlog", "triage backlog" | `workflows/Analyze.md` |
+| **Report** | "generate report", "backlog report" | `workflows/Report.md` |
+| **FullPipeline** | "full backlog triage", "end-to-end triage" | `workflows/FullPipeline.md` |
+
+## Tools Available
+
+The agent should use these tools for investigation:
+
+### Jira (data already fetched)
+- Backlog JSON is pre-fetched via `fetch-backlog.py` (uses `jrc` CLI)
+- Read issues from the JSON file directly — no need to query Jira per-issue
+
+### Git (local repos)
+```bash
+# Search for related work in upstream repos
+git -C ~/src/tektoncd/pipeline log --oneline --all --grep="keyword"
+git -C ~/src/tektoncd/pipeline log --oneline --all --since="2024-01-01" -- path/to/file
+```
+
+### GitHub CLI
+```bash
+# Search closed issues
+gh issue list -R tektoncd/pipeline --state closed --search "keyword" --limit 10 --json number,title,closedAt,url
+# Search merged PRs
+gh pr list -R tektoncd/pipeline --state merged --search "keyword" --limit 10 --json number,title,mergedAt,url
+# Read a specific issue/PR
+gh issue view 1234 -R tektoncd/pipeline --json title,body,comments,labels
+gh pr view 5678 -R tektoncd/pipeline --json title,body,mergedAt,files
+```
+
+## Per-Issue Analysis Output
+
+For each issue, produce:
+
+```json
+{
+ "key": "SRVKP-1234",
+ "recommendation": "CLOSE|REVIEW_TO_CLOSE|NEEDS_TRIAGE|KEEP|HIGH_PRIORITY",
+ "relevance_score": 0-100,
+ "confidence": "high|medium|low",
+ "reason": "2-4 sentence explanation with specific evidence. Reference upstream PRs/commits.",
+ "tags": ["eol-version", "addressed-upstream", "stale", "customer-impact", ...],
+ "upstream_evidence": "tektoncd/pipeline#1234 merged 2025-03-15 — implemented this feature",
+ "suggested_comment": "Closing: this was addressed upstream in tektoncd/pipeline#1234 and shipped in OSP 1.20."
+}
+```
+
+### Recommendation Guidelines
+
+| Recommendation | Score | Criteria |
+|---|---|---|
+| **CLOSE** | 0-20 | EOL version, confirmed fixed upstream, duplicate, obsolete |
+| **REVIEW_TO_CLOSE** | 21-35 | Likely irrelevant but needs human confirmation |
+| **NEEDS_TRIAGE** | 36-50 | Ambiguous, can't determine without more context |
+| **KEEP** | 51-75 | Still relevant, valid bug/feature, ongoing work |
+| **HIGH_PRIORITY** | 76-100 | Blocker, customer-facing, security, active work needed |
+
+## Examples
+
+**Example 1: Full pipeline**
+```
+User: "triage the SRVKP backlog"
+→ Checks fetch-backlog.py output exists (runs if not)
+→ Loads issues from JSON
+→ For each issue, investigates upstream repos
+→ Writes analysis JSON with per-issue reasoning
+→ Generates interactive HTML report
+```
+
+**Example 2: Analyze specific component**
+```
+User: "triage backlog for Pipelines as Code"
+→ Filters to PaC component issues
+→ Focuses investigation on tektoncd/pipelines-as-code
+→ Produces component-specific report
+```
+
+**Example 3: Just the close candidates**
+```
+User: "find backlog issues we can close"
+→ Pre-filters to old issues, EOL versions, no-description
+→ Investigates each against upstream
+→ Produces close-focused report with suggested comments
+```
.gitignore
@@ -32,3 +32,5 @@ hardware-configuration.nix
node_modules/
/.agent-shell/
+/dots/config/claude/skills/find-skills
+/dots/config/claude/skills/make-interfaces-feel-better