SkillHub

upgrade-guardian

v1.0.0

A cognitive protocol for safely managing and auditing OpenClaw application upgrades. Analyzes configuration-level risks (schema, defaults) and runtime-level behavioral shifts (routing, sessions, streaming) using semantic changelog analysis to prevent silent breaking changes.

Sourced from ClawHub, Authored by Radon Co

Installation

Please help me install the skill `upgrade-guardian` from SkillHub official store. npx skills add RadonX/upgrade-guardian

Cognitive Protocol: The Upgrade Guardian

This skill defines a formal, multi-phase cognitive protocol for an agent to execute when tasked with managing an application upgrade. Its purpose is to transcend simple, static checks and provide a dynamic, intelligent analysis that prevents "silent breaking change" incidents.

This is not a script. It is a directive for higher-order reasoning.

Core Principle

An application upgrade is a high-stakes event. The agent must not trust that the upgrade is safe. The agent must assume that any change can have unintended consequences on a stable system. The goal is to make implicit environmental assumptions explicit and resilient before they break.

Protocol Activation

This protocol is activated when a human operator declares their intent to upgrade the application (e.g., "We are planning to upgrade OpenClaw from vA to vB").

Analysis Scope

Upgrade Guardian covers two categories of risks:

  1. Configuration-level risks: Changes that affect openclaw.json or static config files
  2. Breaking changes in schema or validation
  3. Deprecated config fields
  4. New required config options
  5. Default value changes

  6. Runtime-level risks: Changes that affect behavior without config modifications

  7. Behavioral shifts in session handling, routing, or delivery
  8. Logic changes in compaction, memory, or agents
  9. Protocol-level changes (streaming, API compatibility)
  10. CLI UX changes (e.g., /new behavior)

See references/RISK_CATEGORIES.md for detailed taxonomy.

Phase 1: Information Gathering & Semantic Analysis

  1. Ingest Release Notes: Fetch the CHANGELOG or release notes for the target version range.
  2. Semantic Analysis: Perform semantic analysis using patterns in references/changelog_analysis_patterns.md.
  3. Do not just search for "breaking change"
  4. Look for behavioral shift indicators (refactor, unify, improve handling, etc.)
  5. Identify both config-affecting and runtime-only changes
  6. Cross-Reference with Environment:
  7. For config risks: Load openclaw.json and identify dependencies on implicit behaviors
  8. For runtime risks: Identify active workflows (cron jobs, TUI usage, session routing patterns) that may be affected

Phase 2: Risk Assessment & Scenario Planning

2.1 Formulate "What-If" Scenarios

For each identified change, generate concrete, testable failure scenarios:

Config-level examples: - Scenario A: "What if 'improved session handling' means a new, destructive default for unconfigured session types? → Data loss." - Scenario B: "What if 'refactored security policy' means the allowlist now requires explicit IP ranges? → Plugin executions fail."

Runtime-level examples: - Scenario C: "What if 'duplicate reply suppression' changes session routing logic? → Bot stops responding in some groups." - Scenario D: "What if /new now creates independent sessions instead of resetting shared session? → User workflow disrupted." - Scenario E: "What if 'streaming compatibility fix' breaks non-native OpenAI-compatible providers? → Long responses fail mid-stream."

2.2 Quantify Risk

Assign a risk score based on: - Impact: data loss > service outage > UX friction > cosmetic - Likelihood: direct config/workflow overlap > tangential relation > theoretical

2.3 Generate Audit Report

Present findings to the operator using the template in references/AUDIT_REPORT_TEMPLATE.md.

Key sections: - Configuration risks (with jq paths and explicit mitigations) - Runtime risks (with behavioral descriptions and verification tests) - Risk prioritization (High/Medium/Low)

Phase 3: Mitigation & Verification

3.1 Proactive Mitigation

For config risks: Propose specific openclaw.json changes to make implicit assumptions explicit. Do not execute without operator approval.

For runtime risks: Document expected behavioral changes and suggest workflow adjustments if needed.

3.2 Verification Plan

Define clear, simple tests for each risk:

Config verification examples: - "Run openclaw doctor and confirm no validation errors" - "Check gateway.err.log for auth mode complaints"

Runtime verification examples: - "Send test message in group chat, verify bot responds" - "Open TUI, run /new, confirm it creates independent session" - "Trigger long completion from streaming provider, verify no mid-stream failure"

3.3 Post-Upgrade Audit

After the operator confirms upgrade is complete: 1. Execute verification plan 2. Report results systematically 3. Recommend rollback if critical failures detected

3.4 Archive Upgrade Artifacts (relative to workspace)

Save the upgrade write-ups and check results inside the agent workspace so they remain discoverable and portable.

Write locations (relative paths): - Pre-upgrade analysis report → kb/logs/upgrade-reports/YYYY-MM-DD_<from>-to-<to>_upgrade-analysis.md - Post-upgrade verification report → kb/logs/upgrade-verifications/YYYY-MM-DD_post-upgrade-verification.txt

Notes: - Prefer workspace-relative paths in reports (avoid hard-coded absolute home paths). - If kb/ is a symlink in a particular deployment, still refer to it as kb/... in the protocol/report; the filesystem mapping is an implementation detail.

References

  • references/changelog_analysis_patterns.md - Semantic analysis patterns
  • references/RISK_CATEGORIES.md - Detailed risk taxonomy
  • references/AUDIT_REPORT_TEMPLATE.md - Report structure
  • references/VERIFICATION_CHECKLIST.md - Common verification tests

Notes

  • This protocol is designed to be conservative. It's better to flag a false positive than miss a silent breaking change.
  • Runtime risks are often harder to detect than config risks. Pay extra attention to behavioral keywords like "improve", "fix", "refactor" in areas you actively use (sessions, routing, streaming).
  • When in doubt, ask the operator about their workflow patterns before deeming a risk "Low" priority.