Add comprehensive demo scenarios documentation

2026-01-12 09:46:57 +00:00
parent 822612b5dd
commit 7cc3803b85
1 changed files with 171 additions and 0 deletions
--- a/DEMO_SCENARIOS.md
+++ b/DEMO_SCENARIOS.md
@@ -0,0 +1,171 @@
 # Claude MCP Demo Scenarios
 This repository contains demo scenarios showcasing Claude's operational capabilities with OpenShift through MCP (Model Context Protocol) integration.
 ## Demo Scenarios
 ### 1. Cluster Health Check & Diagnostics
 **Scenario**: "Claude, can you check if my cluster is healthy?"
 **What Claude Does**:
 - Lists nodes and checks their status
 - Examines critical workloads (control plane, operators)
 - Reviews recent events for errors or warnings
 - Checks resource consumption (CPU, memory) via metrics server
 - Identifies pods in CrashLoopBackOff or other problematic states
 - Provides a structured health summary with actionable insights
 **Key Value**: Comprehensive cluster assessment in seconds vs. manual kubectl/oc commands across multiple resources.
 ---
 ### 2. Security Review & Hardening
 **Scenario**: "Review the security posture of my Calibre deployment and help me lock it down."
 **What Claude Does**:
 - Examines pod security context and SCC assignments
 - Identifies overly permissive configurations (privileged, anyuid, root user)
 - Proposes custom SCCs with minimum viable privileges
 - Guides through incremental security hardening
 - Documents failure modes and appropriate fixes
 - Creates declarative GitOps manifests for security policies
 **Key Value**: Expert security review without needing deep SCC knowledge. Learns the boundaries through experimentation.
 **Real Example**: Successfully hardened Calibre from `anyuid` to `restricted-s6` SCC, discovering s6-overlay compatibility issues and documenting workarounds.
 ---
 ### 3. Agentic Problem Solving
 **Scenario**: User mentions NFS performance concerns in passing.
 **What Claude Does** (without being asked):
 - Creates a test pod with appropriate tools
 - Mounts the NFS volume
 - Runs performance benchmarks (dd, fio)
 - Analyzes results and compares to expected performance
 - Cleans up test resources
 - Reports findings with context
 **Key Value**: Proactive investigation and validation. Claude doesn't wait for explicit instructions—it understands the implied need and takes action.
 ---
 ### 4. Subtle Error Detection
 **Scenario**: "Claude, I removed the SCC from the service account and added the new one, but the pod is still using the old SCC. What did I miss?"
 **What Claude Does**:
 - Retrieves actual pod spec to see what SA it's using
 - Compares to the SA name in the user's command
 - Spots the typo: `peantuflix-sa` vs `peanutflix-sa`
 - Identifies the root cause immediately
 **Key Value**: Catches typos, wrong namespaces, label selector errors, and other "stupid mistakes" that eat 30+ minutes of senior engineer time. Machines don't autocorrect what humans read.
 **Other Examples**:
 - Off-by-one errors in array indices
 - Copy-paste artifacts (wrong resource names)
 - Namespace mismatches
 - Label selectors that silently match nothing
 ---
 ### 5. Multi-Tool Orchestration
 **Scenario**: "Find all applications using the old Gitea URL and help me migrate them."
 **What Claude Does**:
 - Uses ArgoCD MCP to list all applications
 - Uses OCP MCP to examine each app's manifests
 - Uses Gitea MCP to search repo contents for the old URL
 - Proposes a migration plan with git operations
 - Can execute the migration if approved
 **Key Value**: Coordinates across multiple systems (ArgoCD, Kubernetes, Git) in a single workflow. Human would need to context-switch between tools.
 ---
 ### 6. GitOps Workflow Automation
 **Scenario**: "Create a custom SCC for GPU workloads and apply it through GitOps."
 **What Claude Does**:
 - Analyzes requirements (hostPath for /dev/dri, but no privilege escalation)
 - Creates SCC manifest with appropriate constraints
 - Generates ClusterRoleBinding for service account
 - Commits both to the okd-platform repo
 - ArgoCD picks up changes and applies them
 - Validates the pod starts with correct SCC
 **Key Value**: Full GitOps workflow from requirements to validation. Everything is declarative and version-controlled.
 ---
 ### 7. Root Cause Analysis
 **Scenario**: "My pod won't start. Help me debug it."
 **What Claude Does**:
 - Retrieves pod status and events
 - Identifies SCC admission errors
 - Examines the deployment manifest
 - Traces through which SCCs are available to the service account
 - Finds the specific constraint violation
 - Proposes the minimal fix (not just "use privileged")
 **Key Value**: Systematic debugging following the admission chain. Explains *why* something failed, not just *what* failed.
 **Real Example**: Diagnosed that Plex required `allowPrivilegeEscalation: true` due to s6-overlay's setuid behavior, despite already having hostPath access working.
 ---
 ### 8. Documentation & Knowledge Capture
 **Scenario**: Throughout any complex task.
 **What Claude Does**:
 - Suggests creating documentation as issues are discovered
 - Proposes README updates with workarounds
 - Generates example manifests with inline comments
 - Creates decision records (why we chose X over Y)
 - Documents failure modes for future reference
 **Key Value**: Operational knowledge is captured in git, not lost in someone's head. Future engineers (or future-you) benefit.
 ---
 ## Demo Structure
 Each scenario should demonstrate:
 1. **Natural language input** - No YAML required from the user
 2. **Autonomous tool use** - Claude picks the right tools
 3. **Iterative problem solving** - When Plan A fails, try Plan B
 4. **GitOps-first approach** - Everything through version control
 5. **Explanation of reasoning** - Not just "do this," but "here's why"
 ## Technical Foundation
 **MCP Servers Used**:
 - `openshift-mcp-server` - Kubernetes/OpenShift API operations
 - `gitea-mcp-server` - Git repository operations  
 - `argocd-mcp-server` - ArgoCD application management
 - `minio-mcp-server` - Object storage operations
 **Key Capabilities**:
 - Direct API access (no kubectl wrapper scripts)
 - Multi-step workflows with validation
 - Failure recovery and alternative approaches
 - Context retention across long conversations
 - Integration with existing GitOps workflows
 ## Notes for Demo Day
 - **Start simple**: Cluster health check is impressive but approachable
 - **Build complexity**: Show multi-tool orchestration after basics
 - **Highlight autonomy**: The agentic scenarios (NFS testing) are most impressive
 - **Show failure handling**: Claude debugging its own mistakes is powerful
 - **Emphasize GitOps**: Everything is declarative and auditable
 ## Future Scenarios to Develop
 - **Disaster recovery**: "My cluster is down, help me restore from backup"
 - **Capacity planning**: "Will my cluster handle 10x traffic?"
 - **Security audit**: "Find all workloads running as root"
 - **Cost optimization**: "Which pods are using the most resources?"
 - **Compliance checking**: "Do all our apps meet PSS restricted standards?"