Enhance LLM sandbox with persistent caching, helper scripts, and improved UX

Added comprehensive improvements to ClawdBox for better LLM agent experience:

- Tools: Added tree, tmux, htop, strace, file, less for enhanced debugging
- Python packages: httpie, pyyaml, requests, black, ipython pre-installed
- Persistent caching: pip/npm caches now survive container restarts
- Git config persistence: .gitconfig auto-links from /data volume
- Shell improvements: colored prompt, aliases (ll, k, dc), 10k line history
- Helper scripts: ConfigMap with disk-usage, health-check, clean-workspace, install-tools
- Environment variables: TERM, TZ, DEBIAN_FRONTEND for better compatibility
- Makefile: Common operations (build, deploy, logs, shell, health-check)
- Documentation: Comprehensive README with troubleshooting and workflows

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
2026-03-07 23:51:26 +11:00
parent 8fe712cda7
commit e039f77f0e
5 changed files with 562 additions and 8 deletions

253
README.md
View File

@@ -10,11 +10,14 @@ This container provides a stable, tool-rich environment for the AI agent to:
- Use tools that aren't available in the minimal agent environment.
## Tools Included
- **Core:** curl, wget, git, jq, yq, unzip, tar, vim/nano
- **Dev:** python3 (pip/venv), build-essential, nodejs, npm
- **Network:** ping, dnsutils, net-tools
- **Core:** curl, wget, git, jq, yq, unzip, tar, vim/nano, tree, less
- **Dev:** python3 (pip/venv), build-essential, nodejs (v20), npm
- **Python Libraries:** httpie, pyyaml, requests, black, ipython
- **Network:** ping, dnsutils, net-tools, openssh-server/client, sshpass
- **Media:** ffmpeg
- **Access:** openssh-server
- **Monitoring:** htop, tmux, ncdu, strace
- **Kubernetes:** OpenShift CLI (oc)
- **Search:** ripgrep (fast grep alternative)
## Deployment (OpenShift / K8s)
@@ -47,4 +50,244 @@ This container provides a stable, tool-rich environment for the AI agent to:
```
## Access
Connect via SSH using the `claw` user (passwordless sudo enabled).
Connect via SSH using the `claw` user (passwordless sudo enabled):
```bash
ssh -p 2222 claw@clawdbox.apps.lab.apilab.us
# or
make shell
```
## Persistent Storage Structure
The `/data` volume preserves data across container restarts:
```
/data/
├── ssh/ # SSH host keys (auto-generated on first run)
├── scripts/ # Helper scripts (from ConfigMap, read-only)
│ ├── disk-usage.sh
│ ├── health-check.sh
│ ├── clean-workspace.sh
│ └── install-tools.sh
├── .cache/
│ ├── pip/ # Python package cache (persisted)
│ └── npm/ # Node package cache (persisted)
├── .local/ # User-installed Python packages (pip install --user)
├── .gitconfig # Git configuration (create to persist)
├── .bash_history # Command history (persistent)
└── [your workspace] # Your work files
```
**Storage:** 10Gi PersistentVolumeClaim (ReadWriteOnce)
## Common Tasks
### Quick Operations (Using Makefile)
```bash
make help # Show all available commands
make logs # Stream container logs
make shell # SSH into container
make disk-usage # Check storage usage
make clean-cache # Clear package caches
make redeploy # Rebuild, push, and restart
```
### Install Python Packages
Packages are cached in `/data/.cache/pip` and survive restarts:
```bash
pip3 install --user pandas numpy scikit-learn
# Installs to /data/.local/lib/python3.*/site-packages/
```
### Install Node Packages Globally
```bash
npm install -g typescript ts-node
# Cached in /data/.cache/npm
```
### Persist Git Configuration
```bash
# Inside the container:
git config --global user.name "Your Name"
git config --global user.email "you@example.com"
# Save to persistent storage:
cp ~/.gitconfig /data/.gitconfig
# (Will auto-link on next restart)
```
### Check Disk Usage
```bash
# Quick overview:
df -h /data
# Use helper script for detailed report:
/data/scripts/disk-usage.sh
# Interactive explorer:
ncdu /data
# Top 10 largest directories:
du -h /data | sort -rh | head -10
```
### Helper Scripts
Pre-loaded scripts available in `/data/scripts/`:
```bash
# Comprehensive health check:
/data/scripts/health-check.sh
# Disk usage report:
/data/scripts/disk-usage.sh
# Interactive workspace cleanup:
/data/scripts/clean-workspace.sh
# Install common tools:
/data/scripts/install-tools.sh
```
### Shell Features
The shell includes several quality-of-life improvements:
- **Colored prompt:** `claw@clawdbox:/data$` (green user, blue path)
- **Persistent history:** Command history saved to `/data/.bash_history`
- **Useful aliases:**
- `ll` - detailed file listing (`ls -lah`)
- `k` - kubectl shortcut
- `dc` - docker shortcut
- **10,000 line history:** Never lose your commands
## Troubleshooting
### SSH Connection Refused
**Problem:** Cannot connect via SSH
**Diagnosis:**
```bash
# Check if pod is running:
kubectl get pods -n clawdbox
# Check pod logs:
make logs
# or
kubectl logs -n clawdbox deployment/clawdbox
```
**Common causes:**
- Pod still starting (wait for startup probe to pass)
- SSH keys not mounted correctly (check secret exists)
- Route not configured (check `kubectl get route -n clawdbox`)
### Out of Disk Space
**Problem:** `/data` volume is full
**Diagnosis:**
```bash
make disk-usage
# or
ssh -p 2222 claw@clawdbox.apps.lab.apilab.us "df -h /data"
```
**Solutions:**
```bash
# Clear package caches:
make clean-cache
# Find large directories:
ncdu /data
# Clear specific caches manually:
rm -rf /data/.cache/pip/*
rm -rf /data/.cache/npm/*
```
### Slow Package Installs
**Problem:** `pip install` or `npm install` is slow
**Diagnosis:**
Check if cache directories are properly configured:
```bash
ssh -p 2222 claw@clawdbox.apps.lab.apilab.us
echo $PIP_CACHE_DIR # Should show: /data/.cache/pip
echo $npm_config_cache # Should show: /data/.cache/npm
ls -la /data/.cache/
```
**Solution:**
If environment variables are missing, rebuild the container:
```bash
make redeploy
```
### Pod Stuck in CrashLoopBackOff
**Problem:** Container won't start
**Diagnosis:**
```bash
kubectl describe pod -n clawdbox -l app=clawdbox
kubectl logs -n clawdbox -l app=clawdbox --previous
```
**Common causes:**
- PVC not bound (check `kubectl get pvc -n clawdbox`)
- SSH host key generation failed (check logs for errors)
- Resource limits too low (increase in deployment.yaml)
### Deployment Status
**Quick health check:**
```bash
make status
# Shows: deployment, pods, services, routes
```
## Development Workflow
### Local Development
```bash
# 1. Make changes to Dockerfile or manifests
vim Dockerfile
# 2. Build and test locally (optional)
docker build -t clawdbox:test .
# 3. Deploy to cluster
make redeploy
# 4. Watch for successful rollout
make logs
```
### Adding New Tools
Edit the Dockerfile and add to the `apt-get install` section:
```dockerfile
RUN apt-get update && apt-get install -y --no-install-recommends \
# ... existing tools ...
your-new-tool \
&& rm -rf /var/lib/apt/lists/*
```
Then:
```bash
make redeploy
```
## Security Notes
- **Non-root:** Container runs as UID 1000 (`claw` user)
- **SSH:** Public key authentication only (no passwords)
- **Sudo:** Passwordless sudo available for `claw` user
- **Capabilities:** All capabilities dropped except `NET_BIND_SERVICE`
- **Network:** Ingress restricted to SSH port (2222)
## Resource Limits
**Requests:**
- CPU: 500m
- Memory: 256Mi
**Limits:**
- CPU: 2000m (2 cores)
- Memory: 2Gi
Adjust in `manifests/deployment.yaml` if needed for heavy workloads.