HAND-TAGGED >>> 991 SKILLS LIVE <<<* OPEN SOURCE *NO LOGIN, NO TRACKING FRESH DROPS WEEKLY HAND-TAGGED >>> 991 SKILLS LIVE <<<* OPEN SOURCE *NO LOGIN, NO TRACKING FRESH DROPS WEEKLY HAND-TAGGED >>> 991 SKILLS LIVE <<<* OPEN SOURCE *NO LOGIN, NO TRACKING FRESH DROPS WEEKLY HAND-TAGGED >>> 991 SKILLS LIVE <<<* OPEN SOURCE *NO LOGIN, NO TRACKING FRESH DROPS WEEKLY HAND-TAGGED >>> 991 SKILLS LIVE <<<* OPEN SOURCE *NO LOGIN, NO TRACKING FRESH DROPS WEEKLY HAND-TAGGED >>> 991 SKILLS LIVE <<<* OPEN SOURCE *NO LOGIN, NO TRACKING FRESH DROPS WEEKLY
← back to homepage
Boost service mesh visibility instantlySKILL #LITY
Coding

service-mesh-observability

Boost service mesh visibility instantly

Implement comprehensive observability for service meshes including distributed tracing, metrics, and visualization. Use when setting up mesh monitoring, debugging latency issues, or implementing SLOs for service communication.

↗ github · ★ 37k·src: wshobson/agents

the manual

Service Mesh Observability

Complete guide to observability patterns for Istio, Linkerd, and service mesh deployments.

When to Use This Skill

  • Setting up distributed tracing across services
  • Implementing service mesh metrics and dashboards
  • Debugging latency and error issues
  • Defining SLOs for service communication
  • Visualizing service dependencies
  • Troubleshooting mesh connectivity

Core Concepts

1. Three Pillars of Observability

┌─────────────────────────────────────────────────────┐
│                  Observability                       │
├─────────────────┬─────────────────┬─────────────────┤
│     Metrics     │     Traces      │      Logs       │
│                 │                 │                 │
│ • Request rate  │ • Span context  │ • Access logs   │
│ • Error rate    │ • Latency       │ • Error details │
│ • Latency P50   │ • Dependencies  │ • Debug info    │
│ • Saturation    │ • Bottlenecks   │ • Audit trail   │
└─────────────────┴─────────────────┴─────────────────┘

2. Golden Signals for Mesh

SignalDescriptionAlert Threshold
LatencyRequest duration P50, P99P99 > 500ms
TrafficRequests per secondAnomaly detection
Errors5xx error rate> 1%
SaturationResource utilization> 80%

Templates and detailed worked examples

Full template library and detailed worked examples live in references/details.md. Read that file when you need the concrete templates.

Best Practices

Do's

  • Sample appropriately - 100% in dev, 1-10% in prod
  • Use trace context - Propagate headers consistently
  • Set up alerts - For golden signals
  • Correlate metrics/traces - Use exemplars
  • Retain strategically - Hot/cold storage tiers

Don'ts

  • Don't over-sample - Storage costs add up
  • Don't ignore cardinality - Limit label values
  • Don't skip dashboards - Visualize dependencies
  • Don't forget costs - Monitor observability costs

more coding

Request code reviews to catch issues early
Coding
HOT
Request code reviews to catch issues early
requesting-code-review
2@ 2 240k
Execute plans flawlessly and efficiently
Coding
HOT
Execute plans flawlessly and efficiently
executing-plans
0@ 0 240k
Finish your dev branch like a pro
Coding
HOT
Finish your dev branch like a pro
finishing-a-development-branch
0@ 0 240k
Verify feedback before you implement changes
Coding
HOT
Verify feedback before you implement changes
receiving-code-review
0@ 0 240k
Debug systematically to save time
Coding
HOT
Debug systematically to save time
systematic-debugging
0@ 0 240k
Write tests first, code with confidence
Coding
HOT
Write tests first, code with confidence
test-driven-development
0@ 0 240k
Build powerful MCP servers fast
Coding
HOT
Build powerful MCP servers fast
mcp-builder
0@ 1 156k
Transform messy data into clean spreadsheets
Coding
HOT
Transform messy data into clean spreadsheets
xlsx
0@ 0 156k