I'm Too Lazy to Check Datadog Every Morning, So I Made AI Do It
piotrgrudzien
22 points
13 comments
March 15, 2026
Related Discussions
Found 5 related stories in 37.8ms across 3,471 title embeddings via pgvector HNSW
- Agents that run while I sleep aray07 · 288 pts · March 10, 2026 · 53% similar
- My AI Agents Lie About Their Status, So I Built a Hidden Monitor kaylamathisen · 13 pts · March 04, 2026 · 53% similar
- Hackerbot-Claw: AI Bot Exploiting GitHub Actions – Microsoft, Datadog Hit So Far varunsharma07 · 12 pts · March 01, 2026 · 53% similar
- AI Team OS – Turn Claude Code into a Self-Managing AI Team cronus1141 · 40 pts · March 21, 2026 · 53% similar
- We automated everything except knowing what's going on kennethops · 85 pts · March 03, 2026 · 51% similar
Discussion Highlights (3 comments)
danpalmer
Why would one need to check Datadog every morning? Wouldn't alerts fire if there was something to do?
sgarman
I don't understand the workflow of having multiple new bugs everyday that need fixed. Is there bad code being shipped? Are there 1000 devs and it's just this persons' job to fix everyone's bugs? Is this an extremely old and complicated codebase they are improving? Not trying to be snarky - I just don't understand how every day there is new bugs that are just error messages. If there are new bugs every day that need fixed is the AI really good enough to know the fix from just an error?
Xeoncross
> Total alerts/errors found: 7 Apps written in an exceptions language (Java, JavaScript, PHP, etc..) are really annoying to monitor as everything that isn't the happy path triggers an 'error'/'fatal' log/metric. Yes, you can technically work around it with (near) Go-level error verbosity (try/catches everywhere on every call) but I've never seen a team actually do that. Modern languages that don't throw exceptions for every error like Rust, Go, and Zig make much more sane telemetry reports in my experience. On this note, a login failure is not an error, it's a warning because there is no action to take. It's an expected outcome. Errors should be actionable. WARN should be for things that in aggregate (like login failures) point to an issue.