WasteLine: AWS waste detection built around your AI coding agent
- 2 days ago
- 10 min read
TL;DR
WasteLine is an AWS cloud waste detection and remediation tool that runs inside your AI coding agent (Claude Code, Codex, Gemini CLI, Kiro).
Detection is deterministic: 49 rules execute in the CLI, the agent never decides what counts as waste.
Remediation is scripted: pre-written CLI commands, Terraform snippets, or OpenOps workflows ship with each finding.
Free tier runs all 49 rules on a single AWS account, no licence required, nothing leaves your environment.
The hard part of cloud waste is not detection. It is the gap between detection and action.
Most FinOps platforms surface waste in a dashboard. Then they hand you a spreadsheet. The engineering team that owns the resource has to figure out the IAM, the Terraform, the CLI command, and the rollback. Weeks pass. The next scan surfaces the same findings, plus new ones. Waste regenerates faster than it gets remediated.
Flexera has measured this gap for years. Their State of the Cloud report has put waste at 27 to 32% of cloud spend for 7 consecutive years. The 2026 edition ticked up to 29% , the first increase in five years, driven by AI workloads. Detection has never been better. Remediation is barely moving.
This is not a people problem. It is a structural one. Detection and remediation live in different tools, owned by different teams, on different timelines. There is no closed loop.
WasteLine is now generally available, and it takes a different design choice: it is AI coding agent native. The tool is shaped around the assumption that the practitioner will work through it from inside Claude Code, Codex, Gemini CLI, Kiro, or any MCP-compatible coding assistant.
This post explains why we made that choice, what it produced, and what we learned building it.
WasteLine in one paragraph
WasteLine is a CLI tool plus an MCP (Model Context Protocol, the standard interface for AI assistants to call external tools) server for AWS waste detection and remediation. It scans across 49 rules covering seven categories: orphaned resources, idle resources, overprovisioned workloads, commitment mismatches, schedule blindness, modernization opportunities, and FinOps for AI (Bedrock and SageMaker). Every finding ships with a ready-to-execute remediation artifact: an AWS CLI script, a Terraform snippet, or an OpenOps approval workflow. Findings are prioritised P1 through P4 based on cost impact, confidence level, blast radius, and estimated remediation effort. Free tier runs all 49 rules on a single AWS account with no licence required. Professional and Enterprise tiers add full remediation artifacts, executive reports, multi-account scanning, drift tracking, and scheduled Fargate scans.

Three principles drove the design
1. Detection rules are deterministic
49 rules, written in Python, executed by the CLI. The assistant never decides what counts as waste. It runs the scan, reads the findings, and presents them to you. The same scan run from the CLI directly, from the assistant, or from a scheduled Fargate job produces identical results. There is no LLM in the loop on the question "is this resource wasted". That question has a written rule, not a model output.
This matters because an LLM running detection without deterministic rules underneath will produce findings that drift between runs, hallucinate counts, and miss edge cases. None of that is acceptable when a CFO is asked to commit to a savings target. It is also not acceptable in a CI/CD pipeline, where the same input must produce the same output.
Aspect | Deterministic detection (WasteLine) | LLM-generated detection |
Same input, same output | Yes, every time | No, drifts between runs |
Auditable | Rule code is the audit trail | Model logs and prompts |
Suitable for CI/CD | Yes | No, non-deterministic by design |
Edge case handling | Defined explicitly in code | Inferred from training data |
Update process | Pull request, code review | Prompt engineering |
CFO defensibility | High, traceable to written rule | Low, depends on model trust |
The determinism guarantee is enforced by the architecture, not by policy. The detection logic is plain Python with explicit thresholds, lookback windows, and confidence scoring rules. The MCP server exposes the rules as 11 callable tools but does not let the agent rewrite them. The output schema is signed with SHA-256 integrity checksums, so any tampering between scan and report is detectable.
2. Remediation artifacts are scripted, not generated
Each finding ships with a ready-to-execute remediation: an AWS CLI command, a Terraform snippet, or an OpenOps approval workflow. Pre-written. Reproducible. Reviewable. Version-controlled if you choose. The assistant does not write the remediation script in real time. It pulls it from the tool's library, fills in the parameters, and waits for your approval before anything runs.
This matters because remediation under time pressure is where mistakes happen. A wrong parameter, a missed dependency, a script that runs in the wrong account. Scripted artifacts let your team review what is going to run before it runs, and let you keep the script in a git repository if you want an audit trail.
The remediation library covers four artifact types: delete (16 rules), modify (7 rules), add_config (3 rules), and tag (1 rule). Fifteen additional rules are advisory: they produce guidance text, not executable scripts, because the right action depends on context the tool cannot see (commitment renewal strategy, business hours definition, model migration timing). We intentionally chose to ship advisory output for those cases rather than fabricate a script that would be wrong half the time.
3. The assistant's role is install, scoping, and dialogue
The agent reads a bootstrap prompt, helps you generate a scoped IAM policy (read-only, no write permissions anywhere), runs the scan, opens the dashboard locally, and walks you through the priorities. From paste-prompt to a prioritised action plan: about 15 minutes. Before any data reaches the assistant, resource IDs and ARNs (Amazon Resource Names, the unique identifiers for AWS resources) are anonymised. The agent sees waste categories, rule IDs, and aggregated counts. It does not see your account ID or your bucket names.
This is not "let an LLM run FinOps for you". It is "use your coding assistant to remove the friction between you and a deterministic FinOps tool".
That difference is the point. The first framing puts the LLM in the position of the practitioner, with all the risks that come with non-determinism, hallucination, and audit trails that read like model logs. The second framing puts the LLM in the position of an assistant, which is what current models are reliably good at. We have written more on this design pattern in our post on Model Context Protocol as the runtime for agentic FinOps.
What we learned building WasteLine
We started this project with a thesis (deterministic detection, scripted remediation, agent-driven interface) and 18 months of practitioner experience running cloud assessments. 3 lessons surfaced during development that we think generalise beyond WasteLine.
Lesson 1: Read-only is the precondition, not the feature
A lot of FinOps tools claim to be safe and bury that claim in marketing copy. We treated it as a non-negotiable engineering constraint and built the safety properties into the architecture.
There are no write API calls in the provider code. This is enforced 2 ways.
First, an AST static check runs in CI and blocks any pull request that introduces a write API prefix (Delete, Modify, Create, Tag, Put, etc.). Over 55 dangerous prefixes are checked.
Second, at runtime, a SafeBoto3Session class wraps the AWS SDK and only permits prefixes from a read-only allowlist (describe_, list_, get_, head_, generate_presigned). Any other call is blocked at runtime, regardless of what the calling code intends.
The IAM policy generator outputs only read permissions, and the bundled policy is tested in CI to ensure no write action ever sneaks in. The remediation generator does not execute; it produces scripts as proposals, with PROPOSED headers stamped on every artifact. The user keeps every approval gate.
The lesson generalises: when a property is critical, encode it in the architecture so it cannot be undone by a careless commit. Documentation and policy are not enough. A safety property that depends on a future contributor remembering to do the right thing is not a safety property.
Lesson 2: The Tweet Test, 9 minutes or less
Every UX decision was measured against a single bar: a new user should go from "pip install wasteline" to seeing real findings in their dashboard within 9 minutes. Why 9? Because it fits in a coffee or twitter break and roughly matches the patience of a senior engineer evaluating a new tool. If a feature added more than 30 seconds to that path, it needed exceptional justification.
This was brutal but it forced clarity. Several "powerful" features got cut because they made first-time use slower. The interactive setup wizard was added because manual IAM configuration was eating four minutes for first-time users on a fresh AWS account. The wasteline demo mode was added because some evaluators wanted to see real findings before committing the IAM setup time. The dashboard --open flag was added because the manual two-step (run scan, then open browser, then load file) was costing two minutes of friction.
The lesson generalises: most product roadmaps are organised around features. The roadmap that produces a tool people actually use is organised around the time-to-first-value path. Every feature lengthens or shortens that path. Knowing which one before you build is what separates tools people install from tools people use.
Lesson 3: Validate everything an AI suggests, including itself
We used AI heavily in development: Claude Code primarily, with Codex reviews and exploratory work in Cursor. We learned the hard way that AI suggestions need validation, even when they are confident.
3 failure modes recurred.
First, AI code reviews can be stale. We had Codex reviews claim that 5 out of 6 prior fixes were absent from the codebase; on validation, 4 of those 6 were already present. The review had run against an older snapshot of the branch. Acting on the review without validation would have wasted hours and possibly re-introduced bugs.
Second, AI-generated tests can pass while AI-generated implementation has subtle bugs. We hit a case where an annotation priority_dist: dict[str, int] = Counter() passed every unit test in pure Python (because Counter is a subclass of dict), but broke at runtime in a Cython-compiled wheel because Cython enforces type annotations strictly. The unit tests, AI-written, ran against source. The bug only surfaced when a real user ran the published wheel. The test suite was not wrong, but it was not sufficient.
Third, AI confidence does not equal correctness. We were told a fix was complete and verified more than once when it was incomplete or had been only partially deployed.
The discipline we developed: no AI suggestion ships without explicit human validation against the actual current code state. Every external code review is classified before it is acted on (confirmed, partially confirmed, stale, unsupported). Integration tests run end-to-end CLI commands, not just unit tests. AI accelerates work; it does not replace verification.
This is the meta-lesson behind WasteLine's design choice. The tool that you use AI agents to install and operate has to be more rigorous than the tool you operate by hand, not less. The agent removes friction. The friction was sometimes catching bugs.
What is next
2 integrations are on the roadmap, both extending the action loop further:
Slack, for routing findings to the team that owns the resource, with feedback flowing back to the dashboard. Useful for organisations with mature ownership models and clear team boundaries.
GitHub, for tracking remediation pull requests through to merge, so you know which findings actually got fixed and which got abandoned. Useful for closing the operational loop on remediation, which is where most FinOps tools stop.
Beyond integrations, the next major milestones are Azure and GCP providers. Azure is in design, planned for Q3 2026. GCP follows. The architecture has been built to support multi-cloud from day one (the providers/ directory has placeholder modules for Azure and GCP), but each provider requires its own detection rule library, pricing logic, and IAM configuration.
We are also evaluating community contributions for new detection rules. The rule architecture is documented and intentionally modular: each rule is a single Python file with a scan(), generate_remediation(), and assess_blast_radius() method. If you have a detection rule you would like to contribute, the path is open.
Try it
Free tier on a single AWS account, all 49 rules, no licence required.
pip install wasteline
wasteline setup
wasteline scan --openOr, to evaluate without AWS credentials:
wasteline demo
wasteline browser -i wasteline-demo-*.jsonHosted dashboard with a sample scan: https://wasteline.vercel.app
Landing: https://wasteline.optimnow.io
If you have feedback on the Slack vs GitHub integration choice, or on any other aspect of the tool, the comments below are open. Honest critique is what shapes the next milestones.
Frequently asked questions
What is WasteLine?
WasteLine is an AWS cloud waste detection and remediation tool built by OptimNow. It scans AWS environments for orphaned, idle, overprovisioned, commitment-mismatched, and schedule-blind resources across 49 deterministic detection rules. Every finding ships with a ready-to-execute remediation artifact (AWS CLI command, Terraform snippet, or OpenOps approval workflow). It is designed to run inside AI coding agents (Claude Code, Codex, Gemini CLI, Kiro) but the agent never decides what counts as waste.
How is WasteLine different from SaaS FinOps platforms?
SaaS FinOps platforms typically require connecting your billing data to their backend through cross-account IAM roles, then surface recommendations in their dashboard. WasteLine runs client-side, in your environment, with no SaaS backend, no telemetry, and no cross-account vendor access. Detection rules execute deterministically in the CLI rather than as model outputs. Setup takes about 15 minutes from pip install to a prioritised action plan, compared to weeks for typical SaaS onboarding. WasteLine also ships ready-to-execute remediation scripts with each finding, where most platforms stop at recommendations.
Can WasteLine modify my cloud resources?
No. WasteLine is read-only by design. The provider code contains zero write API calls (Delete, Modify, Create, Tag, Put, and similar prefixes), enforced by AST static checks in CI and a runtime SDK allowlist (SafeBoto3Session) that blocks any non-read API call. The IAM policy generator outputs read-only permissions and is tested in CI. Remediation artifacts are generated as proposals with PROPOSED headers; the user keeps every approval gate. WasteLine never executes changes against your cloud environment.
Which AI coding agents does WasteLine support?
WasteLine ships an MCP (Model Context Protocol) server with 11 callable tools. It works with any MCP-compatible coding assistant, including Claude Code (which auto-discovers via the .mcp.json file in the repo), Claude Desktop, Codex CLI, Gemini CLI, and Kiro. The CLI itself is invokable from any shell or assistant that has shell access. A bootstrap prompt in the quickstart pack lets the assistant drive the full assessment from install to remediation, pausing for user approval at each step.
Is there a free tier, and what does it include?
Yes. The Free tier requires no licence key and runs all 49 detection rules on a single AWS account. Five rules ship with full finding detail (unattached EBS volumes, orphaned Elastic IPs, empty S3 buckets, unused security groups, plus one more). The remaining 44 rules are shown as summaries with rule ID, category, and estimated waste, with full details unlocked at the Professional tier. The Free tier includes the local dashboard and JSON or CSV export. Demo mode (wasteline demo) generates synthetic findings without requiring AWS credentials.




.png)