toolscliautomation

Audit your developer tools: a CLI that reports overlap, usage, and cost per project

UUnknown

2026-02-19

10 min read

Build a CLI that scans repos, CI configs, and billing to score tool usage and recommend consolidation—start saving time and money in 2026.

Stop wasting time and budget: audit your dev tools from the CLI

If your teams juggle a dozen code-quality tools, three CI providers, and a pile of SaaS subscriptions that no one can justify, you're not alone. In 2026 the proliferation of AI assistants and micro‑SaaS has accelerated tool sprawl — and the bills, complexity, and security surface area keep growing. This guide shows how to build a practical CLI that scans repos, CI configs, and billing data to produce a per‑project tool usage score and clear consolidation suggestions.

Why audit developer tools now (2026 context)

Two recent trends make a focused tool audit essential:

SaaS + AI sprawl: 2024–2026 saw an explosion of niche AI dev tools and SaaS integrations. Many are underused but still billed monthly.
FinOps and observability consolidation: Cloud teams are applying FinOps principles to dev tools and observability platforms. Consolidation can save 10–40% on tooling costs when done correctly.

An automated CLI audit helps you answer three business questions fast: Which tools are actually used? Where are overlaps? How much do they cost per project?

What this CLI does — overview

The CLI I describe below scans three inputs and produces a JSON/terminal report:

Repository footprint: package manifests, Dockerfiles, workflow files, and code references
CI configs: job definitions, matrix entries, and invocation frequency
Billing data: cloud invoices, SaaS invoices or normalized charge items

Output: a per‑project tool usage score, normalized cost-per-tool, and prioritized consolidation suggestions with expected annual savings estimates.

Design: collectors, normalizers, scorer, reporter

Build the CLI as a pipeline with four stages:

Collectors — scan local repos or clone them, read CI config endpoints (e.g., .github/workflows, .gitlab-ci.yml), and pull billing exports.
Normalizers — map tokens and vendor names (e.g., "snyk" vs "snyk-ci") to canonical tool identifiers.
Scorer — compute usage metrics and a composite score per tool per project.
Reporter — produce terminal summaries, JSON, and suggested consolidation actions.

Scoring model (example)

The score balances presence and activity with cost and risk. A simple formula you can tune:

# score components per tool per project (0..1 normalized)
presence = 1.0 if tool found in repo else 0.0
ci_activity = min(1.0, job_count / 10)
dev_activity = min(1.0, active_calls / 50)  # e.g., API calls or integrations
cost_factor = min(1.0, monthly_cost / 1000)  # normalized; higher cost increases scrutiny
security_risk = 1.0 if tool has known vulnerabilities else 0.0

# composite (weights configurable)
score = 0.35 * presence + 0.30 * ci_activity + 0.20 * dev_activity - 0.10 * cost_factor - 0.05 * security_risk
# higher score => more justified to keep; negative/low => candidate for consolidation

Use thresholds for recommendations: score < 0.2 -> deprecate candidate; 0.2–0.5 -> review; > 0.5 -> keep or consolidate into fewer teams.

Quick prototype: Python CLI (runnable)

Below is a compact Python prototype using Click and PyYAML to scan a directory of repos for package manifests, GitHub Actions workflows, and a CSV of billing lines. It produces a JSON report with scores and suggestions.

#!/usr/bin/env python3
# file: tool_audit.py
import os
import json
import csv
import yaml
import click
from collections import defaultdict

# Simple normalizer map
CANONICAL = {
    'snyk': 'snyk',
    'dependabot': 'dependabot',
    'eslint': 'eslint',
    'prettier': 'prettier',
    'codecov': 'codecov',
    'coveralls': 'coveralls',
}

def normalize(name):
    key = name.lower()
    return CANONICAL.get(key, key)

def scan_repo(path):
    tools = defaultdict(lambda: {'presence':0,'ci_jobs':0,'dev_refs':0})
    # look for package.json
    pj = os.path.join(path, 'package.json')
    if os.path.isfile(pj):
        try:
            data = json.load(open(pj))
            for dep in list(data.get('dependencies',{}))+list(data.get('devDependencies',{})):
                t = normalize(dep)
                tools[t]['presence'] = 1
        except Exception:
            pass
    # requirements.txt
    rq = os.path.join(path, 'requirements.txt')
    if os.path.isfile(rq):
        for line in open(rq):
            if line.strip() and not line.startswith('#'):
                t = normalize(line.split('==')[0])
                tools[t]['presence'] = 1
    # GitHub workflows
    ghw = os.path.join(path, '.github', 'workflows')
    if os.path.isdir(ghw):
        for fn in os.listdir(ghw):
            if fn.endswith(('.yml', '.yaml')):
                try:
                    doc = yaml.safe_load(open(os.path.join(ghw, fn)))
                    jobs = doc.get('jobs', {})
                    for job in jobs.values():
                        steps = job.get('steps', [])
                        for s in steps:
                            if 'uses' in s:
                                uses = s['uses']
                                if 'actions/checkout' in uses:
                                    continue
                                tool = normalize(uses.split('/')[1] if '/' in uses else uses)
                                tools[tool]['ci_jobs'] += 1
                except Exception:
                    pass
    return tools

def ingest_billing(csv_path):
    # CSV with columns: project, tool, amount
    billing = defaultdict(lambda: defaultdict(float))
    if not os.path.isfile(csv_path):
        return billing
    for r in csv.DictReader(open(csv_path)):
        proj = r.get('project')
        tool = normalize(r.get('tool',''))
        amount = float(r.get('amount',0))
        billing[proj][tool] += amount
    return billing

@click.command()
@click.argument('repos_dir', type=click.Path(exists=True))
@click.option('--billing', type=click.Path(), help='billing.csv')
@click.option('--out', type=click.Path(), default='audit_report.json')
def main(repos_dir, billing, out):
    report = {}
    billing_data = ingest_billing(billing) if billing else {}
    for name in os.listdir(repos_dir):
        p = os.path.join(repos_dir, name)
        if not os.path.isdir(p):
            continue
        tools = scan_repo(p)
        report[name] = {}
        for tool, meta in tools.items():
            monthly_cost = billing_data.get(name, {}).get(tool, 0.0)
            ci_activity = min(1.0, meta['ci_jobs'] / 5)
            dev_activity = min(1.0, meta['dev_refs'] / 20)
            presence = meta['presence']
            cost_factor = min(1.0, monthly_cost / 1000)
            score = 0.35*presence + 0.30*ci_activity + 0.20*dev_activity - 0.10*cost_factor
            suggestion = 'keep' if score>0.5 else ('review' if score>0.2 else 'consider remove')
            report[name][tool] = {
                'presence': presence,
                'ci_jobs': meta['ci_jobs'],
                'monthly_cost': monthly_cost,
                'score': round(score,3),
                'suggestion': suggestion
            }
    open(out,'w').write(json.dumps(report, indent=2))
    print(f'Wrote {out}')

if __name__ == '__main__':
    main()

This prototype is intentionally small. Productionize by adding concurrency, caching, richer dev activity signals (git blame, API calls), and secure billing integration.

Quick Node.js scanner (one-file)

For teams using Node toolchains, here's a nimble scanner to detect package.json dependencies and GitHub Actions 'uses' entries.

// file: quick-scan.js
const fs = require('fs');
const path = require('path');
const yaml = require('js-yaml');

function scanRepo(dir){
  const result = {};
  const pj = path.join(dir, 'package.json');
  if (fs.existsSync(pj)){
    const pkg = JSON.parse(fs.readFileSync(pj,'utf8'));
    Object.assign(result, Object.fromEntries(Object.keys(pkg.dependencies||{}).map(k=>[k, {presence:1}])))
  }
  const wfdir = path.join(dir, '.github','workflows');
  if (fs.existsSync(wfdir)){
    for (const f of fs.readdirSync(wfdir)){
      if (f.endsWith('.yml')||f.endsWith('.yaml')){
        const doc = yaml.load(fs.readFileSync(path.join(wfdir,f),'utf8'))||{};
        const jobs = doc.jobs||{};
        Object.values(jobs).forEach(job=>{
          (job.steps||[]).forEach(s=>{ if(s.uses){ const t = s.uses.split('/')[1]||s.uses; result[t] = result[t] || {ci_jobs:0}; result[t].ci_jobs = (result[t].ci_jobs||0)+1 } })
        })
      }
    }
  }
  return result;
}

console.log(JSON.stringify(scanRepo(process.argv[2]||'.'), null, 2));

Parsing CI configs: patterns to detect

Each CI system stores tooling references differently. Detect these patterns:

GitHub Actions: uses: owner/tool@v — extract owner/tool and action name
GitLab CI: image: registry/tool:tag or script: install cli tools
CircleCI: docker images and orbs references
Jenkins pipelines: libraryResource or shared libs; groovy steps invoking tools

Tip: search for common executable names in scripts (snyk, npm, pip, curl to vendor endpoints) and normalize results.

Integrating billing data

Billing is the hardest part but the most valuable. Sources in 2026:

Cloud billing exports (AWS Cost Explorer/Reports, GCP Billing exports to BigQuery, Azure Cost Management)
SaaS invoice exports (Stripe, Chargebee, vendor billing portals)
Internal chargebacks or cost-allocation data (FinOps systems)

Strategy:

Export line items with a tool label (vendor or SKU). If vendors provide tags, use them to map to projects.
Normalize recurring SaaS invoices to monthly and attribute share to projects using a simple heuristic: code repo mentions, CI job invocations, and active user lists.
Use a conservative approach: if attribution is ambiguous, surface it as "unattributed" to avoid overclaiming savings.

Example: AWS Cost Explorer query that filters by tag: Project=project-name and groups by service to get monthly spend per project. For SaaS, ingest CSV from Stripe with invoice descriptions and map descriptions to canonical tools.

Consolidation suggestions — rules and heuristics

Recommendations should be prioritized with estimated savings and migration effort:

Duplicate tools: If two tools cover the same capability (e.g., code scanning) and one has low usage + high cost, suggest migration after capability mapping.
Underused paid tools: Low score and non-critical functionality -> cancel or trial-free alternatives.
CI provider consolidation: Multiple CI providers increase matrix runs. Consolidate where test time and artifacts align; estimate savings from fewer parallel runners.
Open source vs SaaS: For basic needs, an OSS alternative + managed infra might be cheaper, but factor maintenance costs.

Each suggestion should include: expected annual savings, migration complexity (low/medium/high), and key blockers (data export, vendor lock-in, required team training).

Running at scale: multi-repo and performance

For organizations with hundreds of repos, design for scale:

Parallelize repo scans with worker pools and rate-limit CI/billing API calls.
Store normalized results in SQLite or a small Postgres instance for aggregation across time.
Cache fingerprints (checksum of workflow files); rescan only on changes to reduce cost.
Provide an incremental scan mode for CI-based scheduled runs (e.g., nightly GitHub Actions)

Security and privacy (must-haves)

The CLI will touch billing and repo metadata. Follow these rules:

Least privilege — use read-only tokens scoped to billing exports and repo metadata; never store secrets in plain text.
Audit logs — record who ran the audit and when; ideal for governance reviews.
Encryption — encrypt billing data at rest and in transit. Rotate keys regularly.
Data minimization — do not collect code contents beyond filenames and identified tool references unless necessary; keep PII out.

Sample output (terminal + JSON)

A succinct terminal summary should lead, with optional detailed JSON for analysts. Example terminal summary:

Project: payment-service
  - eslint: score 0.72 (keep) monthly $0
  - Codecov: score 0.18 (consider remove) monthly $180 -> annual saving $1,980
  - Sentry: score 0.42 (review) monthly $400 -> investigate duplicate with observability platform

  Top consolidation recommendation: Remove Codecov from payment-service (low usage, high cost). Estimated annual savings: $1,980.

The corresponding JSON should include attribution metadata and confidence scores for automation or dashboards.

Advanced strategies and 2026 predictions

Looking ahead in 2026, adopt these advanced tactics:

LLM-assisted mapping — use a local LLM or hosted vector search to map tool feature overlap automatically (e.g., static analysis vs SCA vs secret scanning) by ingesting vendor docs and changelogs.
Predictive FinOps — use historical billing + usage signals to predict future spend and estimate consolidation ROI with a probabilistic model.
Policy enforcement — integrate with policy-as-code (e.g., OPA) to block adding new paid tools without review if similar tools exist.
OpenTelemetry & observability consolidation — expect more teams to migrate to OpenTelemetry-based stacks; audit should flag duplicated instrumentation costs.

Actionable checklist to start this week

Run a quick scan of 5 high-impact repositories using the Node quick-scan.js and capture package and workflow footprints.
Export the last 12 months of SaaS invoices (Stripe/Chargebee) and cloud cost tags for two projects to a CSV for ingestion.
Run the Python prototype against the repos folder with the CSV to generate an initial report.
Prioritize one low-effort consolidation (e.g., cancel an underused paid test coverage tool) and realize saving within 30 days.
Automate the scan as a scheduled workflow and send summaries to a FinOps Slack channel for review.

Common pitfalls and caveats

Do not equate presence with importance—consult owners before canceling any tool.
Attribution is probabilistic. Surface confidence instead of absolute claims.
Migration costs can outweigh license savings — factor engineering time and risk into recommendations.

Wrap-up: what you’ll get and next steps

A small CLI that correlates repo footprints, CI activity, and billing will transform vague feelings about tool bloat into actionable, prioritized consolidation work. In 2026, pairing this audit with FinOps and LLM-powered analysis gives teams faster, safer decisions and measurable savings.

Ready to move from noise to action? Start with the prototype above, run it on a handful of repos, and iterate the scoring weights with your finance and engineering leads. Over time you can integrate the CLI into governance flows to make tool sprawl a thing of the past.

Call to action

Try the sample Python CLI on three repos this week and export your SaaS invoices to CSV. If you want a packaged starter kit (Python + Node scanners, CI parsers, and a FinOps-friendly report schema), download the codenscripts starter archive and join our 2026 Tool Audit workshop for a guided consolidation sprint.

Unknown

Contributor

Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.