
Tool sprawl auditor: Python scripts to analyze SaaS usage and recommend consolidation
Automated Python scripts to pull billing and usage, score overlap, and recommend SaaS consolidation for engineering and product teams.
You're paying for 27 SaaS products but actively use 7 — here's an automated way to know which ones to keep
Tool sprawl drains engineering time, fragments data, and inflates recurring costs. If your product and engineering teams spend more time deciding which app to use than shipping features, you have a problem common in 2026: fragmented SaaS stacks, opaque usage telemetry, and rising usage-based pricing. This article gives a practical, reusable set of Python scripts and connector templates that pull billing and usage data from SaaS APIs, score overlap, and generate prioritized consolidation recommendations for engineering and product leaders.
Quick summary — what you'll get
- Audit pattern: inventory → billing → usage → identity → scoring → recommendations.
- Reusable Python library (async) to fetch and normalize billing/usage from diverse SaaS APIs.
- Scoring model (Jaccard + cost-per-active-user + feature overlap) with tunable weights.
- Connector templates (Python + examples for Bash/JS) and deployment notes for scale, security, and FinOps integration.
- Actionable checklist and a sample consolidation report you can run in hours.
Why this matters in 2026
Since late 2023 and through 2025 the market exploded with vertical AI SaaS—adding dozens of point solutions into teams' workflows. By 2026, enterprises are evolving FinOps practices beyond cloud infrastructure into SaaS procurement and usage governance. This means:
- Hybrid pricing models—seat-based + usage-based billing is common; unclear usage drivers increase waste.
- Decentralized purchasing—business units buy tools with corporate cards; central visibility is limited.
- Identity and provisioning are the most reliable signal for active users (SSO, SCIM, IdP logs).
- APIs are improving, but no standard billing schema exists; normalization is required.
Audit approach — the inverted pyramid
- Inventory: Gather vendor names, billing emails, and invoice IDs (procurement, corporate cards, employee-submitted receipts).
- Billing: Pull invoices/subscription lines, cost per period, and billing model (seat vs usage).
- Usage: Gather active users, feature usage, API calls, storage consumed.
- Identity: Map users via SSO/SCIM/IdP logs to measure adoption and duplicates across tools.
- Score & recommend: Compute overlap and consolidation potential. Prioritize by savings and operational risk.
Reusable Python scripts — architecture
The library is intentionally modular so teams can add connectors. Key modules:
- connectors: one file per SaaS vendor implementing a standard interface (fetch_billing, fetch_users, fetch_features).
- normalizer: maps vendor-specific fields to canonical schema (cost, period, seats, active_users, feature_tags).
- scorer: computes consolidation_score and breakdowns.
- reporter: writes CSV/JSON, generates human-readable recommendations.
- runner: orchestrates concurrent runs, rate-limit handling, and caching.
Core connector interface (contract)
class BaseConnector:
async def fetch_billing(self) -> dict: # returns {'vendor': '', 'lines': [ { 'sku': '', 'cost': 123.45, 'period_start': '', 'period_end': '' } ]}
raise NotImplementedError
async def fetch_users(self) -> list: # returns [{'user_id': '', 'email': '', 'status': 'active', 'last_active': '2026-01-12'}]
raise NotImplementedError
async def fetch_features(self) -> dict: # returns {'feature_tags': ['chat', 'analytics'], 'usage_metrics': {...}}
raise NotImplementedError
Example async runner (abridged)
import asyncio
import aiohttp
from connectors import connector_factory
async def run_connector(name, cfg):
conn = connector_factory(name, cfg)
billing = await conn.fetch_billing()
users = await conn.fetch_users()
features = await conn.fetch_features()
return {'vendor': name, 'billing': billing, 'users': users, 'features': features}
async def main(config):
tasks = [run_connector(n, cfg) for n, cfg in config.items()]
return await asyncio.gather(*tasks)
if __name__ == '__main__':
# load config.yaml with endpoints and creds
results = asyncio.run(main(config))
Scoring model — how to measure overlap and consolidation potential
We combine signals into a consolidation_score (0..100). Components and rationale:
- User overlap (40%): Jaccard similarity between active user sets. If two tools share most active users, consolidation is feasible.
- Feature overlap (30%): Intersection of feature_tags — e.g., both provide 'in-app chat', 'analytics', 'forms'.
- Cost inefficiency (20%): Cost per active user vs category median.
- Integration overhead (10%): Number of custom integrations (webhooks/ETL) — more integrations increase consolidation risk/cost.
Scoring functions (Python)
def jaccard(a: set, b: set) -> float:
if not a and not b:
return 1.0
return len(a & b) / len(a | b)
def consolidation_score(a, b, weights=None):
weights = weights or {'user': 0.4, 'feature': 0.3, 'cost': 0.2, 'integration': 0.1}
user_score = jaccard(set(a['active_users']), set(b['active_users']))
feature_score = jaccard(set(a['features']), set(b['features']))
# cost: 1 means perfectly inefficient (big saving), 0 best fit — normalize by observed max
cost_per_user_a = a['cost'] / max(1, len(a['active_users']))
cost_per_user_b = b['cost'] / max(1, len(b['active_users']))
cost_score = abs(cost_per_user_a - cost_per_user_b) / max(cost_per_user_a, cost_per_user_b, 1)
integration_score = 1 - min(1, (a['integrations'] + b['integrations']) / 10)
raw = (weights['user'] * user_score + weights['feature'] * feature_score + weights['cost'] * cost_score + weights['integration'] * integration_score)
return round(raw * 100, 1)
Connector templates — examples you can copy
Below are compact connector templates to adapt. Replace secrets with your secrets manager calls.
Stripe (billing) — Python
import stripe
class StripeConnector(BaseConnector):
def __init__(self, cfg):
stripe.api_key = cfg['api_key']
async def fetch_billing(self):
# stripe python client is sync — run in thread for asyncio
from asyncio import to_thread
def get_invoices():
invoices = stripe.Invoice.list(limit=100)
lines = []
for inv in invoices.auto_paging_iter():
for li in inv.lines.data:
lines.append({'sku': li.description, 'cost': li.amount_paid/100.0, 'period_start': li.period.start, 'period_end': li.period.end})
return {'vendor': 'stripe', 'lines': lines}
return await to_thread(get_invoices)
async def fetch_users(self):
return [] # Stripe doesn't provide app users; combine with product connectors
async def fetch_features(self):
return {}
Okta (users via SCIM / Admin Logs) — Python
import aiohttp
class OktaConnector(BaseConnector):
def __init__(self, cfg):
self.base = cfg['base_url']
self.token = cfg['token']
async def fetch_users(self):
hdr = {'Authorization': f'SSWS {self.token}', 'Accept': 'application/json'}
users = []
async with aiohttp.ClientSession() as s:
url = f'{self.base}/api/v1/users'
async with s.get(url, headers=hdr) as r:
data = await r.json()
for u in data:
users.append({'user_id': u['id'], 'email': u['profile']['email'], 'status': u['status']})
return users
Slack (admin API) — JS snippet to list workspace members
// Node.js example using @slack/web-api
const { WebClient } = require('@slack/web-api');
const client = new WebClient(process.env.SLACK_TOKEN);
async function listMembers() {
const res = await client.users.list();
return res.members.map(m => ({ id: m.id, email: m.profile.email, is_bot: m.is_bot }));
}
module.exports = { listMembers };
Quick Bash example — kick off the Python audit
#!/usr/bin/env bash
set -euo pipefail
CONFIG_FILE=${1:-config.yaml}
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
python run_audit.py --config "$CONFIG_FILE" --output report.json
Handling real-world friction points
Pagination and rate limits
Many vendor SDKs are synchronous; wrap them in thread pools for concurrency. Respect rate limits — implement exponential backoff and token buckets. Cache invoices and user snapshots to avoid repeated expensive calls.
Data normalization
Map vendor fields to canonical terms: cost_usd, period, active_users_count, feature_tags, integrations_count. Keep a transformation pipeline so new connectors can reuse normalization rules.
Identity mapping
Use emails as primary keys but account for aliases and SSO-linked addresses. Implement heuristics to merge duplicates (lowercase, strip domains if needed), and flag ambiguous entries for manual review. Treat identity incidents with the same urgency as any large-scale compromise — see enterprise playbooks for response expectations.
Security and compliance
- Store secrets in Vault/Secrets Manager, not in configs.
- Use least-privilege API tokens (read-only billing and SCIM scopes when available).
- Log access and use immutable audit trails for recommendations.
- Redact PII in any reports shared outside procurement or legal.
Example run — sample output and interpretation
Below is a condensed sample of what the reporter produces. This is an illustrative run based on synthesized data to demonstrate interpretation.
[{
'pair': ['AcmeChat', 'TeamChat+'],
'user_overlap': 0.82,
'feature_overlap': 0.75,
'cost_savings_est': 24000.0,
'consolidation_score': 86.4,
'recommendation': 'Consolidate TeamChat+ into AcmeChat; migrate integrations; estimated 6-8 weeks dev effort.'
},
{
'pair': ['DocStoreX', 'BlobDocs'],
'user_overlap': 0.12,
'feature_overlap': 0.2,
'consolidation_score': 18.0,
'recommendation': 'Low priority — different audiences and low overlap.'
}]
Interpretation guidance:
- Scores > 75: high consolidation potential — add to roadmap after stakeholder review.
- Scores 40–75: moderate — consider pilot and check integrations and data migration complexity.
- Scores < 40: low priority — focus on governance (access review, offboarding).
Operationalizing recommendations
- Stakeholder review: product, engineering, security, procurement.
- Data migration plan and test exports (ensure no data loss for core workflows).
- Integration plan: move webhooks, reconfigure OAuth, map APIs.
- Offboarding: revoke keys, cancel subscriptions after validation and final invoice reconciliation.
- Post-migration metrics: adoption, defect regressions, cost realization.
Advanced strategies and 2026 predictions
Trends and strategies to watch as SaaS landscapes continue to evolve:
- AI-assisted feature mapping: In 2026, expect vendor-agnostic ML models to map feature semantics from docs and usage logs to reduce manual taxonomy work — pair this with explainability and model provenance tooling like live explainability APIs.
- Procurement + FinOps integrations: Automated actions (pause seats, downgrade plans) triggered by consolidation score thresholds will become best practice—paired with approval workflows (procurement playbooks).
- Unified usage telemetry: Emerging standards for billing/usage exports (inspired by cloud cost APIs) will reduce normalization work; adopt these as vendors support them — see notes on data fabric and live APIs.
- Dynamic purchasing: Usage-based pricing will push teams to autoscale features and centralize procurement to avoid bill surprises.
Checklist — run a SaaS consolidation audit in one week
- Collect invoice exports from procurement and corporate cards (Day 0–1).
- Inventory vendors and prioritize top 80% of spend (Day 1).
- Wire connectors for the top vendors (Day 2–3).
- Run the auditor and review the consolidation report (Day 4).
- Run stakeholder reviews and create 90-day consolidation backlog (Day 5).
Security, licensing, and trust
When reusing third-party scripts, validate license compatibility (MIT/Apache vs closed-source). Vet any community connectors for secure secret handling and minimal scopes. For production, prefer running connectors in a centrally managed environment (CI, serverless job with restricted runtime).
Practical takeaways
- Don’t guess — measure: Use identity (SSO/SCIM) plus billing to compute active usage and cost-per-active-user.
- Score by multiple signals: User overlap + feature overlap + cost capture consolidation potential more accurately than cost alone.
- Automate, but validate: Generate recommendations automatically; require human signoff for any cancellation or migration.
- Integrate with FinOps: Feed consolidation outputs into procurement workflows and budget planners.
Getting started — code & next steps
Clone a starter repo that contains the connector contracts, an async runner, and sample connectors for Stripe and Okta. Use the runner to prioritize the top 10 vendors by spend. Tune the scoring weights to reflect business priorities (e.g., compliance risk may lower consolidation appetite for certain tools).
Example: a product org used this approach in a pilot and identified three consolidations representing 22% of annual SaaS spend with a 10-week migration plan. Your results will vary; start small and iterate.
Call to action
If you want the starter script bundle and a one-page checklist to run your first audit this week, download the repo and adapt the connector templates. Run the scripts against your top 10 vendors and share the consolidated report with procurement and engineering for a 30-minute review — you’ll be surprised how quickly waste becomes actionable savings.
Ready to start? Download the starter audit scripts, or reach out to our team for a hands-on workshop to run a pilot and integrate the output into your FinOps pipeline.
Related Reading
- Tool Sprawl for Tech Teams: A Rationalization Framework to Cut Cost and Complexity
- Building and Hosting Micro‑Apps: A Pragmatic DevOps Playbook
- Future Predictions: Data Fabric and Live Social Commerce APIs (2026–2028)
- Edge‑Powered, Cache‑First PWAs for Resilient Developer Tools — Advanced Strategies for 2026
- BBC on YouTube: What That Means for Family-Friendly Pet Content Creators
- Discover Spain’s ‘Garden of Eden’: A Day Trip to the Todolí Citrus Foundation
- When the Cloud Fails: A Caregiver’s Guide to Handling Telehealth and Portal Outages
- Hygge on a Budget: Using Hot-Water Bottles and Simple Lighting to Create a Cosy Dining Nook
- Launch Sprint Template: Build a Micro-App + Keyword Pack in 7 Days
Related Topics
codenscripts
Contributor
Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.
Up Next
More stories handpicked for you
Threat Modeling for Scripts: A Playbook for 2026 XDR and Policy‑as‑Code

Observability Pipelines for Scripted Tooling in 2026: Lightweight Strategies for Cost‑Conscious Dev Teams
