The hype cycle around AI in software development has produced two camps: those who believe LLMs will replace engineers within three years, and those who dismiss the whole space as a glorified autocomplete. Neither view is useful when you are trying to ship a product.
Where AI genuinely accelerates delivery
The highest-leverage uses we have found are not code generation — they are classification, extraction, and summarisation at scale. A fintech client processes 40,000 support tickets per month. Routing those automatically with 94% accuracy eliminated two full-time support roles and cut response time from 8 hours to 40 minutes. That is a real business outcome.
Similarly, document processing — extracting structured data from PDFs, contracts, and forms — is a category where LLMs outperform every alternative we have tried, at a fraction of the development cost of a traditional ML pipeline.
Where it creates expensive debt
The failure pattern we see most often: teams reach for an LLM to solve a problem that a deterministic algorithm handles better, faster, and more reliably. If you can write a rule for it, write the rule. Reserve the LLM for the cases where rules break down — which is genuinely a large category, but not everything.
Hallucination management, evaluation harnesses, prompt versioning, and latency budgets all add real engineering overhead. The ROI calculation is only positive when the problem is genuinely a good fit for the technology.
Our current default stack
For most client integrations we reach for Claude for reasoning-heavy tasks, GPT-4o for multimodal pipelines, and open-source models (Llama, Mistral) when data privacy requirements prevent sending content to external APIs. The model layer is increasingly commodity — the engineering effort lives in retrieval, evaluation, and integration.