Open Source AI Closes the Gap

The argument for open source AI at the start of 2024 was mostly about control and cost. You ran your own model, you controlled where the data went, and you avoided per-token pricing at scale. The argument against was capability. The best open source models were good but not as good as the frontier proprietary models for tasks that required genuine reasoning.

That argument against weakened considerably through 2024 and into 2025. Several open releases hit benchmarks that would have represented the state of the art from any provider two years earlier. More importantly, fine-tuned versions of these models for specific domains started matching or exceeding the proprietary frontier on tasks within those domains. A fine-tuned open source model for medical coding trained on high-quality domain examples often outperformed a general-purpose frontier model on medical coding tasks. The generalist frontier models are better at everything in general. The specialists can be better at specifics.

The infrastructure around open source models also matured rapidly. Running a capable open source model in 2023 required meaningful engineering effort. By late 2025, the tooling for deployment, quantisation, serving, and monitoring had caught up to the point where a small team could operate a production-grade open source model without it becoming the primary focus of their engineering capacity.

What I found interesting was how this changed the negotiating position of companies using proprietary API models. The credible alternative of switching to an open source model changes the conversation about pricing and terms. Even teams that had no intention of switching found that the alternative being viable was useful in ways beyond the immediate technical decision.

The fine-tuning question became more practical during this period. Fine-tuning had always been theoretically available but practically difficult because of data requirements, infrastructure needs, and evaluation complexity. The tooling improvements and the availability of capable base models made the calculation more favourable for specific use cases.

My view is that the market is settling into a structure where proprietary frontier models remain the default for general-purpose reasoning at the highest capability level, while open source models become the default for domain-specific applications where fine-tuning is feasible and where data governance requirements make proprietary API usage complicated. Both categories are large. Neither is going away.

Related Articles