Case Study — AI Platforms · Insurance · Financial Services

Insurance Company: AI Platform Selected at 31% Below Initial Proposals Using Pricing Intelligence

Industry Insurance & Financial Services
Client Size Top-20 U.S. insurer, 12,000 employees
Contract Value $6.8M 3-year enterprise AI commitment
Vendors Benchmarked OpenAI, Anthropic, Google Vertex, AWS Bedrock
Engagement Type New Purchase + Vendor Selection Benchmarking
31% Below initial vendor proposals
$3.1M Saved vs. shortlisted proposals
4 Vendors benchmarked in parallel
8 wk From RFP to signed agreement

Background

A top-20 U.S. insurance carrier had reached the decision point on enterprise AI adoption. After 18 months of internal experimentation with multiple AI APIs and a proof-of-concept program covering claims automation, underwriting assistance, and customer service augmentation, the company's Chief Digital Officer commissioned a formal vendor selection process in Q4 2025.

The shortlist was four vendors: OpenAI (GPT-4o enterprise), Anthropic (Claude for Enterprise), Google (Vertex AI / Gemini), and AWS (Bedrock with multiple model providers). All four had submitted enterprise proposals in response to the carrier's RFP. The proposals ranged from $7.8M to $11.4M over three years, and none were structured identically — making direct price comparison nearly impossible without independent benchmark data.

The carrier's procurement team had no historical data on AI enterprise pricing, given the market's relative immaturity. Unlike Oracle or Microsoft renewals, where historical benchmarks are well-established, enterprise AI pricing was largely opaque. They engaged VendorBenchmark to benchmark all four proposals simultaneously before final vendor selection.

The Challenge

Enterprise AI platform procurement in 2025-2026 presents unique benchmarking challenges that are distinct from traditional software categories. Pricing structures vary radically across vendors: OpenAI prices on token consumption with enterprise commits; Anthropic on seat-based access with usage allowances; Google on a combination of API calls and resource units; AWS Bedrock on model-specific token pricing plus infrastructure. Comparing these on a per-capability or per-use-case basis requires normalization work that vendor RFP responses do not provide.

Complexity Factors in AI Platform Benchmarking

  • Four incompatible pricing models (token-based, seat-based, API-call, resource-unit) required normalization to actual use-case cost
  • None of the proposals included total cost of ownership for fine-tuning, embedding generation, and inference at production scale
  • Vendor proposals varied in included support tiers, SLA structures, and data residency provisions — making headline pricing misleading
  • The carrier's use cases (claims, underwriting, customer service) had different token volume profiles, making the "best value" vendor use-case dependent
  • Enterprise AI pricing was changing rapidly — benchmark data required recency weighting to account for price compression in the market
  • Regulatory requirements (NAIC model bulletin on AI) imposed specific data handling and audit trail requirements that affected implementation cost

The VendorBenchmark Analysis

VendorBenchmark benchmarked all four proposals simultaneously, normalizing pricing to a common unit: estimated annual cost per 1M tokens processed across the carrier's three primary use cases at projected production volume. This normalization required modeling the carrier's token volumes from their PoC data and mapping each vendor's pricing structure to actual expected spend.

The analysis revealed that all four vendors had priced their enterprise proposals at significant premiums above what comparable financial services enterprises had negotiated for equivalent workloads. The AI market's enterprise pricing is largely undisclosed, which allows vendors to anchor negotiations at inflated rates. Our database of comparable financial services AI commitments — 47 deals in insurance and adjacent verticals closed within the prior 12 months — provided the reference points that the carrier's procurement team lacked.

Vendor Initial Proposal (3yr) Benchmark Range (Comparable) Premium vs. Market Status
OpenAI Enterprise $9.2M $5.8M – $7.1M +30% above median Shortlisted
Anthropic Enterprise $7.8M $5.2M – $6.4M +28% above median Shortlisted
Google Vertex AI Selected $8.4M $5.6M – $6.8M +26% above median Won with negotiation
AWS Bedrock $11.4M $6.2M – $7.8M +54% above median Eliminated

The analysis also surfaced a critical finding about total cost of ownership that none of the four proposals had addressed: fine-tuning and embedding costs. For the carrier's claims processing use case — which required domain-specific model tuning on 8 years of historical claims data — three of the four vendors' pricing structures would result in significant additional charges not captured in their initial proposals. Google's Vertex AI was the only platform where fine-tuning costs were included within the enterprise commitment structure at the carrier's scale, improving its relative value position substantially.

"Every vendor told us they were competitive. The benchmark data showed every vendor was 26-54% above market. Without that reference point, we would have negotiated in the dark and signed one of these proposals as-is."
— Chief Digital Officer, Top-20 U.S. Insurance Carrier

The Negotiation and Selection Process

The carrier entered the final negotiation phase with benchmark data as the explicit basis for counter-proposals to each shortlisted vendor. The approach was multi-vendor: all four vendors were informed that they were in active competition, and all four received benchmark-anchored counter-proposals simultaneously. The competitive dynamic, combined with the credibility of peer-transaction data as the negotiating basis, produced more significant price movement than is typical in single-vendor negotiations.

OpenAI moved from $9.2M to $7.1M but declined to include fine-tuning within the commitment structure. Anthropic moved from $7.8M to $6.3M. Google moved from $8.4M to $5.8M and agreed to include fine-tuning credits as part of the enterprise commitment — a structural concession that changed the total cost calculation significantly. AWS Bedrock's final offer remained materially above market and was eliminated from consideration.

The carrier selected Google Vertex AI at $5.8M over three years — 31% below their initial proposal and within the lower quartile of comparable financial services AI commitments in VendorBenchmark's database. The selection was driven by a combination of price, the fine-tuning inclusion, and Google's stronger compliance posture for the carrier's specific NAIC AI model bulletin requirements.

Final Outcome vs. Initial Proposals

  • Selected vendor: Google Vertex AI at $5.8M vs. initial $8.4M proposal — $2.6M direct saving
  • Total saving vs. best alternative shortlisted proposal (Anthropic at $6.3M): $500K additional
  • Total saving vs. average of all initial proposals ($9.2M): $3.4M — 37% reduction
  • Fine-tuning credits included in commitment — eliminates projected $800K in additional costs at production scale
  • Data residency and audit trail requirements met within standard enterprise terms — no premium charged
  • 3-year term with annual review clause — preserves flexibility as AI market continues to evolve

Key Takeaways

This engagement demonstrates the particular value of benchmarking in emerging technology categories where historical pricing data is scarce and vendor proposals are unconstrained by established market norms. AI platform pricing in 2025-2026 is in a period of rapid change, with enterprise prices declining as competition increases and model commoditization accelerates. Enterprises that benchmark against recent peer transactions capture market-rate pricing; those that negotiate from vendor proposals alone are anchored to prices that systematically overstate market rates.

The multi-vendor benchmarking approach — analyzing all four proposals simultaneously against a common normalization framework — provided the carrier with something that sequential single-vendor negotiations cannot: objective comparative data that enabled rational selection decisions based on true total cost rather than headline pricing. This is particularly important in AI procurement, where pricing complexity and incompatible structures are often used to obscure relative value.

For insurance and financial services enterprises specifically, the regulatory dimension of AI procurement — data handling, audit trails, model governance — adds cost components that are not always captured in vendor pricing summaries. Benchmarking that normalizes for these requirements, as this engagement did, ensures that compliance costs are visible in the comparison and do not emerge as surprises post-signature.

"We went from confused about AI pricing to having the clearest vendor selection rationale we've ever presented to a board. Data changes the conversation entirely."
— VP of Procurement, Top-20 U.S. Insurance Carrier
Related Case Studies

More Enterprise Benchmark Results

Get Started

Selecting an AI Platform? Know What the Market Actually Pays.

Submit your AI vendor proposals for multi-platform benchmarking. We normalize pricing structures and compare against recent comparable enterprise transactions — in 48 hours.

SOC 2 Type II NDA Protected 48-Hour Delivery 500+ Vendors