Compare commits
19 Commits
master
...
bcbe2c769f
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
bcbe2c769f | ||
|
|
617aaefcc2 | ||
|
|
1c88fec896 | ||
|
|
8073428060 | ||
|
|
b9738c2099 | ||
|
|
9293c970bc | ||
|
|
255395dc10 | ||
|
|
b7edd8539f | ||
|
|
9bee2eba7a | ||
|
|
aeae1eef7b | ||
|
|
7003814441 | ||
|
|
9732e9e653 | ||
|
|
1802d8674c | ||
|
|
8fc1117867 | ||
|
|
89a07bbde9 | ||
|
|
15105d3faa | ||
|
|
2f85d31f5e | ||
|
|
5705c71140 | ||
|
|
1d65465eed |
1
.gitignore
vendored
@@ -1 +1,2 @@
|
||||
.opencode/**
|
||||
**/__pycache__/**
|
||||
|
||||
29
output/battlecards/cards/card_01_market_valuation.md
Normal file
@@ -0,0 +1,29 @@
|
||||
# Card 1: Market Valuation Extremes
|
||||
|
||||
> The US stock market is trading at historic valuation extremes that mirror previous bubble periods.
|
||||
|
||||
## Fact
|
||||
|
||||
- The Shiller CAPE ratio stands at ~40.03, more than 2x the historical mean of 17.39 since 1881 *(Source: Yale/Shiller, 2026)*
|
||||
- The Buffett Indicator (Total Market Cap / GDP) is at 219%, well above the 200% danger threshold *(Source: FRED/World Bank composite, 2026)*
|
||||
- S&P 500 trailing P/E is at 29.6 vs historical mean of 17.9 — 65% above normal *(Source: S&P historical data, 2026)*
|
||||
- Dividend yield has fallen to 1.04%, the lowest since 1950 — offering virtually no income cushion *(Source: S&P historical data, 2026)*
|
||||
- Federal debt stands at 122.6% of GDP, adding macro fragility to the valuation overstretch *(Source: US Treasury data, 2025)*
|
||||
|
||||

|
||||
|
||||
## Impact
|
||||
|
||||
- **Investment risk is elevated**: Historical CAPE readings above 35 have been followed by below-average 10-year returns. Current CAPE of 40 implies negative 10-year annualized returns.
|
||||
- **AI spending amplifies the bubble**: Hyperscaler AI capex ($208B+ projected for 2026) is propping up tech stock valuations disconnected from current revenue generation.
|
||||
- **Market correction risk**: If AI ROI fails to materialize at scale, the dual pressure of overvaluation AND spending disappointment could trigger a sharp correction similar to 2000.
|
||||
|
||||
## Act
|
||||
|
||||
- **When debating AI market health**: Lead with valuation data. CAPE at 40+ is objectively extreme by any historical standard — only the 2000 dot-com peak (43.77) was higher in 147 years.
|
||||
- **Key question to ask**: "How much AI-driven revenue growth is priced into these valuations, and what happens if it doesn't materialize?"
|
||||
- **Counter-argument anticipation**: "This time is different because AI is transformative." Response: Dot-com stocks also traded at historic multiples before the 2000 crash. The technology (internet) proved real, but valuations were disconnected from reality.
|
||||
|
||||
---
|
||||
|
||||
*Last updated: June 2026 | Sources: Yale/Shiller CAPE data, FRED Buffett Indicator, S&P 500 historical metrics, US Treasury debt data*
|
||||
28
output/battlecards/cards/card_02_ai_infrastructure.md
Normal file
@@ -0,0 +1,28 @@
|
||||
# Card 2: AI Infrastructure Buildout
|
||||
|
||||
> Hyperscaler AI infrastructure spending has exploded 10x in 6 years, raising questions about sustainable ROI.
|
||||
|
||||
## Fact
|
||||
|
||||
- Combined hyperscaler capex surged from $55B in 2020 to a projected $605B in 2026 — a 10x increase in 6 years *(Source: SEC filings, company earnings, 2020-2026)*
|
||||
- AI-related spending now accounts for 85-90% of total hyperscaler capex in 2026 *(Source: analyst estimates, company disclosures)*
|
||||
- Tech debt spiked to $121B in 2025 — 4x the 5-year average — as companies rush to build AI infrastructure *(Source: tech debt tracking data, 2025)*
|
||||
- NVIDIA data center revenue grew from $1.57B (FY2020 Q1) to $75.2B (FY2027 Q1) — a 48x increase *(Source: NVIDIA earnings reports)*
|
||||
|
||||

|
||||
|
||||
## Impact
|
||||
|
||||
- **Massive capital commitment creates overhang**: $605B in annual capex is unprecedented for a single sector. If AI ROI disappoints, stranded assets could trigger write-downs.
|
||||
- **Diminishing returns likely**: The law of diminishing returns applies to infrastructure spending. Each additional dollar of GPU investment yields less marginal AI capability.
|
||||
- **AWS price increases signal supply constraints**: AWS raised H200 prices 15% in January 2026 — the first compute price increase in 20 years, indicating capacity is becoming a bottleneck *(Source: Data Center Dynamics, January 2026)*.
|
||||
|
||||
## Act
|
||||
|
||||
- **When debating AI infrastructure**: Question capex efficiency. A 10x spending increase in 6 years is unsustainable without proportional revenue growth.
|
||||
- **Key question to ask**: "What revenue per dollar of AI infrastructure investment are companies seeing, and is it improving?"
|
||||
- **Historical parallel**: During the dot-com boom, fiber optic infrastructure was overbuilt by 80%. The internet proved transformative, but many infrastructure investments took a decade to become profitable.
|
||||
|
||||
---
|
||||
|
||||
*Last updated: 2026-06-05 | Sources: SEC filings, company earnings reports, ValueAddVC, Data Center Dynamics, NVIDIA earnings*
|
||||
28
output/battlecards/cards/card_03_gpu_utilization.md
Normal file
@@ -0,0 +1,28 @@
|
||||
# Card 3: GPU Utilization Paradox
|
||||
|
||||
> Trillions invested in AI infrastructure sit largely idle, with GPU utilization rates revealing massive waste.
|
||||
|
||||
## Fact
|
||||
|
||||
- Average GPU utilization across enterprise clusters sits at just 5% — meaning 95% of GPU capacity is wasted *(Source: Cast AI 2026 State of Kubernetes Optimization Report)*
|
||||
- Approximately $401B has been invested in AI infrastructure in 2026 alone, with the vast majority of compute capacity idle *(Source: Gartner forecast, 2026)*
|
||||
- CPU utilization is at 8% and memory utilization at 20% — systemic over-provisioning across all resources *(Source: Cast AI 2026)*
|
||||
- 69% CPU over-provisioning (up from 40% YoY) and 79% memory over-provisioning *(Source: Cast AI 2026)*
|
||||
|
||||

|
||||
|
||||
## Impact
|
||||
|
||||
- **Enormous capital waste**: At $401B in infrastructure spending, 5% utilization implies ~$380B in idle compute — money spent with zero productive output.
|
||||
- **ROI crisis accelerating**: As utilization remains abysmal, the gap between capital expenditure and revenue generation widens, threatening investor confidence.
|
||||
- **Efficiency pivot underway**: "Cost per inference/TCO" rose from 34% to 41% as the top industry priority in Q1 2026, signaling a market shift from building to optimizing *(Source: VentureBeat Q1 2026 tracker)*.
|
||||
|
||||
## Act
|
||||
|
||||
- **When debating AI spending efficiency**: Lead with the 5% utilization figure. It's a single, damning statistic that undermines the entire AI infrastructure investment thesis.
|
||||
- **Key question to ask**: "If 95% of GPU capacity sits idle, why are companies doubling their infrastructure budgets?"
|
||||
- **Counter-argument**: "Infrastructure was underutilized during the early internet too." Response: True, but today's capital costs are orders of magnitude higher, and investors are demanding near-term returns, not decade-long infrastructure plays.
|
||||
|
||||
---
|
||||
|
||||
*Last updated: 2026-06-05 | Sources: Cast AI 2026 State of Kubernetes Optimization Report, Gartner 2026 forecast, VentureBeat Q1 2026 AI Infrastructure & Compute Market Tracker*
|
||||
28
output/battlecards/cards/card_04_startup_valuations.md
Normal file
@@ -0,0 +1,28 @@
|
||||
# Card 4: Startup Valuation Disconnect
|
||||
|
||||
> AI startup valuations have detached from revenue fundamentals, echoing the excesses of the dot-com era.
|
||||
|
||||
## Fact
|
||||
|
||||
- OpenAI is valued at $840B with $25B in ARR (~34x revenue multiple) — though IPO projections suggest 12-16x *(Source: aibusiness.vc, May 2026)*
|
||||
- Anthropic reached a $380B valuation (~40x revenue) per CB Insights Q1 2026 — with some reports suggesting a subsequent round at $900B in May 2026 *(Source: CB Insights Q1 2026, aibusiness.vc May 2026)*
|
||||
- Revenue multiples for AI startups range from 40x to 500x, far exceeding dot-com era peaks of 50-100x *(Source: PitchBook/CB Insights data)*
|
||||
- Burn rates are enormous: OpenAI alone has consumed over $7B in funding while pursuing path to profitability *(Source: public filings and media reports)*
|
||||
|
||||

|
||||
|
||||
## Impact
|
||||
|
||||
- **Valuation detached from fundamentals**: Revenue multiples of 100-500x are unsustainable. Even at explosive growth rates, these valuations require decades of hyper-growth to justify.
|
||||
- **Crash risk if growth disappoints**: If AI adoption slows or open-source alternatives erode margins, valuation corrections could be severe — potentially 80-90% like the dot-com bust.
|
||||
- **Investor concentration risk**: A handful of mega-deals dominate AI funding. If these companies fail to deliver, the entire AI investment ecosystem faces systemic risk.
|
||||
|
||||
## Act
|
||||
|
||||
- **When debating AI startup valuations**: Compare to dot-com era multiples. The NASDAQ fell 78% from its 2000 peak — even companies that survived were decimated.
|
||||
- **Key question to ask**: "At 180x revenue, how many years of current revenue would Anthropic need to generate to justify its valuation?"
|
||||
- **Counter-argument anticipation**: "AI companies will grow into their valuations." Response: This was the same argument during the dot-com bubble. Most companies didn't grow into their valuations — they crashed.
|
||||
|
||||
---
|
||||
|
||||
*Last updated: 2026-06-05 | Sources: aibusiness.vc, PitchBook/CB Insights, Public filings*
|
||||
28
output/battlecards/cards/card_05_enterprise_deployment.md
Normal file
@@ -0,0 +1,28 @@
|
||||
# Card 5: Real-World Enterprise Deployment
|
||||
|
||||
> Despite the broader bubble narrative, AI has delivered measurable ROI in specific enterprise deployments.
|
||||
|
||||
## Fact
|
||||
|
||||
- Klarna replaced 853 FTEs with AI agents, saving $60M and reducing resolution time from 11 minutes to under 2 minutes (82% reduction) *(Source: Klarna/LangChain case study, 2025)*
|
||||
- JPMorgan COiN saves 360,000 lawyer-hours annually and generates $150M in annual value, processing 12,000 commercial credit agreements *(Source: JPMorgan, 2025)*
|
||||
- ServiceNow partner SnowGeek achieved 73% midnight escalation reduction, 65% MTTR improvement, and $2.3M in downtime savings *(Source: ServiceNow partner report, MEDIUM confidence)*
|
||||
- Morgan Stanley's DevGen.AI reviewed 9M+ lines of legacy code, saving 280,000 developer hours *(Source: Morgan Stanley, 2025)*
|
||||
|
||||

|
||||
|
||||
## Impact
|
||||
|
||||
- **Real ROI exists in focused deployments**: Companies with clear use cases, strong data infrastructure, and C-level sponsorship are seeing double-digit percentage improvements.
|
||||
- **But success is concentrated**: MIT NANDA research finds 95% of enterprise AI pilots deliver zero measurable P&L impact *(Source: MIT NANDA, July 2025)*. The winning 5% achieve outsized returns that skew averages.
|
||||
- **Hybrid models are the practical approach**: Klarna's partial reversal — restoring human agents for complex emotional queries — highlights that full AI replacement is premature for many use cases.
|
||||
|
||||
## Act
|
||||
|
||||
- **When presenting AI value**: Use specific case studies with verified metrics. General claims about "AI transformation" are easy to dismiss.
|
||||
- **Key question to ask**: "What is the specific ROI from your AI deployment, and how does it compare to the 95% of pilots that deliver zero measurable impact?"
|
||||
- **Counter-argument anticipation**: "These are cherry-picked success stories." Response: True, but success patterns are identifiable — clear scoping, data readiness, and executive sponsorship differentiate winners from the 95% failure rate.
|
||||
|
||||
---
|
||||
|
||||
*Last updated: 2026-06-05 | Sources: Klarna/LangChain case study, JPMorgan 2025, SnowGeek Solutions, MIT NANDA 2025, Morgan Stanley 2025*
|
||||
28
output/battlecards/cards/card_06_developer_adoption.md
Normal file
@@ -0,0 +1,28 @@
|
||||
# Card 6: Developer Adoption Reality
|
||||
|
||||
> AI coding tools have achieved massive adoption among developers, but the productivity gains come with important caveats.
|
||||
|
||||
## Fact
|
||||
|
||||
- GitHub Copilot has crossed 20M cumulative users with 4.7M paid subscribers and $2B+ ARR — 90% of Fortune 100 companies have deployed it *(Source: Microsoft, July 2025)*
|
||||
- 46% of code for active Copilot users is now AI-generated, with task completion 55% faster and PR time reduced 75% *(Source: GitHub research)*
|
||||
- 84% of developers use or plan to use AI coding tools, with 51% using them daily *(Source: JetBrains/Stack Overflow surveys)*
|
||||
- Code acceptance rate is ~30% initially, but code retention is 88% — suggesting AI-assisted code, once accepted, proves reliable *(Source: GitHub data)*
|
||||
|
||||

|
||||
|
||||
## Impact
|
||||
|
||||
- **Adoption is real and accelerating**: $7.37B AI coding tools market in 2025 (up 50% YoY) confirms developers are spending real money on AI tools *(Source: market analysis, 2025)*.
|
||||
- **But quality remains a concern**: 29.1% of Copilot-generated Python code contains potential security vulnerabilities — requiring mandatory human review for security-sensitive code *(Source: research findings, 2025)*.
|
||||
- **Human-AI collaboration is the winning model**: Studies from GitHub, Microsoft Research, and independent teams converge that combined human-AI pairs produce better code than either alone.
|
||||
|
||||
## Act
|
||||
|
||||
- **When debating developer AI**: Present adoption data honestly with quality caveats. AI tools are transformative but not a replacement for skilled developers.
|
||||
- **Key question to ask**: "If 46% of code is AI-generated, what is the actual time savings after accounting for code review, debugging, and security auditing?"
|
||||
- **Counter-argument anticipation**: "AI will replace developers." Response: The data shows AI augments developers — 55% faster tasks, 75% faster PRs, but still requiring human oversight. The net effect is more productive developers, not unemployed ones.
|
||||
|
||||
---
|
||||
|
||||
*Last updated: 2026-06-05 | Sources: GitHub 2025-2026, Microsoft Research, JetBrains 2025 survey, Stack Overflow 2025 survey, Accenture RCT, DX DevCycle Q4 2025, Market analysis 2025*
|
||||
28
output/battlecards/cards/card_07_code_quality_caveats.md
Normal file
@@ -0,0 +1,28 @@
|
||||
# Card 7: Code Quality and Security Caveats
|
||||
|
||||
> AI-generated code carries measurable security risks and quality degradation that organizations must manage.
|
||||
|
||||
## Fact
|
||||
|
||||
- 48% of AI-generated code contains security vulnerabilities overall, with 29.1% of Python and 24.2% of JavaScript code flagged for weaknesses *(Source: security research, 2025)*
|
||||
- AI-coauthored pull requests have 1.7× more issues than human-only code, indicating systemic quality degradation *(Source: GitHub/Microsoft research)*
|
||||
- 7.2% drop in delivery stability from AI use, measured via DORA metrics *(Source: Google DORA report, 2024)*
|
||||
- 6.4% secret leakage rate in AI-generated code — credentials, API keys, and tokens embedded unintentionally *(Source: security analysis)*
|
||||
|
||||

|
||||
|
||||
## Impact
|
||||
|
||||
- **Security exposure is real**: Organizations using AI coding tools must implement mandatory security review processes, adding cost and time to development cycles.
|
||||
- **Long-term tech debt**: The quality degradation (1.7× more issues) compounds over time, potentially creating larger maintenance burdens than short-term productivity gains.
|
||||
- **Emerging threat landscape**: The TanStack 'Mini Shai-Hulud' attack (May 2026) — CVE-2026-45321 — demonstrated the first attack persisting inside AI coding tool configuration files, exposing new attack vectors *(Source: security research, May 2026)*.
|
||||
|
||||
## Act
|
||||
|
||||
- **When discussing AI code quality**: Be honest about the risks. 48% vulnerability rate is not acceptable for production systems without rigorous review.
|
||||
- **Key question to ask**: 'What is your organization's process for reviewing and validating AI-generated code before it reaches production?'
|
||||
- **Counter-argument anticipation**: 'These vulnerabilities are fixable.' Response: They are, but the cost of fixing them post-deployment is exponentially higher than the time spent on proactive review.
|
||||
|
||||
---
|
||||
|
||||
*Last updated: 2026-06-05 | Sources: Security research 2025, GitHub/Microsoft research, Google DORA report 2024, TanStack CVE-2026-45321*
|
||||
28
output/battlecards/cards/card_08_long_term_productivity.md
Normal file
@@ -0,0 +1,28 @@
|
||||
# Card 8: Long-Term Productivity Trajectory
|
||||
|
||||
> Despite short-term inefficiencies and quality concerns, AI-assisted development represents an inevitable and transformative shift in software engineering.
|
||||
|
||||
## Fact
|
||||
|
||||
- Accenture's randomized controlled trial found 8.69% increase in pull requests, 84% improvement in successful build rates, and 46% faster task completion *(Source: Accenture RCT)*
|
||||
- Microsoft Research studies show 20-45% productivity improvement from AI-assisted development *(Source: Microsoft Research)*
|
||||
- Google reports 21% of code in their codebase is now AI-assisted, with measurable quality improvements *(Source: Google internal research)*
|
||||
- Realistic productivity gain range: 20-67% across studies, with higher gains in tasks involving boilerplate and documentation *(Source: multiple academic and industry studies)*
|
||||
|
||||

|
||||
|
||||
## Impact
|
||||
|
||||
- **Productivity gains compound over time**: As developers become more proficient with AI tools, the productivity multiplier increases. The learning curve is steep, but the payoff is significant.
|
||||
- **AI-assisted development is inevitable**: Even organizations skeptical of AI are adopting tools like Copilot. The competitive pressure to adopt is too strong.
|
||||
- **The net effect is positive despite caveats**: While code quality concerns are valid, the overall impact of AI on developer productivity is positive — faster delivery, reduced burnout on repetitive tasks, and more time for creative problem-solving.
|
||||
|
||||
## Act
|
||||
|
||||
- **When discussing AI productivity**: Frame it as a long-term transformation, not a quick fix. The gains are real but require investment in training, process adaptation, and quality management.
|
||||
- **Key question to ask**: "What is your organization's plan for integrating AI tools into the development workflow, and how will you manage the quality trade-offs?"
|
||||
- **Counter-argument anticipation**: "Short-term inefficiencies outweigh long-term gains." Response: Every transformative technology has a learning curve. The internet, cloud computing, and agile development all had initial productivity dips before delivering massive gains.
|
||||
|
||||
---
|
||||
|
||||
*Last updated: 2026-06-05 | Sources: Accenture RCT, Microsoft Research 2024-2025, Google internal research, Multiple academic and industry studies*
|
||||
BIN
output/battlecards/charts/mini_cape_extreme.png
Normal file
|
After Width: | Height: | Size: 119 KiB |
BIN
output/battlecards/charts/mini_capex_trajectory.png
Normal file
|
After Width: | Height: | Size: 65 KiB |
BIN
output/battlecards/charts/mini_code_vulnerabilities.png
Normal file
|
After Width: | Height: | Size: 55 KiB |
BIN
output/battlecards/charts/mini_developer_adoption.png
Normal file
|
After Width: | Height: | Size: 72 KiB |
BIN
output/battlecards/charts/mini_enterprise_savings.png
Normal file
|
After Width: | Height: | Size: 78 KiB |
BIN
output/battlecards/charts/mini_gpu_utilization.png
Normal file
|
After Width: | Height: | Size: 50 KiB |
BIN
output/battlecards/charts/mini_productivity_trajectory.png
Normal file
|
After Width: | Height: | Size: 106 KiB |
BIN
output/battlecards/charts/mini_startup_multiples.png
Normal file
|
After Width: | Height: | Size: 52 KiB |
BIN
output/battlecards/charts/test_comparison_bar.png
Normal file
|
After Width: | Height: | Size: 54 KiB |
BIN
output/battlecards/charts/test_horizontal_bar.png
Normal file
|
After Width: | Height: | Size: 61 KiB |
BIN
output/battlecards/charts/test_line_trend.png
Normal file
|
After Width: | Height: | Size: 75 KiB |
BIN
output/battlecards/charts/test_utilization_bar.png
Normal file
|
After Width: | Height: | Size: 47 KiB |
319
output/battlecards/deck.md
Normal file
@@ -0,0 +1,319 @@
|
||||
# AI Bubble Battle Cards — Evidence Deck
|
||||
|
||||
> Argument-ready, evidence-backed one-pagers for AI market analysis.
|
||||
>
|
||||
> This deck contains 8 battle cards organized into two clusters:
|
||||
> - **Cluster A: "The Bubble Exists"** — Evidence of market overvaluation and infrastructure waste
|
||||
> - **Cluster B: "LLMs Are Still Valuable"** — Evidence of real-world AI value and productivity gains
|
||||
>
|
||||
> *Last updated: June 2026*
|
||||
|
||||
## Table of Contents
|
||||
|
||||
### Cluster A: The Bubble Exists
|
||||
- [Card 1: Market Valuation Extremes](cards/card_01_market_valuation.md)
|
||||
- [Card 2: AI Infrastructure Buildout](cards/card_02_ai_infrastructure.md)
|
||||
- [Card 3: GPU Utilization Paradox](cards/card_03_gpu_utilization.md)
|
||||
- [Card 4: Startup Valuation Disconnect](cards/card_04_startup_valuations.md)
|
||||
|
||||
### Cluster B: LLMs Are Still Valuable
|
||||
- [Card 5: Real-World Enterprise Deployment](cards/card_05_enterprise_deployment.md)
|
||||
- [Card 6: Developer Adoption Reality](cards/card_06_developer_adoption.md)
|
||||
- [Card 7: Code Quality and Security Caveats](cards/card_07_code_quality_caveats.md)
|
||||
- [Card 8: Long-Term Productivity Trajectory](cards/card_08_long_term_productivity.md)
|
||||
|
||||
---
|
||||
|
||||
---
|
||||
|
||||
# Card 1: Market Valuation Extremes
|
||||
|
||||
> The US stock market is trading at historic valuation extremes that mirror previous bubble periods.
|
||||
|
||||
## Fact
|
||||
|
||||
- The Shiller CAPE ratio stands at ~40.03, more than 2x the historical mean of 17.39 since 1881 *(Source: Yale/Shiller, 2026)*
|
||||
- The Buffett Indicator (Total Market Cap / GDP) is at 219%, well above the 200% danger threshold *(Source: FRED/World Bank composite, 2026)*
|
||||
- S&P 500 trailing P/E is at 29.6 vs historical mean of 17.9 — 65% above normal *(Source: S&P historical data, 2026)*
|
||||
- Dividend yield has fallen to 1.04%, the lowest since 1950 — offering virtually no income cushion *(Source: S&P historical data, 2026)*
|
||||
- Federal debt stands at 122.6% of GDP, adding macro fragility to the valuation overstretch *(Source: US Treasury data, 2025)*
|
||||
|
||||

|
||||
|
||||
## Impact
|
||||
|
||||
- **Investment risk is elevated**: Historical CAPE readings above 35 have been followed by below-average 10-year returns. Current CAPE of 40 implies negative 10-year annualized returns.
|
||||
- **AI spending amplifies the bubble**: Hyperscaler AI capex ($208B+ projected for 2026) is propping up tech stock valuations disconnected from current revenue generation.
|
||||
- **Market correction risk**: If AI ROI fails to materialize at scale, the dual pressure of overvaluation AND spending disappointment could trigger a sharp correction similar to 2000.
|
||||
|
||||
## Act
|
||||
|
||||
- **When debating AI market health**: Lead with valuation data. CAPE at 40+ is objectively extreme by any historical standard — only the 2000 dot-com peak (43.77) was higher in 147 years.
|
||||
- **Key question to ask**: "How much AI-driven revenue growth is priced into these valuations, and what happens if it doesn't materialize?"
|
||||
- **Counter-argument anticipation**: "This time is different because AI is transformative." Response: Dot-com stocks also traded at historic multiples before the 2000 crash. The technology (internet) proved real, but valuations were disconnected from reality.
|
||||
|
||||
---
|
||||
|
||||
*Last updated: June 2026 | Sources: Yale/Shiller CAPE data, FRED Buffett Indicator, S&P 500 historical metrics, US Treasury debt data*
|
||||
|
||||
---
|
||||
|
||||
---
|
||||
|
||||
# Card 2: AI Infrastructure Buildout
|
||||
|
||||
> Hyperscaler AI infrastructure spending has exploded 10x in 6 years, raising questions about sustainable ROI.
|
||||
|
||||
## Fact
|
||||
|
||||
- Combined hyperscaler capex surged from $55B in 2020 to a projected $605B in 2026 — a 10x increase in 6 years *(Source: SEC filings, company earnings, 2020-2026)*
|
||||
- AI-related spending now accounts for 85-90% of total hyperscaler capex in 2026 *(Source: analyst estimates, company disclosures)*
|
||||
- Tech debt spiked to $121B in 2025 — 4x the 5-year average — as companies rush to build AI infrastructure *(Source: tech debt tracking data, 2025)*
|
||||
- NVIDIA data center revenue grew from $1.57B (FY2020 Q1) to $75.2B (FY2027 Q1) — a 48x increase *(Source: NVIDIA earnings reports)*
|
||||
|
||||

|
||||
|
||||
## Impact
|
||||
|
||||
- **Massive capital commitment creates overhang**: $605B in annual capex is unprecedented for a single sector. If AI ROI disappoints, stranded assets could trigger write-downs.
|
||||
- **Diminishing returns likely**: The law of diminishing returns applies to infrastructure spending. Each additional dollar of GPU investment yields less marginal AI capability.
|
||||
- **AWS price increases signal supply constraints**: AWS raised H200 prices 15% in January 2026 — the first compute price increase in 20 years, indicating capacity is becoming a bottleneck *(Source: Data Center Dynamics, January 2026)*.
|
||||
|
||||
## Act
|
||||
|
||||
- **When debating AI infrastructure**: Question capex efficiency. A 10x spending increase in 6 years is unsustainable without proportional revenue growth.
|
||||
- **Key question to ask**: "What revenue per dollar of AI infrastructure investment are companies seeing, and is it improving?"
|
||||
- **Historical parallel**: During the dot-com boom, fiber optic infrastructure was overbuilt by 80%. The internet proved transformative, but many infrastructure investments took a decade to become profitable.
|
||||
|
||||
---
|
||||
|
||||
*Last updated: 2026-06-05 | Sources: SEC filings, company earnings reports, ValueAddVC, Data Center Dynamics, NVIDIA earnings*
|
||||
|
||||
---
|
||||
|
||||
---
|
||||
|
||||
# Card 3: GPU Utilization Paradox
|
||||
|
||||
> Trillions invested in AI infrastructure sit largely idle, with GPU utilization rates revealing massive waste.
|
||||
|
||||
## Fact
|
||||
|
||||
- Average GPU utilization across enterprise clusters sits at just 5% — meaning 95% of GPU capacity is wasted *(Source: Cast AI 2026 State of Kubernetes Optimization Report)*
|
||||
- Approximately $401B has been invested in AI infrastructure in 2026 alone, with the vast majority of compute capacity idle *(Source: Gartner forecast, 2026)*
|
||||
- CPU utilization is at 8% and memory utilization at 20% — systemic over-provisioning across all resources *(Source: Cast AI 2026)*
|
||||
- 69% CPU over-provisioning (up from 40% YoY) and 79% memory over-provisioning *(Source: Cast AI 2026)*
|
||||
|
||||

|
||||
|
||||
## Impact
|
||||
|
||||
- **Enormous capital waste**: At $401B in infrastructure spending, 5% utilization implies ~$380B in idle compute — money spent with zero productive output.
|
||||
- **ROI crisis accelerating**: As utilization remains abysmal, the gap between capital expenditure and revenue generation widens, threatening investor confidence.
|
||||
- **Efficiency pivot underway**: "Cost per inference/TCO" rose from 34% to 41% as the top industry priority in Q1 2026, signaling a market shift from building to optimizing *(Source: VentureBeat Q1 2026 tracker)*.
|
||||
|
||||
## Act
|
||||
|
||||
- **When debating AI spending efficiency**: Lead with the 5% utilization figure. It's a single, damning statistic that undermines the entire AI infrastructure investment thesis.
|
||||
- **Key question to ask**: "If 95% of GPU capacity sits idle, why are companies doubling their infrastructure budgets?"
|
||||
- **Counter-argument**: "Infrastructure was underutilized during the early internet too." Response: True, but today's capital costs are orders of magnitude higher, and investors are demanding near-term returns, not decade-long infrastructure plays.
|
||||
|
||||
---
|
||||
|
||||
*Last updated: 2026-06-05 | Sources: Cast AI 2026 State of Kubernetes Optimization Report, Gartner 2026 forecast, VentureBeat Q1 2026 AI Infrastructure & Compute Market Tracker*
|
||||
|
||||
---
|
||||
|
||||
---
|
||||
|
||||
# Card 4: Startup Valuation Disconnect
|
||||
|
||||
> AI startup valuations have detached from revenue fundamentals, echoing the excesses of the dot-com era.
|
||||
|
||||
## Fact
|
||||
|
||||
- OpenAI is valued at $840B with $25B in ARR (~34x revenue multiple) — though IPO projections suggest 12-16x *(Source: aibusiness.vc, May 2026)*
|
||||
- Anthropic reached a $380B valuation (~40x revenue) per CB Insights Q1 2026 — with some reports suggesting a subsequent round at $900B in May 2026 *(Source: CB Insights Q1 2026, aibusiness.vc May 2026)*
|
||||
- Revenue multiples for AI startups range from 40x to 500x, far exceeding dot-com era peaks of 50-100x *(Source: PitchBook/CB Insights data)*
|
||||
- Burn rates are enormous: OpenAI alone has consumed over $7B in funding while pursuing path to profitability *(Source: public filings and media reports)*
|
||||
|
||||

|
||||
|
||||
## Impact
|
||||
|
||||
- **Valuation detached from fundamentals**: Revenue multiples of 100-500x are unsustainable. Even at explosive growth rates, these valuations require decades of hyper-growth to justify.
|
||||
- **Crash risk if growth disappoints**: If AI adoption slows or open-source alternatives erode margins, valuation corrections could be severe — potentially 80-90% like the dot-com bust.
|
||||
- **Investor concentration risk**: A handful of mega-deals dominate AI funding. If these companies fail to deliver, the entire AI investment ecosystem faces systemic risk.
|
||||
|
||||
## Act
|
||||
|
||||
- **When debating AI startup valuations**: Compare to dot-com era multiples. The NASDAQ fell 78% from its 2000 peak — even companies that survived were decimated.
|
||||
- **Key question to ask**: "At 180x revenue, how many years of current revenue would Anthropic need to generate to justify its valuation?"
|
||||
- **Counter-argument anticipation**: "AI companies will grow into their valuations." Response: This was the same argument during the dot-com bubble. Most companies didn't grow into their valuations — they crashed.
|
||||
|
||||
---
|
||||
|
||||
*Last updated: 2026-06-05 | Sources: aibusiness.vc, PitchBook/CB Insights, Public filings*
|
||||
|
||||
---
|
||||
|
||||
---
|
||||
|
||||
# Card 5: Real-World Enterprise Deployment
|
||||
|
||||
> Despite the broader bubble narrative, AI has delivered measurable ROI in specific enterprise deployments.
|
||||
|
||||
## Fact
|
||||
|
||||
- Klarna replaced 853 FTEs with AI agents, saving $60M and reducing resolution time from 11 minutes to under 2 minutes (82% reduction) *(Source: Klarna/LangChain case study, 2025)*
|
||||
- JPMorgan COiN saves 360,000 lawyer-hours annually and generates $150M in annual value, processing 12,000 commercial credit agreements *(Source: JPMorgan, 2025)*
|
||||
- ServiceNow partner SnowGeek achieved 73% midnight escalation reduction, 65% MTTR improvement, and $2.3M in downtime savings *(Source: ServiceNow partner report, MEDIUM confidence)*
|
||||
- Morgan Stanley's DevGen.AI reviewed 9M+ lines of legacy code, saving 280,000 developer hours *(Source: Morgan Stanley, 2025)*
|
||||
|
||||

|
||||
|
||||
## Impact
|
||||
|
||||
- **Real ROI exists in focused deployments**: Companies with clear use cases, strong data infrastructure, and C-level sponsorship are seeing double-digit percentage improvements.
|
||||
- **But success is concentrated**: MIT NANDA research finds 95% of enterprise AI pilots deliver zero measurable P&L impact *(Source: MIT NANDA, July 2025)*. The winning 5% achieve outsized returns that skew averages.
|
||||
- **Hybrid models are the practical approach**: Klarna's partial reversal — restoring human agents for complex emotional queries — highlights that full AI replacement is premature for many use cases.
|
||||
|
||||
## Act
|
||||
|
||||
- **When presenting AI value**: Use specific case studies with verified metrics. General claims about "AI transformation" are easy to dismiss.
|
||||
- **Key question to ask**: "What is the specific ROI from your AI deployment, and how does it compare to the 95% of pilots that deliver zero measurable impact?"
|
||||
- **Counter-argument anticipation**: "These are cherry-picked success stories." Response: True, but success patterns are identifiable — clear scoping, data readiness, and executive sponsorship differentiate winners from the 95% failure rate.
|
||||
|
||||
---
|
||||
|
||||
*Last updated: 2026-06-05 | Sources: Klarna/LangChain case study, JPMorgan 2025, SnowGeek Solutions, MIT NANDA 2025, Morgan Stanley 2025*
|
||||
|
||||
---
|
||||
|
||||
---
|
||||
|
||||
# Card 6: Developer Adoption Reality
|
||||
|
||||
> AI coding tools have achieved massive adoption among developers, but the productivity gains come with important caveats.
|
||||
|
||||
## Fact
|
||||
|
||||
- GitHub Copilot has crossed 20M cumulative users with 4.7M paid subscribers and $2B+ ARR — 90% of Fortune 100 companies have deployed it *(Source: Microsoft, July 2025)*
|
||||
- 46% of code for active Copilot users is now AI-generated, with task completion 55% faster and PR time reduced 75% *(Source: GitHub research)*
|
||||
- 84% of developers use or plan to use AI coding tools, with 51% using them daily *(Source: JetBrains/Stack Overflow surveys)*
|
||||
- Code acceptance rate is ~30% initially, but code retention is 88% — suggesting AI-assisted code, once accepted, proves reliable *(Source: GitHub data)*
|
||||
|
||||

|
||||
|
||||
## Impact
|
||||
|
||||
- **Adoption is real and accelerating**: $7.37B AI coding tools market in 2025 (up 50% YoY) confirms developers are spending real money on AI tools *(Source: market analysis, 2025)*.
|
||||
- **But quality remains a concern**: 29.1% of Copilot-generated Python code contains potential security vulnerabilities — requiring mandatory human review for security-sensitive code *(Source: research findings, 2025)*.
|
||||
- **Human-AI collaboration is the winning model**: Studies from GitHub, Microsoft Research, and independent teams converge that combined human-AI pairs produce better code than either alone.
|
||||
|
||||
## Act
|
||||
|
||||
- **When debating developer AI**: Present adoption data honestly with quality caveats. AI tools are transformative but not a replacement for skilled developers.
|
||||
- **Key question to ask**: "If 46% of code is AI-generated, what is the actual time savings after accounting for code review, debugging, and security auditing?"
|
||||
- **Counter-argument anticipation**: "AI will replace developers." Response: The data shows AI augments developers — 55% faster tasks, 75% faster PRs, but still requiring human oversight. The net effect is more productive developers, not unemployed ones.
|
||||
|
||||
---
|
||||
|
||||
*Last updated: 2026-06-05 | Sources: GitHub 2025-2026, Microsoft Research, JetBrains 2025 survey, Stack Overflow 2025 survey, Accenture RCT, DX DevCycle Q4 2025, Market analysis 2025*
|
||||
|
||||
---
|
||||
|
||||
---
|
||||
|
||||
# Card 7: Code Quality and Security Caveats
|
||||
|
||||
> AI-generated code carries measurable security risks and quality degradation that organizations must manage.
|
||||
|
||||
## Fact
|
||||
|
||||
- 48% of AI-generated code contains security vulnerabilities overall, with 29.1% of Python and 24.2% of JavaScript code flagged for weaknesses *(Source: security research, 2025)*
|
||||
- AI-coauthored pull requests have 1.7× more issues than human-only code, indicating systemic quality degradation *(Source: GitHub/Microsoft research)*
|
||||
- 7.2% drop in delivery stability from AI use, measured via DORA metrics *(Source: Google DORA report, 2024)*
|
||||
- 6.4% secret leakage rate in AI-generated code — credentials, API keys, and tokens embedded unintentionally *(Source: security analysis)*
|
||||
|
||||

|
||||
|
||||
## Impact
|
||||
|
||||
- **Security exposure is real**: Organizations using AI coding tools must implement mandatory security review processes, adding cost and time to development cycles.
|
||||
- **Long-term tech debt**: The quality degradation (1.7× more issues) compounds over time, potentially creating larger maintenance burdens than short-term productivity gains.
|
||||
- **Emerging threat landscape**: The TanStack 'Mini Shai-Hulud' attack (May 2026) — CVE-2026-45321 — demonstrated the first attack persisting inside AI coding tool configuration files, exposing new attack vectors *(Source: security research, May 2026)*.
|
||||
|
||||
## Act
|
||||
|
||||
- **When discussing AI code quality**: Be honest about the risks. 48% vulnerability rate is not acceptable for production systems without rigorous review.
|
||||
- **Key question to ask**: 'What is your organization's process for reviewing and validating AI-generated code before it reaches production?'
|
||||
- **Counter-argument anticipation**: 'These vulnerabilities are fixable.' Response: They are, but the cost of fixing them post-deployment is exponentially higher than the time spent on proactive review.
|
||||
|
||||
---
|
||||
|
||||
*Last updated: 2026-06-05 | Sources: Security research 2025, GitHub/Microsoft research, Google DORA report 2024, TanStack CVE-2026-45321*
|
||||
|
||||
---
|
||||
|
||||
---
|
||||
|
||||
# Card 8: Long-Term Productivity Trajectory
|
||||
|
||||
> Despite short-term inefficiencies and quality concerns, AI-assisted development represents an inevitable and transformative shift in software engineering.
|
||||
|
||||
## Fact
|
||||
|
||||
- Accenture's randomized controlled trial found 8.69% increase in pull requests, 84% improvement in successful build rates, and 46% faster task completion *(Source: Accenture RCT)*
|
||||
- Microsoft Research studies show 20-45% productivity improvement from AI-assisted development *(Source: Microsoft Research)*
|
||||
- Google reports 21% of code in their codebase is now AI-assisted, with measurable quality improvements *(Source: Google internal research)*
|
||||
- Realistic productivity gain range: 20-67% across studies, with higher gains in tasks involving boilerplate and documentation *(Source: multiple academic and industry studies)*
|
||||
|
||||

|
||||
|
||||
## Impact
|
||||
|
||||
- **Productivity gains compound over time**: As developers become more proficient with AI tools, the productivity multiplier increases. The learning curve is steep, but the payoff is significant.
|
||||
- **AI-assisted development is inevitable**: Even organizations skeptical of AI are adopting tools like Copilot. The competitive pressure to adopt is too strong.
|
||||
- **The net effect is positive despite caveats**: While code quality concerns are valid, the overall impact of AI on developer productivity is positive — faster delivery, reduced burnout on repetitive tasks, and more time for creative problem-solving.
|
||||
|
||||
## Act
|
||||
|
||||
- **When discussing AI productivity**: Frame it as a long-term transformation, not a quick fix. The gains are real but require investment in training, process adaptation, and quality management.
|
||||
- **Key question to ask**: "What is your organization's plan for integrating AI tools into the development workflow, and how will you manage the quality trade-offs?"
|
||||
- **Counter-argument anticipation**: "Short-term inefficiencies outweigh long-term gains." Response: Every transformative technology has a learning curve. The internet, cloud computing, and agile development all had initial productivity dips before delivering massive gains.
|
||||
|
||||
---
|
||||
|
||||
*Last updated: 2026-06-05 | Sources: Accenture RCT, Microsoft Research 2024-2025, Google internal research, Multiple academic and industry studies*
|
||||
|
||||
---
|
||||
|
||||
## Source Appendix
|
||||
|
||||
### Primary Data Sources
|
||||
- **Shiller CAPE data**: Yale University, Robert Shiller, 1881-2026
|
||||
- **Buffett Indicator**: FRED (Federal Reserve Economic Data) / World Bank composite
|
||||
- **S&P 500 metrics**: S&P Dow Jones Indices historical data
|
||||
- **US debt data**: US Treasury Department
|
||||
- **Hyperscaler capex**: SEC filings, company earnings reports (Microsoft, Alphabet, Meta, Amazon)
|
||||
- **NVIDIA revenue**: NVIDIA quarterly earnings reports
|
||||
- **GPU utilization**: Cast AI 2026 State of Kubernetes Optimization Report
|
||||
- **Enterprise case studies**: Company press releases, earnings calls, verified media reports
|
||||
- **Developer adoption**: GitHub research, JetBrains surveys, Stack Overflow
|
||||
- **Code quality**: GitHub/Microsoft research, security analysis studies
|
||||
- **Productivity studies**: Accenture RCT, Microsoft Research, Google internal research
|
||||
|
||||
### Supplementary Research Sources
|
||||
- beri.net, "Agentic AI ROI: 12 Cases Show 171% Returns" (May 2026)
|
||||
- aibusiness.vc, "The Trillion-Dollar AI Race" (May 2026)
|
||||
- VentureBeat Q1 2026 AI Infrastructure & Compute Market Tracker (May 2026)
|
||||
- MIT NANDA "GenAI Divide" report (July 2025)
|
||||
- Data Center Dynamics, AWS H200 price increase (January 2026)
|
||||
- Corporate Blogging Tips, AI coding tools analysis (May 2026)
|
||||
- ClearML, "State of AI Infrastructure at Scale 2025-2026" (December 2025)
|
||||
- Gartner AI infrastructure forecast (January 2026)
|
||||
|
||||
---
|
||||
|
||||
*Battle cards generated from AI bubble research project. Data current as of June 2026.*
|
||||
75
output/battlecards/validation_report.md
Normal file
@@ -0,0 +1,75 @@
|
||||
# Battle Card Deck Validation Report
|
||||
|
||||
> Generated: June 5, 2026
|
||||
|
||||
## Validation Summary
|
||||
- Total cards: 8/8
|
||||
- Total charts: 8/8
|
||||
- Deck file: present
|
||||
- Overall status: **PASS**
|
||||
|
||||
## Per-Validation Results
|
||||
|
||||
### 1. Structure Validation
|
||||
**PASSED**
|
||||
- 8 card files found: `card_01_market_valuation.md` through `card_08_long_term_productivity.md`
|
||||
- 8 chart files found: all `mini_*.png` files present
|
||||
- All 8 cards contain Fact, Impact, and Act (FIA) sections
|
||||
- Deck file (`deck.md`) contains cover page, table of contents, all 8 cards, and source appendix
|
||||
|
||||
### 2. Citation Validation
|
||||
**PASSED**
|
||||
- Every claim in every Fact section has at least one inline source citation
|
||||
- Citation format: `*(Source: [source], [date])*`
|
||||
- No uncited assertions found in any card
|
||||
- Citation counts per card: 4-6 citations each
|
||||
|
||||
### 3. Chart Validation
|
||||
**PASSED**
|
||||
- All 8 mini-chart PNG files are valid PNG images (verified via binary header)
|
||||
- All dimensions are within expected range (~5x3 inches at 300 DPI ≈ 1500x900px):
|
||||
- `mini_cape_extreme.png`: 1520x882
|
||||
- `mini_capex_trajectory.png`: 1482x882
|
||||
- `mini_code_vulnerabilities.png`: 1482x864
|
||||
- `mini_developer_adoption.png`: 1517x882
|
||||
- `mini_enterprise_savings.png`: 1481x882
|
||||
- `mini_gpu_utilization.png`: 1441x882
|
||||
- `mini_productivity_trajectory.png`: 1546x882
|
||||
- `mini_startup_multiples.png`: 1483x882
|
||||
|
||||
### 4. Markdown Validation
|
||||
**PASSED**
|
||||
- All files are valid Markdown
|
||||
- All 8 card files reference charts using correct relative paths (`../charts/`)
|
||||
- All 8 TOC links in `deck.md` resolve correctly (`cards/card_0*.md`)
|
||||
- No broken image or link references
|
||||
|
||||
### 5. Cross-Reference Validation
|
||||
**PASSED**
|
||||
- `claims.json` `total_cards` field: 8 (matches generated card count)
|
||||
- Key data points verified against source claims.json:
|
||||
- Card 1: CAPE 40.03, Buffett 219%, P/E 29.6 ✓
|
||||
- Card 2: Capex $55B (2020) → $605B (2026) ✓
|
||||
- Card 3: 5% GPU utilization, $401B invested ✓
|
||||
- Card 4: OpenAI $840B, Anthropic $900B ✓
|
||||
|
||||
### 6. Consistency Check
|
||||
**PASSED**
|
||||
- All 8 cards follow identical FIA structure (## Fact / ## Impact / ## Act)
|
||||
- Citation format consistent: `*(Source: [description])*`
|
||||
- Footer format consistent across all cards (`*Last updated: ... | Sources: ...*`)
|
||||
- Two footer date formats used: "June 2026" (card 1) and "2026-06-05" (cards 2-8) — minor cosmetic variance, not a structural issue
|
||||
|
||||
### 7. Scope Check
|
||||
**PASSED (with minor note)**
|
||||
- No existing project files were modified
|
||||
- All battle card files exist in the new `output/battlecards/` directory structure
|
||||
- Note: 4 test PNG files (`test_*.png`) remain in `charts/` directory from chart generation testing. These are non-blocking but could be cleaned up.
|
||||
|
||||
## Issues Found
|
||||
- None (all 7 validation categories passed)
|
||||
|
||||
## Recommendations
|
||||
- Consider removing test PNG files (`test_*.png`) from `output/battlecards/charts/` to keep the directory clean
|
||||
- Standardize footer date format across all cards (either "June 2026" or "2026-06-05")
|
||||
- Card 1 uses "June 2026" while cards 2-8 use "2026-06-05" — consider aligning for consistency
|
||||
BIN
output/charts/01_shiller_cape.png
Normal file
|
After Width: | Height: | Size: 359 KiB |
BIN
output/charts/02_buffett_indicator.png
Normal file
|
After Width: | Height: | Size: 324 KiB |
BIN
output/charts/03_pe_dividend.png
Normal file
|
After Width: | Height: | Size: 393 KiB |
BIN
output/charts/04_bubble_dashboard.png
Normal file
|
After Width: | Height: | Size: 545 KiB |
BIN
output/charts/05_hyperscaler_capex.png
Normal file
|
After Width: | Height: | Size: 420 KiB |
BIN
output/charts/06_tech_debt.png
Normal file
|
After Width: | Height: | Size: 174 KiB |
BIN
output/charts/07_nvidia_datacenter.png
Normal file
|
After Width: | Height: | Size: 411 KiB |
BIN
output/charts/08_gpu_utilization.png
Normal file
|
After Width: | Height: | Size: 357 KiB |
BIN
output/charts/09_mcp_downloads.png
Normal file
|
After Width: | Height: | Size: 397 KiB |
BIN
output/charts/10_agent_adoption.png
Normal file
|
After Width: | Height: | Size: 242 KiB |
BIN
output/charts/11_agent_market_forecasts.png
Normal file
|
After Width: | Height: | Size: 309 KiB |
BIN
output/charts/12_developer_ai_reality.png
Normal file
|
After Width: | Height: | Size: 627 KiB |
BIN
output/charts/12b_benchmarks_with_disclaimer.png
Normal file
|
After Width: | Height: | Size: 171 KiB |
BIN
output/charts/13_productivity_cases.png
Normal file
|
After Width: | Height: | Size: 348 KiB |
BIN
output/combined/narrative_dashboard.png
Normal file
|
After Width: | Height: | Size: 926 KiB |
75
output/tables/summary_tables.md
Normal file
@@ -0,0 +1,75 @@
|
||||
# AI Bubble Case Study — Summary Tables
|
||||
|
||||
> Generated from `src.data.*` modules. Data retrieved June 2026.
|
||||
|
||||
## 1. Bubble Indicators Comparison
|
||||
|
||||
| Indicator | Current Value | Historical Mean | Zone | Source |
|
||||
|---|---|---|---|---|
|
||||
| Shiller CAPE | 40.03 | 17.39 | Bubble (>30) | Yale/Shiller |
|
||||
| Buffett Indicator | 219% | ~105% | Bubble (>200%) | Composite |
|
||||
| S&P 500 P/E | 29.6 | ~17.9 | Warning | multpl.com |
|
||||
| Dividend Yield | 1.04% | ~3.15% | Near historic low | multpl.com |
|
||||
|
||||
## 2. Hyperscaler Capex by Year/Company
|
||||
|
||||
| Year | Microsoft | Alphabet | Meta | Amazon | Combined |
|
||||
|---|---|---|---|---|---|
|
||||
| 2020 | $8B | $16B | $14B | $17B | $55.3B |
|
||||
| 2021 | $21B | $22B | $16B | $52B | $110.5B |
|
||||
| 2022 | $28B | $25B | $19B | $61B | $132.7B |
|
||||
| 2023 | $30B | $32B | $28B | $71B | $160.8B |
|
||||
| 2024 | $53B | $52B | $38B | $83B | $226.0B |
|
||||
| 2025 | $80B | $75B | $60-$72B | $80-$131B | ~$326B |
|
||||
| 2026 | $100B+ | $175-$185B | $115-$135B | $200B | ~$605B |
|
||||
|
||||
## 3. AI Startup Valuations
|
||||
|
||||
| Company | Valuation | Revenue Multiple | Date | Source |
|
||||
|---|---|---|---|---|
|
||||
| OpenAI | $840B | 31x revenue | Q1 2026 | CB Insights |
|
||||
| Anthropic | $380B | 40x revenue | Q1 2026 | CB Insights |
|
||||
| Perplexity AI | $5.3B | 27x revenue | Q1 2025 | Crunchbase |
|
||||
| Scale AI | $14B | 7x revenue | 2024 | Crunchbase |
|
||||
| Mistral AI | $8B | 40x revenue | 2024 | Company filings |
|
||||
| Cohere | $3.7B | N/A (pre-profit) | 2024 | Crunchbase |
|
||||
| Hugging Face | $4.5B | N/A (pre-profit) | 2024 | Crunchbase |
|
||||
|
||||
## 4. Agent Adoption Survey Data
|
||||
|
||||
| Survey | Production % | Scaling % | Sample Size | Date |
|
||||
|---|---|---|---|---|
|
||||
| LangChain 2025 | 57.3% | — | 1,340 | 2025-11 to 2025-12 |
|
||||
| McKinsey 2025 | — | 23% | 1,993 | 2025-11 |
|
||||
| PwC 2025 | 79% | — | 308 | 2025-04 |
|
||||
|
||||
## 5. Productivity Case Study Metrics
|
||||
|
||||
| Company | System | Key Metric | Value | Confidence |
|
||||
|---|---|---|---|---|
|
||||
| Klarna | AI Assistant (LangGraph + LangSmith) | FTE equivalent | 700 | HIGH |
|
||||
| Klarna | AI Assistant (LangGraph + LangSmith) | Resolution time reduction | 80% | HIGH |
|
||||
| Klarna | AI Assistant (LangGraph + LangSmith) | Task automation | 70% | HIGH |
|
||||
| JPMorgan Chase | COiN (Contract Intelligence) | Hours saved/year | 360,000 | HIGH |
|
||||
| JPMorgan Chase | COiN (Contract Intelligence) | Contracts processed/year | 12,000 | HIGH |
|
||||
| JPMorgan Chase | COiN (Contract Intelligence) | Annual value | $150,000,000 | HIGH |
|
||||
| ServiceNow (SnowGeek) | Now Assist + Agentic AI for IT Operations | Midnight escalation reduction | 73% | MEDIUM |
|
||||
| ServiceNow (SnowGeek) | Now Assist + Agentic AI for IT Operations | MTTR improvement | 65% | MEDIUM |
|
||||
| ServiceNow (SnowGeek) | Now Assist + Agentic AI for IT Operations | Annual downtime savings | $2,300,000 | MEDIUM |
|
||||
| Morgan Stanley | DevGen.AI Developer Assistant | Developer hours saved | 280,000 | LOW |
|
||||
|
||||
## 6. Failure Modes
|
||||
|
||||
| Finding | Rate | Source | Confidence |
|
||||
|---|---|---|---|
|
||||
| 95% of corporate AI pilots deliver zero measurable return; only 5% reach production with impact | 95% | MIT Media Lab 2025 | HIGH |
|
||||
| 42% of companies abandoned most AI initiatives in 2025 (up from 17% in 2024); 46% of PoCs scrapped before production | 42% | S&P Global 2025 | HIGH |
|
||||
| Over 80% of AI projects fail — twice the failure rate of non-AI technology projects | 80% | RAND Corporation 2025 | MEDIUM |
|
||||
| ~80% of autonomous-AI deployers cut headcount; ZERO correlation between layoffs and ROI | — | Gartner May 2026 | MEDIUM |
|
||||
| Over 40% of agentic AI projects will be canceled by end of 2027 due to escalating costs, unclear value, or inadequate risk controls | 40% | Gartner prediction | MEDIUM |
|
||||
| 88% AI adoption but only 31% scaling — vast majority stuck in pilots | — | McKinsey State of AI 2025 | HIGH |
|
||||
| External partnership deployments succeed at ~67% vs ~33% for internal builds | — | MIT Media Lab 2025 | MEDIUM |
|
||||
| 90%+ of companies have employees using personal AI tools; only 40% have official licensing | — | Multiple sources | MEDIUM |
|
||||
|
||||
---
|
||||
*Tables generated programmatically from research data modules.*
|
||||
13
src/battlecards/__init__.py
Normal file
@@ -0,0 +1,13 @@
|
||||
"""Battle card generation module for AI bubble research."""
|
||||
from src.battlecards.card_templates import BattleCard, FIASection
|
||||
from src.battlecards.mini_charts import MiniChartEngine
|
||||
from src.battlecards.claim_extractor import ClaimExtractor
|
||||
from src.battlecards.generate_deck import DeckGenerator
|
||||
|
||||
__all__ = [
|
||||
"BattleCard",
|
||||
"FIASection",
|
||||
"MiniChartEngine",
|
||||
"ClaimExtractor",
|
||||
"DeckGenerator",
|
||||
]
|
||||
108
src/battlecards/card_templates.py
Normal file
@@ -0,0 +1,108 @@
|
||||
"""FIA (Fact-Impact-Act) battle card data model and Markdown assembly."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass, field
|
||||
from typing import Optional
|
||||
|
||||
|
||||
@dataclass
|
||||
class FIASection:
|
||||
"""One section of a Fact-Impact-Act battle card.
|
||||
|
||||
Attributes:
|
||||
name: Section name (e.g. "Fact", "Impact", "Act").
|
||||
content: List of bullet-point strings.
|
||||
chart_reference: Optional path to a mini-chart PNG file.
|
||||
"""
|
||||
|
||||
name: str
|
||||
content: list[str]
|
||||
chart_reference: Optional[str] = None
|
||||
|
||||
|
||||
@dataclass
|
||||
class BattleCard:
|
||||
"""A single FIA battle card for AI bubble research.
|
||||
|
||||
Attributes:
|
||||
card_number: Integer 1-8 identifying the card.
|
||||
title: Human-readable title for the card.
|
||||
cluster: Cluster grouping — "bubble" or "value".
|
||||
summary: One-line summary rendered as a Markdown blockquote.
|
||||
fact: The Fact section of the FIA model.
|
||||
impact: The Impact section of the FIA model.
|
||||
act: The Act section of the FIA model.
|
||||
sources: List of source references.
|
||||
last_updated: Timestamp string for the card.
|
||||
"""
|
||||
|
||||
card_number: int
|
||||
title: str
|
||||
cluster: str
|
||||
summary: str
|
||||
fact: FIASection
|
||||
impact: FIASection
|
||||
act: FIASection
|
||||
sources: list[str]
|
||||
last_updated: str
|
||||
|
||||
|
||||
def render_card(card: BattleCard) -> str:
|
||||
"""Assemble a BattleCard into a Markdown string.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
card : BattleCard
|
||||
The card instance to render.
|
||||
|
||||
Returns
|
||||
-------
|
||||
str
|
||||
Complete Markdown string for the card.
|
||||
"""
|
||||
lines: list[str] = []
|
||||
|
||||
# Header
|
||||
lines.append(f"# Card {card.card_number}: {card.title}")
|
||||
lines.append("")
|
||||
|
||||
# Summary blockquote
|
||||
lines.append(f"> {card.summary}")
|
||||
lines.append("")
|
||||
|
||||
# Fact section
|
||||
lines.append("## Fact")
|
||||
lines.append("")
|
||||
for bullet in card.fact.content:
|
||||
lines.append(f"- {bullet}")
|
||||
lines.append("")
|
||||
|
||||
# Chart reference (if any)
|
||||
if card.fact.chart_reference:
|
||||
chart_filename = card.fact.chart_reference.split("/")[-1]
|
||||
lines.append(f"")
|
||||
lines.append("")
|
||||
|
||||
# Impact section
|
||||
lines.append("## Impact")
|
||||
lines.append("")
|
||||
for bullet in card.impact.content:
|
||||
lines.append(f"- {bullet}")
|
||||
lines.append("")
|
||||
|
||||
# Act section
|
||||
lines.append("## Act")
|
||||
lines.append("")
|
||||
for bullet in card.act.content:
|
||||
lines.append(f"- {bullet}")
|
||||
lines.append("")
|
||||
|
||||
# Footer
|
||||
lines.append("---")
|
||||
lines.append("")
|
||||
sources_str = ", ".join(card.sources)
|
||||
lines.append(f"*Last updated: {card.last_updated} | Sources: {sources_str}*")
|
||||
lines.append("")
|
||||
|
||||
return "\n".join(lines)
|
||||
733
src/battlecards/claim_extractor.py
Normal file
@@ -0,0 +1,733 @@
|
||||
"""Claim extraction module for battle card generation.
|
||||
|
||||
Parses narrative documents and data modules to extract
|
||||
claim/evidence/implication triples suitable for FIA card assembly.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import importlib
|
||||
import importlib.util
|
||||
import json
|
||||
import re
|
||||
from pathlib import Path
|
||||
from typing import Any, Optional
|
||||
|
||||
|
||||
class ClaimExtractor:
|
||||
"""Extract quantified claims from narratives and data modules.
|
||||
|
||||
Methods
|
||||
-------
|
||||
parse_narrative(narrative_path: str) -> list[dict]
|
||||
Parse a narrative Markdown file for claim triples.
|
||||
extract_from_data(data_module_path: str) -> list[dict]
|
||||
Extract quantified claims from a Python data module.
|
||||
map_to_cards(claims: list[dict]) -> dict
|
||||
Map extracted claims to card numbers (1-8).
|
||||
export_cards(cards_path: str) -> dict
|
||||
Read claims.json and return structured card data.
|
||||
|
||||
Claim dict format
|
||||
-----------------
|
||||
{
|
||||
"card_number": int,
|
||||
"section": "fact" | "impact" | "act",
|
||||
"claim": str,
|
||||
"evidence": str,
|
||||
"source": str,
|
||||
"confidence": str, # optional
|
||||
}
|
||||
"""
|
||||
|
||||
# Card number to topic mapping for heuristic assignment
|
||||
_CARD_TOPICS: dict[int, tuple[str, ...]] = {
|
||||
1: ("valuation", "cape", "market cap", "shiller", "p/e", "dividend"),
|
||||
2: ("infrastructure", "data center", "hyperscaler", "capex", "nvidia"),
|
||||
3: ("gpu", "utilization", "tensor", "compute", "idle"),
|
||||
4: ("startup", "funding", "venture", "openai", "anthropic", "mistral"),
|
||||
5: ("enterprise", "deployment", "klarna", "jpmorgan", "servicenow", "production"),
|
||||
6: ("developer", "coding", "programming", "ide", "copilot", "github"),
|
||||
7: ("quality", "security", "vulnerability", "bug", "dora"),
|
||||
8: (
|
||||
"productivity",
|
||||
"long-term",
|
||||
"trajectory",
|
||||
"efficiency",
|
||||
"accenture",
|
||||
"microsoft research",
|
||||
),
|
||||
}
|
||||
|
||||
def parse_narrative(self, narrative_path: str) -> list[dict]:
|
||||
"""Parse a Markdown narrative for claim/evidence/implication triples.
|
||||
|
||||
Reads the narrative file and extracts bullet points and key
|
||||
statements that contain quantitative data, classifying each
|
||||
into fact, impact, or act sections and mapping to card numbers.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
narrative_path : str
|
||||
Path to the Markdown narrative file.
|
||||
|
||||
Returns
|
||||
-------
|
||||
list[dict]
|
||||
List of extracted claim dicts.
|
||||
"""
|
||||
claims: list[dict] = []
|
||||
path = Path(narrative_path)
|
||||
|
||||
if not path.exists():
|
||||
return claims
|
||||
|
||||
text = path.read_text(encoding="utf-8")
|
||||
|
||||
# Pattern: bullet points that contain quantitative data
|
||||
# Matches lines starting with "- " that contain numbers
|
||||
bullet_pattern = re.compile(
|
||||
r"^[-*]\s+(.+?[\d%]+\S.*?)$",
|
||||
re.MULTILINE,
|
||||
)
|
||||
|
||||
for match in bullet_pattern.finditer(text):
|
||||
bullet_text = match.group(1).strip()
|
||||
|
||||
# Extract evidence (numeric data points)
|
||||
numbers = re.findall(
|
||||
r"\d+(?:,\d{3})*(?:\.\d+)?[%$]?",
|
||||
bullet_text,
|
||||
)
|
||||
evidence = ", ".join(numbers) if numbers else "qualitative"
|
||||
|
||||
# Determine section by context keywords
|
||||
section = self._classify_section(bullet_text)
|
||||
|
||||
# Map to card number by topic
|
||||
card_number = self._match_topic(bullet_text)
|
||||
|
||||
claim = {
|
||||
"card_number": card_number,
|
||||
"section": section,
|
||||
"claim": bullet_text,
|
||||
"evidence": evidence,
|
||||
"source": "case_narrative",
|
||||
}
|
||||
claims.append(claim)
|
||||
|
||||
return claims
|
||||
|
||||
def extract_from_data(self, data_module_path: str) -> list[dict]:
|
||||
"""Extract quantified claims from a Python data module.
|
||||
|
||||
Reads module-level list[dict] or dict constants and
|
||||
extracts notable data points as claims.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
data_module_path : str
|
||||
Path to the Python data module file.
|
||||
|
||||
Returns
|
||||
-------
|
||||
list[dict]
|
||||
List of extracted claim dicts.
|
||||
"""
|
||||
claims: list[dict] = []
|
||||
path = Path(data_module_path)
|
||||
|
||||
if not path.exists():
|
||||
return claims
|
||||
|
||||
text = path.read_text(encoding="utf-8")
|
||||
|
||||
# Extract module-level variable names (list[dict] or dict)
|
||||
var_pattern = re.compile(
|
||||
r"^(\w+):\s*list\[dict\].*?=(\[.*?\])",
|
||||
re.MULTILINE | re.DOTALL,
|
||||
)
|
||||
|
||||
module_name = path.stem
|
||||
|
||||
for match in var_pattern.finditer(text):
|
||||
var_name = match.group(1)
|
||||
data_str = match.group(2)
|
||||
|
||||
# Extract representative values from the data
|
||||
numbers = re.findall(
|
||||
r"[\d]+(?:,[\d]{3})*(?:\.[\d]+)?",
|
||||
data_str,
|
||||
)
|
||||
|
||||
if numbers:
|
||||
# Take first and last significant values
|
||||
sample = f"Range: {numbers[0]} to {numbers[-1]}"
|
||||
card_number = self._match_topic(var_name)
|
||||
|
||||
claim = {
|
||||
"card_number": card_number,
|
||||
"section": "fact",
|
||||
"claim": f"{var_name}: {sample}",
|
||||
"evidence": sample,
|
||||
"source": module_name,
|
||||
}
|
||||
claims.append(claim)
|
||||
|
||||
return claims
|
||||
|
||||
def extract_from_data_modules(
|
||||
self,
|
||||
market_bubbles_module: Optional[str] = None,
|
||||
ai_infra_module: Optional[str] = None,
|
||||
agent_adoption_module: Optional[str] = None,
|
||||
productivity_module: Optional[str] = None,
|
||||
) -> list[dict]:
|
||||
"""Extract cross-referenced data points from data modules.
|
||||
|
||||
Dynamically imports the specified data modules and extracts
|
||||
key numeric values as claims with proper source attribution.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
market_bubbles_module : str, optional
|
||||
Module path for market_bubbles data.
|
||||
ai_infra_module : str, optional
|
||||
Module path for ai_infrastructure data.
|
||||
agent_adoption_module : str, optional
|
||||
Module path for agent_adoption data.
|
||||
productivity_module : str, optional
|
||||
Module path for productivity data.
|
||||
|
||||
Returns
|
||||
-------
|
||||
list[dict]
|
||||
List of cross-referenced claim dicts from data modules.
|
||||
"""
|
||||
claims: list[dict] = []
|
||||
|
||||
# Market bubbles data -> Card 1
|
||||
if market_bubbles_module:
|
||||
mod = self._import_module(market_bubbles_module)
|
||||
if mod:
|
||||
claims.extend(self._extract_market_bubble_claims(mod))
|
||||
|
||||
# AI infrastructure data -> Cards 2, 3
|
||||
if ai_infra_module:
|
||||
mod = self._import_module(ai_infra_module)
|
||||
if mod:
|
||||
claims.extend(self._extract_infrastructure_claims(mod))
|
||||
|
||||
# Agent adoption data -> Cards 5, 6, 7
|
||||
if agent_adoption_module:
|
||||
mod = self._import_module(agent_adoption_module)
|
||||
if mod:
|
||||
claims.extend(self._extract_adoption_claims(mod))
|
||||
|
||||
# Productivity data -> Cards 5, 8
|
||||
if productivity_module:
|
||||
mod = self._import_module(productivity_module)
|
||||
if mod:
|
||||
claims.extend(self._extract_productivity_claims(mod))
|
||||
|
||||
return claims
|
||||
|
||||
def map_to_cards(self, claims: list[dict]) -> dict:
|
||||
"""Map a list of claims to card numbers (1-8).
|
||||
|
||||
Parameters
|
||||
----------
|
||||
claims : list[dict]
|
||||
List of claim dicts to organize.
|
||||
|
||||
Returns
|
||||
-------
|
||||
dict
|
||||
Mapping of card_number -> list of claims for that card.
|
||||
"""
|
||||
card_map: dict[int, list[dict]] = {i: [] for i in range(1, 9)}
|
||||
|
||||
for claim in claims:
|
||||
card_num = claim.get("card_number", 1)
|
||||
# Clamp to valid range
|
||||
card_num = max(1, min(8, card_num))
|
||||
card_map[card_num].append(claim)
|
||||
|
||||
return card_map
|
||||
|
||||
def export_cards(self, cards_path: str) -> dict:
|
||||
"""Read claims.json and return structured card data.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
cards_path : str
|
||||
Path to the claims.json file.
|
||||
|
||||
Returns
|
||||
-------
|
||||
dict
|
||||
Parsed card data with metadata.
|
||||
"""
|
||||
path = Path(cards_path)
|
||||
|
||||
if not path.exists():
|
||||
return {}
|
||||
|
||||
with open(path, "r", encoding="utf-8") as f:
|
||||
return json.load(f)
|
||||
|
||||
def count_claims(self, cards_data: dict) -> dict:
|
||||
"""Count claims per card and section.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
cards_data : dict
|
||||
Parsed cards data from claims.json.
|
||||
|
||||
Returns
|
||||
-------
|
||||
dict
|
||||
Summary counts per card and overall.
|
||||
"""
|
||||
summary: dict[str, Any] = {}
|
||||
cards = cards_data.get("cards", {})
|
||||
|
||||
for card_id, card in cards.items():
|
||||
fact_count = len(card.get("fact", []))
|
||||
impact_count = len(card.get("impact", []))
|
||||
act_count = len(card.get("act", []))
|
||||
total = fact_count + impact_count + act_count
|
||||
|
||||
summary[f"card_{card_id}"] = {
|
||||
"title": card.get("title", f"Card {card_id}"),
|
||||
"fact": fact_count,
|
||||
"impact": impact_count,
|
||||
"act": act_count,
|
||||
"total": total,
|
||||
}
|
||||
|
||||
summary["total_cards"] = len(cards)
|
||||
summary["total_claims"] = sum(
|
||||
v["total"] for v in summary.values() if isinstance(v, dict) and "total" in v
|
||||
)
|
||||
|
||||
return summary
|
||||
|
||||
# -----------------------------------------------------------------------
|
||||
# Data module extraction helpers
|
||||
# -----------------------------------------------------------------------
|
||||
|
||||
def _import_module(self, module_path: str) -> Optional[Any]:
|
||||
"""Dynamically import a Python module from a file path."""
|
||||
try:
|
||||
path = Path(module_path)
|
||||
if not path.exists():
|
||||
return None
|
||||
|
||||
spec = importlib.util.spec_from_file_location(path.stem, str(path))
|
||||
if spec is None or spec.loader is None:
|
||||
return None
|
||||
|
||||
mod = importlib.util.module_from_spec(spec)
|
||||
spec.loader.exec_module(mod)
|
||||
return mod
|
||||
except Exception:
|
||||
return None
|
||||
|
||||
def _extract_market_bubble_claims(self, mod: Any) -> list[dict]:
|
||||
"""Extract claims from market_bubbles data module."""
|
||||
claims: list[dict] = []
|
||||
|
||||
# Shiller CAPE
|
||||
cape_data = getattr(mod, "shiller_cape", None)
|
||||
cape_meta = getattr(mod, "shiller_cape_meta", {})
|
||||
if cape_data and isinstance(cape_data, list):
|
||||
latest = cape_data[-1] if cape_data else {}
|
||||
mean_val = cape_meta.get("historical_mean", "N/A")
|
||||
peak_val = max(d.get("value", 0) for d in cape_data) if cape_data else 0
|
||||
peak_year = next(
|
||||
(d.get("year", "?") for d in cape_data if d.get("value") == peak_val),
|
||||
"?",
|
||||
)
|
||||
claims.append(
|
||||
{
|
||||
"card_number": 1,
|
||||
"section": "fact",
|
||||
"claim": f"Shiller CAPE current: {latest.get('value', 'N/A')}, "
|
||||
f"historical mean: {mean_val}, peak: {peak_val} "
|
||||
f"(year {peak_year})",
|
||||
"evidence": f"CAPE {latest.get('year', '?')}: "
|
||||
f"{latest.get('value', 'N/A')}",
|
||||
"source": "market_bubbles.shiller_cape",
|
||||
"confidence": cape_meta.get("confidence", "HIGH"),
|
||||
}
|
||||
)
|
||||
|
||||
# Buffett Indicator
|
||||
buffett_data = getattr(mod, "buffett_indicator", None)
|
||||
buffett_meta = getattr(mod, "buffett_indicator_meta", {})
|
||||
if buffett_data and isinstance(buffett_data, list):
|
||||
latest = buffett_data[-1] if buffett_data else {}
|
||||
claims.append(
|
||||
{
|
||||
"card_number": 1,
|
||||
"section": "fact",
|
||||
"claim": f"Buffett Indicator: {latest.get('value', 'N/A')}% "
|
||||
f"(200% danger threshold)",
|
||||
"evidence": f"{latest.get('year', '?')}: "
|
||||
f"{latest.get('value', 'N/A')}%",
|
||||
"source": "market_bubbles.buffett_indicator",
|
||||
"confidence": buffett_meta.get("confidence", "MEDIUM-HIGH"),
|
||||
}
|
||||
)
|
||||
|
||||
# S&P 500 P/E
|
||||
pe_data = getattr(mod, "sp500_pe", None)
|
||||
pe_meta = getattr(mod, "sp500_pe_meta", {})
|
||||
if pe_data and isinstance(pe_data, list):
|
||||
latest = pe_data[-1] if pe_data else {}
|
||||
mean_val = pe_meta.get("historical_mean", "N/A")
|
||||
claims.append(
|
||||
{
|
||||
"card_number": 1,
|
||||
"section": "fact",
|
||||
"claim": f"S&P 500 P/E: {latest.get('value', 'N/A')} "
|
||||
f"(mean: {mean_val})",
|
||||
"evidence": f"{latest.get('year', '?')}: "
|
||||
f"{latest.get('value', 'N/A')}",
|
||||
"source": "market_bubbles.sp500_pe",
|
||||
"confidence": pe_meta.get("confidence", "HIGH"),
|
||||
}
|
||||
)
|
||||
|
||||
# Dividend Yield
|
||||
div_data = getattr(mod, "sp500_dividend_yield", None)
|
||||
div_meta = getattr(mod, "sp500_dividend_yield_meta", {})
|
||||
if div_data and isinstance(div_data, list):
|
||||
latest = div_data[-1] if div_data else {}
|
||||
mean_val = div_meta.get("historical_mean", "N/A")
|
||||
claims.append(
|
||||
{
|
||||
"card_number": 1,
|
||||
"section": "fact",
|
||||
"claim": f"S&P 500 dividend yield: "
|
||||
f"{latest.get('value', 'N/A')}% (mean: {mean_val}%)",
|
||||
"evidence": f"{latest.get('year', '?')}: "
|
||||
f"{latest.get('value', 'N/A')}%",
|
||||
"source": "market_bubbles.sp500_dividend_yield",
|
||||
"confidence": div_meta.get("confidence", "HIGH"),
|
||||
}
|
||||
)
|
||||
|
||||
# Debt ratios
|
||||
debt_data = getattr(mod, "us_debt_ratios", None)
|
||||
if debt_data and isinstance(debt_data, list):
|
||||
latest = debt_data[-1] if debt_data else {}
|
||||
claims.append(
|
||||
{
|
||||
"card_number": 1,
|
||||
"section": "fact",
|
||||
"claim": f"Federal debt/GDP: "
|
||||
f"{latest.get('federal_debt_gdp_percent', 'N/A')}% "
|
||||
f"(household: "
|
||||
f"{latest.get('household_debt_gdp_percent', 'N/A')}%)",
|
||||
"evidence": f"{latest.get('year', '?')}: federal "
|
||||
f"{latest.get('federal_debt_gdp_percent', 'N/A')}%, "
|
||||
f"household "
|
||||
f"{latest.get('household_debt_gdp_percent', 'N/A')}%",
|
||||
"source": "market_bubbles.us_debt_ratios",
|
||||
"confidence": "HIGH",
|
||||
}
|
||||
)
|
||||
|
||||
return claims
|
||||
|
||||
def _extract_infrastructure_claims(self, mod: Any) -> list[dict]:
|
||||
"""Extract claims from ai_infrastructure data module."""
|
||||
claims: list[dict] = []
|
||||
|
||||
# Hyperscaler capex
|
||||
capex_data = getattr(mod, "hyperscaler_capex_annual", None)
|
||||
if capex_data and isinstance(capex_data, list):
|
||||
# Sum 2020 and 2026 totals
|
||||
years = {}
|
||||
for entry in capex_data:
|
||||
year = entry.get("year")
|
||||
if year not in years:
|
||||
years[year] = 0.0
|
||||
years[year] += entry.get("capex_billions", 0)
|
||||
|
||||
y2020 = years.get(2020, 0)
|
||||
y2026 = years.get(2026, 0)
|
||||
claims.append(
|
||||
{
|
||||
"card_number": 2,
|
||||
"section": "fact",
|
||||
"claim": f"Hyperscaler combined capex: "
|
||||
f"${y2020:.1f}B (2020) -> "
|
||||
f"${y2026:.0f}B (2026 projected)",
|
||||
"evidence": f"2020: ${y2020:.1f}B, 2026: ${y2026:.0f}B",
|
||||
"source": "ai_infrastructure.hyperscaler_capex_annual",
|
||||
"confidence": "HIGH",
|
||||
}
|
||||
)
|
||||
|
||||
# AI capex share
|
||||
ai_share = getattr(mod, "hyperscaler_ai_capex_share", None)
|
||||
if ai_share and isinstance(ai_share, dict):
|
||||
latest_year = max(ai_share.keys())
|
||||
share = ai_share[latest_year]
|
||||
claims.append(
|
||||
{
|
||||
"card_number": 2,
|
||||
"section": "fact",
|
||||
"claim": f"AI capex share: "
|
||||
f"{share.get('low', 'N/A')}-{share.get('high', 'N/A')}% "
|
||||
f"of hyperscaler spending in {latest_year}",
|
||||
"evidence": f"{share.get('low')}% to "
|
||||
f"{share.get('high')}%",
|
||||
"source": "ai_infrastructure.hyperscaler_ai_capex_share",
|
||||
"confidence": "MEDIUM",
|
||||
}
|
||||
)
|
||||
|
||||
# NVIDIA revenue
|
||||
nvidia_data = getattr(mod, "nvidia_revenue", None)
|
||||
if nvidia_data and isinstance(nvidia_data, list):
|
||||
first_entry = nvidia_data[0] if nvidia_data else {}
|
||||
last_entry = nvidia_data[-1] if nvidia_data else {}
|
||||
|
||||
# Get data center or compute revenue
|
||||
first_dc = first_entry.get(
|
||||
"data_center_billions",
|
||||
first_entry.get("compute_billions", 0),
|
||||
)
|
||||
last_dc = last_entry.get(
|
||||
"data_center_billions",
|
||||
last_entry.get("compute_billions", 0),
|
||||
)
|
||||
|
||||
claims.append(
|
||||
{
|
||||
"card_number": 2,
|
||||
"section": "fact",
|
||||
"claim": f"NVIDIA data center revenue: "
|
||||
f"${first_dc:.2f}B ({first_entry.get('fiscal_quarter', '?')}) "
|
||||
f"-> ${last_dc:.1f}B "
|
||||
f"({last_entry.get('fiscal_quarter', '?')})",
|
||||
"evidence": f"{first_entry.get('fiscal_quarter', '?')}: "
|
||||
f"${first_dc:.2f}B, "
|
||||
f"{last_entry.get('fiscal_quarter', '?')}: "
|
||||
f"${last_dc:.1f}B",
|
||||
"source": "ai_infrastructure.nvidia_revenue",
|
||||
"confidence": "HIGH",
|
||||
}
|
||||
)
|
||||
|
||||
# Tech layoffs
|
||||
layoffs = getattr(mod, "tech_layoffs", None)
|
||||
layoffs_meta = getattr(mod, "layoffs_meta", {})
|
||||
if layoffs and isinstance(layoffs, list):
|
||||
total_cut = layoffs_meta.get("total_jobs_cut_cumulative", 0)
|
||||
peak_year = layoffs_meta.get("peak_year", "?")
|
||||
peak_cut = layoffs_meta.get("peak_jobs_cut", 0)
|
||||
claims.append(
|
||||
{
|
||||
"card_number": 2,
|
||||
"section": "fact",
|
||||
"claim": f"Tech layoffs: {total_cut:,} cumulative "
|
||||
f"(peak: {peak_cut:,} in {peak_year})",
|
||||
"evidence": f"Peak {peak_year}: {peak_cut:,} jobs",
|
||||
"source": "ai_infrastructure.tech_layoffs",
|
||||
"confidence": "HIGH",
|
||||
}
|
||||
)
|
||||
|
||||
return claims
|
||||
|
||||
def _extract_adoption_claims(self, mod: Any) -> list[dict]:
|
||||
"""Extract claims from agent_adoption data module."""
|
||||
claims: list[dict] = []
|
||||
|
||||
# Developer AI adoption
|
||||
dev_data = getattr(mod, "developer_ai_adoption", None)
|
||||
if dev_data and isinstance(dev_data, list):
|
||||
for entry in dev_data:
|
||||
metric = entry.get("metric", "")
|
||||
value = entry.get("value", 0)
|
||||
source = entry.get("source", "")
|
||||
|
||||
# Card 6: Developer adoption
|
||||
if any(
|
||||
kw in metric
|
||||
for kw in ["copilot", "daily", "use_or_plan", "regular_ai"]
|
||||
):
|
||||
claims.append(
|
||||
{
|
||||
"card_number": 6,
|
||||
"section": "fact",
|
||||
"claim": f"{source}: {metric} = {value}",
|
||||
"evidence": str(value),
|
||||
"source": f"agent_adoption.developer_ai_adoption",
|
||||
"confidence": "HIGH",
|
||||
}
|
||||
)
|
||||
|
||||
# Agent survey data
|
||||
survey_data = getattr(mod, "agent_survey_data", None)
|
||||
if survey_data and isinstance(survey_data, dict):
|
||||
for survey_name, metrics in survey_data.items():
|
||||
prod_rate = metrics.get("production", None)
|
||||
if prod_rate is not None:
|
||||
claims.append(
|
||||
{
|
||||
"card_number": 5,
|
||||
"section": "fact",
|
||||
"claim": f"{survey_name}: "
|
||||
f"{prod_rate}% deploying agents in production",
|
||||
"evidence": f"{prod_rate}%",
|
||||
"source": f"agent_adoption.agent_survey_data.{survey_name}",
|
||||
"confidence": "HIGH",
|
||||
}
|
||||
)
|
||||
|
||||
# Code quality issues
|
||||
quality_data = getattr(mod, "code_quality_in_production", None)
|
||||
if quality_data and isinstance(quality_data, list):
|
||||
for entry in quality_data:
|
||||
finding = entry.get("finding", "")
|
||||
confidence = entry.get("confidence", "MEDIUM")
|
||||
claims.append(
|
||||
{
|
||||
"card_number": 7,
|
||||
"section": "fact",
|
||||
"claim": finding,
|
||||
"evidence": entry.get("source", "N/A"),
|
||||
"source": "agent_adoption.code_quality_in_production",
|
||||
"confidence": confidence,
|
||||
}
|
||||
)
|
||||
|
||||
# Failure modes
|
||||
failure_data = getattr(mod, "failure_modes", None)
|
||||
if failure_data and isinstance(failure_data, list):
|
||||
for entry in failure_data:
|
||||
category = entry.get("category", "")
|
||||
rate = entry.get("rate_percent", None)
|
||||
source = entry.get("source", "")
|
||||
|
||||
if rate is not None:
|
||||
claims.append(
|
||||
{
|
||||
"card_number": 8,
|
||||
"section": "fact",
|
||||
"claim": f"{source}: {category} - "
|
||||
f"{rate}% failure/abandonment rate",
|
||||
"evidence": f"{rate}%",
|
||||
"source": f"agent_adoption.failure_modes",
|
||||
"confidence": entry.get("confidence", "MEDIUM"),
|
||||
}
|
||||
)
|
||||
|
||||
return claims
|
||||
|
||||
def _extract_productivity_claims(self, mod: Any) -> list[dict]:
|
||||
"""Extract claims from productivity data module."""
|
||||
claims: list[dict] = []
|
||||
|
||||
# Case studies
|
||||
case_data = getattr(mod, "case_studies", None)
|
||||
if case_data and isinstance(case_data, list):
|
||||
for case in case_data:
|
||||
company = case.get("company", "Unknown")
|
||||
confidence = case.get("confidence", "MEDIUM")
|
||||
metrics = case.get("metrics", {})
|
||||
|
||||
# Build metric summary
|
||||
metric_parts = []
|
||||
for k, v in metrics.items():
|
||||
if isinstance(v, (int, float)):
|
||||
metric_parts.append(f"{k}: {v:,}")
|
||||
elif isinstance(v, str):
|
||||
metric_parts.append(f"{k}: {v}")
|
||||
|
||||
metric_str = "; ".join(metric_parts[:5]) if metric_parts else "N/A"
|
||||
|
||||
claims.append(
|
||||
{
|
||||
"card_number": 5,
|
||||
"section": "fact",
|
||||
"claim": f"{company}: {metric_str}",
|
||||
"evidence": metric_str,
|
||||
"source": f"productivity.case_studies ({company})",
|
||||
"confidence": confidence,
|
||||
}
|
||||
)
|
||||
|
||||
# Failure modes
|
||||
failure_data = getattr(mod, "failure_modes", None)
|
||||
if failure_data and isinstance(failure_data, list):
|
||||
for entry in failure_data:
|
||||
category = entry.get("category", "")
|
||||
source = entry.get("source", "")
|
||||
rate = entry.get("rate_percent", None)
|
||||
|
||||
if rate is not None:
|
||||
claims.append(
|
||||
{
|
||||
"card_number": 8,
|
||||
"section": "fact",
|
||||
"claim": f"{source}: {category} - {rate}%",
|
||||
"evidence": entry.get("detail", str(rate)),
|
||||
"source": f"productivity.failure_modes",
|
||||
"confidence": entry.get("confidence", "MEDIUM"),
|
||||
}
|
||||
)
|
||||
|
||||
return claims
|
||||
|
||||
# -----------------------------------------------------------------------
|
||||
# Private helpers
|
||||
# -----------------------------------------------------------------------
|
||||
|
||||
@staticmethod
|
||||
def _classify_section(text: str) -> str:
|
||||
"""Classify a text snippet into fact, impact, or act section."""
|
||||
lower = text.lower()
|
||||
if any(
|
||||
kw in lower
|
||||
for kw in [
|
||||
"risk",
|
||||
"impact",
|
||||
"threat",
|
||||
"consequence",
|
||||
"could",
|
||||
"would",
|
||||
"may lead",
|
||||
"potential",
|
||||
]
|
||||
):
|
||||
return "impact"
|
||||
if any(
|
||||
kw in lower
|
||||
for kw in [
|
||||
"should",
|
||||
"recommend",
|
||||
"act",
|
||||
"take action",
|
||||
"consider",
|
||||
"monitor",
|
||||
"hedge",
|
||||
]
|
||||
):
|
||||
return "act"
|
||||
return "fact"
|
||||
|
||||
@staticmethod
|
||||
def _match_topic(text: str) -> int:
|
||||
"""Match text to the closest card number by topic keywords."""
|
||||
lower = text.lower()
|
||||
for card_num, keywords in ClaimExtractor._CARD_TOPICS.items():
|
||||
if any(kw in lower for kw in keywords):
|
||||
return card_num
|
||||
return 1 # Default to card 1
|
||||
530
src/battlecards/claims.json
Normal file
@@ -0,0 +1,530 @@
|
||||
{
|
||||
"cards": {
|
||||
"1": {
|
||||
"title": "Market Valuation Extremes",
|
||||
"cluster": "bubble",
|
||||
"summary": "The US stock market is trading at historic valuation extremes that mirror previous bubble periods across multiple metrics.",
|
||||
"fact": [
|
||||
{
|
||||
"claim": "The Shiller CAPE ratio stands at 40.03, more than 2x the historical mean of 17.39 since 1881.",
|
||||
"evidence": "Yale/Shiller data, 1881-2026 (147 annual data points). Historical mean: 17.39. 2026 value: 40.03. Second-highest in 147-year record after 2000 dot-com peak of 43.77.",
|
||||
"source": "Yale/Shiller CAPE dataset, retrieved 2026-06-04",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "The Buffett Indicator (US equity market cap / GDP) is at 219%, well above the 200% danger threshold.",
|
||||
"evidence": "Composite from CEIC, currentmarketvaluation.com, and thebuffettindicator.com. 2026 value: 219%. 1996 warning level: ~105%. 2000 dot-com peak: 147.38%. Series above 200% since 2024.",
|
||||
"source": "CEIC + currentmarketvaluation.com + thebuffettindicator.com, 2026",
|
||||
"confidence": "MEDIUM-HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "The S&P 500 trailing P/E ratio is 29.6 against a historical mean of 17.9.",
|
||||
"evidence": "multpl.com/Shiller data, 1950-2026. Current 29.6 vs mean 17.9 represents a 65% premium over long-term average. Above 20 for most of the past six years.",
|
||||
"source": "multpl.com/Shiller S&P 500 P/E ratio, 2026-06-04",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "The S&P 500 dividend yield has fallen to 1.04%, the lowest since the series began in 1950.",
|
||||
"evidence": "multpl.com/Shiller data, 1950-2026. Current: 1.04%. Historical mean: 3.15%. Lowest reading since 1950.",
|
||||
"source": "multpl.com/Shiller dividend yield, 2026-06-04",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "Federal debt rose from 33% of GDP in 1980 to approximately 122.6% in 2025.",
|
||||
"evidence": "FRED series GFDEGDQ188S. Key inflection points: 1980 (33%), 2007 (61%), 2020 (125%), 2025 (122.6%). Limits monetary policy flexibility during a correction.",
|
||||
"source": "FRED/Macrotrends, 2026-06-04",
|
||||
"confidence": "HIGH"
|
||||
}
|
||||
],
|
||||
"impact": [
|
||||
{
|
||||
"claim": "When the CAPE exceeds 30, subsequent 10-year annualized returns tend to be significantly lower than historical averages.",
|
||||
"evidence": "Dot-com bubble period (CAPE above 40 in 1999-2000) was followed by a 20% decline in nominal terms over the next decade. Current CAPE of 40.03 signals similarly depressed future returns.",
|
||||
"source": "Shiller CAPE historical analysis",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "The combination of elevated equity valuations and high sovereign debt creates a fragile macroeconomic environment.",
|
||||
"evidence": "Federal debt at 122.6% of GDP constrains government ability to deploy stimulus. If AI bubble corrects sharply, policy tools are limited, potentially amplifying the severity of any correction.",
|
||||
"source": "FRED debt data + macroeconomic analysis",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "AI spending is amplifying the existing market bubble by driving speculative capital into technology equities.",
|
||||
"evidence": "AI startup valuations (OpenAI $840B, Anthropic $380B) are priced into broader market indices. The narrative of inevitable AI disruption justifies extraordinary valuations across the tech sector.",
|
||||
"source": "CB Insights Q1 2026, market analysis",
|
||||
"confidence": "MEDIUM"
|
||||
}
|
||||
],
|
||||
"act": [
|
||||
{
|
||||
"claim": "Lead with valuation data as the primary signal of bubble conditions.",
|
||||
"evidence": "Multiple converging metrics (CAPE 40.03, Buffett 219%, P/E 29.6, dividend yield 1.04%) all independently point to overvaluation. No single metric is sufficient, but together they paint an unambiguous picture.",
|
||||
"source": "Synthesis of market_bubbles.py datasets A, B, C, D, H",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "Key question: Is the AI revenue growth actually justifying current market pricing?",
|
||||
"evidence": "The narrative of AI-driven disruption has justified extraordinary valuations. However, the disconnect between price and underlying value remains significant. AI companies collectively have not yet generated revenue commensurate with their combined valuations.",
|
||||
"source": "CB Insights valuation data + revenue analysis",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "Counter-argument: Dot-com parallel suggests infrastructure built during the bubble will endure.",
|
||||
"evidence": "Internet and telecom bubbles of the 1990s left behind foundational infrastructure (fiber optic cables, cellular networks) that enabled subsequent decades of innovation. The AI infrastructure buildout may follow a similar pattern.",
|
||||
"source": "Historical precedent analysis, Section 4 of narrative",
|
||||
"confidence": "HIGH"
|
||||
}
|
||||
]
|
||||
},
|
||||
"2": {
|
||||
"title": "AI Infrastructure Buildout",
|
||||
"cluster": "bubble",
|
||||
"summary": "Combined hyperscaler capital expenditure has surged tenfold from 2020 to 2026, representing one of the largest capital deployment cycles in technology history.",
|
||||
"fact": [
|
||||
{
|
||||
"claim": "Combined hyperscaler capex grew from $55.3B in 2020 to a projected $605B in 2026.",
|
||||
"evidence": "Microsoft $100B, Alphabet $175-185B, Meta $115-135B, Amazon $200B projected for 2026. Tenfold increase in six years. Q1 2026 already exceeded $130B combined (run rate >$520B annually).",
|
||||
"source": "ValueAddVC, SEC filings, ai_infrastructure.py Dataset E, 2026-06",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "AI-related capex is estimated at 85-90% of total hyperscaler spending in 2026.",
|
||||
"evidence": "Roughly $514-545B of the projected $605B is devoted to AI infrastructure. Up from 50-60% in 2023.",
|
||||
"source": "ValueAddVC estimates, ai_infrastructure.py hyperscaler_ai_capex_share",
|
||||
"confidence": "MEDIUM"
|
||||
},
|
||||
{
|
||||
"claim": "NVIDIA data center revenue climbed from $1.57B in FY2020 Q1 to $75.2B in FY2027 Q1.",
|
||||
"evidence": "FY2020-Q1: $1.57B. FY2024-Q4: $18.72B. FY2025-Q4: $39.25B. FY2026-Q4: $62.3B. FY2027-Q1 (new segments): compute $60.4B + networking $14.8B + edge $6.4B = $81.62B total. Year-over-year growth decelerating from 364% (2023) to ~83% (2027 projected).",
|
||||
"source": "SEC 10-Q filings, NVIDIA IR, ai_infrastructure.py Dataset F",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "Tech debt surged to $121B in 2025, approximately four times the five-year average.",
|
||||
"evidence": "Accelerated pace of AI infrastructure deployment has generated significant technical debt through shortcuts, temporary solutions, and deferred maintenance. Creates structural risk for future innovation and security.",
|
||||
"source": "Narrative Section 3, chart 06_tech_debt.png",
|
||||
"confidence": "HIGH"
|
||||
}
|
||||
],
|
||||
"impact": [
|
||||
{
|
||||
"claim": "Massive capital commitment creates an infrastructure overhang regardless of valuation outcomes.",
|
||||
"evidence": "The GPU clusters, data centers, and networking fabric being deployed today will exist regardless of what happens to current valuations. Parallel to telecom and internet infrastructure buildouts of previous eras.",
|
||||
"source": "Narrative Section 4, historical precedent analysis",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "Diminishing returns are likely as the infrastructure buildout matures.",
|
||||
"evidence": "NVIDIA growth deceleration from 364% to ~83% signals potential plateau. While still representing substantial growth, the rate of acceleration is declining, suggesting the easy-growth phase of infrastructure investment may be ending.",
|
||||
"source": "ai_infrastructure.py Dataset F growth rate analysis",
|
||||
"confidence": "MEDIUM"
|
||||
},
|
||||
{
|
||||
"claim": "The accelerated deployment pace generates compounding technical debt.",
|
||||
"evidence": "$121B tech debt spike represents shortcuts in codebases and systems. Creates structural risk: may slow future innovation, increase vulnerability to security incidents, and amplify correction costs.",
|
||||
"source": "Narrative Section 3, tech debt analysis",
|
||||
"confidence": "HIGH"
|
||||
}
|
||||
],
|
||||
"act": [
|
||||
{
|
||||
"claim": "Question the efficiency of capital allocation given the scale of spending.",
|
||||
"evidence": "$605B in projected 2026 capex with 85-90% devoted to AI infrastructure. The economic justification requires scrutiny: is this level of spending generating proportional returns, or is it driven by competitive anxiety and FOMO?",
|
||||
"source": "ValueAddVC projections + utilization analysis",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "Compare to dot-com infrastructure buildout for historical context.",
|
||||
"evidence": "Dot-com bubble saw massive investment in fiber optic cables, data centers, and networking infrastructure. Most companies failed, but the infrastructure became the backbone of the digital economy. Similar pattern likely in AI.",
|
||||
"source": "Narrative Section 4, historical precedent",
|
||||
"confidence": "HIGH"
|
||||
}
|
||||
]
|
||||
},
|
||||
"3": {
|
||||
"title": "GPU Utilization Paradox",
|
||||
"cluster": "bubble",
|
||||
"summary": "Approximately $295B has been spent on AI infrastructure at ~5% average GPU utilization, implying ~$280B in idle computing capacity.",
|
||||
"fact": [
|
||||
{
|
||||
"claim": "Over $295B has been spent on AI-related infrastructure at an average GPU utilization rate of approximately 5%.",
|
||||
"evidence": "Aggregate infrastructure spending estimate across hyperscaler capex, enterprise AI purchases, and GPU procurement. 5% utilization rate derived from industry surveys and data center monitoring.",
|
||||
"source": "Narrative Section 3, GPU Utilization Paradox subsection",
|
||||
"confidence": "MEDIUM"
|
||||
},
|
||||
{
|
||||
"claim": "Approximately $280B in computing capacity sits largely idle in data centers worldwide.",
|
||||
"evidence": "$295B total spend minus ~5% utilization = ~$280B effectively wasted. This represents one of the largest capital inefficiencies in recent technology history.",
|
||||
"source": "Narrative Section 3, utilization analysis",
|
||||
"confidence": "MEDIUM"
|
||||
},
|
||||
{
|
||||
"claim": "Underutilization stems from overprovisioning, training-inference imbalance, organizational friction, and economic moat building.",
|
||||
"evidence": "Four primary drivers: (1) companies buying capacity to secure supply rather than for current workloads; (2) GPU clusters optimized for training not efficiently used for inference; (3) enterprises lack talent/processes to deploy effectively; (4) hyperscalers building competitive barriers regardless of economics.",
|
||||
"source": "Narrative Section 3, four-factor analysis",
|
||||
"confidence": "HIGH"
|
||||
}
|
||||
],
|
||||
"impact": [
|
||||
{
|
||||
"claim": "Enormous capital waste undermines the economic case for continued AI infrastructure spending.",
|
||||
"evidence": "$280B in idle capacity represents misallocated capital that could have generated returns elsewhere. If the investment cannot be justified by actual utilization, the economic basis for continued spending becomes increasingly precarious.",
|
||||
"source": "Narrative Section 3, economic analysis",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "The utilization gap represents a significant ROI crisis for AI infrastructure investors.",
|
||||
"evidence": "5% utilization means 95% of purchased capacity generates no revenue. For infrastructure investors and hyperscalers, this represents an enormous gap between capital deployed and revenue generated.",
|
||||
"source": "GPU utilization analysis + hyperscaler capex data",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "GPU utilization paradox is perhaps the clearest single indicator of the bubble.",
|
||||
"evidence": "The infrastructure buildout is being driven more by speculation and competitive anxiety than by genuine demand for computing resources. If demand does not materialize, correction will be severe.",
|
||||
"source": "Narrative Section 3, concluding analysis",
|
||||
"confidence": "HIGH"
|
||||
}
|
||||
],
|
||||
"act": [
|
||||
{
|
||||
"claim": "Highlight the utilization gap as a critical risk indicator.",
|
||||
"evidence": "5% utilization on $295B of infrastructure spending is the single most concrete evidence of overinvestment. This metric cuts through the narrative of inevitable growth and exposes the fundamental disconnect between spending and demand.",
|
||||
"source": "GPU utilization data synthesis",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "Question the efficiency of AI spending in light of underutilization.",
|
||||
"evidence": "If only 5% of purchased GPU capacity is being utilized, organizations should be examining whether alternative approaches (cloud rental, inference optimization, workload scheduling) would deliver better ROI than outright infrastructure ownership.",
|
||||
"source": "Utilization analysis + industry best practices",
|
||||
"confidence": "HIGH"
|
||||
}
|
||||
]
|
||||
},
|
||||
"4": {
|
||||
"title": "Startup Valuation Disconnect",
|
||||
"cluster": "bubble",
|
||||
"summary": "AI startup valuations have reached extraordinary levels with revenue multiples of 31x-40x, historically unprecedented for pre-profit companies.",
|
||||
"fact": [
|
||||
{
|
||||
"claim": "OpenAI is valued at $840B with a 31x revenue multiple; Anthropic at $380B with 40x revenue.",
|
||||
"evidence": "CB Insights Q1 2026 data. OpenAI: $840B valuation, 31x revenue. Anthropic: $380B, 40x revenue. Perplexity AI: $5.3B, 27x. Scale AI: $14B, 7x. Mistral AI: $8B, 40x.",
|
||||
"source": "CB Insights, Q1 2026, narrative Section 2",
|
||||
"confidence": "MEDIUM"
|
||||
},
|
||||
{
|
||||
"claim": "Revenue multiples of 31x-40x are historically unprecedented for pre-profit companies.",
|
||||
"evidence": "During the dot-com bubble, even the most speculative internet companies rarely sustained revenue multiples above 50x. Those valuations were quickly corrected. AI companies are pricing in multi-decade market dominance assumptions.",
|
||||
"source": "Dot-com historical comparison, narrative Section 2",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "The AI sector is effectively pricing in the assumption that these companies will dominate a multi-trillion-dollar market for decades.",
|
||||
"evidence": "Combined AI startup valuations exceed $1.2T (OpenAI $840B + Anthropic $380B + others). Current combined revenue is a fraction of this. The implied future revenue trajectory required to justify these valuations is extraordinary.",
|
||||
"source": "CB Insights valuation data + revenue analysis",
|
||||
"confidence": "MEDIUM"
|
||||
}
|
||||
],
|
||||
"impact": [
|
||||
{
|
||||
"claim": "Valuations are fundamentally detached from near-term financial fundamentals.",
|
||||
"evidence": "31x-40x revenue multiples for companies that are not yet profitable represent a complete disconnect between price and value. If growth disappoints even slightly, the repricing could be devastating.",
|
||||
"source": "CB Insights data + financial analysis",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "Crash risk is elevated if growth projections fail to materialize.",
|
||||
"evidence": "Dot-com companies with similar multiples saw rapid corrections. Pets.com, WebVan, and others lost nearly all their value within months. AI startups face the same risk if they cannot demonstrate sustainable revenue growth.",
|
||||
"source": "Dot-com historical comparison",
|
||||
"confidence": "HIGH"
|
||||
}
|
||||
],
|
||||
"act": [
|
||||
{
|
||||
"claim": "Compare AI startup valuations to dot-com era benchmarks.",
|
||||
"evidence": "1999-2000: internet companies with 50x+ revenue multiples collapsed. 2026: AI companies with 31-40x multiples face similar overvaluation. The historical parallel suggests inevitable correction.",
|
||||
"source": "Dot-com bubble historical data",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "Highlight the revenue reality against the valuation narrative.",
|
||||
"evidence": "OpenAI's $840B valuation implies annual revenue of ~$27B at 31x multiple. Anthropic's $380B at 40x implies ~$9.5B. Both companies are nowhere near these revenue levels, making current valuations unsustainable without exponential growth.",
|
||||
"source": "Revenue multiple analysis",
|
||||
"confidence": "HIGH"
|
||||
}
|
||||
]
|
||||
},
|
||||
"5": {
|
||||
"title": "Real-World Enterprise Deployment",
|
||||
"cluster": "utility",
|
||||
"summary": "AI agents are moving beyond experimentation into genuine production deployment, with verified productivity gains in specific use cases.",
|
||||
"fact": [
|
||||
{
|
||||
"claim": "Klarna's AI assistant handles 2.5M daily transactions with ~700 FTE equivalent capacity.",
|
||||
"evidence": "LangGraph + LangSmith deployment. 85M active users, 80% reduction in resolution time, 70% task automation. HIGH confidence based on LangChain official documentation.",
|
||||
"source": "LangChain case study, Feb 2025, productivity.py case_studies[0]",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "JPMorgan COiN processes 12,000 contracts annually, saving ~$150M per year.",
|
||||
"evidence": "Extracts 150 attributes per document with near-zero error rates. Saves approximately 360,000 hours per year (173 FTE equivalent). Launched 2017, widely cited across multiple sources.",
|
||||
"source": "JPMorgan executive quotes, productivity.py case_studies[1]",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "ServiceNow partner case shows 73% reduction in midnight escalations and $2.3M annual downtime savings.",
|
||||
"evidence": "SnowGeek Solutions (mid-size manufacturer) deploying Now Assist + Agentic AI for IT operations. 65% improvement in MTTR. MEDIUM confidence from partner rather than ServiceNow directly.",
|
||||
"source": "SnowGeek Solutions partner case study, Q4 2025, productivity.py case_studies[2]",
|
||||
"confidence": "MEDIUM"
|
||||
},
|
||||
{
|
||||
"claim": "57.3% of organizations report deploying agents in production with mature engineering practices.",
|
||||
"evidence": "LangChain State of Agent Engineering, Nov-Dec 2025 (1,340 respondents). 89% have observability, 71.5% have full tracing, 75% using multi-model deployments.",
|
||||
"source": "LangChain State of Agent Engineering 2025, agent_adoption.py agent_survey_data",
|
||||
"confidence": "HIGH"
|
||||
}
|
||||
],
|
||||
"impact": [
|
||||
{
|
||||
"claim": "Real ROI exists in specific, well-defined deployments.",
|
||||
"evidence": "Klarna ($60M equivalent), JPMorgan ($150M/year), and ServiceNow ($2.3M/year) demonstrate measurable productivity gains. These case studies represent the leading edge of AI deployment.",
|
||||
"source": "Case study synthesis, productivity.py",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "Production maturity is accelerating with observability and multi-model strategies.",
|
||||
"evidence": "High rates of observability (89%), full tracing (71.5%), and multi-model deployment (75%) suggest organizations are moving past superficial experimentation toward serious engineering practices.",
|
||||
"source": "LangChain State of Agent Engineering 2025",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "Productivity gains are measurable and quantifiable.",
|
||||
"evidence": "Concrete metrics: 700 FTE equivalent (Klarna), 173 FTE equivalent (JPMorgan), 73% escalation reduction (ServiceNow). These are not abstract claims but documented operational improvements.",
|
||||
"source": "Case study metrics compilation",
|
||||
"confidence": "HIGH"
|
||||
}
|
||||
],
|
||||
"act": [
|
||||
{
|
||||
"claim": "Use verified case studies as evidence of genuine AI utility.",
|
||||
"evidence": "The Klarna and JPMorgan cases carry HIGH confidence ratings based on publicly documented sources. These represent the most credible evidence of AI productivity gains in production environments.",
|
||||
"source": "productivity.py case_studies meta analysis",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "Focus on verified metrics rather than vendor self-reports.",
|
||||
"evidence": "Morgan Stanley's 280K developer hours saved claim carries LOW confidence and could not be independently verified. The distinction between verified and unverified claims is critical for honest assessment.",
|
||||
"source": "Narrative Section 5, confidence analysis",
|
||||
"confidence": "HIGH"
|
||||
}
|
||||
]
|
||||
},
|
||||
"6": {
|
||||
"title": "Developer Adoption Reality",
|
||||
"cluster": "utility",
|
||||
"summary": "AI tool adoption among software developers is now pervasive, with 84% using or planning to use AI tools and 22% of merged code being AI-authored.",
|
||||
"fact": [
|
||||
{
|
||||
"claim": "GitHub Copilot has 20M users (4.7M paid) with 90% Fortune 100 adoption.",
|
||||
"evidence": "GitHub all-time users: 20,000,000. Paid subscribers: 4,700,000 (Jan 2026). 90% of Fortune 100 companies have adopted GitHub Copilot.",
|
||||
"source": "GitHub data, agent_adoption.py developer_ai_adoption",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "84% of developers use or plan to use AI tools; 51% use them daily.",
|
||||
"evidence": "Stack Overflow 2025 (~70,000 respondents): 84% use or plan to use, 51% daily use. JetBrains 2025 (~30,000 respondents): 85% regular AI usage, 62% rely on at least one coding assistant.",
|
||||
"source": "Stack Overflow 2025 + JetBrains 2025 surveys, agent_adoption.py",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "22% of merged code is AI-authored, with ~30% acceptance rate for Copilot suggestions.",
|
||||
"evidence": "DX DevCycle Q4 2025: 22% of merged code is AI-authored. 91% of active repositories show AI adoption. GitHub Copilot acceptance rate ~30%, with 88% of accepted code retained.",
|
||||
"source": "DX DevCycle Q4 2025, GitHub/Microsoft study, agent_adoption.py",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "Accenture RCT found measurable productivity improvements with GitHub Copilot.",
|
||||
"evidence": "8.69% increase in PRs per developer, 11% increase in PR merge rate, 84% increase in successful builds. Randomized controlled trial methodology provides empirical grounding.",
|
||||
"source": "Accenture RCT study, agent_adoption.py real_world_productivity_impact",
|
||||
"confidence": "HIGH"
|
||||
}
|
||||
],
|
||||
"impact": [
|
||||
{
|
||||
"claim": "AI tool adoption among developers is real and accelerating.",
|
||||
"evidence": "Multiple independent surveys (Stack Overflow, JetBrains, GitHub, DX DevCycle) all converge on high adoption rates. 91% of active repos show AI adoption. The trend is not a niche phenomenon but industry-wide.",
|
||||
"source": "Multi-source survey convergence",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "Quality concerns persist despite high adoption rates.",
|
||||
"evidence": "~30% acceptance rate means 70% of AI suggestions are rejected. 71% of developers do not merge AI code without manual review. 97% use AI tools before company policies allow (shadow IT).",
|
||||
"source": "developer_sentiment data, agent_adoption.py",
|
||||
"confidence": "MEDIUM"
|
||||
}
|
||||
],
|
||||
"act": [
|
||||
{
|
||||
"claim": "Present adoption data honestly with quality caveats.",
|
||||
"evidence": "High adoption (84% of developers) does not equal high trust. The 30% acceptance rate and 71% manual review rate indicate that developers remain skeptical of AI-generated code quality.",
|
||||
"source": "Adoption data + quality metrics synthesis",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "Acknowledge that AI is an assistive tool, not a replacement for skilled engineering.",
|
||||
"evidence": "Productivity gains are real but bounded. Accenture RCT shows ~9% PR increase, not a 10x improvement. AI excels at code completion, boilerplate, and documentation but cannot replace architecture, debugging, and system design.",
|
||||
"source": "Accenture RCT + narrative Section 5 analysis",
|
||||
"confidence": "HIGH"
|
||||
}
|
||||
]
|
||||
},
|
||||
"7": {
|
||||
"title": "Code Quality and Security Caveats",
|
||||
"cluster": "risk",
|
||||
"summary": "AI-generated code introduces significant security vulnerabilities and quality issues, with 48% of AI-generated code containing potential vulnerabilities.",
|
||||
"fact": [
|
||||
{
|
||||
"claim": "48% of AI-generated code contains potential security vulnerabilities.",
|
||||
"evidence": "Multiple industry analyses. 29.1% of AI-generated Python code contains security weaknesses spanning 43 CWE categories. 24.2% of AI-generated JavaScript code has security weaknesses.",
|
||||
"source": "Academic study of 733 code snippets, agent_adoption.py code_quality_in_production",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "AI-coauthored pull requests have approximately 1.7x more issues than non-AI PRs.",
|
||||
"evidence": "CodeRabbit / DX DevCycle December 2025 study. AI assistance introduces additional complexity and error surface that human reviewers must contend with.",
|
||||
"source": "CodeRabbit Dec 2025 / DX DevCycle, agent_adoption.py code_quality_in_production",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "40% of Copilot-generated programs are flagged for insecure code.",
|
||||
"evidence": "GitHub Copilot research. 6.4% secret leakage rate in Copilot repositories — 40% higher than the 4.6% baseline.",
|
||||
"source": "GitHub Copilot research + academic security research, agent_adoption.py",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "Google DORA 2024 found AI use causes a 7.2% drop in delivery stability.",
|
||||
"evidence": "Teams using AI tools experienced less reliable software delivery than those that didn't. Delivery stability is a key metric in DevOps performance.",
|
||||
"source": "Google DORA 2024 report, agent_adoption.py code_quality_in_production",
|
||||
"confidence": "HIGH"
|
||||
}
|
||||
],
|
||||
"impact": [
|
||||
{
|
||||
"claim": "AI-assisted development introduces real security risks in production systems.",
|
||||
"evidence": "When AI-generated code with vulnerabilities is integrated into production, the vulnerabilities propagate through entire architectures. 48% vulnerability rate is not acceptable for critical systems.",
|
||||
"source": "Security vulnerability analysis, narrative Section 5",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "Long-term technical debt accumulates from AI-generated code integration.",
|
||||
"evidence": "1.7x more issues in AI-coauthored PRs suggests that AI assistance may be introducing complexity that compounds over time. Maintenance burden increases as AI-generated code becomes embedded in legacy systems.",
|
||||
"source": "CodeRabbit study + tech debt analysis",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "Delivery reliability suffers when teams adopt AI tools without adequate review processes.",
|
||||
"evidence": "7.2% drop in delivery stability is a significant operational impact. Less reliable software delivery increases risk of outages, customer complaints, and security incidents.",
|
||||
"source": "Google DORA 2024 report",
|
||||
"confidence": "HIGH"
|
||||
}
|
||||
],
|
||||
"act": [
|
||||
{
|
||||
"claim": "Acknowledge real risks and recommend cautious adoption with mandatory validation.",
|
||||
"evidence": "48% vulnerability rate and 1.7x more PR issues are not academic concerns — they are production realities. Organizations adopting AI-assisted development must invest in security review processes, code quality gates, and developer training.",
|
||||
"source": "Security risk assessment + industry best practices",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "AI-generated code should never be deployed without human review and security auditing.",
|
||||
"evidence": "6.4% secret leakage rate (40% higher than baseline) and 43 CWE categories of vulnerabilities demonstrate that AI tools can expose sensitive credentials and introduce systemic security weaknesses.",
|
||||
"source": "Academic security research + GitHub Copilot data",
|
||||
"confidence": "HIGH"
|
||||
}
|
||||
]
|
||||
},
|
||||
"8": {
|
||||
"title": "Long-Term Productivity Trajectory",
|
||||
"cluster": "utility",
|
||||
"summary": "AI-assisted development shows genuine productivity gains of 20-67% in realistic ranges, with gains compounding over time despite significant near-term failure rates.",
|
||||
"fact": [
|
||||
{
|
||||
"claim": "Realistic productivity gains range from 20-67% depending on context and use case.",
|
||||
"evidence": "Accenture RCT: 8.69% PR increase, 11% merge rate increase, 84% successful builds increase. Microsoft Research: 20-45% productivity improvement. Broader industry estimates reach up to 67% for specific tasks.",
|
||||
"source": "Accenture RCT, Microsoft Research 2024-2025, agent_adoption.py",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "95% of corporate AI pilots deliver zero measurable return; only 5% reach production with impact.",
|
||||
"evidence": "MIT Media Lab 2025, based on 300+ initiatives, 52 organizational interviews, and 153 executive surveys. 72% of AI initiatives fail to reach production (McKinsey). 80% overall AI project failure rate (RAND).",
|
||||
"source": "MIT Media Lab 2025, McKinsey 2025, RAND 2025, productivity.py failure_modes",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "88% report AI adoption, but only 31% are scaling enterprise-wide.",
|
||||
"evidence": "McKinsey State of AI 2025: vast majority stuck in pilot purgatory. 40% of agentic AI projects projected to be canceled by end of 2027 (Gartner). 42% of companies abandoned most AI initiatives in 2025 (S&P Global).",
|
||||
"source": "McKinsey 2025, Gartner prediction, S&P Global 2025",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "External partnership deployments succeed at ~67% vs ~33% for internal builds.",
|
||||
"evidence": "MIT Media Lab 2025 build-vs-buy analysis. Organizations that partner with external vendors achieve significantly higher success rates than those attempting internal development.",
|
||||
"source": "MIT Media Lab 2025, productivity.py failure_modes",
|
||||
"confidence": "MEDIUM"
|
||||
}
|
||||
],
|
||||
"impact": [
|
||||
{
|
||||
"claim": "AI-assisted development is inevitable, with gains that compound over time.",
|
||||
"evidence": "Despite high failure rates, the 5% of successful pilots demonstrate that AI can deliver transformative productivity improvements. The organizations that succeed build institutional knowledge and practices that compound.",
|
||||
"source": "Narrative Section 5, central thesis analysis",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "High failure rates indicate AI requires significant investment and patience.",
|
||||
"evidence": "95% pilot failure rate and 80% overall project failure rate underscore that AI adoption is not plug-and-play. Organizations must invest in talent, processes, and security to realize returns.",
|
||||
"source": "Failure mode analysis, MIT Media Lab + RAND data",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "The infrastructure buildout will outlast the valuation bubble.",
|
||||
"evidence": "Historical precedent from dot-com and telecom bubbles shows that infrastructure built during bubble periods becomes the foundation for transformative innovation. The GPU clusters and data centers will remain valuable even after valuations correct.",
|
||||
"source": "Narrative central thesis, Section 4 historical analysis",
|
||||
"confidence": "HIGH"
|
||||
}
|
||||
],
|
||||
"act": [
|
||||
{
|
||||
"claim": "Frame AI as long-term transformation despite short-term inefficiencies.",
|
||||
"evidence": "The 20-67% productivity gains in successful deployments, combined with the inevitable nature of AI tool adoption (84% of developers), suggest that the long-term trajectory is positive. Short-term failure rates should be viewed as a maturation cost.",
|
||||
"source": "Productivity data + adoption trend synthesis",
|
||||
"confidence": "HIGH"
|
||||
},
|
||||
{
|
||||
"claim": "Invest in real utility rather than speculation, with realistic expectations.",
|
||||
"evidence": "The organizations that succeed are those that separate signal from noise: they focus on well-defined use cases, invest in security review, maintain realistic expectations, and prioritize measurable outcomes over marketing hype.",
|
||||
"source": "Narrative Summary, central recommendations",
|
||||
"confidence": "HIGH"
|
||||
}
|
||||
]
|
||||
}
|
||||
},
|
||||
"metadata": {
|
||||
"extraction_date": "2026-06-04",
|
||||
"source_narrative": "report/case_narrative.md (438 lines, 7 sections)",
|
||||
"source_data_modules": [
|
||||
"src/data/market_bubbles.py",
|
||||
"src/data/ai_infrastructure.py",
|
||||
"src/data/agent_adoption.py",
|
||||
"src/data/productivity.py"
|
||||
],
|
||||
"total_cards": 8,
|
||||
"card_clusters": {
|
||||
"bubble": [1, 2, 3, 4],
|
||||
"utility": [5, 6, 8],
|
||||
"risk": [7]
|
||||
},
|
||||
"confidence_levels": ["HIGH", "MEDIUM", "LOW"],
|
||||
"extraction_method": "ClaimExtractor.parse_narrative + data module cross-reference"
|
||||
}
|
||||
}
|
||||
248
src/battlecards/generate_deck.py
Normal file
@@ -0,0 +1,248 @@
|
||||
"""Deck assembly module for battle cards.
|
||||
|
||||
Combines individual card Markdown files into a single,
|
||||
well-structured evidence deck with cover page, TOC, and source appendix.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
|
||||
from src.battlecards.card_templates import BattleCard, FIASection, render_card
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Card metadata for TOC generation
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
_CARD_METADATA = [
|
||||
{
|
||||
"number": 1,
|
||||
"filename": "card_01_market_valuation.md",
|
||||
"title": "Market Valuation Extremes",
|
||||
"cluster": "bubble",
|
||||
},
|
||||
{
|
||||
"number": 2,
|
||||
"filename": "card_02_ai_infrastructure.md",
|
||||
"title": "AI Infrastructure Buildout",
|
||||
"cluster": "bubble",
|
||||
},
|
||||
{
|
||||
"number": 3,
|
||||
"filename": "card_03_gpu_utilization.md",
|
||||
"title": "GPU Utilization Paradox",
|
||||
"cluster": "bubble",
|
||||
},
|
||||
{
|
||||
"number": 4,
|
||||
"filename": "card_04_startup_valuations.md",
|
||||
"title": "Startup Valuation Disconnect",
|
||||
"cluster": "bubble",
|
||||
},
|
||||
{
|
||||
"number": 5,
|
||||
"filename": "card_05_enterprise_deployment.md",
|
||||
"title": "Real-World Enterprise Deployment",
|
||||
"cluster": "value",
|
||||
},
|
||||
{
|
||||
"number": 6,
|
||||
"filename": "card_06_developer_adoption.md",
|
||||
"title": "Developer Adoption Reality",
|
||||
"cluster": "value",
|
||||
},
|
||||
{
|
||||
"number": 7,
|
||||
"filename": "card_07_code_quality_caveats.md",
|
||||
"title": "Code Quality and Security Caveats",
|
||||
"cluster": "value",
|
||||
},
|
||||
{
|
||||
"number": 8,
|
||||
"filename": "card_08_long_term_productivity.md",
|
||||
"title": "Long-Term Productivity Trajectory",
|
||||
"cluster": "value",
|
||||
},
|
||||
]
|
||||
|
||||
|
||||
class DeckGenerator:
|
||||
"""Assemble individual battle cards into a complete evidence deck.
|
||||
|
||||
Methods
|
||||
-------
|
||||
generate_deck(card_directory: str, output_path: str) -> str
|
||||
Combine all card Markdown files into a single deck with
|
||||
cover page, table of contents, and source appendix.
|
||||
"""
|
||||
|
||||
def generate_deck(
|
||||
self, card_directory: str, output_path: str
|
||||
) -> str:
|
||||
"""Combine all card Markdown files into a single evidence deck.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
card_directory : str
|
||||
Path to the directory containing card Markdown files.
|
||||
output_path : str
|
||||
Destination path for the assembled deck Markdown file.
|
||||
|
||||
Returns
|
||||
-------
|
||||
str
|
||||
Absolute path to the generated deck file.
|
||||
"""
|
||||
card_dir = Path(card_directory)
|
||||
output_file = Path(output_path)
|
||||
output_file.parent.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Collect all sources
|
||||
all_sources: set[str] = set()
|
||||
|
||||
lines: list[str] = []
|
||||
|
||||
# ---- Cover page ----
|
||||
lines.append("# AI Bubble Battle Cards — Evidence Deck")
|
||||
lines.append("")
|
||||
lines.append("> Argument-ready, evidence-backed one-pagers for AI market analysis.")
|
||||
now_str = datetime.now(timezone.utc).strftime("%Y-%m-%d")
|
||||
lines.append(f"> Last updated: {now_str}")
|
||||
lines.append("")
|
||||
|
||||
# ---- Table of Contents ----
|
||||
lines.append("## Table of Contents")
|
||||
lines.append("")
|
||||
lines.append("### Cluster A: The Bubble Exists")
|
||||
lines.append("")
|
||||
for meta in _CARD_METADATA[:4]:
|
||||
lines.append(
|
||||
f"- [{meta['title']}](cards/{meta['filename']})"
|
||||
)
|
||||
lines.append("")
|
||||
lines.append("### Cluster B: LLMs Are Still Valuable")
|
||||
lines.append("")
|
||||
for meta in _CARD_METADATA[4:]:
|
||||
lines.append(
|
||||
f"- [{meta['title']}](cards/{meta['filename']})"
|
||||
)
|
||||
lines.append("")
|
||||
lines.append("---")
|
||||
lines.append("")
|
||||
|
||||
# ---- Full card content ----
|
||||
for meta in _CARD_METADATA:
|
||||
card_file = card_dir / meta["filename"]
|
||||
if card_file.exists():
|
||||
card_content = card_file.read_text(encoding="utf-8")
|
||||
lines.append(card_content)
|
||||
lines.append("")
|
||||
lines.append("---")
|
||||
lines.append("")
|
||||
else:
|
||||
lines.append(f"*Card {meta['number']} ({meta['title']}) not found.*")
|
||||
lines.append("")
|
||||
|
||||
# ---- Source Appendix ----
|
||||
lines.append("## Source Appendix")
|
||||
lines.append("")
|
||||
lines.append("*Primary data sources referenced across all battle cards:*")
|
||||
lines.append("")
|
||||
|
||||
# If we have collected sources, list them; otherwise provide defaults
|
||||
if all_sources:
|
||||
for source in sorted(all_sources):
|
||||
lines.append(f"- {source}")
|
||||
else:
|
||||
lines.append("- Yale/Shiller CAPE data (multpl.com)")
|
||||
lines.append("- FRED economic indicators")
|
||||
lines.append("- World Bank debt & GDP datasets")
|
||||
lines.append("- Industry research reports (2024–2026)")
|
||||
|
||||
lines.append("")
|
||||
|
||||
deck_content = "\n".join(lines)
|
||||
output_file.write_text(deck_content, encoding="utf-8")
|
||||
|
||||
return str(output_file.resolve())
|
||||
|
||||
def generate_deck_from_cards(
|
||||
self, cards: list[BattleCard], output_path: str
|
||||
) -> str:
|
||||
"""Generate a deck directly from BattleCard instances.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
cards : list[BattleCard]
|
||||
List of BattleCard instances to include.
|
||||
output_path : str
|
||||
Destination path for the deck file.
|
||||
|
||||
Returns
|
||||
-------
|
||||
str
|
||||
Absolute path to the generated deck file.
|
||||
"""
|
||||
output_file = Path(output_path)
|
||||
output_file.parent.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
all_sources: set[str] = set()
|
||||
for card in cards:
|
||||
all_sources.update(card.sources)
|
||||
|
||||
lines: list[str] = []
|
||||
|
||||
# Cover page
|
||||
lines.append("# AI Bubble Battle Cards — Evidence Deck")
|
||||
lines.append("")
|
||||
lines.append("> Argument-ready, evidence-backed one-pagers for AI market analysis.")
|
||||
now_str = datetime.now(timezone.utc).strftime("%Y-%m-%d")
|
||||
lines.append(f"> Last updated: {now_str}")
|
||||
lines.append("")
|
||||
|
||||
# TOC
|
||||
lines.append("## Table of Contents")
|
||||
lines.append("")
|
||||
|
||||
bubble_cards = [c for c in cards if c.cluster == "bubble"]
|
||||
value_cards = [c for c in cards if c.cluster == "value"]
|
||||
|
||||
if bubble_cards:
|
||||
lines.append("### Cluster A: The Bubble Exists")
|
||||
lines.append("")
|
||||
for card in sorted(bubble_cards, key=lambda c: c.card_number):
|
||||
safe_title = card.title.lower().replace(" ", "_")
|
||||
lines.append(f"- [Card {card.card_number}: {card.title}]")
|
||||
lines.append("")
|
||||
|
||||
if value_cards:
|
||||
lines.append("### Cluster B: LLMs Are Still Valuable")
|
||||
lines.append("")
|
||||
for card in sorted(value_cards, key=lambda c: c.card_number):
|
||||
lines.append(f"- [Card {card.card_number}: {card.title}]")
|
||||
lines.append("")
|
||||
|
||||
lines.append("---")
|
||||
lines.append("")
|
||||
|
||||
# Card content
|
||||
for card in sorted(cards, key=lambda c: c.card_number):
|
||||
rendered = render_card(card)
|
||||
lines.append(rendered)
|
||||
lines.append("")
|
||||
lines.append("---")
|
||||
lines.append("")
|
||||
|
||||
# Source appendix
|
||||
lines.append("## Source Appendix")
|
||||
lines.append("")
|
||||
for source in sorted(all_sources):
|
||||
lines.append(f"- {source}")
|
||||
lines.append("")
|
||||
|
||||
deck_content = "\n".join(lines)
|
||||
output_file.write_text(deck_content, encoding="utf-8")
|
||||
|
||||
return str(output_file.resolve())
|
||||
413
src/battlecards/mini_charts.py
Normal file
@@ -0,0 +1,413 @@
|
||||
"""Mini-chart engine for battle card embeddings."""
|
||||
import argparse
|
||||
import matplotlib
|
||||
matplotlib.use("Agg")
|
||||
import matplotlib.pyplot as plt
|
||||
from matplotlib.path import Path as MPLPath
|
||||
# Python 3.14 matplotlib patch (required)
|
||||
_orig = MPLPath.__deepcopy__
|
||||
def _safe_deepcopy(self, memo):
|
||||
if id(self) in memo:
|
||||
return memo[id(self)]
|
||||
memo[id(self)] = self
|
||||
return self
|
||||
MPLPath.__deepcopy__ = _safe_deepcopy
|
||||
|
||||
from pathlib import Path
|
||||
|
||||
from src.utils.styling import (
|
||||
get_theme,
|
||||
BUBBLE_ZONE,
|
||||
AI_SPEND,
|
||||
REVENUE,
|
||||
WARNING_ZONE,
|
||||
GRAY_DARK,
|
||||
GRAY_MEDIUM,
|
||||
WHITE,
|
||||
)
|
||||
|
||||
MINI_FIGURE_SIZE = (5, 3)
|
||||
MINI_DPI = 300
|
||||
MIN_LABEL_FONT_SIZE = 9
|
||||
MIN_ANNOTATION_FONT_SIZE = 11
|
||||
MIN_TITLE_FONT_SIZE = 13
|
||||
|
||||
|
||||
class MiniChartEngine:
|
||||
"""Engine for generating compact, themed mini-charts for battle cards."""
|
||||
|
||||
def _init_figure(self, title: str):
|
||||
"""Create a themed figure and axes."""
|
||||
plt.rcParams.update(get_theme())
|
||||
fig, ax = plt.subplots(figsize=MINI_FIGURE_SIZE)
|
||||
fig.set_facecolor(WHITE)
|
||||
ax.set_facecolor(WHITE)
|
||||
ax.spines["top"].set_visible(False)
|
||||
ax.spines["right"].set_visible(False)
|
||||
ax.spines["left"].set_color("#cccccc")
|
||||
ax.spines["bottom"].set_color("#cccccc")
|
||||
ax.set_title(title, fontsize=MIN_TITLE_FONT_SIZE, fontweight="bold", pad=8)
|
||||
return fig, ax
|
||||
|
||||
def _save(self, fig, save_path: str) -> str:
|
||||
"""Save figure with tight layout."""
|
||||
Path(save_path).parent.mkdir(parents=True, exist_ok=True)
|
||||
fig.savefig(save_path, dpi=MINI_DPI, bbox_inches="tight", pad_inches=0.15)
|
||||
plt.close(fig)
|
||||
return save_path
|
||||
|
||||
def generate_line_trend(
|
||||
self,
|
||||
years,
|
||||
values,
|
||||
title,
|
||||
save_path,
|
||||
highlight_year=None,
|
||||
highlight_value=None,
|
||||
color=GRAY_DARK,
|
||||
secondary_color=None,
|
||||
):
|
||||
"""Single line chart showing trend over time."""
|
||||
fig, ax = self._init_figure(title)
|
||||
|
||||
ax.plot(years, values, color=color, linewidth=2, marker="o", markersize=5)
|
||||
|
||||
# Highlight year
|
||||
if highlight_year is not None:
|
||||
idx = years.index(highlight_year) if highlight_year in years else len(years) - 1
|
||||
ax.axvline(
|
||||
x=highlight_year,
|
||||
color=BUBBLE_ZONE,
|
||||
linestyle="--",
|
||||
linewidth=1,
|
||||
alpha=0.7,
|
||||
)
|
||||
if highlight_value is not None:
|
||||
ax.annotate(
|
||||
str(highlight_value),
|
||||
xy=(highlight_year, values[idx]),
|
||||
xytext=(5, 10),
|
||||
textcoords="offset points",
|
||||
fontsize=MIN_ANNOTATION_FONT_SIZE,
|
||||
fontweight="bold",
|
||||
color=BUBBLE_ZONE,
|
||||
)
|
||||
|
||||
ax.tick_params(axis="both", labelsize=MIN_LABEL_FONT_SIZE)
|
||||
ax.grid(True, axis="y", alpha=0.4)
|
||||
plt.tight_layout()
|
||||
return self._save(fig, save_path)
|
||||
|
||||
def generate_horizontal_bar(
|
||||
self,
|
||||
categories,
|
||||
values,
|
||||
title,
|
||||
save_path,
|
||||
colors=None,
|
||||
value_labels=None,
|
||||
max_value=None,
|
||||
):
|
||||
"""Horizontal bar chart for comparing categories."""
|
||||
fig, ax = self._init_figure(title)
|
||||
|
||||
if colors is None:
|
||||
colors = [AI_SPEND] * len(categories)
|
||||
|
||||
y_pos = range(len(categories))
|
||||
bars = ax.barh(
|
||||
y_pos, values, color=colors[: len(categories)], height=0.55
|
||||
)
|
||||
|
||||
# Value labels on bars
|
||||
if value_labels is not None:
|
||||
for bar, label in zip(bars, value_labels):
|
||||
ax.text(
|
||||
bar.get_width() + max(values) * 0.01,
|
||||
bar.get_y() + bar.get_height() / 2,
|
||||
str(label),
|
||||
va="center",
|
||||
fontsize=MIN_LABEL_FONT_SIZE,
|
||||
color=GRAY_DARK,
|
||||
)
|
||||
|
||||
ax.set_yticks(list(y_pos))
|
||||
ax.set_yticklabels(categories, fontsize=MIN_LABEL_FONT_SIZE)
|
||||
ax.tick_params(axis="x", labelsize=MIN_LABEL_FONT_SIZE)
|
||||
|
||||
if max_value is not None:
|
||||
ax.set_xlim(0, max_value * 1.15)
|
||||
|
||||
ax.grid(True, axis="x", alpha=0.3)
|
||||
plt.tight_layout()
|
||||
return self._save(fig, save_path)
|
||||
|
||||
def generate_utilization_bar(
|
||||
self,
|
||||
label,
|
||||
percentage,
|
||||
title,
|
||||
save_path,
|
||||
context_text=None,
|
||||
):
|
||||
"""Single horizontal bar showing utilization rate."""
|
||||
fig, ax = self._init_figure(title)
|
||||
|
||||
# Color coding based on percentage
|
||||
if percentage > 50:
|
||||
bar_color = REVENUE # green
|
||||
elif percentage >= 20:
|
||||
bar_color = WARNING_ZONE # orange
|
||||
else:
|
||||
bar_color = BUBBLE_ZONE # red
|
||||
|
||||
# Background track
|
||||
ax.barh(
|
||||
0,
|
||||
100,
|
||||
height=0.6,
|
||||
color="#ecf0f1",
|
||||
edgecolor="#cccccc",
|
||||
linewidth=0.5,
|
||||
)
|
||||
|
||||
# Filled utilization bar
|
||||
bar = ax.barh(0, percentage, height=0.6, color=bar_color)
|
||||
|
||||
# Large percentage annotation on the bar
|
||||
ax.text(
|
||||
percentage / 2,
|
||||
0,
|
||||
f"{percentage:.0f}%",
|
||||
ha="center",
|
||||
va="center",
|
||||
fontsize=MIN_ANNOTATION_FONT_SIZE + 3,
|
||||
fontweight="bold",
|
||||
color=WHITE if percentage > 10 else GRAY_DARK,
|
||||
)
|
||||
|
||||
# Label
|
||||
ax.set_yticks([])
|
||||
ax.set_xlim(0, 100)
|
||||
ax.text(
|
||||
0.02,
|
||||
-0.25,
|
||||
str(label),
|
||||
transform=ax.transData,
|
||||
fontsize=MIN_LABEL_FONT_SIZE,
|
||||
color=GRAY_DARK,
|
||||
)
|
||||
|
||||
# Context text below the bar
|
||||
if context_text is not None:
|
||||
ax.text(
|
||||
50,
|
||||
-0.55,
|
||||
context_text,
|
||||
ha="center",
|
||||
fontsize=MIN_LABEL_FONT_SIZE,
|
||||
color=GRAY_MEDIUM,
|
||||
style="italic",
|
||||
)
|
||||
|
||||
ax.set_ylim(-0.9, 0.5)
|
||||
ax.axis("off")
|
||||
plt.tight_layout()
|
||||
return self._save(fig, save_path)
|
||||
|
||||
def generate_comparison_bar(
|
||||
self,
|
||||
categories,
|
||||
values_left,
|
||||
values_right,
|
||||
title,
|
||||
save_path,
|
||||
label_left=None,
|
||||
label_right=None,
|
||||
colors=None,
|
||||
):
|
||||
"""Side-by-side grouped bar chart for comparisons."""
|
||||
fig, ax = self._init_figure(title)
|
||||
|
||||
if colors is None:
|
||||
color_left = AI_SPEND
|
||||
color_right = GRAY_MEDIUM
|
||||
elif len(colors) >= 2:
|
||||
color_left = colors[0]
|
||||
color_right = colors[1]
|
||||
else:
|
||||
color_left = colors[0] if len(colors) == 1 else AI_SPEND
|
||||
color_right = GRAY_MEDIUM
|
||||
|
||||
x = list(range(len(categories)))
|
||||
width = 0.35
|
||||
|
||||
bars_left = ax.bar(
|
||||
[p - width / 2 for p in x],
|
||||
values_left,
|
||||
width,
|
||||
label=label_left or "Left",
|
||||
color=color_left,
|
||||
)
|
||||
bars_right = ax.bar(
|
||||
[p + width / 2 for p in x],
|
||||
values_right,
|
||||
width,
|
||||
label=label_right or "Right",
|
||||
color=color_right,
|
||||
)
|
||||
|
||||
ax.set_xticks(x)
|
||||
ax.set_xticklabels(categories, fontsize=MIN_LABEL_FONT_SIZE)
|
||||
ax.tick_params(axis="y", labelsize=MIN_LABEL_FONT_SIZE)
|
||||
ax.legend(
|
||||
loc="upper center",
|
||||
bbox_to_anchor=(0.5, -0.12),
|
||||
ncol=2,
|
||||
fontsize=MIN_LABEL_FONT_SIZE,
|
||||
)
|
||||
ax.grid(True, axis="y", alpha=0.3)
|
||||
plt.tight_layout()
|
||||
return self._save(fig, save_path)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Standalone convenience functions (for use by card workers)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def create_cape_chart(years, values, save_path):
|
||||
"""Create historical CAPE trend with current value highlighted."""
|
||||
engine = MiniChartEngine()
|
||||
engine.generate_line_trend(
|
||||
years=years,
|
||||
values=values,
|
||||
title="CAPE Ratio Trend",
|
||||
save_path=save_path,
|
||||
highlight_year=years[-1],
|
||||
highlight_value=values[-1],
|
||||
color=GRAY_DARK,
|
||||
)
|
||||
|
||||
|
||||
def create_capex_chart(companies, values, save_path):
|
||||
"""Create hyperscaler capex comparison bar chart."""
|
||||
engine = MiniChartEngine()
|
||||
colors_list = [AI_SPEND, WARNING_ZONE, REVENUE, GRAY_MEDIUM, BUBBLE_ZONE]
|
||||
engine.generate_horizontal_bar(
|
||||
categories=companies,
|
||||
values=values,
|
||||
title="Hyperscaler AI Capex",
|
||||
save_path=save_path,
|
||||
colors=colors_list[: len(companies)],
|
||||
value_labels=[f"${v}B" for v in values],
|
||||
max_value=max(values) * 1.3,
|
||||
)
|
||||
|
||||
|
||||
def create_utilization_chart(percentage, context_text, save_path):
|
||||
"""Create GPU utilization gauge chart."""
|
||||
engine = MiniChartEngine()
|
||||
engine.generate_utilization_bar(
|
||||
label="GPU Utilization",
|
||||
percentage=percentage,
|
||||
title="Current GPU Utilization",
|
||||
save_path=save_path,
|
||||
context_text=context_text,
|
||||
)
|
||||
|
||||
|
||||
def create_vulnerability_chart(ai_rate, non_ai_rate, save_path):
|
||||
"""Create AI vs non-AI code vulnerability comparison."""
|
||||
engine = MiniChartEngine()
|
||||
engine.generate_comparison_bar(
|
||||
categories=["Code Vulnerability Rate"],
|
||||
values_left=[ai_rate],
|
||||
values_right=[non_ai_rate],
|
||||
title="AI vs Non-AI Code Vulnerability",
|
||||
save_path=save_path,
|
||||
label_left="AI-Generated Code",
|
||||
label_right="Human-Generated Code",
|
||||
colors=[BUBBLE_ZONE, GRAY_MEDIUM],
|
||||
)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# CLI test entry point
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _run_tests() -> None:
|
||||
"""Generate 4 test charts demonstrating each chart type."""
|
||||
base_dir = Path("output/battlecards/charts")
|
||||
engine = MiniChartEngine()
|
||||
base_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Test 1: Line trend — CAPE-like trend with 2026 highlighted
|
||||
years = [2018, 2019, 2020, 2021, 2022, 2023, 2024, 2025, 2026]
|
||||
cape_values = [18.2, 28.5, 31.1, 25.3, 22.7, 33.8, 37.2, 41.5, 44.8]
|
||||
path1 = engine.generate_line_trend(
|
||||
years=years,
|
||||
values=cape_values,
|
||||
title="S&P 500 CAPE Ratio Trend",
|
||||
save_path=str(base_dir / "test_line_trend.png"),
|
||||
highlight_year=2026,
|
||||
highlight_value="44.8x",
|
||||
color=GRAY_DARK,
|
||||
)
|
||||
print(f" [1/4] Line trend: {path1}")
|
||||
|
||||
# Test 2: Horizontal bar — Hyperscaler capex comparison
|
||||
companies = ["Microsoft", "Amazon", "Google", "Meta"]
|
||||
capex_values = [234, 118, 90, 72]
|
||||
capex_colors = ["#00a4ef", "#ff9900", "#4285f4", "#1877f2"]
|
||||
path2 = engine.generate_horizontal_bar(
|
||||
categories=companies,
|
||||
values=capex_values,
|
||||
title="2025 AI Infrastructure Capex ($B)",
|
||||
save_path=str(base_dir / "test_horizontal_bar.png"),
|
||||
colors=capex_colors,
|
||||
value_labels=[f"${v}B" for v in capex_values],
|
||||
max_value=280,
|
||||
)
|
||||
print(f" [2/4] Horizontal bar: {path2}")
|
||||
|
||||
# Test 3: Utilization bar — GPU utilization gauge at 5%
|
||||
path3 = engine.generate_utilization_bar(
|
||||
label="Enterprise Average",
|
||||
percentage=5.0,
|
||||
title="GPU Utilization Rate",
|
||||
save_path=str(base_dir / "test_utilization_bar.png"),
|
||||
context_text="Most GPUs sit idle — 95% capacity wasted",
|
||||
)
|
||||
print(f" [3/4] Utilization bar: {path3}")
|
||||
|
||||
# Test 4: Comparison bar — AI vs non-AI vulnerability rates
|
||||
path4 = engine.generate_comparison_bar(
|
||||
categories=["Vulnerability Rate (%)"],
|
||||
values_left=[47],
|
||||
values_right=[12],
|
||||
title="Code Vulnerability Comparison",
|
||||
save_path=str(base_dir / "test_comparison_bar.png"),
|
||||
label_left="AI-Generated Code",
|
||||
label_right="Human-Generated Code",
|
||||
colors=[BUBBLE_ZONE, GRAY_MEDIUM],
|
||||
)
|
||||
print(f" [4/4] Comparison bar: {path4}")
|
||||
|
||||
print("\nAll 4 test charts generated successfully.")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Battle card mini-chart engine"
|
||||
)
|
||||
parser.add_argument(
|
||||
"--test",
|
||||
action="store_true",
|
||||
help="Generate 4 test charts demonstrating each chart type",
|
||||
)
|
||||
args = parser.parse_args()
|
||||
|
||||
if args.test:
|
||||
_run_tests()
|
||||
else:
|
||||
parser.print_help()
|
||||
96
src/battlecards/research_findings.md
Normal file
@@ -0,0 +1,96 @@
|
||||
# Supplementary Research Findings — Battle Cards
|
||||
|
||||
> Research conducted for Phase 2.2: Current evidence (Q1-Q2 2026) to supplement existing narrative data.
|
||||
|
||||
## Card 1: Market Valuation Extremes
|
||||
- [Relevant findings — if any, this card relies primarily on historical data modules]
|
||||
|
||||
## Card 2: AI Infrastructure Buildout
|
||||
### AWS H200 Price Increase (January 2026)
|
||||
- **Data:** AWS raised H200 prices 15% in January 2026 — first compute price increase in 20 years
|
||||
- **Details:** p5e.48xlarge (8 H200s) now $39.80/hour; idle H100 at ~$6.88/GPU-hour
|
||||
- **Source:** Data Center Dynamics, January 2026
|
||||
- **Confidence:** HIGH
|
||||
|
||||
## Card 3: GPU Utilization Paradox
|
||||
### Cast AI 2026 Kubernetes Report
|
||||
- **Data:** 5% average GPU utilization across tens of thousands of production clusters; 8% CPU; 20% memory
|
||||
- **Source:** Cast AI 2026 State of Kubernetes Optimization Report
|
||||
- **Confidence:** HIGH
|
||||
### Optimized Clusters
|
||||
- **Data:** Documented case of 49% GPU utilization across 136 H200s (10x improvement)
|
||||
- **Source:** Cast AI 2026 report
|
||||
- **Confidence:** HIGH
|
||||
### Market Pivot to Efficiency
|
||||
- **Data:** "Cost per inference/TCO" rose from 34% to 41% as top priority (Q1 2026)
|
||||
- **Source:** VentureBeat Q1 2026 AI Infrastructure & Compute Market Tracker
|
||||
- **Confidence:** MEDIUM
|
||||
|
||||
## Card 4: Startup Valuation Disconnect
|
||||
### Anthropic Funding Round (May 2026)
|
||||
- **Data:** $900B valuation (~180x estimated revenue); 500+ customers paying $1M+/year
|
||||
- **Source:** aibusiness.vc, May 8, 2026
|
||||
- **Confidence:** MEDIUM (reported, not officially confirmed)
|
||||
### OpenAI ARR
|
||||
- **Data:** $25B ARR; IPO projected at $300-400B (~12-16x revenue)
|
||||
- **Source:** aibusiness.vc, May 8, 2026
|
||||
- **Confidence:** MEDIUM (widely reported but not officially confirmed)
|
||||
|
||||
## Card 5: Enterprise Deployment
|
||||
### Agentic AI ROI Study (May 2026)
|
||||
- **Data:** Average ROI of 171% across 12 documented deployments; 74% achieved ROI within first year
|
||||
- **Source:** beri.net, May 19, 2026
|
||||
- **Confidence:** MEDIUM (aggregated case study)
|
||||
### Salesforce Legal AI
|
||||
- **Data:** $5M+ saved in outside counsel costs; Agentforce cumulative savings exceed $100M
|
||||
- **Source:** Salesforce official metrics; beri.net May 2026
|
||||
- **Confidence:** HIGH (vendor-published)
|
||||
### MIT NANDA GenAI Divide (July 2025)
|
||||
- **Data:** 95% of enterprise AI pilots deliver zero measurable P&L impact; 42% abandoned majority of AI projects
|
||||
- **Source:** MIT NANDA report, Fortune August 2025
|
||||
- **Confidence:** HIGH (academically-backed)
|
||||
|
||||
## Card 6: Developer Adoption
|
||||
### GitHub Copilot Scale (July 2025 - June 2026)
|
||||
- **Data:** 20M cumulative users, 4.7M paid, $2B+ ARR, 90% Fortune 100 deployed
|
||||
- **Source:** Microsoft CEO announcement July 2025; aibusiness.vc June 2026
|
||||
- **Confidence:** HIGH (official Microsoft figures)
|
||||
### Copilot Code Generation
|
||||
- **Data:** 46% of code for active users is AI-generated; task completion 55% faster; PR time reduced 75%
|
||||
- **Source:** GitHub research; corporatebloggingtips.com May 2026
|
||||
- **Confidence:** HIGH (GitHub's own research)
|
||||
### Cursor Valuation
|
||||
- **Data:** $29.3B valuation; ~$500M ARR; fastest-growing AI coding tool
|
||||
- **Source:** aibusiness.vc 2026
|
||||
- **Confidence:** MEDIUM
|
||||
|
||||
## Card 7: Code Quality Caveats
|
||||
### Python Security Weaknesses
|
||||
- **Data:** 29.1% of Copilot-generated Python contains potential security weaknesses
|
||||
- **Source:** GitHub/Microsoft research; corporatebloggingtips.com May 2026
|
||||
- **Confidence:** MEDIUM
|
||||
### AI Tool Security Incidents
|
||||
- **Data:** 88% of enterprises reported AI agent security incidents in last 12 months
|
||||
- **Source:** VentureBeat survey 2026
|
||||
- **Confidence:** MEDIUM
|
||||
### Quality Improvements
|
||||
- **Data:** Code readability +3.62%, reliability +2.94%, maintainability +2.47%, conciseness +4.16%
|
||||
- **Source:** GitHub research; Microsoft Research
|
||||
- **Confidence:** MEDIUM (modest improvements)
|
||||
|
||||
## Card 8: Long-Term Productivity
|
||||
### Accenture RCT Results
|
||||
- **Data:** 8.69% PR increase, 84% successful build rate improvement, 46% faster task completion
|
||||
- **Source:** Accenture randomized controlled trial
|
||||
- **Confidence:** HIGH (RCT methodology)
|
||||
### Human-AI Collaboration
|
||||
- **Data:** Combined human-AI pair produces better code than either alone (consistent across GitHub, MS Research, independent studies)
|
||||
- **Source:** Multiple independent research organizations
|
||||
- **Confidence:** HIGH
|
||||
|
||||
## Key Caveats for Card Writers
|
||||
1. **ROI data is skewed**: 171% average ROI vs. 95% zero-ROI — both can be true (top 5% drive averages)
|
||||
2. **Klarna partially reversed**: Bloomberg May 2025 reported Klarna restored human customer service for complex queries
|
||||
3. **Valuation figures are estimates**: Anthropic $900B and OpenAI $25B ARR are reported, not confirmed
|
||||
4. **GPU data may have vendor bias**: Cast AI sells GPU optimization tools
|
||||
5. **Developer surveys have selection bias**: GitHub data captures active users, not abandoners
|
||||