💡 Idea Generation · Identification Strategy

The Claude Code Shock

5 个 Nobel-track 研究问题,以及如何用 2025-11-24 的全球冲击来验证它们

📌 在这一页之前先看

📱 App Market 页 — 那里讨论的是为什么这是个稀缺的实证设定。本页讨论的是用这个设定回答哪些具体经济问题,以及如何验证。

⚡ 这个 Shock 是什么?

2025-11-24 前后,Anthropic 和 OpenAI 同时发布了第一批真正能从 idea 到 shipped product的 agentic coding tools:

2025-11-19 — OpenAI 发布 GPT-5.1-Codex-Max
2025-11-24 — Anthropic 发布 Claude Opus 4.5 + Claude Code 平台升级
2025-12-03 — Anthropic Claude Code 商业 run-rate 突破 $1B (作为采用率证据)

这不是单一公司事件,而是前沿 agentic coding 技术栈的同步释放。冲击的特点:

✅ 全球同时发生 — 没有"未受影响地区"做对照
✅ 日期精确 — 不需要构造 event window 估计
✅ 独立外部验证 — GitHub 在同一窗口内 AI-coding repos +56%(独立数据源)
✅ 跟 Apple 政策窗口可以隔离 — placebo 测试 Apple 政策日期(10/29, 11/13)无效应

🤔 我的推理框架

评估一个 testable prediction 我用 4 个标准:

Theoretical importance — 这个问题跟过去 25 年 Econ Nobel 传统的对话深度
Empirical answerability with THIS data — 你的 App Store 数据能不能直接 test (vs 需要新数据)
Magnitude of policy implications — 实证结果是否能改变政策
Field-shaping potential — 这个问题是否会成为后续研究的 anchor

下面 5 个 question 按"Nobel-track potential"从高到低排序。每个都附:理论背景、谁讨论过、可测假设、验证策略、风险。

🥇 #1 — Surplus Migration 红利流向问题

★ 我评估的最高优先级

Where does AI productivity surplus accrue?

"AI 生产率红利最终流向谁?上游 AI 公司、下游平台、新进入者、现有公司、还是消费者?"

理论背景

整个 AI 经济学的核心问题。Acemoglu 框架预测:红利流向取决于哪些 input 仍然稀缺。当 code 变便宜,稀缺度高的 complement(用户网络、注意力、信任、监管访问、平台分发权)成为新瓶颈,租金流向控制这些 complement 的主体。

谁讨论过

Tirole "Digital Dystopia" AER 2021 — 平台 surveillance 抽取 attention rent 的理论模型
Athey & Scott Morton "AI Competition and Welfare" (NBER 2025) — "Double Harm": upstream AI monopoly + 下游 productivity loss
Korinek & Lockwood "Public Finance in AI" (NBER 2025) — labor 收入税基崩塌时,租金流向 capital 持有者
Korinek & Stiglitz "Steering Technological Progress" (NBER 2026) — 政策应该如何 steer 红利

Testable Predictions

P1: Producer surplus 流向 entrant,当 code 是 binding input → 50.4% 新 developer entry vs 14.2% incumbent ✅
P2: Platform commission ↑ × Δ entrant revenue → Apple 30% × Δrevenue ≈ $0.6-3M 增量租金
P3: Consumer surplus 受 matching capacity 约束 → Review-weighted attention 不 scale 是直接证据
P4: Cross-category 上,scarce complement 决定 rent 流向 → self-contained-code vs commerce/social 反比例 ✅

验证策略 (用你已有数据)

算 entrant 总 revenue: 4,400 launches × E[revenue|self-contained genre] ≈ $2-10M(用 Sensor Tower 公开 benchmark)
算 platform commission: × 30% ≈ $0.6-3M
算 incumbent surplus loss: -14.7 pp 在 self-contained 类别的 incumbent share decline
算 consumer DWL: (zero-review apps × E[download if discovered] × E[CS per download])
关键的 cross-category test: 这 4 个数在 self-contained-code vs commerce/social 应该是反比例的

风险

Upstream AI 公司(Anthropic/OpenAI)的 rent 你看不到 → 不能 close the surplus accounting。但你能 lower bound 其他 4 个 channel,这已经足够 power 一篇 Top Econ paper。

🥈 #2 — Empirical Bottleneck Test

Nobel-adjacent · 直接 test Acemoglu 框架

Which tasks are the binding constraint?

"Acemoglu 理论说自动化一个 task 不足以提高 aggregate output,如果 complement 还是 binding。但哪些 task 是 binding 是个 empirical 问题,理论说不清。"

理论背景

Acemoglu-Restrepo task-based automation framework 是 2024 Nobel 直接相关的工作。但至今没有市场尺度的实证 test 验证它的核心机制。所有 test 都是间接的(survey,RCT)。你这个 setting 直接读出 binding constraint。

谁讨论过

Acemoglu & Restrepo "Tasks, Automation, Wage Inequality" Econometrica 2022
Acemoglu "Simple Macro of AI" Economic Policy 2024 — 0.66% TFP gain 的预测
Acemoglu, Kong, Restrepo "Tasks at Work" NBER 2024
Benjamin Jones "AI in R&D" NBER 2025 — task complementarity θ < 0 的 CES 模型
Restrepo "We Won't Be Missed" NBER 2025 — bottleneck vs supplementary task partition

Testable Predictions

P1: Self-contained-code genres 响应最大 — ✅ 已经在你的数据里:62% of DID
P2: 需要 user network 的 genre 不响应 — ✅ Social Networking +6.1%
P3: 需要 merchant supply 的 genre 不响应 — ✅ Commerce 3.5% of DID
P4: 反向预测 — 如果下一波 AI 能 onboard 用户(AI agents 模拟 users),Social genre 才会响应
P5: 同一 genre 内 app-level 非代码复杂度应预测响应大小

验证策略

主结果已经在你 paper 里
加分项: 把 3-LLM 分类的 non-code bottleneck 做成连续变量(而非 binary), 跑 OLS: Δlaunch_g = α + β·(1 − bottleneck_intensity_g) + ε_g β 应该 ≈ 1(完全弹性)在纯代码类,~0 在完全网络类
Out-of-sample 预测: pre-register 一个声明 — "下一波 AI 增加 X 能力时,genre Y 将响应"。等下次 AI release 验证 = 真正的 prospective test → 这是 Nobel-track 工作的核心做法

风险

3-LLM 分类的 measurement error。Mitigation: 报 inter-rater reliability + 人工子样本验证 + 测试 robust to majority-vote vs unanimous-vote。

🥉 #3 — Schumpeter vs Routine-Displacement

直接 contradicts NBER 2025 卷的内部分歧

Does AI favor entrants or incumbents?

"经济学最古老的辩论之一:新技术 destroys incumbents (Schumpeter 1942) 还是 entrenches them (Brynjolfsson-Hitzig 集中化预测)?"

理论背景

这是 NBER 2025 卷里 Brynjolfsson-Hitzig (集中化) vs Goldfarb (分散化反驳) 的核心争论。没有实证裁决。你的数据是这个争论的清晰判决人。

谁讨论过

Brynjolfsson & Hitzig "AI's Use of Knowledge in Society" (NBER 2025) — AI 反转 Hayek,集中化论
Goldfarb 评论 — tacit knowledge 可能在 HQ,AI 反而 democratize
Klette & Kortum 2004,Akcigit & Kerr 2018 — innovation by entry vs entrenchment
Sutton 1991 — sunk costs and entry barriers

Testable Predictions

P1: New entrant share rises after shock — ✅ +50.4% directly
P2: Incumbent app launch share falls — ✅ -14.7 pp in self-contained
P3: 效应集中于 IP/data 不锁定 incumbent 的类别 — ✅ self-contained vs commerce
P4: Long-run survival rate of entrants vs incumbents — ⚠️ 需要 12-24 个月跟踪
P5: Within incumbents,quality-quantity tradeoff(老公司停止造新 app,转为质量提升)

验证策略

已经有: entrant vs incumbent split (50.4% / 14.2%)
加分项: 跟踪 entrant survival(6-12 个月后还在 store 的比例)
加分项: incumbent 行为变化 — 价格提升? 应用质量提升? 评论数变化?
关键 contrast: 比较 self-contained vs network-required genres 上 incumbent 是否被保护(在 network-required 类,incumbent 由 user network 保护应该不会丢市场)

风险

5 个月 post-treatment window 看不到 long-run dynamics。Mitigation: 限定 paper 的 claim 在 short-run market reallocation,把 long-run 作为 future work。

🏅 #4 — Micro-Macro Bridge

解决文献 puzzle

Does task productivity equal market productivity?

"文献最大的 puzzle — RCTs 显示 AI 让单人快 30-50%(Noy/Zhang, Brynjolfsson-Li-Raymond),但 macro 数据看不到 TFP 跳。到底哪个对?"

理论背景

这是 Solow Paradox 2.0。你的数据正好填这个 gap:market-scale 观察,可直接对比 task vs end-to-end production。

谁讨论过

Brynjolfsson, Rock & Syverson "Productivity J-Curve" 2021
Noy & Zhang 2023, Peng et al. 2023 — RCT 显示大 task gain
Acemoglu "Simple Macro" 2024 — Hulten 定理预测 0.66% TFP
Andrews & Farboodi "Do Markets Believe in TAI?" 2025 — bond yields 暗示市场下调增长预期

Testable Predictions

P1: Task-level gains (40-50% per Brynjolfsson RCT) > end-to-end production gains (48%)
P2: 在 high non-code complement genres, market gain << task gain
P3: Aggregate market value (revenue × launches) gain 远小于 task productivity gain → productivity gain "leaks" 到 complements

验证策略

已经在你 paper 的逻辑里了。主要是 framing strengthening:把这个 puzzle 在 intro 第一段就 frame 出来(我们已经在 paper_v3 的修订里做了)。

风险

"Task" 和 "market" productivity 的定义不在 perfectly comparable scale,需要小心论证。

🎖 #5 — Abundance Without Matching

welfare implication 强,但 well-trodden 路径

Does cheap production create variety welfare or noise flood?

"Brynjolfsson-Hu-Smith Long Tail 理论说 cheap production 增加 variety → consumer welfare ↑。但如果 matching capacity 不 scale,marginal product = waste。哪个对?"

谁讨论过

Brynjolfsson, Hu & Smith 2003 — Long Tail 红利
Athey & Ellison 2011 — search-cost 文献
Stiglitz & Ventura-Bolet "Information Ecosystem" (NBER 2025) — information collapse 的延伸

Testable

计算 zero-review apps 占比上升 × 估计 download-if-matched × CS per download。比较 self-contained (高产但低 attention) vs commerce/local (低产但 attention-balanced)。

为什么排第 5

这条路 search-cost 文献已经被研究比较多。不是 first paper on the topic,而是 extension。

📦 我的推荐:Bundled Single Paper

⚡ 最优策略:三合一

**一篇 paper 同时回答 Q1 + Q2 + Q3** 互相加强:

Frame: "AI cheapens one input; surplus migrates to whichever non-code complement now binds" (= Q1 Surplus Migration)
Mechanism: Cross-category heterogeneity identifies which complements bind (= Q2 Bottleneck Test)
Distribution: Entrant capture vs incumbent retention varies by category (= Q3 Schumpeter Test)

这三个串起来 = AER/QJE 主刊级 paper。Q4 和 Q5 是 robustness/extension。

🚀 接下来你可以做的具体动作

算一个 platform rent ballpark 数字 — 用 Sensor Tower / data.ai 公开数据,给 surplus migration narrative 加一个 magnitude 数字 ($0.6-3M Apple commission gain)。这就是 Athey-Morton "double harm" 在 downstream platform 的实证版,是 paper 的 punchline。
把 cross-category mechanism 升级为连续变量 — 现在你用 binary 分类。改成连续 bottleneck intensity 然后报 OLS 系数 → 直接 test Acemoglu 框架。
Pre-register 一个 out-of-sample prediction — 声明"下一波 AI 增加 X 能力时,genre Y 将响应" → 等下次 AI release 验证。这是真正的 prospective Nobel-track move。
跟踪 entrant survival — 6-12 个月后回头看哪些 entrant 还在,这能补全 Schumpeter 测试。
(可选) 写一个 short companion theory paper — 把 Q1/Q2/Q3 的预测在一个 formal model 里串起来 (扩展 Garicano 2000 knowledge hierarchy + Acemoglu-Restrepo task model)。

🔭 跨设定推广(Future Papers)

整套 framework 不限于 App Store。同样的 cross-domain mechanism 在以下 setting 都能复用:

科学发表(Gartenberg 已经做了一个 case)— ChatGPT 让"写"便宜,但 evaluation / peer review 是 binding
影视/媒体 — Sora/Midjourney 让"创作"便宜,但 attention/distribution 是 binding(这可能就是 "Art Market"?)
金融顾问 — AI 让 personalization 便宜,但 trust / regulation 是 binding
教育内容 — AI 让 lesson 便宜,但 credentialing / labor market signaling 是 binding

每一个 setting 都能复用同样的 cross-domain test 框架。你这篇 paper 是"first market-scale test",未来 5 年别人在做的就是把这套 framework 应用到其他 setting。这就是 founding-paper 的位置。

← Previous

📱 App Market · Economic laboratory

🏆 NBER Volume Research Rankings

AIEcon Paper

The Claude Code Shock

⚡ 这个 Shock 是什么?

🤔 我的推理框架

🥇 #1 — Surplus Migration 红利流向问题

Where does AI productivity surplus accrue?

🥈 #2 — Empirical Bottleneck Test

Which tasks are the binding constraint?

🥉 #3 — Schumpeter vs Routine-Displacement

Does AI favor entrants or incumbents?

🏅 #4 — Micro-Macro Bridge

Does task productivity equal market productivity?

🎖 #5 — Abundance Without Matching

Does cheap production create variety welfare or noise flood?

📦 我的推荐:Bundled Single Paper

🚀 接下来你可以做的具体动作

🔭 跨设定推广(Future Papers)