Literature Readings · Prediction Markets · Paper A · Pre-Registration
Paper A Pre-Registration
Locked Industry Exposure Measure + Analysis Plan for OSF Submission
Draft pre-registration · May 19, 2026 · OSF-style format · Lock before any 2024 data access
📋 Purpose & OSF Submission
The single biggest reviewer concern for Paper A will be: "the industry exposure measure was constructed to maximize the cross-sectional fit on 2024 data." Researcher-degrees-of-freedom in measure construction is the classic credibility-eroding problem in cross-sectional asset pricing.
The solution: pre-register the industry exposure measure construction at OSF BEFORE accessing 2024 PolyMarket data. This pre-registration documents:
- The primary composite exposure measure (5-channel, equal-weighted)
- 4 pre-committed alternative constructions
- The 40 event timestamps to be analyzed (verified independently before lock)
- The 2016 election placebo validation procedure
- Specific decision rules for what counts as positive vs negative evidence
Recommended platform: OSF Registries (osf.io/registries/aspredicted/new). Doable in 1 sitting once this document is finalized.
1. Locked Hypothesis
H1 — Cross-asset response to PolyMarket shocks. Within 30-minute windows around major 2024 election news events, US asset prices respond systematically to PolyMarket Trump probability changes. Specifically, for asset class k:
H1.SPX: βSPX > 0 (Trump favorable for equity aggregate)
H1.MXN: βUSDMXN > 0 (Mexican peso depreciates on Trump shocks)
H1.10Y: β10Y-yield > 0 (yields rise on Trump shocks)
H1.VIX: βVIX > 0 (vol rises on Trump shocks)
H1.BTC: βBTC > 0 (crypto rises on Trump deregulation expectation)
H2 — Industry cross-section. Within S&P 500 firms, the firm-specific coefficient β̂i is increasing in pre-determined TrumpExposurei (defined in §3):
β̂i = δ0 + δ1 · TrumpExposurei + γ · Zi + ui
Primary test: δ1 > 0 (p < 0.05) under standard heteroskedasticity-robust SEs clustered by industry.
2. Locked Sample Selection
Firms: S&P 500 constituents as of Jan 1, 2024. Excludes financials with high government exposure (defined as >30% revenue from US government) — these are pre-determined Trump-positive but mechanically. Total expected: N ≈ 470 firms.
Events: The 40 events listed in §5, with timestamps verified against ≥ 2 sources (Wikipedia + WSJ + Bloomberg).
Sample period: Jan 1, 2024 - Nov 30, 2024 (PolyMarket Trump prob > 5% throughout).
Event window: 30 minutes (primary). Robustness windows of 5, 15, 60 min, and 1 day pre-committed.
Asset cross-section: all S&P 500 + 11 SPDR sectors + 8 FX pairs + 4 Treasury tenors + IG/HY spreads + VIX + 6 commodities + 5 crypto = ~520 unit panel × 40 events ≈ ~21K observations.
3. Locked Industry Exposure Measure
The composite measure for each firm i, locked as follows:
where each component is standardized to mean-0, SD-1 over the S&P 500 cross-section using 2017-2019 data only.
3.1 TaxExposurei
Definition: Firm i's 2017-2019 average effective tax rate (ETR = total tax / pretax income from Compustat).
Construction: Standardize ETR across S&P 500 firms. Higher TaxExposure = more room to gain from corporate tax cuts.
Compustat fields: txt (income taxes) / pi (pretax income), averaged over fiscal years 2017, 2018, 2019.
Robustness alt: 5-year average (2015-2019) or 3-year median.
3.2 RegExposurei
Definition: Firm i's industry-level regulatory intensity, measured as the sum of:
(a) 2017-2019 industry lobbying spend on federal agencies (from OpenSecrets, summed over Senate + House + Executive + relevant agencies), and
(b) Industry-level Federal Register mentions of regulatory actions affecting the firm's NAICS-4 industry (2017-2019).
Standardize both components, equal-weighted average.
Higher RegExposure = more to gain from deregulation.
3.3 TradeExposurei
Definition: Firm i's exposure to import competition from China + Mexico, measured as 2017-2019 industry import share from those countries weighted by tariff differential (column 2 minus column 1 from US HTS).
Source: Atkin & Khandelwal (NBER WP) trade-exposure dataset, or replicate from Census trade data + USITC HTS.
Higher TradeExposure = more to lose from tariffs. NEGATIVE coefficient in composite.
3.4 ImmigrationExposurei
Definition: Average of two components, both standardized:
(a) H-1B dependency: industry's 2017-2019 share of H-1B visa applications relative to industry employment (USCIS data).
(b) Unauthorized-worker share: industry's 2017-2019 share of unauthorized workers (Borjas-Tienda estimates by NAICS-4, via BLS).
Higher ImmigrationExposure = more to lose from restriction. NEGATIVE coefficient in composite.
3.5 GeopoliticalExposurei
Definition: Average of three components, standardized:
(a) Defense contract revenue share (Bloomberg Government 2017-2019 contract awards, scaled by firm revenue)
(b) International risk score from 10-K Item 1A "geographic risk" text (NLP-based scoring of Russia/Ukraine/Iran/China mentions in 2018-2019 10-Ks)
(c) US-government revenue share (Compustat segment data)
Higher GeopoliticalExposure = more sensitive to policy regime change.
4. Locked Alternative Constructions
Four pre-committed alternative constructions to be reported in the paper alongside the primary measure:
Alt 1 — PCA composite. First principal component of the 5 channels (after standardization).
Alt 2 — Lobbying-weighted composite. Same 5 channels but weighted by industry political activity (lobbying spend share).
Alt 3 — 10-K text-mining composite. Pure NLP-based: score each firm's 2018-2019 10-Ks for "political risk" language using a pre-specified lexicon.
Alt 4 — Individual channels. Report β̂i = δ1 · Channeli for each of the 5 channels SEPARATELY (no composite).
Decision rule: Robustness is established if (i) the primary composite has δ1 > 0 with p < 0.05, AND (ii) at least 3 of 4 alternatives produce qualitatively similar conclusions.
5. Locked Event List (40 Events)
Timestamp standard: all UTC, second resolution. Verified independently against ≥2 of {Wikipedia, WSJ, NYT, Bloomberg, Reuters, AP}. Locked before 2024 PolyMarket data access.
5.1 Nine major events (locked)
| # | Event | Date / UTC | Direction prior |
|---|---|---|---|
| 1 | Iowa caucus — Trump wins | 2024-01-15 23:30 | Trump+ |
| 2 | Trump conviction (NY hush money) | 2024-05-30 21:00 | Harris+ |
| 3 | Biden-Trump debate (Atlanta) | 2024-06-27 01:00 | Trump+++ |
| 4 | Trump assassination attempt #1 (Butler PA) | 2024-07-13 22:11 | Trump+++ |
| 5 | Biden drops out (Truth Social post) | 2024-07-21 17:46 | Harris++ (Biden-) |
| 6 | Harris picks Walz as VP | 2024-08-06 13:00 | Harris+ |
| 7 | Harris-Trump debate (Philadelphia) | 2024-09-11 01:00 | Harris++ |
| 8 | Trump assassination attempt #2 (West Palm Beach) | 2024-09-15 18:00 | Trump+ |
| 9 | Election Night (decisive call ~2 AM ET) | 2024-11-06 06:00 | Trump+++ |
5.2 Minor events
31 additional events (super Tuesday, court rulings, major polling shifts > 3pp daily, VP debate, sub-presidential primaries, etc.) — full list in OSF appendix. Locked at pre-registration time.
5.3 Decision rule for adding events
Once locked, no events may be added or removed except for the following pre-committed reasons:
- Timestamp correction (must reconcile against existing locked sources)
- Discovery of an event independently classified as "major" by ≥3 of {WSJ, NYT, Bloomberg, FT} in 2024 retrospective summaries — but must be disclosed as a deviation
6. Locked Primary Specifications
6.1 Spec 1 (asset-class)
Xe controls: lagged 1-min VIX, lagged 1-min S&P, lagged 1-min DXY. SE clustering: by event. Window: [te − 5min, te + 30min].
6.2 Spec 2 (industry cross-section)
Zi controls: log market cap, book-to-market ratio (B/M), profitability (OP), asset growth (Inv), pre-2024 idiosyncratic vol. SE clustering: by GICS-2 industry. Sample: S&P 500 firms with ≥ 30 events of non-missing returns.
6.3 First-stage / IV
WhaleTradet: net Trump-direction trade flow (in $) from the 11 Chainalysis-identified whale wallets (Fredi9999 cluster) in minute t. First-stage F-stat > 10 required for IV results to be reported.
7. Sample Size & Power
Effect size benchmarked against Snowberg-Wolfers-Zitzewitz (QJE 2007):
- SWZ 2007 effect: 1-pp shift in Trump probability → 0.02-0.03% equity return (intraday).
- 2024 expected effect: larger (more concentrated industries; more polarized policy platforms). Conservative target: 0.04% per 1-pp PM shock for SPX.
- Sample size: 40 events × 30-min windows × ~520 assets ≈ ~21,000 observations.
- Power at 5% α: > 0.99 for effect sizes of 0.01% or larger (assuming intraday return SD ~ 0.05%).
- Pre-committed null: If estimated β̂SPX CI is [-0.01%, 0.01%] at 95% CI, we report a "well-identified null" finding rather than a positive result.
8. Decision Rules
Primary finding criterion:
(a) Spec 1 yields β̂k with sign matching H1 prediction at p < 0.05 for at least 3 of 5 named asset classes (SPX, MXN, 10Y, VIX, BTC). AND
(b) Spec 2 yields δ̂1 > 0 at p < 0.05 for the primary composite measure. AND
(c) Result holds under at least 3 of 4 alternative measures.
If (a), (b), and (c) all satisfied: claim positive evidence for SWZ-redux. If only (a) and (b): claim weaker evidence with caveats. If only (a): report null on cross-section. If none: report null overall.
Multiple testing correction: Bonferroni-corrected α = 0.05/5 = 0.01 for the named-asset-class headlines. Romano-Wolf step-down for industry-level cross-section.
P-hacking guardrails: we commit to NOT reporting any specifications, asset classes, or industries beyond those listed in §6 in the main paper. Additional exploratory analyses go in the online appendix and are clearly labeled as exploratory.
9. 2016 Election Placebo Validation
Before any 2024 data analysis, validate the industry exposure measure on the 2016 election:
Step 1. Run Spec 1 + Spec 2 on 2016 election event windows using IEM + TradeSports + PredictIt data.
Step 2. The 2016 industry exposure measure should be constructed from 2010-2015 firm data (5-year lookback, no 2016 overlap).
Step 3. If 2016 cross-sectional δ1 is significantly > 0, the measure has external validity. Proceed to 2024.
Step 4. If 2016 δ1 is null or negative, we have evidence the measure is poorly constructed. Investigate why and revise BEFORE locking 2024 analysis. Document the revision publicly.
10. Pre-Committed Allowed Deviations
Pre-registrations should anticipate the legitimate reasons for deviation. We commit to disclosing deviations on these grounds:
- Data unavailability: if a specific Compustat/Bloomberg/Atkin-Khandelwal field is not available as expected, we may substitute the closest available alternative AND disclose
- Methodological discovery during pipeline build: if we discover a measurement error or improvement, we apply it AND disclose the original and revised version
- Reviewer requests: standard for any pre-registered analysis
- Identification refinement: if the whale-trade-IV exclusion restriction is violated (per a placebo test we will conduct), we transparently report and use alternative IDs
NOT allowed without disclosure: exposure measure tweaking, event list extension, sample period changes, decision rule changes after observing 2024 data.
🔒 Action: Submit to OSF
Once this document is finalized, the action is to submit to OSF Registries as an "AsPredicted" or "OSF Standard" pre-registration. Typical OSF lock-and-publish: ~30 minutes once the document is ready. Receive OSF DOI; cite the DOI in Paper A and Paper B drafts.
Estimated time to finalize: 1-2 days of careful review + minor edits. Submit ~2 weeks before any PolyMarket 2024 data access begins.
Pre-registration draft · Generated May 19, 2026