<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Error Margin]]></title><description><![CDATA[Unstructured data driven analysis in finance, science and rationality. ]]></description><link>https://www.errormargin.com</link><image><url>https://www.errormargin.com/img/substack.png</url><title>Error Margin</title><link>https://www.errormargin.com</link></image><generator>Substack</generator><lastBuildDate>Wed, 08 Apr 2026 19:49:50 GMT</lastBuildDate><atom:link href="https://www.errormargin.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Error Margin]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[errormargin@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[errormargin@substack.com]]></itunes:email><itunes:name><![CDATA[Error Margin]]></itunes:name></itunes:owner><itunes:author><![CDATA[Error Margin]]></itunes:author><googleplay:owner><![CDATA[errormargin@substack.com]]></googleplay:owner><googleplay:email><![CDATA[errormargin@substack.com]]></googleplay:email><googleplay:author><![CDATA[Error Margin]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[ACX Prediction Contest 2024 Retrospective]]></title><description><![CDATA[Since 100% of readers will have found this blog from the ACX announcement I'm going to assume you're familiar with the yearly prediction contest.]]></description><link>https://www.errormargin.com/p/acx-prediction-contest-2024-retrospective</link><guid isPermaLink="false">https://www.errormargin.com/p/acx-prediction-contest-2024-retrospective</guid><dc:creator><![CDATA[Error Margin]]></dc:creator><pubDate>Sun, 27 Apr 2025 18:15:50 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ucxt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cc8b2f4-27d3-4aad-bf17-31eed3610c65_1034x389.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Since 100% of readers will have found this blog from the ACX announcement I'm going to assume you're familiar with the yearly prediction contest. I did the 2024 entry with a friend, who writes over at The Dissonance and has written his own retrospective <a href="https://thedissonance.net/2025/02/09/acx-2024-prediction-contest-retro.html">here</a> which I really recommend for lots of juicy statistical analysis of our predictions. I&#8217;m going to avoid duplicating all of his analysis and instead focus on what my key takeaways were, as well as digging into particular questions of interest.</p><p>It was our first time trying to build somewhat explicit models and be as quantitative as possible (I&#8217;d previously entered one ACX contest with a method most accurately described as squinting at a random number generator). Suffice to say we did way better than we were expecting.</p><div class="image-gallery-embed" data-attrs="{&quot;gallery&quot;:{&quot;images&quot;:[{&quot;type&quot;:&quot;image/png&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3cc8b2f4-27d3-4aad-bf17-31eed3610c65_1034x389.png&quot;}],&quot;caption&quot;:&quot;I'll explain why we didn't get prizes later. Also note that I have changed my alias since starting this blog.&quot;,&quot;alt&quot;:&quot;&quot;,&quot;staticGalleryImage&quot;:{&quot;type&quot;:&quot;image/png&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3cc8b2f4-27d3-4aad-bf17-31eed3610c65_1034x389.png&quot;}},&quot;isEditorNode&quot;:true}"></div><h2>The Method</h2><p>We wanted to minimise correlated failure modes, so we spent the initial research and modelling time working independently and avoiding sharing any significant information except for clarifications about resolution criteria and the like. We wanted to get through this in a weekend, so allocated 7 minutes per question for individual research and 3 minutes for discussion. In practice this almost always ran over slightly, occasionally by a factor of 3. </p><p>We recorded our individual predictions in a spreadsheet, then updated them after our discussions. These updated predictions are what we both submitted to Metaculus. We predicted on every question.</p><p>My research methodology almost always consisted of attempting to calculate a meaningful base rate, then looking for idiosyncratic effects that could update me away from this. The 2024 contest seemed to have many questions that lent themselves to this kind of quantitative analysis quite well. For example:</p><ul><li><p>Question: Will the WHO declare a global health emergency (PHEIC) in 2024?</p><p>Strategy: Pick a representative timeline and sample. In this case it was clearly going up over time, so I weighted more recent years more heavily.</p></li></ul><ul><li><p>Question: In 2024 will there be any change in the composition of the US Supreme Court? </p><p>Strategy: Look up everyone&#8217;s age, and US death rates by age. Subtract a bit from each due to wealth then multiply out. Add epsilon for chance of drastic constitutional changes.</p></li><li><p>Question: Will {Bitcoin | SSE | S&amp;P500} go up over 2024. </p><p>Strategy: Pick a representative long timeline and sample years to get a base probability. Update upwards a bit because this whole AI thing might be a big deal.</p></li></ul><p>These tended to be the sorts of questions where I think we were best calibrated, although it can also be quite hard to gain significant leaderboard points on these, as frequently our predictions would come in pretty much on top of the community prediction.</p><div class="image-gallery-embed" data-attrs="{&quot;gallery&quot;:{&quot;images&quot;:[{&quot;type&quot;:&quot;image/png&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c4fcffa9-5d66-4af8-8b37-6c0194bde78c_1140x588.png&quot;}],&quot;caption&quot;:&quot;This curve looks to me like everyone did approximately the same base rate calculation, then decayed it over time as 2024 progressed. &quot;,&quot;alt&quot;:&quot;&quot;,&quot;staticGalleryImage&quot;:{&quot;type&quot;:&quot;image/png&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c4fcffa9-5d66-4af8-8b37-6c0194bde78c_1140x588.png&quot;}},&quot;isEditorNode&quot;:true}"></div><p>Most of our questions ended up being close to the community prediction, and I estimate that around 50% of my score came from questions where I was a few percentage points off consensus in the direction of resolution. These are probably quite boring to discuss, so I&#8217;ll focus on ones where I disagreed with Metaculus.</p><h2>What went well?</h2><p>My second highest scoring question was </p><div class="image-gallery-embed" data-attrs="{&quot;gallery&quot;:{&quot;images&quot;:[{&quot;type&quot;:&quot;image/png&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8bf4d163-31e2-4b69-933d-f09bb9d3c09d_1136x596.png&quot;}],&quot;caption&quot;:&quot;This graph single-handedly caused a panic attack when we first saw it in February.&quot;,&quot;alt&quot;:&quot;&quot;,&quot;staticGalleryImage&quot;:{&quot;type&quot;:&quot;image/png&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8bf4d163-31e2-4b69-933d-f09bb9d3c09d_1136x596.png&quot;}},&quot;isEditorNode&quot;:true}"></div><p>When we first saw this we both thought we must have accidentally input 1-P instead of P. In reality though, I think this is largely attributable to resolution criteria lawyering. The question sounds like it&#8217;s asking if the November datapoint is going to be higher than 11%, when in reality it requires the average across all of Jan-Nov to be higher than 11%. Given LEV sales were at 8.5% at the end of 2023, this would mean hitting around 13.5% by November assuming a linear growth rate, for a nearly 60% YoY increase. The rate ended up being 9.7%, and not even any of the individual months breached 11%, although some did get quite close. Interestingly enough, this is also the question on which I most affected my co-predictor J&#8217;s prediction (if you look at the <em>Question Breakdown </em>graph in his <a href="https://thedissonance.net/2025/02/09/acx-2024-prediction-contest-retro.html">post</a>), and second by effect on Brier score. </p><p>Another question I feel quite good about is:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3F77!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65a71ea3-afcd-43e7-aeb5-51df171b2b10_1138x628.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3F77!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65a71ea3-afcd-43e7-aeb5-51df171b2b10_1138x628.png 424w, https://substackcdn.com/image/fetch/$s_!3F77!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65a71ea3-afcd-43e7-aeb5-51df171b2b10_1138x628.png 848w, https://substackcdn.com/image/fetch/$s_!3F77!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65a71ea3-afcd-43e7-aeb5-51df171b2b10_1138x628.png 1272w, https://substackcdn.com/image/fetch/$s_!3F77!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65a71ea3-afcd-43e7-aeb5-51df171b2b10_1138x628.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3F77!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65a71ea3-afcd-43e7-aeb5-51df171b2b10_1138x628.png" width="1138" height="628" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/65a71ea3-afcd-43e7-aeb5-51df171b2b10_1138x628.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:628,&quot;width&quot;:1138,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:78747,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.errormargin.com/i/158787644?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65a71ea3-afcd-43e7-aeb5-51df171b2b10_1138x628.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3F77!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65a71ea3-afcd-43e7-aeb5-51df171b2b10_1138x628.png 424w, https://substackcdn.com/image/fetch/$s_!3F77!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65a71ea3-afcd-43e7-aeb5-51df171b2b10_1138x628.png 848w, https://substackcdn.com/image/fetch/$s_!3F77!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65a71ea3-afcd-43e7-aeb5-51df171b2b10_1138x628.png 1272w, https://substackcdn.com/image/fetch/$s_!3F77!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65a71ea3-afcd-43e7-aeb5-51df171b2b10_1138x628.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This was actively being talked about by the Biden administration at the time, which had a track record of being somewhat on the ball regarding AI. A large component of my prediction came from this <a href="https://bidenwhitehouse.archives.gov/briefing-room/presidential-actions/2025/01/14/executive-order-on-advancing-united-states-leadership-in-artificial-intelligence-infrastructure/">executive order</a> Biden issued, which seemed uncharacteristically detailed and binding. For example, it frequently referred to specific timelines, many of which were 90 or 180 days long.</p><blockquote><p>Within 1 year of the date of this order and consistent with applicable law, the Secretary of Defense, in consultation with the Secretary of Commerce, the Secretary of Energy, the Secretary of Homeland Security, the Director of National Intelligence, and the Assistant to the President for National Security Affairs, shall issue regulations that prescribe heightened safeguards to protect computing hardware acquired, developed, stored, or used on any sites on which frontier AI infrastructure is located and that are managed by the Department of Defense, as needed to implement or build upon the objectives of, or the requirements established pursuant to, subsection 4(g)(iv).</p></blockquote><p>None of it explicitly referenced limitations on AI, but given how popular conversations about copyright were at the time I had high individual probabilities assigned to both some kind of restrictions on training data, and separately on export restrictions on China. I thought the frequent referral to the Department of Defense made the idea of restrictions on foreign countries more likely. Given how soon this resolved, I feel good about my probability.</p><p>One more example (my third highest scoring question) was </p><div class="image-gallery-embed" data-attrs="{&quot;gallery&quot;:{&quot;images&quot;:[{&quot;type&quot;:&quot;image/png&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4ee05300-28a9-40aa-b026-7f01c509f463_1141x569.png&quot;}],&quot;caption&quot;:&quot;&quot;,&quot;alt&quot;:&quot;&quot;,&quot;staticGalleryImage&quot;:{&quot;type&quot;:&quot;image/png&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4ee05300-28a9-40aa-b026-7f01c509f463_1141x569.png&quot;}},&quot;isEditorNode&quot;:true}"></div><p>Since 1950, Israel has had 7 premature prime ministerial resignations (I discount deaths and the one guy who got depressed), for a base rate of 10% per year. I decided to very slightly increase Netanyahu&#8217;s chances, since as a rule of thumb support for country leaders increases during times of conflict. My main takeaway is that one should be wary of going too far from base rate, especially since Israel has had many conflicts, so the base rate ought to still be somewhat representative.</p><h2>What went not so well?</h2><p>Over 3/4 of my negative points came from two questions, both of which were not amenable to base-rate analysis.</p><p>My worst question was</p><div class="image-gallery-embed" data-attrs="{&quot;gallery&quot;:{&quot;images&quot;:[{&quot;type&quot;:&quot;image/png&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f2eff1de-e249-42eb-930a-f5a26042a3c0_759x362.png&quot;}],&quot;caption&quot;:&quot;All 2 Starship launches prior to 2024 had failed, so clearly I should have stuck to base rate analysis and put 0% for 2024 &quot;,&quot;alt&quot;:&quot;&quot;,&quot;staticGalleryImage&quot;:{&quot;type&quot;:&quot;image/png&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f2eff1de-e249-42eb-930a-f5a26042a3c0_759x362.png&quot;}},&quot;isEditorNode&quot;:true}"></div><p>I remember finding it hard to be quantitative about this question, although again I tried to take the number of predicted launches and assigning them an individual probability. I think the downside to this approach is that it can magnify errors in your individual probabilities. For example, with 4 launches, if each has a 60% chance of failure, we get 13% overall, whereas if each has 70% we instead get 24%. This model also completely sucks, because launch failures are likely to be very correlated, so assigning them an individual percentage is pretty meaningless. I think my key takeaway is that putting 90% for a question I don&#8217;t especially trust my modelling on is dumb - ie I should probably account for Knightian uncertainty.</p><p>My second worst question was </p><div class="image-gallery-embed" data-attrs="{&quot;gallery&quot;:{&quot;images&quot;:[{&quot;type&quot;:&quot;image/png&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/68f7b32c-bc28-4f29-8689-695d5334a395_755x389.png&quot;}],&quot;caption&quot;:&quot;No information had been published describing Q*  before 2024. Base rate analysis wins again!&quot;,&quot;alt&quot;:&quot;&quot;,&quot;staticGalleryImage&quot;:{&quot;type&quot;:&quot;image/png&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/68f7b32c-bc28-4f29-8689-695d5334a395_755x389.png&quot;}},&quot;isEditorNode&quot;:true}"></div><p>With this one, I remember thinking that it would be a massive publicity nightmare if they didn&#8217;t release more information about it. This was patently false, I&#8217;ve not heard anyone mention Q* at all recently. I also assumed that this must be the next big thing, based on it leaking, and that therefore the next model release would at least in passing describe Q*. </p><p>My main takeaway here is that it&#8217;s really hard to model company politics and publicity effects. Also that publicity effects from <em><strong>not</strong></em><strong> </strong>doing something are pretty much non existent. How often do you see articles titled &#8220;X still hasn&#8217;t explained Y&#8221;? How long would you wait before publishing this? Who&#8217;s the target audience? What else do you say in your article that isn&#8217;t in the title?</p><h2>The Debacle</h2><p>Given it was our first time participating, we were hoping to come in the top 50%, but not much better than that, so we didn&#8217;t think too much of entering a third submission containing our average predictions. This way we could get an interesting comparison of how we did as a team, vs how we did individually, and could get some nice warm fuzzies when our joint account came in the top 30%.  We thought about telling Metaculus at the time, but that seemed entirely ridiculous (&#8220;Just in case we as complete prediction market first timers do come in the top 5, please be aware&#8230;&#8221;) so we decided to just leave it. </p><p>We checked the leaderboards a few times as the questions started resolving, and were pleasantly surprised to see ourselves slowly notching our ways up to the top 50. Then the leaderboard switched to spot scoring, and suddenly our accounts were 3rd, 4th and 6th. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1FC7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fceb13f21-81c8-4b79-8e8c-be07a3b69c9a_572x436.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1FC7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fceb13f21-81c8-4b79-8e8c-be07a3b69c9a_572x436.png 424w, https://substackcdn.com/image/fetch/$s_!1FC7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fceb13f21-81c8-4b79-8e8c-be07a3b69c9a_572x436.png 848w, https://substackcdn.com/image/fetch/$s_!1FC7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fceb13f21-81c8-4b79-8e8c-be07a3b69c9a_572x436.png 1272w, https://substackcdn.com/image/fetch/$s_!1FC7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fceb13f21-81c8-4b79-8e8c-be07a3b69c9a_572x436.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1FC7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fceb13f21-81c8-4b79-8e8c-be07a3b69c9a_572x436.png" width="572" height="436" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ceb13f21-81c8-4b79-8e8c-be07a3b69c9a_572x436.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:436,&quot;width&quot;:572,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1FC7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fceb13f21-81c8-4b79-8e8c-be07a3b69c9a_572x436.png 424w, https://substackcdn.com/image/fetch/$s_!1FC7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fceb13f21-81c8-4b79-8e8c-be07a3b69c9a_572x436.png 848w, https://substackcdn.com/image/fetch/$s_!1FC7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fceb13f21-81c8-4b79-8e8c-be07a3b69c9a_572x436.png 1272w, https://substackcdn.com/image/fetch/$s_!1FC7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fceb13f21-81c8-4b79-8e8c-be07a3b69c9a_572x436.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>At this point we reached out to Metaculus, and we agreed that our shared account l1z1x should be deactivated, and that we should forgo our cash prizes. We are very grateful to Metaculus for being so pleasant to deal with about this, for listening to our explanations with an open mind and goodwill, and for allowing us to keep our spots on the leaderboard. For their account of this please see the <a href="https://www.metaculus.com/notebooks/31706/announcing-the-acx-2024-prediction-contest-winners/">winners announcement</a>. </p><p>On the brighter side, the joint account came 3rd, so it seems wisdom of crowds works despite the saying &#8220;two's company, three's a crowd&#8221;!</p><h2>Takeaways</h2><ul><li><p>The style of collaboration was great, I am fully committed to independent research time + discussion later.</p><ul><li><p>A lot of the time we came up with similar models, and our updates were relatively small. In this case it might have been worth us collaborating during research, as some effort was duplicated between us.</p></li><li><p>But sometimes we came up with very different models. When these converged on similar probabilities, this was a strong sign we were onto something. When these disagreed we were usually both able to provide strong arguments that the other had missed, leading to us converging on a more accurate number. There were even times when we both updated in the same direction post discussion - we&#8217;d found different data that supported the same prediction!</p></li><li><p>I think it&#8217;s really important that the research is independent. Trying to work collaboratively all the way through is likely to lead to correlated models and errors. </p></li></ul></li><li><p>Trust the base rate!</p><ul><li><p>I got a lot of points by trusting the base rate when popular opinion differed from it by quite a lot. I think there&#8217;s a human tendency to over-update on recent events. When you see lots of recent news articles complaining about a given prime minister, it&#8217;s easy to assume this means a high likelihood they step down. But all PMs have articles complaining about them all the time! This feels somewhat related to Scott&#8217;s post <a href="https://www.astralcodexten.com/p/against-learning-from-dramatic-events?utm_source=publication-search">Against Learning From Dramatic Events</a>. </p></li><li><p>I didn&#8217;t think of doing this for the 2025 contest, but in 2026 I&#8217;m planning to write down my calculated base rate explicitly, and compare that against my final prediction. Subscribe now for more posts in only 644 days time!</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.errormargin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.errormargin.com/subscribe?"><span>Subscribe now</span></a></p></li></ul></li><li><p>Don&#8217;t submit joint account predictions on Metaculus</p></li><li><p>Mike Johnson 4 eva</p><ul><li><p>Our highest probability for any question ended up being &#8220;Will Mike Johnson remain Speaker for all of 2024?&#8221; at 95%. We thought about this carefully, and I maintain this was a well-calibrated estimate, but it was also a source of endless hilarity whenever Mike Johnson was in the news for anything, and made me way more invested in his speakership than I had any right to be. I also got a very healthy 58 points for this, so I can feel good in hindsight. </p></li><li><p>More generally, predicting these things made me way more invested in the news than I otherwise would have been. I think this alone is a good reason to participate.</p></li></ul></li></ul><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[Time Travelling Life Insurance Arbitrage / Welcome to Error Margin]]></title><description><![CDATA[Hi, welcome to Error Margin!]]></description><link>https://www.errormargin.com/p/time-travelling-life-insurance-arbitrage</link><guid isPermaLink="false">https://www.errormargin.com/p/time-travelling-life-insurance-arbitrage</guid><dc:creator><![CDATA[Error Margin]]></dc:creator><pubDate>Mon, 10 Feb 2025 02:08:49 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!q8NP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03017cd7-b9c1-44ff-af7c-2476adcf6dc3_727x563.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi, welcome to Error Margin! I would like my first blog post to be something impactful that'll be immediately applicable to my readers&#8217; lives, so I'm writing about how to arbitrage the hell out of life insurance companies if you're able to time travel cheaply.</p><p>We all know life insurance premiums get more expensive as you get older, because your chance of death in any given year increases with age (citation needed). Here&#8217;s a table where we can see this happening.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a></p><div class="image-gallery-embed" data-attrs="{&quot;gallery&quot;:{&quot;images&quot;:[{&quot;type&quot;:&quot;image/webp&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/03017cd7-b9c1-44ff-af7c-2476adcf6dc3_727x563.webp&quot;}],&quot;caption&quot;:&quot;Rs. 100k is a little over a thousand dollars. A lakh is 100k Rupees.&quot;,&quot;alt&quot;:&quot;&quot;,&quot;staticGalleryImage&quot;:{&quot;type&quot;:&quot;image/webp&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/03017cd7-b9c1-44ff-af7c-2476adcf6dc3_727x563.webp&quot;}},&quot;isEditorNode&quot;:true}"></div><p>At first glance these numbers don&#8217;t seem too crazy, but let&#8217;s figure out the total cost of the policy assuming you want coverage until you&#8217;re 80.</p><div class="image-gallery-embed" data-attrs="{&quot;gallery&quot;:{&quot;images&quot;:[{&quot;type&quot;:&quot;image/png&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c55abb52-01c5-40c5-9b43-002141aedb42_455x207.png&quot;}],&quot;caption&quot;:&quot;I'm using the Flat Sum Assured numbers for Protect-Lump Sum. The cost is per lakh of payout so you can just pretend the cost is in the currency of your choice and it'll be correct for some payout value.&quot;,&quot;alt&quot;:&quot;&quot;,&quot;staticGalleryImage&quot;:{&quot;type&quot;:&quot;image/png&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c55abb52-01c5-40c5-9b43-002141aedb42_455x207.png&quot;}},&quot;isEditorNode&quot;:true}"></div><p>And now the arb becomes clear. Let&#8217;s suppose you&#8217;re a healthy non-smoking Indian male aged 55. All your friends are getting life insurance until they&#8217;re 80, but you&#8217;ve read Error Margin so you laugh in their face, approach your friendly local neighbourhood reinsurance company and sell them a life insurance policy on yourself for a juicy 28k, travel back to the year 1995, head to the cinema to watch Toy Story, then find your 25yo self and force them at gunpoint to take out life insurance until they&#8217;re 80. Congrats my healthy non-smoking Indian male friend, you just locked in a nearly 6k risk free profit for an instantaneous 25% return. Do this 31 times and you&#8217;ve earned yourself 1000x your capital and probably some questions from the insurance company.</p><h2>OK but I don&#8217;t live in India?</h2><p>No problem, are you perhaps a 50yo healthy male British non-smoker? I'll spare you the maths but you can do the arb too. 30 years of &#163;100k insurance at age 30 costs you &#163;2084. 10 years at age 50 costs you &#163;2282 for a 9.5% return if you time travel.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a></p><p>It's hard to find good publicly available data on this but I initially stumbled upon it when looking for life insurance for myself, which makes me think it's not just cherry picked. </p><p>Also, Canadian insurer Equitable published this graph that makes a similar point visually. It&#8217;s not quite the same exact thing because their premiums aren&#8217;t fixed for the whole term when you renew, but if I&#8217;m reading this correctly then the total cost of a 10 year term when you&#8217;re 60 is the area under the navy line from year 20 to 30, which is higher than the area under the whole cyan line of your (initially higher) premiums if you take the 30 year term at age 40.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Vpgf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48c1786f-444e-4fec-a4f1-46e3c5399f9a_905x532.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Vpgf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48c1786f-444e-4fec-a4f1-46e3c5399f9a_905x532.png 424w, https://substackcdn.com/image/fetch/$s_!Vpgf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48c1786f-444e-4fec-a4f1-46e3c5399f9a_905x532.png 848w, https://substackcdn.com/image/fetch/$s_!Vpgf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48c1786f-444e-4fec-a4f1-46e3c5399f9a_905x532.png 1272w, https://substackcdn.com/image/fetch/$s_!Vpgf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48c1786f-444e-4fec-a4f1-46e3c5399f9a_905x532.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Vpgf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48c1786f-444e-4fec-a4f1-46e3c5399f9a_905x532.png" width="905" height="532" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/48c1786f-444e-4fec-a4f1-46e3c5399f9a_905x532.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:532,&quot;width&quot;:905,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Vpgf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48c1786f-444e-4fec-a4f1-46e3c5399f9a_905x532.png 424w, https://substackcdn.com/image/fetch/$s_!Vpgf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48c1786f-444e-4fec-a4f1-46e3c5399f9a_905x532.png 848w, https://substackcdn.com/image/fetch/$s_!Vpgf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48c1786f-444e-4fec-a4f1-46e3c5399f9a_905x532.png 1272w, https://substackcdn.com/image/fetch/$s_!Vpgf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48c1786f-444e-4fec-a4f1-46e3c5399f9a_905x532.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Weirdly enough, I couldn&#8217;t recreate this with the US data from <a href="https://www.nerdwallet.com/article/insurance/average-life-insurance-rates">Nerd Wallet</a>. Is there a reason for this? Almost certainly. Do I know what it is? Nope. Do I think it&#8217;s because actual time travellers arbed the US market? No comment.</p><p>This might be to do with regulatory differences between countries in how much risk the insurance companies are able to take with their investments, but this would require the US to be unusually strict in its regulations, which seems pretty unlikely. I&#8217;d be interested to hear any theories to explain this.</p><h2>But what about the efficient markets hypothesis?</h2><p>OK well for starters this isn't really a market. Even if it was, it's very difficult to short &#8212; you need a time machine and the ability to sell life insurance on yourself, so there&#8217;s no mechanism to correct this.</p><p>But this may actually make more sense than it seems to at first. I&#8217;ve read<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a> that insurance companies make only small profits on premiums minus payouts after accounting for operating expenses, and the way they top this up is by investing the premiums until they need the cash. So the insurance company genuinely does have more opportunity to make money if you start young and take out a long contract. The returns of 25% and 9.5% I quote above only annualise out to 0.75% and 0.43% respectively given the 30y and 20y timescales involved.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a></p><h2>But&#8230; I don&#8217;t have a time machine?</h2><p>I never said arbitrage was easy!</p><p>I guess the takeaway for those of us stuck in the 2020s is that as a consumer holding off on life insurance when you&#8217;re young may actually cost you more in total over the long run. This is obviously not financial advice, and I&#8217;m sure rules and regulations will make this differ a lot by country, but when getting life insurance it may be worth considering how old you&#8217;re going to be when it renews, and whether you can optimise the length of your term to minimise total cost to whatever age you want cover for.</p><h2>B.b.b..but&#8230; where&#8217;s the rest of your blog?</h2><p>Work in progress! I wanted to get at least one post out ahead of the ACX announcement of the prediction contest winners. I expect to post maybe 5 - 10 more things from my backlog over the next few weeks, then you&#8217;ll be at the mercy of whenever I have blog-worthy thoughts. Please subscribe to my mailing list for forecasting, scientific self-experimentation, video games, and lots of analysis of random interesting bits of data I come across in the wild. Also a post mortem of the 2024 ACX Prediction Contest, though if you want to skip the wait please check out <a href="https://thedissonance.net/2025/02/09/acx-2024-prediction-contest-retro.html">this awesome post</a> from my co-conspirator.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.errormargin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.errormargin.com/subscribe?"><span>Subscribe now</span></a></p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>Courtesy of <a href="https://www.avivaindia.com/sites/default/files/Aviva-Life-Shield-Premium-KF-Web_0.pdf">Aviva India</a>, because it had the most extreme example I could find.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p><a href="https://www.reassured.co.uk/life-insurance/100000-life-insurance/">Data source</a>, you can get these numbers from the two different tables in the article.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p><a href="https://advisor.equitable.ca/advisor/getattachment/3f467954-ed12-4839-8289-addc0cd2287b/2114.pdf">Source</a></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p>I can&#8217;t find the original source but I feel like I&#8217;ve heard this in a few different places. Here&#8217;s a replacement <a href="https://begininsurance.ca/en/blog/behind-the-policy-understanding-how-insurance-companies-generate-revenue">source </a>that gestures in the same direction.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-5" href="#footnote-anchor-5" class="footnote-number" contenteditable="false" target="_self">5</a><div class="footnote-content"><p>These returns numbers aren&#8217;t really especially related to any numbers of consequence for either the insurer or the time traveller. The former is getting paid in small instalments over time and can decide what to do with the money, and the latter doesn&#8217;t even need capital if they sell insurance before buying it. But it&#8217;s a useful back of the envelope sanity check all the same. </p></div></div>]]></content:encoded></item></channel></rss>