Pundits are already speculating about who will run for president on the Democratic ticket for 2020, which begs the question: what does the data suggest when the first person will declare? How many candidates can we expect? And what is the rate at which candidates will enter the race?
Shut up and tell me the answers
Assuming the Democratic race will be like the Republican 2012 and 2016 race, apparently candidates declaring their entrance follows a Poisson Distribution, as the interval between announcements follows a Exponential Distribution; with 95% probability, candidates announce their intent to run anywhere between 7.62 days on the low end, and 17.54 on the high end, with an expected rate of about 11 days between each announcement.
Consequently, since the first serious contender announced Sunday 11 November 2018, we can expect, with 95% probability, anywhere between 21 to 48.55 candidates to make the announcement, with likelihood maximized at 33.6 candidates to be the Democratic nominee in the 2020 cycle.
Rate at which Candidates Declare
According to Wikipedia, the 2012 Republican candidates declared in the following order:
Candidate | Date Declared |
---|---|
Newt Gingrich | |
Ron Paul | |
Herman Cain | |
Mitt Romney | |
Rick Santorum | |
Jon Huntsman, Jr. | |
Michele Bachmann | |
Rick Perry |
Similarly for the 2016 Republican primary race, again from wikipedia, we have the following:
Candidate | Date Declared |
---|---|
Ted Cruz | |
Rand Paul | |
Marco Rubio | |
Ben Carson | |
Carly Fiorina | |
Mike Huckabee | |
Rick Santorum | |
George Pataki | |
Lindsey Graham | |
Rick Perry | |
Jeb Bush | |
Bobby Jindal | |
Chris Christie |
The interval between each candidate declaring his or her candidacy for Republicans in 2012 is (in days): 8, 12, 4, 15, 6, 47. This has a mean of 15.33333 days, and a variance of 256.66666 days. (Observe that the variance is approximately the square of the mean.)
The intervals between each candidate for Republicans in 2016 is (in days): 6, 21, 1, 22, 1, 4, 3, 11, 9, 6. This has a mean of 8.4 days, and a variance of 57.822222 days (the fractional part is 37/45). (Again, observe how the mean squared is approximately the variance.)
When we combine this data together, the concatenated dataset has a mean of 11 days, and a variance of 132.26666 days (observe the square of the mean is approximately the variance).
This property (the variance is approximately the square of the mean) supports the hypothesis that this dataset is described by a Random Variable following an Exponential Distribution with its rate parameter approximately 1/λ ≈ 11 days.
Well, really, 95% of the posterior distribution lies between 0.05702248 < λ < 0.1312337 and peaks at 1/λ ≈ 11 days; the high density interval thus described is shaded in blue in the following plot:
When first candidate will declare
We should probably ask the question How many days before election day November 3, 2020 will the first candidate announce his or her bid for presidency?
The data for the Republican candidates in 2016, dataset (of days before the election the candidate announced) is: 497, 503, 512, 523, 526, 530, 531, 553, 554, 554, 575, 581, 596. Observe this occurred within a span of 199 days.
Similarly, the data for the 2012 election: 451, 498, 504, 519, 523, 535, 543, 545. Also observe this occurred within the span of 94 days (roughly half the length of the 2016 election).
Observe that the 2016 election had 5 candidates declare earlier than the earliest nominee in 2012, so it stands to reason to suppose that the greater the number of candidates, the earlier the first bid, and vice-versa:
Herd-Size Conjecture: the number of candidates is directly proportional to the earliest candidate's declaration date (as measured by days before the election).
(This is not an unreasonable conjecture, since candidates declare at intervals which are described by an exponential distribution, and there is a hard deadline to declare your candidacy.)
Assuming every candidate wants to run in every primary, South Carolina requires filing for primary candidates by September 30, 2019 (assuming it is like the 2016 primary — the deadlines for the 2016 primaries may be found here). This is 400 days before the election, everyone must file before then.
There is actually a fairly decent correlation between the total funds raised, total spent, and total left on-hand (just add up the quarterly books) and the length of a primary campaign. For the nominee, I consider the "end date" to be election day.
I had to work with 2012 data because it's far neater than the 2016 data. The R2 = 0.8546118, and R2adj = 0.6607609; the model is:
(number of days) = 115.53788314621659 - 11.278698441397473×(total raised in $Mn) + 22.78101502109625×(total spent in $Mn) - 10.49938850575515×(total left on-hand in $Mn)
Supposing this model also holds for the Democrats, all we have to do is estimate how much money is "out there" for candidates. For the Republicans in 2012, all candidates raised a total sum of $337,615,860
Corollary: If the Herd-Size Conjecture holds, then number of candidates is directly proportional to total funds raised by the candidates.
Proof sketch: There are several steps in the proof.
- Since the length of a campaign is directly proportional to the funds raised, and the Herd-Size Conjecture says the earliest candidate's declaration (i.e., the start of the campaign) is directly proportional to the number of candidates, it follows the candidate's declaration date is proportional to the funds raised.
- Since the candidates declare at a fairly steady rate following an exponential distribution, it follows the earlier the first candidate declares, the greater the number of candidates will declare.
- The greater the funds raised, the earlier the candidate's start date tends to be (by step 1), and hence the greater the number of candidates (by step 2). ■
Of course, this just punts the problem (of determining who will declare first) to the much more difficult problem of how much money will be in the Democratic 2020 primaries, and how will it be partitioned.
Something to consider is that the amount may be approximately the total amount of money raised by the party in the House from the previous midterm election (2010 Republicans raised a total of $353Mn in the House, according to OpenSecrets). If this is a good approximation, then there will be roughly $649Mn raised by the Democratic candidates in the primary, and the total amount spent will vary between $421Mn to $649Mn (topic for future post!). But if $443Mn is spent, we could expect to see primary announcements as early as December 4th, 2018.