Monday, June 20, 2022

Introduction to Ecological Inference

Consider the election for some office the United States. The voters are organized into precincts. For the 2020 presidential election, there were a total of 176,933 precincts or equivalents (according to the United States Election Assistance Commission report). The voting age population in 2020 would be approximately 256,662,010 people (as reported in the Federal Register). We obtain an average of 1450.6 voting-age individuals per precinct.

Generic Puzzle: How did [candidate] perform with [demographic of voters] in [given election]?

Since voting data is private, it's impossible to figure this out with certainty. But there are a number of statistical tools we can use to answer the question. We will look at ecological inference today.

Puzzle 1: How many women voted for Biden (or Trump)? How many men voted for Biden (or Trump)?

There were approximately 132,408,924 women adults in 2020, and approximately 124,253,086 men adults. The FEC reports 158,383,403 votes were cast in the 2020 presidential election. That makes voter turnout approximately 66.20074%, but we do not know its composition in terms of men and women.

If we consider precinct $i$, we could write down a table of the fraction of the turnout who were women $\beta^{f}_{i}$, the fraction of the turnout who were men $\beta^{m}_{i}$, the proportion of the population of the precinct which are women $X_{i}$, and the voter turnout $T_{i}$ (as a fraction) in the following handy table:

Demographic Voted (fraction) Did not vote (fraction) Total
Women$\beta^{f}_{i}$$1-\beta^{f}_{i}$$X_{i}$
Men$\beta^{m}_{i}$$1-\beta^{m}_{i}$$1-X_{i}$
$T_{i}$$1-T_{i}$

We have the following useful identity \begin{equation}\tag{1} T_{i} = X_{i}\beta^{f}_{i} + (1 - X_{i})\beta^{m}_{i}. \end{equation} To convince ourselves of this, suppose we had $N_{i}$ adults in precinct $i$, $F_{i}$ of which are females, and that $V_{i}$ votes were case in the precinct, $V_{i}^{f}$ being cast by female voters. Then $\beta^{f}_{i}=V^{f}_{i}/F_{i}$ and $X_{i} = F_{i}/N_{i}$, multiplying through gives $\beta^{f}_{i}X_{i}=V^{f}_{i}/N_{i}$ the number of votes case by women relative to the precinct's population; when we combine it with $\beta^{m}_{i}(1 - X_{i}) = (V_{i} - V^{f}_{i})/N_{i}$, we recover $T_{i} = V_{i}/N_{i}$ the voter turnout as a fraction of the precinct's population. So far, so good? Good!

Now the name of the game is determine $\beta^{f}_{i}$ and $\beta^{m}_{i}$.

But this is an underdetermined system: we have 2 unknowns per precinct, and 1 equation per precinct. This is a fundamental weakness of ecological inference known as the Inderminacy problem. It's a serious problem, because we can "cook the books" to infer whatever we want. But let us try to soldier on, and see what we can determine.

Method of Bounds

For simplicity, I will be treating the entire United States as a single precinct. If you find this dodgy, well, the more sophisticated techniques of ecological inference stipulates worse, so buckle up.

We can place some bounds on $\beta^{f}$ (I'm dropping the index tracking precincts, since there's only one). If every man who could vote did vote, then we would have \begin{align} \frac{T - (1 - X)}{X} &= \frac{(V/N) - (N-F)/N}{F/N} \tag{2a} \\ &= \frac{V - (N-F)}{F}\leq\beta^{f}. \tag{2b} \end{align} The number of votes not cast by a man $V - (N-F)$ relative to the population of adult women $F$ would be the lower bound for female voter turnout. This works, provided $V-(N-F)\gt0$ (there are more votes than men). To handle the other case, we need \begin{equation}\tag{3} \max\left(\frac{T - (1 - X)}{X}, 0\right)\leq\beta^{f}. \end{equation} When working this out, we find the lower bound to be empirically \begin{equation}\tag{4} 0.32888\leq\beta^{f} \end{equation} at least 32.8% of voting-age females cast a ballot in 2020.

Likewise, we can derive an upperbound where only women cast ballots (which makes sense when $F\gt V$ there are more women than votes): \begin{equation}\tag{5} \min(T/X, 1) \geq\beta^{f}. \end{equation} Empirically, there were more votes than women, so \begin{equation}\tag{6} 0.3288\leq\beta^{f}\leq 1. \end{equation} This isn't terribly enlightening, somewhere between 32.8% and 100% of voting age females cast a ballot in 2020. (The bounds on male voter turnout is similarly between 31.9% and 100%.)

If we want something more, we need to supply additional constraints by hand. For example, the difference between voter turnout by sex is bounded by 10% empirically, so this would constrain the space of possibilities further. But we are now bringing in our own prior beliefs: the data doesn't tell us the difference between voter turnout is bounded by 10% empirically, I literally just made it up because it sounds plausible.

Had we precinct-level data, we could take a weighted average of these bounds to infer district-level turnout by sex. Perhaps we will work through state-level considerations using county-level data in a future post.

The other thing I should point out is that the interval \begin{equation}\tag{7} \beta^{f}\in\left[\max\left(\frac{T - (1 - X)}{X}, 0\right),\; \min(T/X, 1)\right] \end{equation} is a 100% confidence interval. This fact isn't really taken advantage of sufficiently, in my opinion, but taking advantage of it requires imposing our beliefs on the statistics, which allows us to conclude anything we want.

Goodman's Regression Approach

Another approach is to start with our first identity \begin{equation}\tag{1} T_{i} = X_{i}\beta^{f}_{i} + (1 - X_{i})\beta^{m}_{i}. \end{equation} We then regress $T_{i}$ against $X_{i}$. This will produce district-level estimates $B^{f}$ (and $B^{m}$) of the turnout by sex.

This is a linear regression, so a couple of caveats are worth noting:

  1. The estimates may produce unreasonable results, violating the deterministic bounds. In fact, you should expect the estimates to be biased (in the technical, statistical sense).
  2. There is an explicit assumption of constant covariance, which means the composition of a precinct does not affect voter turnout or voting behaviour. This may be reasonable superficially, but urban precincts tend to behave differently than rural precincts, and this could produce unreasonable results.

King's Approach

We can take some combination of the deterministic approach and Goodman's regression, to restrict values to $0\leq B^{f}\leq 1$ and likewise for men. That is to say, the coefficients (when plotted against each other) live in the unit square. (Or, for demographics with $n$-categories, like age, the unit $n$-cube.)

We transform the identity at the heart of Goodman's approach into the form: \begin{equation}\tag{8} \beta^{m}_{i} = \left(\frac{T_{i}}{1-X_{i}}\right) - \left(\frac{X_{i}}{1-X_{i}}\right)\beta^{f}_{i}. \end{equation} We can then plot a line in the unit square, since we know the $X_{i}$ and $T_{i}$, and this produces a distribution of possible values for the $\beta^{m}_{i}$ and $\beta^{f}_{i}$ parameters. In order for us to extract information, we need to make three statistical assumptions.

Probably the most severe assumption in this approach is the demand of spatial homogeneity: the conditional random variable $T_{i}|X_{i}$ are independent of observations. This sounds fine, but it stipulates rural voters behave indistinguishably from urban voters (and other demographics not considered behave indistinguishably from each other).

We tend to believe this is not the case (according to polling, election results, etc.). So we need to be careful about the domain of validity for King's regression.

The next assumption we need to make is that there are no cluster points. In other words, we will be using a bivariate truncated normal distribution (or if there are $n$ demographic categories, an $n$-variate truncated normal distribution).

The criticism for this assumption is that there's no reason to believe a truncated multivariate normal distribution is more appropriate than any other probability distribution on the unit cube. Arguably, it's not; Jaynes would argue we should use a entropy maximizing distribution, for example. There's merit to Jaynes's argument: we are imposing a subjective prior belief onto our statistical analysis by choosing some probability distribution. We need to appeal to some statistical principle, or prove the results are independent of prior, or...

The last assumption we need to make is that there is no a priori aggregation bias; i.e., $X_{i}$ is [mean] independent of $\beta^{f}_{i}$ and $\beta^{m}_{i}$.

I don't have much to say about this, because King developed another model which weakens this condition (and let's us measure violations of it).

When the deterministic bounds on the demographic category of interest is relatively tight, all results tend to coincide. But when the bounds are large (like our example with voter turnout by sex being anywhere between 30% to 100%), the conclusions drawn are largely model-dependent. This subtlety can lead to contradictory results.

Concluding Remarks

These three approaches constitute the "trunk" of ecological inference, from which we can form "branches" useful for whatever problem we're interested in. King really deserves credit for rejuvenating the tool, and a lot of modern work generalizes his approach.

But we should also realize there are many incredibly subtle (and easy to miss) opportunities for us to derive results which we were looking for ab initio. For this reason alone, it should be tested against other methods like Bayesian multilevel regression (with or without post-stratification), or even simpler methods when possible. It's easy to shoot yourself in the foot with ecological inference, and it's eager to do it, too.

Again, it is worth stressing: inferences drawn from ecological inference are either obvious or model-dependent. It's easy to impose your pre-existing beliefs onto the model without realizing it. You really shouldn't be using this for election analysis.

Sunday, May 16, 2021

Review: Kalman Filtered Senate Polls

Previously, we discussed Kalman filtering the polls. We will examine how well such a filter performed in the 2020 US Senate races.

Implementation Details

The basic details of the Kalman filter may be found in the previous blog post, here I just review some of the decisions I've made when implementing it in practice:

Polling Date. We treat the date for a poll as the midpoint between its start and end dates. To be clear, we are truncating timestamps to dates.

Polls are 4-vectors. We treat a poll as giving 4 data points: the percentage support for the Democratic candidate, the Republican candidate, all third-party candidates, and the undecided voters.

Third Parties. I treated all third parties as a single candidate. For polls which do not ask about third-party voters, we treat them as the margin of error.

Pooling the Polls. Following Jackman's "Pooling the Polls", we take a precision-weighted average of polls concluding on the same day. This amounts to, for a single poll released on a given end date, renaming the variables and computing the covariance and precision matrices. For multiple polls, this amounts to taking the covariance matrix, inverting it to obtain the precision matrix, multiplying the associated polling result as a 4-vector, weighting the result by the polling size, then multipling the resulting sum by the matrix inverse of the sum of precision matrices. This gives us the "effective responses" for each candidate.

Undecided voters. If a poll ignores undecided voters, we similarly treat them as the margin of error.

Criteria for Assessing Estimates

We will look at the polls at two points of time: at the beginning of October, and a week before the election (i.e., October 27, 2020). The reason for this decision being, we're interested in whether the Democratic candidate could have acted to "course correct" before the election.

The assessment will compare the Democratic candidate's polling average with the votes cast on election day. This is because a lot of Republican supporters are nervous about publicly backing the Republican candidate, and tend to be swept into the "undecided" bucket.

We note that the estimates the Kalman filter produces is a multivariate normal distribution. Since we're interested in the Democratic candidate's performance, we can turn it into a univariate normal distribution. The 95% margin is given in parentheses around the estimates.

Further, we restrict focus to competitive senate seats. Inside Elections considered the following races as "competitive" [i.e., tilt or toss-up]: Arizona, Georgia, Iowa, Kansas, Maine, Montana, North Carolina, South Carolina.

Conclusion: The only state where Kalman filtering the polls produces significantly different results than the observed vote percentage is Maine, where Susan Collins overperformed (or Sarah Gideon drastically underperformed). All other senate race results coincide, within a prescribed margin of error, with the Kalman filtered polling.

Arizona

We plot the polling average as of October 1st:

The Polling averages as of October 27th:

Candidate Sept. 28 Oct. 27 Vote Percent
Mark Kelley (D)46.8% (±4.32%)48.52% (±4.22%)51.16%
Martha McSally (R)38.4%43.37%48.81%
Third Party4.71%3.42%0.03%
Undecided10.0%4.6%--

Georgia Regular

Jon Ossoff faced an uphill battle, but managed to pull off an unexpected victory.

Candidate Oct. 1 Oct. 27 Runoff Vote Percent
Jon Ossoff (D)41.8% (±3.49%)44.87% (±3.40%)47.9%
David Perdue (R)46.8%42.57%49.7%
Third Party2.95%4.08%2.4%
Undecided8.45%8.47%--

Iowa

The polls were fairly accurate for Ms Greenfield, with Kalman filtering at least. We should note the large number of undecideds in the polls reflect the surprisingly large noise.

Candidate Sept. 26 Oct. 25 Vote Percent
Theresa Greenfield (D)47.1% (±4.51%)45.63% (±3.82%)45.15%
Joni Ernst (R)44.0%46.10%51.74%
Third Party1.39%2.89%3.11%
Undecided7.44%5.39%--

Maine

Senator Susan Collins remained stable in the polls around 41% until the very end of October, whereas her Democratic challenger fluctuated between 40% and 50%.

Candidate Sept. 29 Oct. 25 Vote Percent
Sara Gideon (D)45.69% (±6.05%)50.89% (±5.27%)42.39%
Susan Collins (R)42.17%49.0%50.98%
Third Party0.72%0.04989734%6.63%
Undecided11.42%0.06070465%--

Montana

The polls in Montana were far more stable for Steve Bullock.

Candidate Oct. 2 Oct. 26 Vote Percent
Steve Bullock (D)46.01% (±4.3%)46.38% (±4.21%)45.0%
Steve Daines (R)45.40%45.72%55.0%
Third Party3.66%3.31%0%
Undecided4.93%4.60%--

North Carolina

We plot the polling average as of October 1st:

The polling average fluctuates wildly in October, which is hard to see in the plots I could produce, since it's so wild. Remember, Cal Cunningham landed in hot water with an extramarital affair. We see the plot produced October 30, 2020, reflects fluctuation ranges between 45% and 48% for Cunningham:

Candidate Sept. 26 Oct. 27 Vote Percent
Cal Cunningham (D)50.36% (±4%)47.0% (±2.8%)46.9%
Thom Tillis (R)39.30%46.5%48.7%
Third Party4.19%1.42%4.4%
Undecided6.15%5.12%--

South Carolina

It's not clear what caused Sen Graham's support to increase over time, but I suspect it's that undecideds "came back home" to Sen Graham (for whatever reason). Support for Jaime Harrison fluctuated around 44% (with a standard deviation of about ±2%) throughout October, just observing the Kalman filtered results.

Candidate Oct. 2 Oct. 26 Vote Percent
Jaime Harrison (D)44.85% (± 3.9%)41.62% (± 3.6%)44.17%
Lindsey Graham (R)44.82%49.89%54.44%
Third Party3.08%3.38%1.39%
Undecided7.25%5.10%--

Saturday, February 13, 2021

Zettelkasten

I am going to try to describe how Niklas Luhmann worked his Zettelkasten, then examine motivations. Luhmann's zettelkasten is basically a system for "growing" a corpus of notes based off of "synoptical reading" (as Mortimer Adler would call it).

Mechanics

Take a slip of paper (zettel). We will refer to the slip of paper as a "card", but it could be half a piece of copy paper. An economical choice is to take letter paper, and cut it into 4.25-by-5.5 inch quarters.

Write an "atomic" idea on it (in the sense that, it fits on one side of the slip of paper and is self-contained). What was just written is the "content" of the card. A good heuristic for writing content is to write the card to someone (a pen-pal) who doesn't know the topic. Think of the cards as analogous to "tweets". On the upper right-hand party of the card, write the topic of the card.

We also write in the upper left-hand corner a unique "ID" number on the card. There is no role for an ID other than to make reference to cards possible. We sequentially write numbers on these cards starting at 1, 2, 3, etc.

We may have a "thread" (i.e., a sequence of cards all pertaining to the same general thread of thought), and we indicate this with a slash ("/") followed by the position in the thread. There is at most 1 slash in an ID. For example, if the first card is really a thread, then the cards in the thread are given the IDs "1/1", "1/2", "1/3", etc.

Remark. We should pick the topics for "top-level" threads which are sufficiently general, but they need not form a "Dewey decimal" classification of topics. For example, in my Zettelkasten, the first thread is "1/1 Zettelkasten" which discusses all the conventions I've adopted for my Zettelkasten. The next thread is "2/1 System and Method" discussing the notion of a "System" in philosophy, general systems theory, etc., and "Method" which constitutes the operations and heuristics which "generate" a "System". This leads to discussion of "2/2 Language", "2/3 Explication", "2/4 Evidence", and so on. Then I have "3/1 (American) Political Science" which discusses "3/2 Public Opinion", "3/3 Elections and Campaigns", "3/4 Voting Theory and Behaviour", etc. Note the top-level threads begin with sufficiently general fields of discussion, and subfields constitute "thread items". (Later we see that branching can be used to discuss the main topics in the subfield, which can be explored in a subthread.)

But other times we want to elucidate an aspect of a card, which may be divergent from our thread. For example, I want to elucidate an aspect of my card with ID "1/2", then I add a card with ID "1/2a" after "1/2" but before "1/3". This new card with ID "1/2a" is called a "branch". On the original card "1/2", Luhmann wrote in red "(a)" near the text which is elucidated in "1/2a". (If I want to elucidate an aspect of a card, but the card is not in a thread, I transform it into a thread by appending a "1" to the original card. Example: I am looking at the card with ID "2/12a" and want to have a branch off this card. First I rewrite the ID to be "2/12a1", then I write the ID for the branch "2/12a1a".) There is no significance to lowercase letters in IDs, it could be uppercase if that's easier to distinguish from numerals.

A branch could also be a thread, we just append the position in the thread sequence. Continuing the example, I would have "1/2a1", "1/2a2", "1/2a3", etc. We still write just "(a)" in red, even for the branch "1/2a1" which is a thread. Think of this as analogous to "relative paths" in file systems, or "relative URLs" for links in HTML.

This scheme is "compositional", in the sense that I could iterate this thread-then-branch pattern however deep I want. The IDs alternate between numerals and letters, always starting at "1" for numerals and "a" for letters. For example, a branch on "1/2a3" would be "1/2a3a".

Now, lacking a classification of content or some grouping of concepts, we need to refer to concepts crystalized in the cards already in our Zettelkasten: we want to link to cards. This is done by writing, in parentheses, the ID number we want to reference. And this can be done referring to any card in the system, even ones that come after the referee card. For example, card "1/3" could refer to card "7/12" without a problem.

Lacking any ordering, it quickly becomes hard to remember where concepts are recorded once we have a dozen threads or so. We manage our knowledge through special cards called register cards. These are "entry points", portals to clusters of knowledge. A register card consists of an optional brief summary, followed by a list of tags or keywords with links. Not every card will appear on a register, otherwise we'll double the size of our zettelkasten. It is unclear to me if registers have a special placement in the zettelkasten, or if they belong to an external box.

Luhmann maintained a bibliography external to the Zettelkasten, as he described in his essay Communicating with Slip Boxes

Another complementary aid can be the bibliographical apparatus. Bibliographical notes which we extract from the literature, should be captured inside the card index. Books, articles, etc., which we have actually read, should be put on a separate slip with bibliographical information in a separate box. You will then not only be able to determine after some time what you actually read and what you only noted to prepare reading, but you can also add numbered links to the notes, which are based on this work or were suggested by it.

Personally, I have gone back-and-forth between having an external bibliography versus an internal one. It depends on whether we view a book (or article) as a thing worth thinking about in its own right. For example, if we want to note Dirac's book Principles of Quantum Mechanics was completely rewritten in the second edition, that the first edition had material remarkably similar to the many-worlds interpretation, etc., then such notes belong in the Zettelkasten and quite ostensibly serves as the "bibliography entry" for the book. The argument is stronger in the social sciences and philosophy, since texts form a dialogue among each other, constantly responding to the context shaped by previous work.

I do write on the back of a slip the "works cited", including edition of book, pages relevant, etc. This should be sufficiently documented so if, say, the library I used burned down with no books surviving, I should be able to consult another library and find the original source. But the content of the slip should also contain enough information that this should not be necessary.

Bibliographic Notes

Ahrens's How to Take Smart Notes, the only English book on Zettelkastens, observes that when Luhmann read a book, he wouldn't underline anything, nor make marginal notes. Instead, Luhmann wrote down on a slip all these thoughts. This slip contains a citation to the book being read, and would be stored in a separate slip box which Luhmann referred to as his bibliography system.

The idea behind writing notes like this is not to take "close notes", extended quotes, etc. I am uncertain what exactly constitutes the notes written down here, perhaps something as simple as the ideas presented. We write in such a way so as to best facilitate the next step: writing slips for our Zettelkasten.

Motivation

Niklas Luhmann, in his essay Learning How to Read, argues:

The problem of reading theoretical texts seems to consist in the fact that they do not require just short-term memory but also long-term memory in order to be able to distinguish between what is essential and what is not essential and what is new from what is merely repeated. But one cannot remember everything. This would simply be learning by heart. In other words, one must read very selectively and must be able to extract extensively networked references. One must be able to understand recursions. But how can one learn these skills, if no instructions can be given; or perhaps only about things that are unusual like “recursion” in the previous sentences as opposed to “must”?

Perhaps the best method would be to take notes—not excerpts, but condensed reformulations of what has been read. The re-description of what has already been described leads almost automatically to a training of paying attention to “frames,” or schemata of observation, or even to noticing conditions which lead the text to offer some descriptions but not others. What is not meant, what is excluded when something is asserted? If the text speaks of “human rights,” what is excluded by the author? Non-human rights? Human duties? Or is it comparing cultures or historical times that did not know human rights and could live very well without them?

This leads to another question: what are we to do with what we have written down? Certainly, at first we will produce mostly garbage. But we have been educated to expect something useful from our activities and soon lose confidence if nothing useful seems to result. We should therefore reflect on whether and how we arrange our notes so that they are available for later access. At least this should be a consoling illusion. This requires a computer or a card file with numbered index cards and an index. The constant accommodation of notes is then a further step in our working process. It costs time, but it is also an activity that goes beyond the mere monotony of reading and incidentally trains our memory.

I read this to mean, when reading a book we should take notes. We can try to understand the author's terms, or "normalize" the terminology for comparison with other authors. Using Mortimer Adler's terminology, we can do an analytical reading or a synoptical reading (respectively).

Either way, we should produce something from reading. Armed with these notes, we should reflect on whether and how we arrange our notes so that they are available for later access. [...] The constant accommodation of notes is then a further step in our working process.

Luhmann's Zettelkasten is then designed specifically for this constant accommodation of notes. The numbering scheme for IDs enables this "organic growth"...for most subjects.

An Historian's "Zettelkasten"

The historian Douglass Southall Freeman apparently used a zettelkasten-type system when researching material for his books. David E. Johnson's biography of the man, Douglas Southall Freeman, describes Freeman's method of assembling materials, notes, outlines before writing his text. The basic algorithm is as follows:

  1. Read the source material
  2. “Once it was determined that a letter, book, or manuscript contained information that might be useful, the next decision was what type of note to make of it. There were three categories of notes: ‘Now or Never Notes’ which contained ‘absolutely necessary’ information; ‘Maybe Notes’ with a brief summary of the information; and ‘Companion Notes’ that gave the pages and citations to a source close at hand.” (329)
  3. “Whatever the category, note taking was done in a consistent form. The cards—called ‘quarter sheets’—were 5.5-by-4.25 inches [i.e., literally a quarter of a piece of American writing paper]. In the upper left corner of the card the date was noted—year, month, day. In the upper right, the source and page citation. The subject was written in the center-top of the card. A brief abstract of the contents of the item was typed across the card.” (329) An example from Freeman’s research on George Washington looked like:

    1781, Sept. 13 ADVANCE J.Trumball’s Diary, 333
    Leave Mount Vernon, between Colchester and Dumfries, meet letters that report action between two fleets. French have left Bay in pursuit—event not known. “Much agitated.”
  4. “Supplementing the cards were ‘long sheets’ held in three-ring binders. The long sheets contained more details from the source; often including entire letters or lengthy extracts. The cards were filed chronologically, the long sheets by topic.” (330)
  5. Once a particular source has been thoroughly examined, Freeman later in his life (while working on his biography of George Washington) numbered the cards with something called a “numbering machine”.
  6. “With the cards and long sheets complete, Freeman recorded the information in another notebook, a sort of working outline. In these entries, key words were capitalized so he could tell at a glance what his subject was dealing with at a particular moment. His notebook page for George Washington on August 17, 1775, has two words capitalized: POWDER and QUARTERMASTER.” (330)
  7. “The cards, long sheets, and notebooks were cross-referenced and carried identical numbers if they touched on the same topic. For a fact to be lost or slip through the cracks would require failure at four different places.” (330)

This seems to describe a system remarkably similar to Luhmann's zettelkasten: unique numbering of cards, linked by card-number, an external bibliography system (with source material).

What is worth emphasizing is the numbering is written on the cards when the research is finished and not before. This contrasts with Luhmann's approach, where IDs may be written as one develops notes (as opposed to waiting to finish notes on a topic).

How to get started?

I'd like to now describe what I do with my zettelkasten. I've intentionally modeled/imitated Luhmann's approach, but I have some differences. Before discussing the method, I will discuss the mechanics underlying the physical notes themselves.

First, the mechanics, I do use plain paper for my zettelkasten. Specifically letter paper folded into quarters, then I use scissors to slice the paper into slips. When taking notes, I therefore carry around with me a ream of paper and a pair of scissors.

Now, for the method, with a blank slip, I leave space in the upper left-hand corner for the ID, and write the subject or topic in the upper right-hand corner. The body of the slip I write notes. But this is not merely text, I also include "mini-spreadsheets" if I need tabular data for later use (e.g., vote results for a state across multiple elections), or maps (for geographical concerns), or caricatures (because I don't use a printer and I need to remember what someone looks like).

The back of the slip stores metadata. I write on the back after rotating it 90-degrees (so the slip now resembles a "miniature letter sheet"). The metadata stored helps me remember where I obtained information, and where I use the slip.

For bibliographic citations, I write them at the top of the back slip (specifically, what references I actually used, the page number, etc.). One quirk, the page number is of the form pp.x where x is a digit indicating what percent down the page the reference is from (so "10.5" is half-way down page 10, "27.3" is about 30% from the top of page 27, "61.9" is near the bottom of page 61).

Also on the back of the slip, starting at the bottom, I am trying to write "back links". For example, if I am writing text on slip 65/2 and I link/cite slip 42/72, then I write on the bottom of the back of slip 42/72 in black ink "(65/2)". Thus I get a list of "back-references" telling me where this card is used elsewhere in my zettelkasten.

How many "Threads" to have?

I have since learned that Luhmann had two zettelkastens. Johannes Schmidt reports the first Zettelkasten consists of approximately 23,000 cards, which are divided into 108 sections ['threads'] by subjects and numbered consecutively. Schmidt informs us the second had eleven top-level sections with a total of about 100 subsections.

How many threads should we have in our Zettelkasten? The correct answer is, As many as we need.

For physical Zettelkastens, we are constrained to fit the ID in a limited amount of space. This requires some careful thought about how to organize the threads. I would like to quote Schmidt on this matter:

According to Luhmann the collection is a “combination of disorder and order, of clustering and unpredictable combinations emerging from ad hoc selection.” Of course the file collection is not simply a chaotic compilation of notes but an aggregation of a vast number of cards on specific concepts and topics. This order per subject area on a top level is reflected in the first number assigned to the card followed by a comma (first collection) or slash (second collection) that separates it from the rest of the number given each card (see below). The first collection features 108 sections differentiated by subject areas, exploring and reflecting on largely predetermined, fairly detailed fields of knowledge in law, administrative sciences, philosophy and sociology, such as state, equality, planning, power, constitution, revolution, hierarchy, science, role, concept of world, information, and so on. The second collection, by design, is quite more problem-oriented, reflecting the emerging sociological interests of Luhmann: It consists of only eleven top-level subject areas: organizational theory, functionalism, decision theory, office, formal/informal order, sovereignty/state, individual concepts/individual problems, economy, ad hoc notes, archaic societies, advanced civilizations. What this compilation immediately illustrates is that it is not a system of order in the sense of an established taxonomy but a historical product of Luhmann’s reading and research interests especially in the 1960s. Following the subject areas defined at the top level are other subsections that revolve around a variety of topics. The relationship between the top-level subject area and the lower-level subjects cannot be described in terms of a strictly hierarchical order, it is rather a form of loose coupling insofar as one can find lower-level subjects which do not fit systematically to the top-level issue but show only marginally connections.

This is a result of the specific system of organization of the notes applied within these sections on a particular subject matter which ensures that the initial decision for a specific topic did not lead to a sequence of cards confined to that one topic: Whenever Luhmann came across an interesting idea about a secondary aspect on one of his cards, he pursued this idea by adding additional notes and inserted the respective card at that place in the existing sequence of cards. This method could be applied again to the card that had been inserted and so forth, the result being a sequence of cards leading thematically and conceptually farther and farther away from the initial subject and constitute their on subsection. Furthermore this technique enabled the collection not only to grow in absolute numbers, but to grow “inwardly” without the limitations of a systematically order.

But the positioning of larger subject areas as well as individual cards in the collection was not only the historical product of Luhmann’s reading interests and note-taking activities. It also owed to the difficulty of assigning an issue to one and only one single (top-level) subject, which is a matter of ambiguity or so to say conceptual indecisiveness. Luhmann solved this problem by seizing it as an opportunity: instead of subscribing to the idea of a systematic classification system, he opted for organizing entries based on the principle that they must have only some relation to the previous entry without also having to keep some overarching system in mind. One could say: there must be a local solution (i.e. connection or internal fit) only. This indicates, accordingly, that the positioning of a special subject within this system of organization reveals nothing about its theoretical importance — for there are no privileged positions in this web of notes: there is no top and no bottom.

The decision inherent in this filing technique without a fixed system of order is an essential prerequisite of the creativity of the filing system. In explaining his approach, Luhmann emphasized, with the first steps of computer technology in mind, the benefits of the principle of “multiple storage”: in the card index it serves to provide different avenues of accessing a topic or concept since the respective notes may be filed in different places and different contexts. Conversely, embedding a topic in various contexts gives rise to different lines of information by means of opening up different realms of comparison in each case due to the fact that a note is an information only in a web of other notes. Furthermore it was Luhmann’s intention to “avoid premature systematization and closure and maintain openness toward the future.” His way of organizing the collection allows for it to continuously adapt to the evolution of his thinking and his overall theory which as well is not conceptualized in a hierarchical manner but rather in a cybernetical way in which every term or theoretical concept is dependent on the other.

Choice of "Topics"

So Luhmann's attempts at using Zettelkastens began with some kind of Dewey decimal system of subjects, then in his second kasten merely a dozen or so subjects. Should we pick some set of "topics" for the high level threads?

The answer is, it depends on what you're doing. For a zettelkasten dedicated to a particular project (like, writing a book), this may not be wise. But for one's life-long project or education, this might prove useful.

Curiously, the Propædia provides one schematization for organizing all knowledge. Whether one agrees or not with such a schematization, it deserves serious consideration.

My personal opinion is the threads should be sufficiently general subjects. Their ordering doesn't matter (unlike the case of, e.g., the Propædia). Here's a small example of my Zettelkasten's organization.

  1. Zettelkasten
    1. Numbering Scheme
      1. Link
      2. Thread
      3. Branching
      4. Ontological Status of ID Numbers
    2. Register, Tag
    3. Principle of Atomicity
    4. Bibliography
    5. Slips of Paper, Index Cards
  2. Language
    1. Formal Language
    2. Language Game
    3. Pattern Language
      1. Pattern
      2. Context
      3. Forces
      4. Problem
  3. System and Method
        1. System
        2. Component
        3. Boundary
        1. Method
          1. Bare and Stylized Facts
        2. Scientific Methods
        3. Method as Component of System
    1. Language
      1. Formal Language
      2. Language Game
      3. Pattern Language
    2. Computation
  4. Programming
    1. Software Design
    2. Toy Model of a Computer
      1. Basics
      2. Memory Model
      3. CPU
      4. Disk Storage
      5. Secondary Devices
    3. Lisp
      1. Basics
      2. Memory Model
      3. Macros
      4. Domain-Specific Languages
      5. Correctness
    4. C
      1. Basics
      2. Rules to the Language
      3. Macro Tricks
      4. Memory Model
      5. ACSL
      6. Correctness
    5. Operating Systems
      1. xv6
      2. Linux
      3. FreeBSD
      4. DOS
    6. C++
    7. Make A Lisp (Literate C++)
    8. R
    9. Make An R (Literate C)
  5. Mathematics, Statistics
    1. Foundations of Mathematics
      1. Naive Set Theory
      2. Stuff, Structures, Properties
      3. First Order Logic
      4. Set Theory
      5. Type Theory
    2. Category Theory
    3. Linear Algebra
    4. Abstract Algebra
      1. Foundations (working with Stuff, Structure, and Properties, morphisms)
      2. Symmetries (Group Theory)
      3. Number Systems (Ring Theory)
      4. Modules over Rings
      5. Galois Theory
    5. Topology
    6. Measure Theory
    7. Combinatorics
      1. Foundations
      2. Enumerative Combinatorics
        1. Twelvefold Path
        2. Generating Functions
      3. Partition Theory
      4. Design Theory
    8. Probability
      1. Foundations
        1. Random Processes
        2. Events
        3. Sigma Algebras
        4. Probability
      2. Coin Flipping
      3. Classical Puzzles
        1. Birthday Paradox
        2. Coupon Collector Problem
        3. Monty Hall Problem
        4. Dice Problems
        5. Population Puzzles
      4. Random Variables
      5. Probability Distributions
    9. Statistics
      1. Data Collection
      2. Exploratory Data Analysis
      3. Statistical Inference
      4. Regression Analysis
      5. Bayesian Data Analysis

One thing that's surprising is, I recently became interested in bookbinding. This required discussing paper quite a bit, which turns out to be elucidate aspects of slip "1/5 Slips of Paper, Index Cards".

Best Practices

Or, "Things I wish I had-done/am-glad-to-have-done". This is a grocery list of observations, errors, lessons learned...in no particular order.

1. First Thread. The first thread should be about the topic of Zettelkastens. This also stores the reasoning for the conventions I have chosen to follow, the alternatives I had considered, and the arguments for and against the choices made.

2. Numbering Scheme. Dedicate some time thinking about the numbering scheme you'd like to use. If you're imitating Luhmann, what about when you have more than 26 aspects you'd like to drill down on? For example, if we have "1/1a", "1/1b", ..., "1/1z", what will the next card be? "1/1A", "1/1B", ..., "1/1Z" and then what? Or "1/1aa" and then "1/1ab"?

Or will you use a numbering scheme similar to Wittgenstein's Tractatus? (See, e.g., Warren M Tang's summary of the numbering scheme.) So "1.1" instead of "1/1"? Or "1/1.1" instead of "1/1a"? And then "1/1.11" to clarify an aspect of "1/1.1"? As Ogden's translation explains, The decimal figures as numbers of the separate propositions indicate the logical importance of the propositions, the emphasis laid upon them in my exposition. The propositions n.1, n.2, n.3, etc., are comments on proposition No. n; the propositions n.m1, n.m2, etc., are comments on the proposition No. n.m; and so on.

Or will you use a "modern" scheme using multiple periods? So "2.1" is a comment on "2", and "2.1.1" is a comment on "2.1", with each "component" of the number separated by a period, ordered sequentially? This jettisons alternating letters and numbers, which may not be good for one's memory. Also this seems to weaken the difference between (a) a slip in the middle of a thread, and (b) elaborating an aspect of a given slip. Is this advantageous for your needs or is it a disadvantage?

Remember, using either Wittgenstein's scheme or "modern" schemes does not easily facilitate "branching" (in Luhmann's sense). This jettisons one of the key aspects of a Zettelkasten. Is this worth it? Think very hard about this before committing yourself. Branches, I have found, are very useful when you want to elaborate on aspects of a slip...and such elaborations are not necessarily ordered in any way. (You should probably sit down and think if you want branches in your Zettelkasten; amendations, elucidations, forks, asides, all these and more are captured by "branches", but cannot adequately be captured by threads alone.)

3. Index Card vs Paper. For cards which may need replacement/updating/rewriting, use a quarter slip of ordinary paper. This is not as durable as index cards, because index cards consist of thicker paper. The "lifespan" of ordinary paper is far shorter than index cards, or so I have been told.

When will I know that a card needs replacement, updating, or rewriting? One instance is when a card "stores data". For example, my Zettelkasten includes recent presidential election results from, say, Pennsylvania. This will require updating after 2020.

Let me stress again, using paper will result in a zettel that will (with heavy usage) "wear out" in a limited time. I am writing my Zettelkasten with knowledge that using ordinary paper will require replacement.

4. Highlighter, Ink. Consider using highlighters on the edge of the cards. Different colors for different types of cards. (Red for register, yellow for people or organizations, etc.) This lets me look down at the zettelkasten and identify cards visually. But this works best on index cards, not plain paper.

Similarly, ink color could be used for particular significance. I reserve red ink for links, blue ink for defined terms, and so on. Whatever you choose to do, be consistent.

5. Write content first, IDs last. First, I usually write the notes on the slip (either the body or the title, or both), and when I'm done with a batch of slips, I ponder arrangement and clustering. While writing content, I sometimes realize I want a branch to elaborate on particular topics, so I'll insert links (or leave space for them later).

There are some times when I know I need to add notes on, e.g., "American Political Science". In this case, I find an introductory textbook, find the topics discussed (the Constitution, each of the three branches of government, the bureaucracy, public opinion, elections, political culture and geography, etc.) which give the entries in the thread, and branches then elaborate aspects of each entry. This forms the basis of knowledge which is then extended in advanced courses/topics (e.g., committees in congress, game theoretic models of committees, etc.).

Other times, I am relatively aimless. I don't know "where I'm going" with my notes on "System and Method". I explore a train of thought ("System", "Environment and Boundary", "Component", "Method", "Explication", "Analysis", "Bare and Stylized Facts", "Language", etc.), then I organize them and finally give them IDs.

6. Threads should be "sufficiently general subjects". In the example zettelkasten I have produced, we should note how general the top-level thread topics are: language, mathematics, computer programming, etc. (Statistically, Luhmann had on average about 213 cards per "top-level thread" in his first Zettelkasten, and roughly 670 cards per "top level thread" in his second; this is the general aim.)

Really, I'm just writing notes on slips as I am solving a particularly interesting problem, or reading a book. These zettels lack only an ID number. When I get home, I look at my zettelkasten to figure out which thread these slips belong to. I do not start with a schematization ab initio. But I invent new subjects/threads as it becomes relevant.

7. Literate Programming. It is possible to do something like Knuth's literate programming in your Zettelkasten. For example, Knuth's "section" or "chunk" corresponds to a "zettel", the code section may be done in markdown fences (i.e., demarcated by ```lang ...```). This really shines when correctness is a concern, because we can insert slips establishing the preconditions, postconditions, and invariants for the code in a section: correctness becomes a branched thread. (For thicker paper, or pens which don't bleed through the paper, one could use one side for the code section and the other for human readable comments.)

(I'm writing a book on automated theorem proving, so my needs for proving the correctness of a program are probably idiosyncratic here. But, as odd as it sounds, the Zettelkasten structure captures Knuth's approach amazingly well!)

References

This is by no means complete, and I realize I cite sources in this post but neglected to add them here. Apologies!