Saturday, February 13, 2021

Zettelkasten

I am going to try to describe how Niklas Luhmann worked his Zettelkasten, then examine motivations. Luhmann's zettelkasten is basically a system for "growing" a corpus of notes based off of "synoptical reading" (as Mortimer Adler would call it).

Mechanics

Take a slip of paper (zettel). We will refer to the slip of paper as a "card", but it could be half a piece of copy paper. An economical choice is to take letter paper, and cut it into 4.25-by-5.5 inch quarters.

Write an "atomic" idea on it (in the sense that, it fits on one side of the slip of paper and is self-contained). What was just written is the "content" of the card. A good heuristic for writing content is to write the card to someone (a pen-pal) who doesn't know the topic. Think of the cards as analogous to "tweets". On the upper right-hand party of the card, write the topic of the card.

We also write in the upper left-hand corner a unique "ID" number on the card. There is no role for an ID other than to make reference to cards possible. We sequentially write numbers on these cards starting at 1, 2, 3, etc.

We may have a "thread" (i.e., a sequence of cards all pertaining to the same general thread of thought), and we indicate this with a slash ("/") followed by the position in the thread. There is at most 1 slash in an ID. For example, if the first card is really a thread, then the cards in the thread are given the IDs "1/1", "1/2", "1/3", etc.

Remark. We should pick the topics for "top-level" threads which are sufficiently general, but they need not form a "Dewey decimal" classification of topics. For example, in my Zettelkasten, the first thread is "1/1 Zettelkasten" which discusses all the conventions I've adopted for my Zettelkasten. The next thread is "2/1 System and Method" discussing the notion of a "System" in philosophy, general systems theory, etc., and "Method" which constitutes the operations and heuristics which "generate" a "System". This leads to discussion of "2/2 Language", "2/3 Explication", "2/4 Evidence", and so on. Then I have "3/1 (American) Political Science" which discusses "3/2 Public Opinion", "3/3 Elections and Campaigns", "3/4 Voting Theory and Behaviour", etc. Note the top-level threads begin with sufficiently general fields of discussion, and subfields constitute "thread items". (Later we see that branching can be used to discuss the main topics in the subfield, which can be explored in a subthread.)

But other times we want to elucidate an aspect of a card, which may be divergent from our thread. For example, I want to elucidate an aspect of my card with ID "1/2", then I add a card with ID "1/2a" after "1/2" but before "1/3". This new card with ID "1/2a" is called a "branch". On the original card "1/2", Luhmann wrote in red "(a)" near the text which is elucidated in "1/2a". (If I want to elucidate an aspect of a card, but the card is not in a thread, I transform it into a thread by appending a "1" to the original card. Example: I am looking at the card with ID "2/12a" and want to have a branch off this card. First I rewrite the ID to be "2/12a1", then I write the ID for the branch "2/12a1a".) There is no significance to lowercase letters in IDs, it could be uppercase if that's easier to distinguish from numerals.

A branch could also be a thread, we just append the position in the thread sequence. Continuing the example, I would have "1/2a1", "1/2a2", "1/2a3", etc. We still write just "(a)" in red, even for the branch "1/2a1" which is a thread. Think of this as analogous to "relative paths" in file systems, or "relative URLs" for links in HTML.

This scheme is "compositional", in the sense that I could iterate this thread-then-branch pattern however deep I want. The IDs alternate between numerals and letters, always starting at "1" for numerals and "a" for letters. For example, a branch on "1/2a3" would be "1/2a3a".

Now, lacking a classification of content or some grouping of concepts, we need to refer to concepts crystalized in the cards already in our Zettelkasten: we want to link to cards. This is done by writing, in parentheses, the ID number we want to reference. And this can be done referring to any card in the system, even ones that come after the referee card. For example, card "1/3" could refer to card "7/12" without a problem.

Lacking any ordering, it quickly becomes hard to remember where concepts are recorded once we have a dozen threads or so. We manage our knowledge through special cards called register cards. These are "entry points", portals to clusters of knowledge. A register card consists of an optional brief summary, followed by a list of tags or keywords with links. Not every card will appear on a register, otherwise we'll double the size of our zettelkasten. It is unclear to me if registers have a special placement in the zettelkasten, or if they belong to an external box.

Luhmann maintained a bibliography external to the Zettelkasten, as he described in his essay Communicating with Slip Boxes

Another complementary aid can be the bibliographical apparatus. Bibliographical notes which we extract from the literature, should be captured inside the card index. Books, articles, etc., which we have actually read, should be put on a separate slip with bibliographical information in a separate box. You will then not only be able to determine after some time what you actually read and what you only noted to prepare reading, but you can also add numbered links to the notes, which are based on this work or were suggested by it.

Personally, I have gone back-and-forth between having an external bibliography versus an internal one. It depends on whether we view a book (or article) as a thing worth thinking about in its own right. For example, if we want to note Dirac's book Principles of Quantum Mechanics was completely rewritten in the second edition, that the first edition had material remarkably similar to the many-worlds interpretation, etc., then such notes belong in the Zettelkasten and quite ostensibly serves as the "bibliography entry" for the book. The argument is stronger in the social sciences and philosophy, since texts form a dialogue among each other, constantly responding to the context shaped by previous work.

I do write on the back of a slip the "works cited", including edition of book, pages relevant, etc. This should be sufficiently documented so if, say, the library I used burned down with no books surviving, I should be able to consult another library and find the original source. But the content of the slip should also contain enough information that this should not be necessary.

Bibliographic Notes

Ahrens's How to Take Smart Notes, the only English book on Zettelkastens, observes that when Luhmann read a book, he wouldn't underline anything, nor make marginal notes. Instead, Luhmann wrote down on a slip all these thoughts. This slip contains a citation to the book being read, and would be stored in a separate slip box which Luhmann referred to as his bibliography system.

The idea behind writing notes like this is not to take "close notes", extended quotes, etc. I am uncertain what exactly constitutes the notes written down here, perhaps something as simple as the ideas presented. We write in such a way so as to best facilitate the next step: writing slips for our Zettelkasten.

Motivation

Niklas Luhmann, in his essay Learning How to Read, argues:

The problem of reading theoretical texts seems to consist in the fact that they do not require just short-term memory but also long-term memory in order to be able to distinguish between what is essential and what is not essential and what is new from what is merely repeated. But one cannot remember everything. This would simply be learning by heart. In other words, one must read very selectively and must be able to extract extensively networked references. One must be able to understand recursions. But how can one learn these skills, if no instructions can be given; or perhaps only about things that are unusual like “recursion” in the previous sentences as opposed to “must”?

Perhaps the best method would be to take notes—not excerpts, but condensed reformulations of what has been read. The re-description of what has already been described leads almost automatically to a training of paying attention to “frames,” or schemata of observation, or even to noticing conditions which lead the text to offer some descriptions but not others. What is not meant, what is excluded when something is asserted? If the text speaks of “human rights,” what is excluded by the author? Non-human rights? Human duties? Or is it comparing cultures or historical times that did not know human rights and could live very well without them?

This leads to another question: what are we to do with what we have written down? Certainly, at first we will produce mostly garbage. But we have been educated to expect something useful from our activities and soon lose confidence if nothing useful seems to result. We should therefore reflect on whether and how we arrange our notes so that they are available for later access. At least this should be a consoling illusion. This requires a computer or a card file with numbered index cards and an index. The constant accommodation of notes is then a further step in our working process. It costs time, but it is also an activity that goes beyond the mere monotony of reading and incidentally trains our memory.

I read this to mean, when reading a book we should take notes. We can try to understand the author's terms, or "normalize" the terminology for comparison with other authors. Using Mortimer Adler's terminology, we can do an analytical reading or a synoptical reading (respectively).

Either way, we should produce something from reading. Armed with these notes, we should reflect on whether and how we arrange our notes so that they are available for later access. [...] The constant accommodation of notes is then a further step in our working process.

Luhmann's Zettelkasten is then designed specifically for this constant accommodation of notes. The numbering scheme for IDs enables this "organic growth"...for most subjects.

An Historian's "Zettelkasten"

The historian Douglass Southall Freeman apparently used a zettelkasten-type system when researching material for his books. David E. Johnson's biography of the man, Douglas Southall Freeman, describes Freeman's method of assembling materials, notes, outlines before writing his text. The basic algorithm is as follows:

  1. Read the source material
  2. “Once it was determined that a letter, book, or manuscript contained information that might be useful, the next decision was what type of note to make of it. There were three categories of notes: ‘Now or Never Notes’ which contained ‘absolutely necessary’ information; ‘Maybe Notes’ with a brief summary of the information; and ‘Companion Notes’ that gave the pages and citations to a source close at hand.” (329)
  3. “Whatever the category, note taking was done in a consistent form. The cards—called ‘quarter sheets’—were 5.5-by-4.25 inches [i.e., literally a quarter of a piece of American writing paper]. In the upper left corner of the card the date was noted—year, month, day. In the upper right, the source and page citation. The subject was written in the center-top of the card. A brief abstract of the contents of the item was typed across the card.” (329) An example from Freeman’s research on George Washington looked like:

    1781, Sept. 13 ADVANCE J.Trumball’s Diary, 333
    Leave Mount Vernon, between Colchester and Dumfries, meet letters that report action between two fleets. French have left Bay in pursuit—event not known. “Much agitated.”
  4. “Supplementing the cards were ‘long sheets’ held in three-ring binders. The long sheets contained more details from the source; often including entire letters or lengthy extracts. The cards were filed chronologically, the long sheets by topic.” (330)
  5. Once a particular source has been thoroughly examined, Freeman later in his life (while working on his biography of George Washington) numbered the cards with something called a “numbering machine”.
  6. “With the cards and long sheets complete, Freeman recorded the information in another notebook, a sort of working outline. In these entries, key words were capitalized so he could tell at a glance what his subject was dealing with at a particular moment. His notebook page for George Washington on August 17, 1775, has two words capitalized: POWDER and QUARTERMASTER.” (330)
  7. “The cards, long sheets, and notebooks were cross-referenced and carried identical numbers if they touched on the same topic. For a fact to be lost or slip through the cracks would require failure at four different places.” (330)

This seems to describe a system remarkably similar to Luhmann's zettelkasten: unique numbering of cards, linked by card-number, an external bibliography system (with source material).

What is worth emphasizing is the numbering is written on the cards when the research is finished and not before. This contrasts with Luhmann's approach, where IDs may be written as one develops notes (as opposed to waiting to finish notes on a topic).

How to get started?

I'd like to now describe what I do with my zettelkasten. I've intentionally modeled/imitated Luhmann's approach, but I have some differences. Before discussing the method, I will discuss the mechanics underlying the physical notes themselves.

First, the mechanics, I do use plain paper for my zettelkasten. Specifically letter paper folded into quarters, then I use scissors to slice the paper into slips. When taking notes, I therefore carry around with me a ream of paper and a pair of scissors.

Now, for the method, with a blank slip, I leave space in the upper left-hand corner for the ID, and write the subject or topic in the upper right-hand corner. The body of the slip I write notes. But this is not merely text, I also include "mini-spreadsheets" if I need tabular data for later use (e.g., vote results for a state across multiple elections), or maps (for geographical concerns), or caricatures (because I don't use a printer and I need to remember what someone looks like).

The back of the slip stores metadata. I write on the back after rotating it 90-degrees (so the slip now resembles a "miniature letter sheet"). The metadata stored helps me remember where I obtained information, and where I use the slip.

For bibliographic citations, I write them at the top of the back slip (specifically, what references I actually used, the page number, etc.). One quirk, the page number is of the form pp.x where x is a digit indicating what percent down the page the reference is from (so "10.5" is half-way down page 10, "27.3" is about 30% from the top of page 27, "61.9" is near the bottom of page 61).

Also on the back of the slip, starting at the bottom, I am trying to write "back links". For example, if I am writing text on slip 65/2 and I link/cite slip 42/72, then I write on the bottom of the back of slip 42/72 in black ink "(65/2)". Thus I get a list of "back-references" telling me where this card is used elsewhere in my zettelkasten.

How many "Threads" to have?

I have since learned that Luhmann had two zettelkastens. Johannes Schmidt reports the first Zettelkasten consists of approximately 23,000 cards, which are divided into 108 sections ['threads'] by subjects and numbered consecutively. Schmidt informs us the second had eleven top-level sections with a total of about 100 subsections.

How many threads should we have in our Zettelkasten? The correct answer is, As many as we need.

For physical Zettelkastens, we are constrained to fit the ID in a limited amount of space. This requires some careful thought about how to organize the threads. I would like to quote Schmidt on this matter:

According to Luhmann the collection is a “combination of disorder and order, of clustering and unpredictable combinations emerging from ad hoc selection.” Of course the file collection is not simply a chaotic compilation of notes but an aggregation of a vast number of cards on specific concepts and topics. This order per subject area on a top level is reflected in the first number assigned to the card followed by a comma (first collection) or slash (second collection) that separates it from the rest of the number given each card (see below). The first collection features 108 sections differentiated by subject areas, exploring and reflecting on largely predetermined, fairly detailed fields of knowledge in law, administrative sciences, philosophy and sociology, such as state, equality, planning, power, constitution, revolution, hierarchy, science, role, concept of world, information, and so on. The second collection, by design, is quite more problem-oriented, reflecting the emerging sociological interests of Luhmann: It consists of only eleven top-level subject areas: organizational theory, functionalism, decision theory, office, formal/informal order, sovereignty/state, individual concepts/individual problems, economy, ad hoc notes, archaic societies, advanced civilizations. What this compilation immediately illustrates is that it is not a system of order in the sense of an established taxonomy but a historical product of Luhmann’s reading and research interests especially in the 1960s. Following the subject areas defined at the top level are other subsections that revolve around a variety of topics. The relationship between the top-level subject area and the lower-level subjects cannot be described in terms of a strictly hierarchical order, it is rather a form of loose coupling insofar as one can find lower-level subjects which do not fit systematically to the top-level issue but show only marginally connections.

This is a result of the specific system of organization of the notes applied within these sections on a particular subject matter which ensures that the initial decision for a specific topic did not lead to a sequence of cards confined to that one topic: Whenever Luhmann came across an interesting idea about a secondary aspect on one of his cards, he pursued this idea by adding additional notes and inserted the respective card at that place in the existing sequence of cards. This method could be applied again to the card that had been inserted and so forth, the result being a sequence of cards leading thematically and conceptually farther and farther away from the initial subject and constitute their on subsection. Furthermore this technique enabled the collection not only to grow in absolute numbers, but to grow “inwardly” without the limitations of a systematically order.

But the positioning of larger subject areas as well as individual cards in the collection was not only the historical product of Luhmann’s reading interests and note-taking activities. It also owed to the difficulty of assigning an issue to one and only one single (top-level) subject, which is a matter of ambiguity or so to say conceptual indecisiveness. Luhmann solved this problem by seizing it as an opportunity: instead of subscribing to the idea of a systematic classification system, he opted for organizing entries based on the principle that they must have only some relation to the previous entry without also having to keep some overarching system in mind. One could say: there must be a local solution (i.e. connection or internal fit) only. This indicates, accordingly, that the positioning of a special subject within this system of organization reveals nothing about its theoretical importance — for there are no privileged positions in this web of notes: there is no top and no bottom.

The decision inherent in this filing technique without a fixed system of order is an essential prerequisite of the creativity of the filing system. In explaining his approach, Luhmann emphasized, with the first steps of computer technology in mind, the benefits of the principle of “multiple storage”: in the card index it serves to provide different avenues of accessing a topic or concept since the respective notes may be filed in different places and different contexts. Conversely, embedding a topic in various contexts gives rise to different lines of information by means of opening up different realms of comparison in each case due to the fact that a note is an information only in a web of other notes. Furthermore it was Luhmann’s intention to “avoid premature systematization and closure and maintain openness toward the future.” His way of organizing the collection allows for it to continuously adapt to the evolution of his thinking and his overall theory which as well is not conceptualized in a hierarchical manner but rather in a cybernetical way in which every term or theoretical concept is dependent on the other.

Choice of "Topics"

So Luhmann's attempts at using Zettelkastens began with some kind of Dewey decimal system of subjects, then in his second kasten merely a dozen or so subjects. Should we pick some set of "topics" for the high level threads?

The answer is, it depends on what you're doing. For a zettelkasten dedicated to a particular project (like, writing a book), this may not be wise. But for one's life-long project or education, this might prove useful.

Curiously, the Propædia provides one schematization for organizing all knowledge. Whether one agrees or not with such a schematization, it deserves serious consideration.

My personal opinion is the threads should be sufficiently general subjects. Their ordering doesn't matter (unlike the case of, e.g., the Propædia). Here's a small example of my Zettelkasten's organization.

  1. Zettelkasten
    1. Numbering Scheme
      1. Link
      2. Thread
      3. Branching
      4. Ontological Status of ID Numbers
    2. Register, Tag
    3. Principle of Atomicity
    4. Bibliography
    5. Slips of Paper, Index Cards
  2. Language
    1. Formal Language
    2. Language Game
    3. Pattern Language
      1. Pattern
      2. Context
      3. Forces
      4. Problem
  3. System and Method
        1. System
        2. Component
        3. Boundary
        1. Method
          1. Bare and Stylized Facts
        2. Scientific Methods
        3. Method as Component of System
    1. Language
      1. Formal Language
      2. Language Game
      3. Pattern Language
    2. Computation
  4. Programming
    1. Software Design
    2. Toy Model of a Computer
      1. Basics
      2. Memory Model
      3. CPU
      4. Disk Storage
      5. Secondary Devices
    3. Lisp
      1. Basics
      2. Memory Model
      3. Macros
      4. Domain-Specific Languages
      5. Correctness
    4. C
      1. Basics
      2. Rules to the Language
      3. Macro Tricks
      4. Memory Model
      5. ACSL
      6. Correctness
    5. Operating Systems
      1. xv6
      2. Linux
      3. FreeBSD
      4. DOS
    6. C++
    7. Make A Lisp (Literate C++)
    8. R
    9. Make An R (Literate C)
  5. Mathematics, Statistics
    1. Foundations of Mathematics
      1. Naive Set Theory
      2. Stuff, Structures, Properties
      3. First Order Logic
      4. Set Theory
      5. Type Theory
    2. Category Theory
    3. Linear Algebra
    4. Abstract Algebra
      1. Foundations (working with Stuff, Structure, and Properties, morphisms)
      2. Symmetries (Group Theory)
      3. Number Systems (Ring Theory)
      4. Modules over Rings
      5. Galois Theory
    5. Topology
    6. Measure Theory
    7. Combinatorics
      1. Foundations
      2. Enumerative Combinatorics
        1. Twelvefold Path
        2. Generating Functions
      3. Partition Theory
      4. Design Theory
    8. Probability
      1. Foundations
        1. Random Processes
        2. Events
        3. Sigma Algebras
        4. Probability
      2. Coin Flipping
      3. Classical Puzzles
        1. Birthday Paradox
        2. Coupon Collector Problem
        3. Monty Hall Problem
        4. Dice Problems
        5. Population Puzzles
      4. Random Variables
      5. Probability Distributions
    9. Statistics
      1. Data Collection
      2. Exploratory Data Analysis
      3. Statistical Inference
      4. Regression Analysis
      5. Bayesian Data Analysis

One thing that's surprising is, I recently became interested in bookbinding. This required discussing paper quite a bit, which turns out to be elucidate aspects of slip "1/5 Slips of Paper, Index Cards".

Best Practices

Or, "Things I wish I had-done/am-glad-to-have-done". This is a grocery list of observations, errors, lessons learned...in no particular order.

1. First Thread. The first thread should be about the topic of Zettelkastens. This also stores the reasoning for the conventions I have chosen to follow, the alternatives I had considered, and the arguments for and against the choices made.

2. Numbering Scheme. Dedicate some time thinking about the numbering scheme you'd like to use. If you're imitating Luhmann, what about when you have more than 26 aspects you'd like to drill down on? For example, if we have "1/1a", "1/1b", ..., "1/1z", what will the next card be? "1/1A", "1/1B", ..., "1/1Z" and then what? Or "1/1aa" and then "1/1ab"?

Or will you use a numbering scheme similar to Wittgenstein's Tractatus? (See, e.g., Warren M Tang's summary of the numbering scheme.) So "1.1" instead of "1/1"? Or "1/1.1" instead of "1/1a"? And then "1/1.11" to clarify an aspect of "1/1.1"? As Ogden's translation explains, The decimal figures as numbers of the separate propositions indicate the logical importance of the propositions, the emphasis laid upon them in my exposition. The propositions n.1, n.2, n.3, etc., are comments on proposition No. n; the propositions n.m1, n.m2, etc., are comments on the proposition No. n.m; and so on.

Or will you use a "modern" scheme using multiple periods? So "2.1" is a comment on "2", and "2.1.1" is a comment on "2.1", with each "component" of the number separated by a period, ordered sequentially? This jettisons alternating letters and numbers, which may not be good for one's memory. Also this seems to weaken the difference between (a) a slip in the middle of a thread, and (b) elaborating an aspect of a given slip. Is this advantageous for your needs or is it a disadvantage?

Remember, using either Wittgenstein's scheme or "modern" schemes does not easily facilitate "branching" (in Luhmann's sense). This jettisons one of the key aspects of a Zettelkasten. Is this worth it? Think very hard about this before committing yourself. Branches, I have found, are very useful when you want to elaborate on aspects of a slip...and such elaborations are not necessarily ordered in any way. (You should probably sit down and think if you want branches in your Zettelkasten; amendations, elucidations, forks, asides, all these and more are captured by "branches", but cannot adequately be captured by threads alone.)

3. Index Card vs Paper. For cards which may need replacement/updating/rewriting, use a quarter slip of ordinary paper. This is not as durable as index cards, because index cards consist of thicker paper. The "lifespan" of ordinary paper is far shorter than index cards, or so I have been told.

When will I know that a card needs replacement, updating, or rewriting? One instance is when a card "stores data". For example, my Zettelkasten includes recent presidential election results from, say, Pennsylvania. This will require updating after 2020.

Let me stress again, using paper will result in a zettel that will (with heavy usage) "wear out" in a limited time. I am writing my Zettelkasten with knowledge that using ordinary paper will require replacement.

4. Highlighter, Ink. Consider using highlighters on the edge of the cards. Different colors for different types of cards. (Red for register, yellow for people or organizations, etc.) This lets me look down at the zettelkasten and identify cards visually. But this works best on index cards, not plain paper.

Similarly, ink color could be used for particular significance. I reserve red ink for links, blue ink for defined terms, and so on. Whatever you choose to do, be consistent.

5. Write content first, IDs last. First, I usually write the notes on the slip (either the body or the title, or both), and when I'm done with a batch of slips, I ponder arrangement and clustering. While writing content, I sometimes realize I want a branch to elaborate on particular topics, so I'll insert links (or leave space for them later).

There are some times when I know I need to add notes on, e.g., "American Political Science". In this case, I find an introductory textbook, find the topics discussed (the Constitution, each of the three branches of government, the bureaucracy, public opinion, elections, political culture and geography, etc.) which give the entries in the thread, and branches then elaborate aspects of each entry. This forms the basis of knowledge which is then extended in advanced courses/topics (e.g., committees in congress, game theoretic models of committees, etc.).

Other times, I am relatively aimless. I don't know "where I'm going" with my notes on "System and Method". I explore a train of thought ("System", "Environment and Boundary", "Component", "Method", "Explication", "Analysis", "Bare and Stylized Facts", "Language", etc.), then I organize them and finally give them IDs.

6. Threads should be "sufficiently general subjects". In the example zettelkasten I have produced, we should note how general the top-level thread topics are: language, mathematics, computer programming, etc. (Statistically, Luhmann had on average about 213 cards per "top-level thread" in his first Zettelkasten, and roughly 670 cards per "top level thread" in his second; this is the general aim.)

Really, I'm just writing notes on slips as I am solving a particularly interesting problem, or reading a book. These zettels lack only an ID number. When I get home, I look at my zettelkasten to figure out which thread these slips belong to. I do not start with a schematization ab initio. But I invent new subjects/threads as it becomes relevant.

7. Literate Programming. It is possible to do something like Knuth's literate programming in your Zettelkasten. For example, Knuth's "section" or "chunk" corresponds to a "zettel", the code section may be done in markdown fences (i.e., demarcated by ```lang ...```). This really shines when correctness is a concern, because we can insert slips establishing the preconditions, postconditions, and invariants for the code in a section: correctness becomes a branched thread. (For thicker paper, or pens which don't bleed through the paper, one could use one side for the code section and the other for human readable comments.)

(I'm writing a book on automated theorem proving, so my needs for proving the correctness of a program are probably idiosyncratic here. But, as odd as it sounds, the Zettelkasten structure captures Knuth's approach amazingly well!)

References

This is by no means complete, and I realize I cite sources in this post but neglected to add them here. Apologies!

No comments:

Post a Comment