If you want to learn category theory in a way that is more orthodox, a lot of people recommend Tom Leinster’s Basic Category Theory, which is free[1]. I’m going to be working through it soon, but the bit I’ve skimmed through looks really good if more “mathsy” than things like TFA. It also does a better job (imo) of justifying the existence of category theory as a field of study.
Disclaimer for the book, and for category theory in general: most books are optimized for people who already master mathematics at an undergraduate level. If you're not familiar with algebraic structures, linear algebra, or topology, be prepared to learn them along the way from different resources.
Category theory is also not that impressive unless you already understand some of the semantics it is trying to unify. In this regards, the book itself presents, for example, the initial property as trivial at first hand, unless you notice that it does not simply hold for arbitrary structures.
If someone does not want to check the mathematics line by line and prefers to give the article the benefit of the doubt, note that it also presents this JavaScript:
[1, 3, 2].sort((a, b) => {
if (a > b) {
return true
} else {
return false
}
})
This is not a valid comparator. It returns bools where the API expects a negative, zero or positive result, on my Chrome instance it returns `[1, 3, 2]`. That is roughly the level of correctness of the mathematics in the article as well, which I'm trying to present in sibling comment: https://news.ycombinator.com/item?id=47814213
Ok, let's say that it is not JS, but an untyped, closure-based programming language with a strikingly similar array and sort API to JS. Sadly, this comparator is still wrong for any sorting API that expects a general three-way comparison, because it does not handle equality as a separate case.
And to tie it down to the mathematics: if a sorting algorithm asks for a full comparison between a and b, and your function returns only a bool, you are conflating the "no" (a is before b) with the "no" (a is the same as b). This fails to represent equality as a separate case, which is exactly the kind of imprecision the author should be trying to teach against.
> Sadly, this comparator is still wrong for any sorting API that expects a general three-way comparison, because it does not handle equality as a separate case.
Let's scroll up a little bit and read from the section you're finding fault with:
the most straightforward type of order that you think of is linear order i.e. one in which every object has its place depending on every other object
Rather than the usual "harrumph! This writer knows NOTHING of mathematics and has no business writing about it," maybe a simple counter-example would do, i.e. present an ordering "in which every object has its place depending on every other object" and "leaves no room for ambiguity in terms of which element comes before which" but also satisfies your requirement of allowing 'equal' ordering.
Your reply only works if the article were consistently talking about a strict order. However, it is not. It explicitly introduces linear order using reflexivity and antisymmetry, in other words, a non-strict `<=`-style relation, in which equality IS a real case.
If the author wanted to describe a 'no ties' scenario where every object has its own unique place, they should have defined a strict total order.
They may know everything about mathematics for all I care. I am critiquing what I am reading, not the author's knowledge.
Edit: for anyone wanting a basic example, ["aa", "aa", "ab"] under the usual lexicographic <=. All elements are comparable, so "every object has its place depending on every other object." It also "leaves no room for ambiguity in terms of which element comes before which": aa = aa < ab. Linear order means everything is comparable, not that there are no ties. By claiming "no ties are permitted" while defining the order as a reflexive, antisymmetric relation, the author is mixing a strict-order intuition into a non-strict-order definition.
To address your actual pedantry, clearly you have some implicit normative belief about how a book about category theory should be written. That's cool, but this book has clearly chosen another approach, and appears to be clear and well explained enough to give a light introduction to category theory.
The syntax in the article is not scheme, you can clearly see it in my comment you're responding to.
As for your 'light introduction' comment: even ignoring the code, these are not pedantic complaints but basic mathematical and factual errors.
For example, the statement of Birkhoff’s Representation Theorem is wrong. The article says:
> Each distributive lattice is isomorphic to an inclusion order of its join-irreducible elements.
That is simply not the theorem. The theorem says "Theorem. Any finite distributive lattice L is isomorphic to the lattice of lower sets of the partial order of the join-irreducible elements of L.". You can read the definition on Wikipedia [0]
The article is plain wrong. The join-irreducibles themselves form a poset. The theorem is about the lattice of down-sets of that poset, ordered by inclusion. So the article is NOT simplifying, but misstating one of the central results it tries to explain. Call it a 'light introduction' as long as you want. This does not excuse the article from reversing the meaning of the theorem.
It's basically like saying 'E=m*c' is a simplification of 'E=m*c^2'.
I think it is pretty obvious that at the challenge with all abstract mathematics in general and the category theory in particular isnt the fact that people dont understand what a "linear order" is, but the fact it is so distant from daily routine that it seems completely pointless. It's like pouring water over pefectly smooth glass
Is there a "mind-blowing fact" about category theory? Like the first time I've heard that one can prove there is no analytical solution for a polynomial equation with a degree > 5 with group theory, it was mind-blowing. What's the counterpart of category theory?
A thing is its relationships. (Yoneda lemma.) Keep track of how an object connects to everything else, and you’ve recovered the object itself, up to isomorphism. It’s why mathematicians study things by probing them: a group by its actions, a space by the maps into it, a scheme in algebraic geometry defined as the rule for what maps into it look like. (You do need the full pattern of connections, not just a list — two different rings can have the same modules, for instance.) [0]
Writing a program and proving a theorem are the same act. (Curry–Howard–Lambek.) For well-behaved programs, every program is a proof of something and every proof is a program. The match is exact for simple typed languages and leaks a bit once you add general recursion (an infinite loop “proves” anything in Haskell), but the underlying identity is real. Lambek added the third leg: these are also morphisms in a category. [1]
Algebra and geometry are one thing wearing different costumes. (Stone duality and cousins.) A system of equations and the shape it cuts out aren’t related, they’re the same object seen from opposite sides. Grothendieck rebuilt algebraic geometry on this idea, with schemes (so you can do geometry on the integers themselves) and étale cohomology (topological invariants for shapes with no actual topology). His student Deligne used that machinery to settle the Weil conjectures in 1974. Wiles’s Fermat proof lives in the same world, though it leans on much more than the categorical foundations. [2]
In my study, it's basically never that the person names the thing after themselves. My theory goes: Often a discovery is presented in a paper by someone(s), who gives it a usually only barely passable name. For a time, only a handful of experts in the field know about it and none of them care to write general explainers for the layman. So they call it what's easy. "[Name] [concept]" because they're used to talking in names all the time. Academic experts have a large library of people's names tied to the concepts in their papers, i know my PI certainly did, every query was met with a name that had solved it to go look up.
Anyways, the discussion begins with these people. Who all use the name to reference the paper which contains the result. As the discussion expand, it remains centered on this group and you have to talk _with_ them and not at them so you use the name they do. This usage slowly expands, until eventually it gets written in a textbook, taught to grad students, then to undergrads, and it becomes hopeless to change the name.
I share the frustration with naming, we can come up with such better names for things now. But until we give stipend bonuses for good naming, the experts will never care to do so. But i wholeheartedly disagree that the problem as a whole can be reduced to "people like their ribbons". Naming something after yourself is so gauche and would not be tolerated in my field at least. The other professors would create a better name simply out of spite for your greed.
Sometimes the proof in category theory is trivial but we have no lower dimension or concrete intuition as to why that is true. This whole state of affairs is called abstract nonsense.
I think that CT is more akin to just a different language for mathematics than a solid set of axioms from which you can prove things. The most fact-y proof I've personally seen was that you can't extend the usual definition of functions in set theory to work with parametric polymorphism (not that just some constructions won't work, but that there isn't one at all).
Sure, category theory can't prove the unsolvability of the quintic. But did you know that a monad is really just a monoid object in the monoidal category of endofunctors on the category of types of your favorite language?
You say pretty obvious, but it took me 2 years during my PhD to be consciously aware of this. And once I did, I immediately knew I wanted to leave my field as soon as I would finish.
You're more right than you'd think. The whole point of mathematics is precise thinking, yet the article is very inaccurate.
Nobody seems to care or notice. I'm watching in disbelief how nobody is pointing out the article is full of inaccuracies. See my sibling thread for a (very) incomplete list, which should disqualified this as a serious reading: https://news.ycombinator.com/item?id=47814213
My conclusion cannot be other than this ought to be useless for the general practitioner, since even wrong mathematics is appreciated the same as correct mathematics.
There is a way to frame category theory such that it's all just arrows -- by associating the identity arrow (which all objects have by definition) with the object itself. In a sense, the object is syntactic sugar.
binary relations defining order are more nuanced than they seem; a linear order isn't just about ranking, it's about the structure of the relationships themselves.
this reminds me of Haskell’s type classes; they elegantly define order concepts through their own set of rules, capturing relationships in a clean way.
I think the last 10% is exactly where the useful part is, at least for programmers.
In a preorder seen as a category, there is at most one arrow between any two objects. So every diagram commutes and uniqueness is basically free. Then products and coproducts stop looking like magic diagrams and become something very familiar: greatest lower bounds and least upper bounds.
Small nit: preorders are thin categories, but posets are the skeletal thin categories. In a preorder you can have distinct a and b with both a <= b and b <= a, which means they are isomorphic, not literally the same. Quotienting by that equivalence gives you the poset.
The software angle is the part I find most useful. This kind of bugs shows up when we force a total order onto something that is only partially ordered, or only preordered. Dependency graphs, versions, permissions, type hierarchies, CRDT states, rule specificity, build steps. A lot of these don’t really want a comparator and a sort. Sometimes they want a quotient, a topological sort, a join, or just the honest answer that two things are not comparable.
That feels like the practical lesson here: category theory is not always adding abstraction. Sometimes it is just a good way to stop pretending two different structures are the same thing.
I love how math is like a new language, in a new country, of culture you are not exactly familiar with.
This article is like living there for few months. You see things, some of them you recognize as something similar to what you have at home, then you learn how the locals look at them and call them. And suddenly you can understand what somebody means when they say:
"Each distributive lattice is isomorphic to an inclusion order of its join-irreducible elements."
Having a charitable local (or expat with years there under their belt) that helps you grasp it because they know where you came from, just like the person who wrote this article, is such a treasure.
Unless there's some idiosyncratic meaning for the `=>`, the Antisymmetry one basically says `Orange -> Yellow => Yellow -/> Orange`. The diagram is not acurate. The prose is very imprecise. "It also means that no ties are permitted - either I am better than my grandmother at soccer or she is better at it than me." NO. Antisymmetry doesn't exclude `x = y`. Ties are permitted in the equality case. Antisymmetry for a non-strict order says that if both directions hold, the two elements must in fact be the same element. The author is describing strict comparison or total comparability intuition, not antisymmetry.
I don't think they are completely wrong - "=>" is just implication. A hidden assumption in their diagrams is that circles of different colours are assumed to be different elements.
A morphism from orange to yellow means "O <= Y". From this, antisymmetry (and the hidden assumption) implies that "Y not <= O".
Totality is just the other way around (all two distinct elements are comparable in one direction).
If this is meant to be an explainer, that can't be simply implicit. The text actually seems full of imprecise claims, such as:
"All diagrams that look something different than the said chain diagram represent partial orders"
"The different linear orders that make up the partial order are called chains"
The Birkhoff theorem statement, which is materially wrong. A finite distributive lattice is not isomorphic to "the inclusion order of its join-irreducible elements".
It really isn't a long enough section to get lost in.
The 'not accurate' diagram says that orange-less-than-yellow implies yellow-not-less-than-orange. Hard to find fault with.
> NO. Antisymmetry doesn't exclude `x = y`. Ties are permitted in the equality case. Antisymmetry for a non-strict order says that if both directions hold, the two elements must in fact be the same element. The author is describing strict comparison or total comparability intuition, not antisymmetry.
My comment is not long enough either to get lost in.
The prose "It also means that no ties are permitted - either I am better than my grandmother at soccer or she is better at it than me" is inaccurate for describing antisymmetry. In the same short section, you first state the correct condition:
You have x ≤ y and y ≤ x only if x = y
from which it doesn't follow that "It also means that no ties are permitted". The "no ties" idea belongs to a stronger notion such as a strict total order, not to antisymmetry.
[1] https://arxiv.org/pdf/1612.09375
Category theory is also not that impressive unless you already understand some of the semantics it is trying to unify. In this regards, the book itself presents, for example, the initial property as trivial at first hand, unless you notice that it does not simply hold for arbitrary structures.
[1, 3, 2].sort((a, b) => { if (a > b) { return true
})This is not a valid comparator. It returns bools where the API expects a negative, zero or positive result, on my Chrome instance it returns `[1, 3, 2]`. That is roughly the level of correctness of the mathematics in the article as well, which I'm trying to present in sibling comment: https://news.ycombinator.com/item?id=47814213
And to tie it down to the mathematics: if a sorting algorithm asks for a full comparison between a and b, and your function returns only a bool, you are conflating the "no" (a is before b) with the "no" (a is the same as b). This fails to represent equality as a separate case, which is exactly the kind of imprecision the author should be trying to teach against.
Let's scroll up a little bit and read from the section you're finding fault with:
Rather than the usual "harrumph! This writer knows NOTHING of mathematics and has no business writing about it," maybe a simple counter-example would do, i.e. present an ordering "in which every object has its place depending on every other object" and "leaves no room for ambiguity in terms of which element comes before which" but also satisfies your requirement of allowing 'equal' ordering.If the author wanted to describe a 'no ties' scenario where every object has its own unique place, they should have defined a strict total order.
They may know everything about mathematics for all I care. I am critiquing what I am reading, not the author's knowledge.
Edit: for anyone wanting a basic example, ["aa", "aa", "ab"] under the usual lexicographic <=. All elements are comparable, so "every object has its place depending on every other object." It also "leaves no room for ambiguity in terms of which element comes before which": aa = aa < ab. Linear order means everything is comparable, not that there are no ties. By claiming "no ties are permitted" while defining the order as a reflexive, antisymmetric relation, the author is mixing a strict-order intuition into a non-strict-order definition.
Ah! You're talking about Racket or Scheme!
```
> (sort '(3 1 2) (lambda (a b) (< a b)))
'(1,2,3)
```
I suppose you ought to go and tell the r6rs standardisation team that a HN user vehemently disagrees with their api: https://www.r6rs.org/document/lib-html-5.96/r6rs-lib-Z-H-5.h...
To address your actual pedantry, clearly you have some implicit normative belief about how a book about category theory should be written. That's cool, but this book has clearly chosen another approach, and appears to be clear and well explained enough to give a light introduction to category theory.
As for your 'light introduction' comment: even ignoring the code, these are not pedantic complaints but basic mathematical and factual errors.
For example, the statement of Birkhoff’s Representation Theorem is wrong. The article says:
> Each distributive lattice is isomorphic to an inclusion order of its join-irreducible elements.
That is simply not the theorem. The theorem says "Theorem. Any finite distributive lattice L is isomorphic to the lattice of lower sets of the partial order of the join-irreducible elements of L.". You can read the definition on Wikipedia [0]
The article is plain wrong. The join-irreducibles themselves form a poset. The theorem is about the lattice of down-sets of that poset, ordered by inclusion. So the article is NOT simplifying, but misstating one of the central results it tries to explain. Call it a 'light introduction' as long as you want. This does not excuse the article from reversing the meaning of the theorem.
It's basically like saying 'E=m*c' is a simplification of 'E=m*c^2'.
[0] https://en.wikipedia.org/wiki/Birkhoff%27s_representation_th...
Writing a program and proving a theorem are the same act. (Curry–Howard–Lambek.) For well-behaved programs, every program is a proof of something and every proof is a program. The match is exact for simple typed languages and leaks a bit once you add general recursion (an infinite loop “proves” anything in Haskell), but the underlying identity is real. Lambek added the third leg: these are also morphisms in a category. [1]
Algebra and geometry are one thing wearing different costumes. (Stone duality and cousins.) A system of equations and the shape it cuts out aren’t related, they’re the same object seen from opposite sides. Grothendieck rebuilt algebraic geometry on this idea, with schemes (so you can do geometry on the integers themselves) and étale cohomology (topological invariants for shapes with no actual topology). His student Deligne used that machinery to settle the Weil conjectures in 1974. Wiles’s Fermat proof lives in the same world, though it leans on much more than the categorical foundations. [2]
[0] https://en.wikipedia.org/wiki/Yoneda_lemma
[1] https://en.wikipedia.org/wiki/Curry%E2%80%93Howard_correspon...
[2] https://en.wikipedia.org/wiki/Stone_duality
We should strive to name all things by their function not by their inventor or discoverer IMO. But people like their ribbons.
Anyways, the discussion begins with these people. Who all use the name to reference the paper which contains the result. As the discussion expand, it remains centered on this group and you have to talk _with_ them and not at them so you use the name they do. This usage slowly expands, until eventually it gets written in a textbook, taught to grad students, then to undergrads, and it becomes hopeless to change the name.
I share the frustration with naming, we can come up with such better names for things now. But until we give stipend bonuses for good naming, the experts will never care to do so. But i wholeheartedly disagree that the problem as a whole can be reduced to "people like their ribbons". Naming something after yourself is so gauche and would not be tolerated in my field at least. The other professors would create a better name simply out of spite for your greed.
https://math.stackexchange.com/questions/823289/abstract-non...
Sometimes the proof in category theory is trivial but we have no lower dimension or concrete intuition as to why that is true. This whole state of affairs is called abstract nonsense.
Nobody seems to care or notice. I'm watching in disbelief how nobody is pointing out the article is full of inaccuracies. See my sibling thread for a (very) incomplete list, which should disqualified this as a serious reading: https://news.ycombinator.com/item?id=47814213
My conclusion cannot be other than this ought to be useless for the general practitioner, since even wrong mathematics is appreciated the same as correct mathematics.
I'm unclear what the last 10% of 'category theory' gives us.
In a preorder seen as a category, there is at most one arrow between any two objects. So every diagram commutes and uniqueness is basically free. Then products and coproducts stop looking like magic diagrams and become something very familiar: greatest lower bounds and least upper bounds.
Small nit: preorders are thin categories, but posets are the skeletal thin categories. In a preorder you can have distinct a and b with both a <= b and b <= a, which means they are isomorphic, not literally the same. Quotienting by that equivalence gives you the poset.
The software angle is the part I find most useful. This kind of bugs shows up when we force a total order onto something that is only partially ordered, or only preordered. Dependency graphs, versions, permissions, type hierarchies, CRDT states, rule specificity, build steps. A lot of these don’t really want a comparator and a sort. Sometimes they want a quotient, a topological sort, a join, or just the honest answer that two things are not comparable.
That feels like the practical lesson here: category theory is not always adding abstraction. Sometimes it is just a good way to stop pretending two different structures are the same thing.
This article is like living there for few months. You see things, some of them you recognize as something similar to what you have at home, then you learn how the locals look at them and call them. And suddenly you can understand what somebody means when they say:
"Each distributive lattice is isomorphic to an inclusion order of its join-irreducible elements."
Having a charitable local (or expat with years there under their belt) that helps you grasp it because they know where you came from, just like the person who wrote this article, is such a treasure.
A morphism from orange to yellow means "O <= Y". From this, antisymmetry (and the hidden assumption) implies that "Y not <= O".
Totality is just the other way around (all two distinct elements are comparable in one direction).
"All diagrams that look something different than the said chain diagram represent partial orders"
"The different linear orders that make up the partial order are called chains"
The Birkhoff theorem statement, which is materially wrong. A finite distributive lattice is not isomorphic to "the inclusion order of its join-irreducible elements".
The 'not accurate' diagram says that orange-less-than-yellow implies yellow-not-less-than-orange. Hard to find fault with.
> NO. Antisymmetry doesn't exclude `x = y`. Ties are permitted in the equality case. Antisymmetry for a non-strict order says that if both directions hold, the two elements must in fact be the same element. The author is describing strict comparison or total comparability intuition, not antisymmetry.
I like the article's "imprecise prose" better:
The prose "It also means that no ties are permitted - either I am better than my grandmother at soccer or she is better at it than me" is inaccurate for describing antisymmetry. In the same short section, you first state the correct condition:
You have x ≤ y and y ≤ x only if x = y
from which it doesn't follow that "It also means that no ties are permitted". The "no ties" idea belongs to a stronger notion such as a strict total order, not to antisymmetry.