The 56th Venice Biennale

At the entrance to the central pavilion of the 56th Venice Biennale is a restorer’s ladder—three stories tall, made of two long staves of veined wood and girded like a construction crane with two lattices of peeled iron. The ladder (Fabio Mauri, “Macchina per fissure acquerelli”) reaches up and back to Galileo Chini’s painted dome, first erected in 1909 for the 8th Biennale. The ladder is a gesture; it feels and looks temporary, yet it has all the tart flavor of a Lichtenstein one-liner. The ladder is telescoping upward and backward to some imagined beginning.

This year’s theme is “All The World’s Futures”. What a hopeful title. The future is associated with kids (America), robots (Belgium), and chrome skinsuits (South Korea), so all the futures can only mean all the kids, all the robots, and all the metal eyeshadow. The theme, whatever it’s supposed to mean, functions as a trick. The more you stare at some icon of the future hanging or projected on the beige prop wall, the more you feel like you are being dragged relentlessly into some regressive French movie about the American 80’s, like you have opened the door to a dour, fat-faced salesman trying to sell you on the next new gospel. The future has never seemed so off-kilter, so imbecile.

Art, like bread, is meant to be consumed. Art fills you, it soaks up excess alcohol, and it makes you sick if you consume too much. The Biennale is a feast. You walk around, and there are Adrian Pipers to enjoy, Young British Artists to mock, an epically boring live reading of Das Kapital, tourists going on benders with selfie-sticks, Venetians glowering in the backlight. Some of the pavilions were atrocious; some of them were sublime. The German pavilion was a hot mess of hipster lawn art and commercials for video games I would never play. At the Japanese pavilion, Chiharu Shiota somehow both submerged and elevated the entire exhibit under a skein of red thread, keys, and sunken boats, creating another horizon where heaven meets the sea. The Norwegian pavilion was anomalous, architectural, modern, and striking. The French pavilion, like the Belgian, has robots. You can walk through the entire exhibition hall, from the Giardini through the Arsenale, and find good art, bad art, blameable art, art which is forgivable because it is, after all, only art. What you will not find is art that gives you hope for the future. Art—fine art, collected art, curated art—does not belong to the future, at least not as robots belong to the future.

Art is a sideshow to progress.

(to be continued…)

Formal concepts and natural languages

Back in January, Yiannis (Vlassopoulos) and I were talking about “quadratic relations” and higher concepts in language, for example the analogy between “(king, queen)” and “(man, woman)”. Deep neural networks can learn such relations from a set of natural language texts, called a corpus. But there are other ways of learning such relations:

Representation of corpus	How to learn / compute
n-grams	string matching
word embeddings	standard algorithms, e.g. neural nets
presyntactic category	computational category theory
bar construction	Koszul dual
formal concept lattice	TBD

There are some very nice connections between all five representations. In a way, they’re all struggling to get away from the raw, syntactic, “1-dimensional” data of word co-location to something higher-order, something semantic. (For example, “royal + woman = queen” is semantic; “royal + woman = royal woman” is not.) I’d like to tell a bit of that story here.

Continue reading →

Mathematical experiments

[This is an ancient draft from 2015(!) that I plan to edit in place here while I’m at IPAM – Josh, Oct. 22, 2024.]

Two years ago, I said that a mathematical experiment is an example done fancy, but a more accurate description is that a mathematical experiment is an example minus theory.

We do experiments all the time; for example, we calculate invariants for specific examples. We check axioms for specific examples of a structure, as part of thinking about the general structure.
Explainable AI is connected to mathematical experiments, since mathematical models, especially very abstract ones, are developed mainly for human consumption / usage.

Conjectures are propositions “minus proof”. Analogously, a mathematical experiment is an example minus theory, i.e. minus the stuff that makes the example an example of something. But just as “propositions minus proof” doesn’t quite do justice to the pragmatic content of a conjecture, “examples minus theory” doesn’t quite do justice to the pragmatic role of an experiment and what it is used for. An experiment should also reflect some portion of the context, mechanisms, and knowledge used to produce those examples.

So mathematical experiments are both (1) the examples minus theory and (2) the examples plus pragmatics.

% The design of examples and propositions often occurs by working inside a theory, while the design of experiments and conjectures often occurs within the process of theory-building.

Mathematics should interact with the real world. We want a way talking about “all the other practical stuff in the real world”—e.g. applied mathematics”—within pure mathematics itself. (A way to talk about “why this outside stuff matters, what should I do with it?” will come later.) Ideally, this would also lead to a better accounting system for how mathematics does interact with the real world.

% Something that captures the stuff in the pragmatics that’s missing from the (current) syntax and semantics…

What’s in a name? That which we call a rose
By any other name would smell as sweet.

Names have meaning. The von Neumann \{ 0 \} and the Fregean \{ A : |A| = 1 \} are different “names” for the same natural number, but the names are used differently, often to illustrate different aspects of axiomatic number systems. Or: in an algebraic database system, it’s very important to distinguish and separate the values “Bob” and “Clara”, even though they may be isomorphic—satisfy all the same database constraints—as far as the database system can tell.

Continuing Carette and O’Connor’s discussion of names. What is the utility of a name, and can we quantify it?

As we discussed in part 1, we want to understand the contextual, “intended” interpretation of a sentence, and one distillation of the context of a sentence is the choice of a variable’s name or label. For example, ‘F’ in the sentence, F = ma. We know that encoded in the choice of ‘F’ (as opposed to ‘camel’, for example) is information about Newton’s laws of motion, the pedagogy of physics, the social mores of science, and even the nature of causality. We distinguish the “camel-invariant” properties of ‘F’—those properties which would remain even if we renamed all instances of ‘F’ to ‘camel’—from the contextual and connotative properties of ‘F’.

[What does this have to do with internal structure or structure in data? Intuitively this is relations between data, probability distributions, clusterings and connected components. Properties which can be specified, in principle, extensionally. The idea is that this internal structure can also be characterized as the “camel-invariant” properties, since they are internal to the theory and independent of an “observer”. It is the structure internal to a domain. Internal structure may be compared to external structure or “the structure of the problem”, things that tell us how to use and invoke the object ‘F’, e.g. in proofs or in the development of technology based on ‘F’. Sometimes this information will be formal, like the “type” of X. like the type of ‘x’, which tells us how to use and invoke ‘x’… this is structure assigned via a representation, which could be distinguished via a functor.]

Outline of the rest of this post:

Connect the namespace discussion from the previous post to “the standard problems in KR” (these don’t exist yet, you’re going to make them up in this post!).The namespace discussion comes up in mathematical pragmatics because when we’re building new theories—as opposed to working and making proofs inside an existing theory, e.g. solving problems—we do not and can not hold a “flat” view of all the necessary objects and variables. We need higher-order connections between modules, and thus we need to address namespace issues. Namespace issues are the most basic syntactic evidence of the usual tension between theory-building and problem-solving.
What is mathematical pragmatics? Well, turn it around: from a mathematical semantics of natural language to a “natural semantics” of mathematical language. The goal of this note is to define this “natural semantics” in some way that reflects the idea that something is useful. Even questions like, “what’s the meaning of a name?” Usefulness is really hard to define, even in mathematics, (which is why we will later define mathematical experiments), so in this part I will try to establish the background.One part of a linguistic expression’s meaning is its usage in various sentences. Similarly, one part of a formal expression’s meaning is its usage in various proofs—this “usage” is what we mean by a proof-theoretic semantics… well, so long as proofs are understood as formal proofs—sequences of formal expressions related by implication.We want to get at the essential meaning of the object or expression, a sort of “why it matters” or the “subatomic, non-compositional component” of the normal meanings and uses. Part of why that matters is how it gets used. But another part of the meaning comes down to where it came from, how it was defined and “bound”.Want a proof that neither the proof- nor model-theoretic semantics offers are a complete picture of pragmatics.How do we isolate pragmatics from syntax and semantics? This is not always possible. One way of thinking about pragmatics is through relevance: a mathematician uses an expression / proposition if and only if it is relevant to the theorem to be proved.To me, the appealing fact about mathematics, and especially formal mathematics, is that it allows us to isolate the pragmatics from the syntax and the semantics. (In fact, it’s sort of why we have syntax and semantics at all, which usually isn’t possible in other fields). What does this mean? [Can we describe it in terms of the syntax -| semantics adjunction?]Syntax is a game of names, semantics is a science of names, and pragmatics is an art of names.
To capture the pragmatics of an object, instead of integrating over the “uses” of some object, which are ill-defined, we want to integrate over the mathematical experiments, understood as some sort of process that has inputs and may then be measured. But we define the process not from the initial object or to the terminal object, but from a suitably generalized version of “initial” and “terminal”. We want to say something essential about the pragmatic character of the mathematical object, but in a way which is non-pragmatic, i.e. non-contextual.
Two paragraph soft intro to the philosophical issues of a structural foundations of mathematics. The relationship between description (“intension”) and objects (“extension”) accounts across a variety of foundations, from set theory to category theory, each the embodiment of a certain principle of abstraction. Review David Corfield’s “Toward a Philosophy of Real Mathematics”.
(What is a process of abstraction, part 1.) What is a process of abstraction? See beginnings of that discussion below. Softpedal the diagrammatic part, i.e. just start using diagrams without talking about a “diagrammatic calculus”. Emphasize the need for a provisional, “throw-away” part. (It’s throw-away because it’s not 1-d syntactic.)
Identify a few well-formed “pragmatic” problems. Example: mathematical theories are non-contextual, by supposition. E.g. the “lines and points” into “camels and asparagus” claim. You can draw the signature without dealing with any other part of your language. Can we show that mathematical theories are actually CONTEXTUAL? Based on the practical anecdote that the meaning of the words you use do matter, for “thinking” purposes.Try reading Griffiths and Paseau on logical consequence and the project of reflecting “natural consequence” inside a formal system. Perhaps, with a very refined notion of logical consequence, we could test for “natural semantics” in mathematical propositions?
Sanity check: Can we build a practical tool for mathematicians based on this idea? It could help one capture the intuition of “this has something to do with that”, “this was ought to be involved in the solution”, “this is a surprising result.” One that is responsive to conjectures.
What is a process of abstraction, part 2. Remember, don’t get bogged down! Get to a motivation for thinking about homotopy type theory, and some criteria that we can check off against HoTT.
Discuss homotopy type theory, and the notion of a proof of a proposition X being a point in a space X. Then one can define equivalences and relationships between proofs. How does this carry us closer to a correct definition of utility/use/value.
Bring up an example: mathematical software and UX. Can we have a small “lemma” application, in the style of iPython Notebook, for taking mathematical expressions in a paper and applying simple queries on them (like SOLVE, GET, etc.) in a way that is consistent with the additional pragmatics or “uncertainty” on top of them?

We apply mathematical concepts sometimes, though not always, via some sort of modeling assumption over “lower” data. We want to understand that modeling process or “the process of abstraction”. We want to use this calculus as a way of reasoning about the relationship between the concepts we construct (and, perhaps more importantly, the language in which we construct them) and the way we apply those concepts in the real world.

Here are some general phenomena related to abstraction:

Token/type distinction
Data/model/theory distinction
More abstract mathematical concepts apply to more cases, but in general, it seems the more abstract a theory, the harder it is to do something useful with it.
Is the process of abstraction composable? I think so. But perhaps not always.
Phenomena such as similarity between cases (i.e. x ~ y := “x,y satisfy P”),
Intensional definition via comprehension (D is the set such that D = {x : x satisfies P}) versus extensional definition,
Refining a concept by getting rid of properties, e.g. group to monoid.

What is a process of abstraction? Is every constructor and combinator is a process of abstraction? One may check: how is “5” an abstraction of 2+3? Or is “2+3” an abstraction of 5? How is T(v) an abstraction of v? How is S^1 an abstraction of a line D^1? But “2+3”, on its own, doesn’t tell us much about how or why we constructed 5. Perhaps “2+3” is just language for verifying that something we had already constructed (by +1’s, or by some other method) satisfies the + structure.

Suppose that a process of abstraction is an arrow going “up”—taking particular cases and “using” them to generate a more general concept. What is the use of a principle of abstraction, in the sense of the thick arrow below?

Our motivation: we want to understand how the concepts can control (our perception of, construction of) the cases. “How logic determines geometry”. By hypothesis, the way we go down is correlated in some way with the way we go up, and knowing one can tell us something about the other.

% Tools from TDA may be helpful. We’re using topological tools on natural language, and mathematical statements have way more structure, so you’d think so. I’m not sure yet.

% Etienne Ghys once posed an elaboration of Hilbert’s 24th problem in his lecture at Simplicity 2013. Imagine a movie. How do we convert movies into mathematical proofs? If not movies, then pictures? If not pictures, then general “geometrical signs”? One expects that the movie contains more structure and information than is needed for a proof, and one might choose to hold that the proof is somehow “embedded” inside the movie.

% What is a computational process from a thermodynamical point of view? What is a thermodynamical process from a computational point of view? Apply the sheaf-theoretic perspective on continuous vs. discrete dynamical systems to think about measures of information/entropy like Shannon entropy and Kolmogorov complexity in the context of internal/external structure of data, which are both inputs to something called the “meaning” of the data (which we derive, by hypothesis, through some theory of value). The key tool is replacing the strict, numeric notion of time in our various models of computation with a more structural definition about the role time plays (i.e. with Int-sheaves). Adriaans: “The issue is central in some of the more philosophical discussions on the nature of computation and information (Putnam 1988; Searle 1990).” Also reference levels of abstraction (Floridi 2002).

Operads and subsumption

After seeing David Spivak’s talk on operads for design at FMCS 2015, I immediately thought of Brooks’ subsumption architecture. The subsumption architecture was one of the first formalisms for programming mobile robots—simple, insect-like robots capable of feeling their way around without needing to plan or learn. Operads, on the other hand, are certain objects in category theory used to model “modularity”, e.g. situations where multiple things of a sort can be combined to form a single thing of the same sort.

I’d like to formalize subsumption using operads.

But why would anyone want to formalize an derelict robotics architecture with high-falutin’ mathematics?

The answer is simple. It’s not that subsumption on its own is important (though it is) or that it requires formalizing (though it does). What I’d really like to understand is how operads give domain-specific languages (and probably much more) and whether categories are the right way to pose problems that involve combining and stacking many such DSLs—think of a robot that can move, plan, and learn all at the same time—which, for lack of a better term, I will call hard integration problems.

(The rest of this post is currently in process! I will come back throughout the fall and update it.)

Continue reading →

UX, experiments, and real mathematics, part 1

Back when I was a rube just starting to learn algebraic topology, I started thinking about a unifying platform for mathematics research, a sort of dynamical Wikipedia for math that would converge, given good data, to some global “truth”. (My model: unification programs in theoretical and experimental physics.) The reason was simple—what I really wanted was a unifying platform for AI research, but AI was way, way too hard. I didn’t have a formal language, I didn’t have a type system or a consistent ontology between experiments, I didn’t have good correspondences or representation theorems between different branches of AI, and I certainly didn’t have category theory. Instinctively, I felt it would be easier to start with math. In my gut, I felt that any kind of “unifying” platform had to start with math.

Recently I met some people who have also been thinking about different variants of a “Wikipedia for math” and, more generally, about tools for mathematicians like visualizations, databases, and proof assistants. People are coming together; a context is emerging; it feels like the time is ripe for something good! So I thought I’d dust off my old notes and see if I can build some momentum around these ideas.

In this post, examples, examples, examples. I will discuss type systems for wikis, Florian Rabe’s “module system for mathematical theories“, Carette and O’Connor’s work on theory presentation combinators, and the pro/con of a scalable “library” of mathematics. I’ll briefly introduce the idea of mathematical experiments.
In part 2, I will talk about experiments in physics (especially in quantum mechanics), and consider a higher-order model of ensembles of experiments in this setting.
In part 3, I will talk about mathematical experiments (everything we love about examples, done fancier!), their relationship with “data”, and what they can do for the working mathematician.
In part 4, I’d like to understand what kind of theoretical foundation would be needed for an attack on mathematical pragmatics (a.k.a. “real mathematics” in the sense of Corfield) and check whether homotopy type theory could be a good candidate.

Continue reading →

Time-series and persistence

Recently, I’ve been working on a project to apply persistent homology to neural spike train data, with the particular goal of seeing whether this technique can reveal “low frequency” relationships and co-firing patterns that standard dimensionality reduction methods like PCA have been throwing away. For example, some neurons fire at a very Hz, around ~75-100 Hz, while in fact most neurons fire at ~10-30 Hz in response to some stimulus. The loud, shouty neurons are drowning out the quiet, contemplative ones! What do the quiet neurons know? More to the point, how do I get it out of them? Continue reading →

This is me.

Hello! I’m a mathematician / computer scientist doing research on the intersections between geometry, artificial intelligence, and governance. I am currently doing a PhD at Oxford. I also co-founded and lead research at Metagov. For more details, jump to my research page.

To contact me, send me an email at joshua dot z dot tan at gmail dot com (remember the “z” in the middle, otherwise you’ll get someone different!).

A summary of persistent homology

Say that we are given a single room in Borges’ library, and that we would like to say something about what those books are about—perhaps we can cluster them by subject, determine a common theme, or state that the selection is rigorously random. One way to start would be to scan every book in the room, represent each one as a long string (an average book has about 500,000 characters, though each book in Borges’ library has 410 pages x 40 lines x 80 characters per line = 1,312,000 characters), then perform some sort of data analysis.

In this case the number of books in each room (32 per bookshelf x 5 shelves per wall x 4 walls = 640 books) is far less than the length in characters of each book, i.e. the dimension of the book if we represent it in vector form. We would like to pare down this representation so that we can analyze and compare just the most relevant features of these books, i.e. those that suggest their subjects, their settings, the language they are written in, and so on.

Typically, we simplify our representations by “throwing out the unimportant dimensions”, and the most typical way to do this to keep just the two to three dimensions that have the largest singular value. On a well-structured data set, singular value decomposition, aka PCA, might leave only 2-3 dimensions which together account for over 90% of the variance… but clearly this approach will never work on books so long as we represent them as huge strings of characters. No single character can say anything about the subject or classification of an entire book.

Another way of simplifying data is to look at the data in a qualitative way: by studying its shape or, more precisely, its connectedness. This is the idea behind persistent homology.

To be continued…