Linguistics Using Category Theory

Azimuth 2018-03-12

 

Now students in the Applied Category Theory 2018 school are reading about categories applied to linguistics. Read the blog article here for more:

• Jade Master and Cory Griffith, Linguistics using category theory, The n-Category Café, 6 February 2018.

This was written by my grad student Jade Master along with Cory Griffith, an undergrad at Stanford. Jade is currently working with me on the semantics of open Petri nets.

What’s the basic idea of this linguistics and category theory stuff? I don’t know much about this, but I can say a bit.

Since category theory is great for understanding the semantics of programming languages, it makes sense to try it for human languages, even though they’re much harder. The first serious attempt I know was by Jim Lambek, who introduced pregroup grammars in 1958:

• Joachim Lambek, The mathematics of sentence structure, Amer. Math. Monthly 65 (1958), 154–170.

In this article he hid the connection to category theory. But when you start diagramming sentences or phrases using his grammar, as below, you get planar string diagrams as shown above. So it’s not surprising—if you’re in the know—that he’s secretly using monoidal categories where every object has a right dual and, separately, a left dual.

This fact is just barely mentioned in the Wikipedia article:

Pregroup grammar.

but it’s explained in more detail here:

• A. Preller and J. Lambek, Free compact 2-categories, Mathematical Structures in Computer Science 17 (2005), 309-340.

This stuff is hugely fun, so I’m wondering why I never looked into it before! When I talked to Lambek, who is sadly no longer with us, it was mainly about his theories relating particle physics to quaternions.

Recently Mehrnoosh Sadrzadeh and Bob Coecke have taken up Lambek’s ideas, relating them to the category of finite-dimensional vector spaces. Choosing a monoidal functor from a pregroup grammar to this category allows one to study linguistics using linear algebra! This simplifies things, perhaps a bit too much—but it makes it easy to do massive computations, which is very popular in this age of “big data” and machine learning.

It also sets up a weird analogy between linguistics and quantum mechanics, which I’m a bit suspicious of. While the category of finite-dimensional vector spaces with its usual tensor product is monoidal, and has duals, it’s symmetric, so the difference between writing a word to the left of another and writing it to the right of another gets washed out! I think instead of using vector spaces one should use modules of some noncommutative Hopf algebra, or something like that. Hmm… I should talk to those folks.

To discuss this, please visit The n-Category Café, since there’s a nice conversation going on there and I don’t want to split it. There has also been a conversation on Google+, and I’ll quote some of it here, so you don’t have to run all over the internet.

Noam Zeilberger wrote:

You might have been simplifying things for the post, but a small comment anyways: what Lambek introduced in his original paper are these days usually called “Lambek grammars”, and not exactly the same thing as what Lambek later introduced as “pregroup grammars”. Lambek grammars actually correspond to monoidal biclosed categories in disguise (i.e., based on left/right division rather than left/right duals), and may also be considered without a unit (as in his original paper). (I only have a passing familiarity with this stuff, though, and am not very clear on the difference in linguistic expressivity between grammars based on division vs grammars based on duals.)

Noam Zeilberger wrote:

If you haven’t seen it before, you might also like Lambek’s followup paper “On the calculus of syntactic types”, which generalized his original calculus by dropping associativity (so that sentences are viewed as trees rather than strings). Here are the first few paragraphs from the introduction:

…and here is a bit near the end of the 1961 paper, where he made explicit how derivations in the (original) associative calculus can be interpreted as morphisms of a monoidal biclosed category:

John Baez wrote:

Noam Zeilberger wrote: “what Lambek introduced in his original paper are these days usually called “Lambek grammars”, and not exactly the same thing as what Lambek later introduced as “pregroup grammars”.”

Can you say what the difference is? I wasn’t simplifying things on purpose; I just don’t know this stuff. I think monoidal biclosed categories are great, and if someone wants to demand that the left or right duals be inverses, or that the category be a poset, I can live with that too…. though if I ever learned more linguistics, I might ask why those additional assumptions are reasonable. (Right now I have no idea how reasonable the whole approach is to begin with!)

Thanks for the links! I will read them in my enormous amounts of spare time. :-)

Noam Zeilberger wrote:

As I said it’s not clear to me what the linguistic motivations are, but the way I understand the difference between the original “Lambek” grammars and (later introduced by Lambek) pregroup grammars is that it is precisely analogous to the difference between a monoidal category with left/right residuals and a monoidal category with left/right duals. Lambek’s 1958 paper was building off the idea of “categorial grammar” introduced earlier by Ajdukiewicz and Bar-Hillel, where the basic way of combining types was left division A\B and right division B/A (with no product).

Noam Zeilberger wrote:

At least one seeming advantage of the original approach (without duals) is that it permits interpretations of the “semantics” of sentences/derivations in cartesian closed categories. So it’s in harmony with the approach of “Montague semantics” (mentioned by Richard Williamson over at the n-Cafe) where the meanings of natural language expressions are interpreted using lambda calculus. What I understand is that this is one of the reasons Lambek grammar started to become more popular in the 80s, following a paper by Van Benthem where he observed that such such lambda terms denoting the meanings of expressions could be computed via “homomorphism” from syntactic derivations in Lambek grammar.

Jason Nichols wrote:

John Baez, as someone with a minimal understanding of set theory, lambda calculus, and information theory, what would you recommend as background reading to try to understand this stuff?

It’s really interesting, and looks relevant to work I do with NLP and even abstract syntax trees, but I reading the papers and wiki pages, I feel like there’s a pretty big gap to cross between where I am, and where I’d need to be to begin to understand this stuff.

John Baez wrote:

Jason Nichols: I suggest trying to read some of Lambek’s early papers, like this one:

• Joachim Lambek, The mathematics of sentence structure, Amer. Math. Monthly 65 (1958), 154–170.

(If you have access to the version at the American Mathematical Monthly, it’s better typeset than this free version.) I don’t think you need to understand category theory to follow them, at least not this first one. At least for starters, knowing category theory mainly makes it clear that the structures he’s trying to use are not arbitrary, but “mathematically natural”. I guess that as the subject develops further, people take more advantage of the category theory and it becomes more important to know it. But anyway, I recommend Lambek’s papers!

Borislav Iordanov wrote:

Lambek was an amazing teacher, I was lucky to have him in my ungrad. There is a small and very approachable book on his pregroups treatment that he wrote shortly before he passed away: “From Word to Sentence: a computational algebraic approach to grammar”. It’s plain algebra and very fun. Sadly looks like out of print on Amazon, but if you can find it, well worth it.

Andreas Geisler wrote:

One immediate concern for me here is that this seems (don’t have the expertise to be sure) to repeat a very old mistake of linguistics, long abandoned :

Words do not have atomic meanings. They are not a part of some 1:1 lookup table.

The most likely scenario right now is that our brains store meaning as a continuously accumulating set of connections that ultimately are impacted by every instance of a form we’ve ever heard/seen.

So, you shall know a word by all the company you’ve ever seen it in.

Andreas Geisler wrote:

John Baez I am a linguist by training, you’re welcome to borrow my brain if you want. You just have to figure out the words to use to get my brain to index what you need, as I don’t know the category theory stuff at all.

It’s a question of interpretation. I am also a translator, so i might be of some small assistance there as well, but it’s not going to be easy either way I am afraid.

John Baez wrote:

Andreas Geisler wrote: “I might be of some small assistance there as well, but it’s not going to be easy either way I am afraid.”

No, it wouldn’t. Alas, I don’t really have time to tackle linguistics myself. Mehrnoosh Sadrzadeh is seriously working on category theory and linguistics. She’s one of the people leading a team of students at this Applied Category Theory 2018 school. She’s the one who assigned this paper by Lambek, which 2 students blogged about. So she would be the one to talk to.

So, you shall know a word by all the company you’ve ever seen it in.

Yes, that quote appears in the blog article by the students, which my post here was merely an advertisement for.