Blog

In my last blog post, we looked at the very basics of proof refinement systems. I noted at the end that there was a big drawback of the systems as shown, namely that information flows only in one direction: from the root of a proof tree to the leaves. In this post, we'll see how to get information to flow in the opposite direction as well.

In this blog post, I'm going to discuss the overall structure of a proof refinement system. Such systems can be used for implementing automatic theorem provers, proof assistants, and type checkers for programming languages. The proof refinement literature is either old or hard to understand, so this post, and subsequent ones on the same topic, will present it in a more casual and current way.

Work on proof refinement as a methodology for building type checkers and interactive proof systems has a long history, going back to LCF, HOL, Nuprl, and Coq. These techniques never really penetrated into the mainstream of programming language implementation, but perhaps this may change.

This post aims to provide a unifying approach to symbolic and non-symbolic techniques of artificial intelligence. Historically, symbolic techniques have been very desirable for representing richly structured domains of knowledge, but have had little impact on machine learning because of a lack of generally applicable techniques. Conversely, machine learning techniques tend to be highly numeric in nature, thus restricting their application domain, but within those domains they have excelled. This post outlines a general approach to representing symbolic problems that is completely amenable to non-symbolic techniques such as deep learning, bridging the gap once and for all. The application domains are quite extensive: from traditional symbolic AI problems such as theorem proving, robotic planning, program synthesis, inductive programming, natural language understanding, natural language generation, and game playing, to less traditional tasks such as chemical synthesis design, program refactoring, and fuzzy matching of symbolic data. This post provides overviews of how these tasks can be done in this framework. For the vast majority of these tasks, the solutions also constitute glass boxes: they are understandable rather than being black boxes.

This post also only gives an overview of the technique. Test cases showing how it can be applied to some simple domains, as a validation of the technique, will be posted in subsequent blogposts, and then linked here.

This post is an attempt to communicate what I've come to understand about Plotkin's paper Call-By-Name, Call-By-Value and the λ-Calculus. In this paper, Plotkin describes the a form of the λ Calculus, together with an abstract machine called the SECD machine, and then describes, partially, how to get from an SECD machine to a continuation passing style implementation in a host λ Calculus. Plotkin addresses the issue that your host language might be lazy (call-by-name) while the language you want to implement is eager (call-by-value), or vice versa, and a naive implementation won't suffice, hence the CPS translation.

It took me a little bit to fully understand what was going on with the paper, and it took even longer to understand how to arrive at the translations that Plotkin has. Or to be more accurate, there is an algorithm that can be built relatively quickly from Plotkin's examples, but formulating the reasoning behind why that's the right algorithm isn't clear. So what I'll try to do in this post is show the process that I arrived at. I'll also use a simplification: rather than the SECD machine, I'll use a CEK machine. In fact, just a CK machine, to simplify even further.

The title of this post is a bit inflammatory, I grant you, but hopefully in context it'll make sense. In this blog post, I'm going to discuss one of the main characteristics of current NLP parsers, namely their heavy reliance on statistical models, and the reasons behind that choice. I'll also discuss what limitations arise because of this (in particular, for the purposes of natural language understanding and AI), and what we might want to explore as an alternative to the standard NLP approach. This post shouldn't be seen as an attempt to be authoritative on the topic, but just to provide a perspective originating from my Language Engine project.

The concept of edit distance is a useful one, allowing one to calculate how much difference there is between two strings. The idea has been generalized to trees, where instead of comparing two strings, you compare two n-ary branching trees. In this post, I'll show another generalization, which permits the comparison of a string with a regular expression to calculate how well the string matches the regular expression.

This post doesn't contain any actual code, just sketches of an algorithm, because how it gets implemented ends up being somewhat boring. I do, however, have a github repo here with a Haskell implementation.

A while back, I was playing with regular expressions and derivatives of datatypes (in the sense of McBride, which you can read about here). At some point it dawned on me that given any particular regular expression, the one-hole contexts into that expression have a tight connection to the states of a non-deterministic finite automata that recognizes the expression.

Note that this notion of differentiation is at the type level, not at the value level as described in Matt Might's blog post.

This post is somewhat tangential to what I've been doing with Language Engine, but since it's about programming language design, and since I'm designing a language for LE, I guess it's kind of related.

The topic of todays post is an answer to a question I posed on Twitter a few days ago about whether or not it would be possible to define evaluation in a way that was "style generic". That is to say, whether you use the standard lazy or eager evaluation functions that just recurse using the host language's mechanism, or a stack machine that builds up evaluation contexts, or whatever, the evaluated term you produce at the end is the same — the only difference is the style of the evaluator, not the meaning of evaluation. So is there a way to define the lazy evaluator, or the eager evaluator, or the lazy stack machine, and so on, and then choose one or the other as you want, for use in your language, independent of the language itself? The answer, I think, is yes, and so I'll present here my solution.

Pattern matching is absolutely central to modern functional programming languages, and in the type theoretically informed ones, it forms the basis of all usage of user-defined data. To inspect some piece of data, to extract part of it, etc. is all done by pattern matching. Since it's so central, and so all-permeating, it's a good idea to make it as efficient as possible.

About a year ago I posted a tutorial on Github about the chart parser I'm using for Language Engine's linguistic parsing. In the intervening year, I've rethought parts of it, and come up with a new, more general, abstract approach to chart parsing that I'm going to call Abstract Chart Parsing, because it abstracts over the details of the chart and the rules in very useful ways. An alternative name, owing to its design, might be State Machine Chart Parsing.

I want to summarize an approach to precedence (for pretty printing and parsing) that I've recently come up with in part of a series of blog posts I'm writing. It's useful enough, I think, to separate out into it's own post, because precedence is such a constant thorn in people's sides.

For the dispatcher in the Language Engine JS library, it was necessary to be able to efficiently test a number of pre-defined sets for subset status of an arbitrary given set. In particular, the pre-defined subsets are the keyword argument name patterns, and the arbitrary given set is the set of actual keyword argument names. To implement this efficiently, I designed a nice little algorithm using tries, which I'm going to describe below (only using Haskell, not JavaScript).

The code associated with this blogpost, which includes relevant dependencies, can be found here.

There's an interesting connection between language and computer science in the form of a correspondence between grammars and inductively defined data types. I hope the correspondence isn't too unforeseen, but it's still worth making as explicit as possible, to see some variations.

Below you can see a Sierpinski Triangle drawn in a canvas. What makes this particular Sierpinski Triangle interesting is that it's generated by setting the color of each pixel to a thresholded bitwise & on the coordinates of the pixel. Check out the source code to see the precise details of how this is drawn, or read below for a deeper discussion.

Subtypes are notoriously easy to mess up when designing a programming language, but they really shouldn't be. I think part of the reason is that most people try to set out their subtyping rules by gut instinct, because they don't really know how to do it properly. I don't know of any good tutorial on this, so fair enough. Here's that tutorial.

One of the main benefits of LTSG parsing was that it could narrow the parsing problem by guiding the parse, establishing subgoals, making the problem nice and recursive. The main drawback, however, was that no work would be done if the head of a phrase (including the whole sentence) was missing. On the other hand, Unger parsing could parse in a recursive goal-directed way regardless of missing words, but was unguided by head information, and has poor parse times due to non-determinism.

If we could somehow have the benefits of these approaches, without the drawbacks, we might have a shot at a good error correcting parser. One option that could be useful is to parse bottom up with a chart parser, to produce as much partial parse information as possible, and then work in a top-down fashion to find and correct the errors. This can be done somewhat efficiently, using because most of the parse is already there.

In this third post in my series on parsing inputs that might have errors (previous post here), I'm going to talk about Unger parsing. Unger parsing is a simple parsing technique that can handle ambiguous grammars quite nicely, and turns out to also be rather well suited to handling errors as well. I'll introduce Unger parsing, since I suspect most people are unfamiliar with it, then discuss it's application to error handling, and then discuss its drawbacks.

In this blog post, I want to follow up on my previous post about robust error handling when parsing by discussing one interesting, though imperfect, solution to the problem that we get by using a Lexicalized Tree Substitution Grammar (LTSG), which is a non-adjunction fragment of a Lezicalized Tree Adjoining Grammar. If you're not familiar with TAGs or LTAGs, this tutorial provides an excellent overview. The section on constraint-based TAGs isn't necessary, just the ones up to LTAGs.

In this post I want to talk about parsing inputs that might be malformed in some way. That is to say, this is about handling parse errors. Why might we want to do this? Well, if you're parsing a programming language, it's helpful to be able to inform the programmer why their code is malformed. If you're parsing a natural language, then there's a good chance that the speaker or writer won't provide perfectly grammatical sentences as input.

Much has been written about the awesomeness that is the Yoneda Lemma. For instance, Edward Kmett said "When you need something to go faster, hit it with the Yoneda lemma. Still isn't fast enough? Use it again."

But what exactly is the Yoneda Lemma? What does it mean, and why does it mean that? That's what I'm going to address in this blog post, at least for the Haskell version of the lemma. If you're not familiar with Haskell, tho.. sorry.

I'd like to follow up to my previous blog post on implementing a simply typed lambda calculus in JavaScript by demonstrating a more complete, more natural type system. If you haven't read it already, you should do so. The previous type system was a little unnatural because it had no type inference mechanism, and so all of the relevant type information had to be provided in the programs themselves. This time, however, we'll build a type checker that has inference, so we can write more natural programs.

This is my personal take on things, not necessarily an accurate history of the field. Take it with a huge grain of salt!

For the last 25 or so years, NLP has consisted almost entirely of fairly simple statistical models of language have prevailed. Rather than using carefully build grammars and fancy parsers, rather than trying to build systems that rely on linguistics, instead modern NLP tries to learn as much as possible from scratch. NLP research is now, essentially, just a branch of machine learning rather than a branch of linguistics. What happened?

Some of them are, in my opinion, relatively easy to solve, at least in terms of building the necessary programs/tools (data and learning is a separate issue), but others, especially the last one, are substantially harder. But hopefully I can convince you that at least the first four are something we can seriously consider working on.

Every once in a while I get asked about what sort of machine learning I used to build Language Engine, and when I say I didn't use any, the response is something like "But deep learning is so big these days!" While I'm tempted to just pull a Peter Thiel and say this buzzword-y nonsense is just fashion, and suggest you avoid fashions like the plague, I want to explain a bit instead. So in this post, I'm going to elaborate on how deep learning relates to Language Engine, and why we don't use it, or any other kind of machine learning, and why future versions will only use minimal amounts.

In this post, I'm going to talk about quantifiers. Not quantifiers as in the type theoretical quantifiers Pi and Sigma, but rather natural language quantifiers like "some", "all", "no", etc. One thing we need to do in implementing Language Engine is have a way of representing the meanings of these, and more specifically, the scope-taking properties that they have.

For simple sentences like "John saw Susan", it's easy to pretend like "John" and "Susan" just pick out some entities, perhaps like so:

So you want to write a type checker, but you don't know how. Well in this post, I'm going to show you how to implement one using JavaScript, for the simply typed lambda calculus with pairs and functions. JS is not the most ideal language for this, unfortunately, because it lacks a good type system of its own, and makes a bunch of stuff really ugly and noisy, but in a way, that's ideal. It'll give you a sense of what you'd have to do in other languages with similarly impoverished type systems. The way I'm going to implement it should be relatively easy to translate into idiomatic code in various.

You can see the full code for this type checker on GitHub. There is also a Haskell version of the same type checker, for comparison. Question info at the bottom of this post as usual.

I'm assuming you know a little bit about type theory, such as having looked at the Pfenning videos that I posted in this little guide. If not, I highly recommend you brush up, otherwise some of this might be confusing.

In this post, I want to tackle an important issue when designing a lexicon for an app that uses Language Engine, namely: how do you cope with the variety of seemingly different ways that people can express the same or nearly-same idea? For example, consider the following interaction you could imagine taking place in an Instragram-like app:

User: Take a selfie!App: Ok! *shutter* How's that?User: Oh, it's awful!App: What about with one of these filters?

This conversation, especially line 3 where the user says "it's awful", can be paraphrased many different ways. For instance, the user could have used a different adjective and said "it's horrible", or they could've combined a positive adjective with a negative by saying "it's not good". They could've alternatively used a verbal construction such as "I dislike it" or "I hate it", or a negated verbal construction, as in "I don't like it".

And yet these all basically express the same attitude towards the picture. A simple approach to meaning would end up giving us one meaning for each sentence, which would be pretty bad. Can we do better? Yes!

As promised last time, we can now tackle the problem of representing pluralities. Once we have events and adverbial modifiers separated off as in

drink(e,stephen,coffee) & time(e,past) & time(e,today)

it becomes relatively trivial to propose that we do the very same thing to the arguments of verb predicates, as well. There are a number of alternative ways to do this, but one option is to have special relations associated with the predicate.

I ended the previous post by suggesting that a set-based approach to pluralities, as in

[[John and Susan lifted the grate]] = lifted({john,susan}, {crate})

would not work well for at least two reasons: [...]

The task is to present an alternative representation that doesn't have these problems. But before I can do that, we need to detour a little bit to discuss adverbial modification, which I will do in this post.

In this post I'm going to discuss some of the design issues encountered while implementing Language Engine, specifically the issue of plural noun phrases.

First, I need to briefly introduce some basic ideas about the meanings of sentences. Suppose that we want to interpret the sentence "Vir stabbed the Emperor" into some logical form as the meaning. A very traditional approach to this would be to assign it the meaning