A Proof Theoretic Approach to Knowledge Acquisition

Introduction

When we try to solve problems, very often we try to apply techniques that we have learned from previous problem solving ventures. In academic life this is especially visible: when I go to take a test, I expect that some of the techniques I have learned in lecture and while completing home work will in some way apply to the problems that I face.

In machine learning we see many attempts at formalizing this strategy of learning by analogy. In explanation-based learning, a solution to a particular problem is analyzed and a reason (hopefully a good one) is generated for why the solution worked. This reason can then be used to solve similar problems. A closely related technique, called analogical reasoning, takes the solution to a problem as a guideline to solving similar problems by attempting to follow a similar line of inference. Finally, with the abstraction technique, one uses abstract problems and their solutions as a starting point for generating outlines of solutions to similar, specific problems which can then be refined to solve the specific problem.

One can clearly see that in our everyday lives we use all these techniques to some degree. Indeed, we very often use them successfully because we are able to both see the similarities between problems and abstract the techniques needed to solve these common problems.

How humans go about identifying similarities between problems is still a mystery: that we intuitively see how two abstract objects are similar in a useful way is beyond the scope of this work. However, in this work we attempt to build a method for abstracting techniques that can be used to solve other problems. Specifically, we focus on extracting useful information from mathematical proofs in automated reasoning systems.

Background

Many automated reasoning systems rely on the Curry-Howard Isomorphism. I will assume here that you are familiar with basic propositional logic and the typed lambda calculus. The Curry-Howard Isomorphism informally states that there is direct correspondence between logical formulas and types, proofs of logical formulas and lambda terms, and proof normalization (simplification of proofs) and reduction in the lambda calculus.

The above explanation is very rough and ready: if you want to know more you should see

This might not sound like much but it is an astounding result. Here are some examples:

What this essentially says is that mathematical proofs can be expressed as programs!

Motivation

In a presentation given at a Cornell University NuPRL seminar, Professor Dexter Kozen outlined a basic proof system for Kleene Algebra with Tests that had two very interesting operators: pub and cite.

Before going into the details of these operators, let me outline the basic structure of a simple mathematical proof. Proofs typically consist of a number of lemmas which, when pieced together and reasoned about, yield the ultimate conclusion. Lemmas themselves have proofs associated with them. In addition, proofs can make reference to theorems, often instantiating these theorems with specific objects of interest.

Here’s a quick example:

With the Curry-Howard isomorphism under our belts, we can view the proofs of the lemmas and theorems supporting the final proof as programs themselves. We might view a theorem as a program that has some bound variables needing instantiation to make that theorem report on a specific object. When we insert this theorem into our larger proof, we could substitute our pertinent values for these bound variables and get a proof specific to our needs. However, this is not what we really do. For instance, when we refer to the pigeonhole principle in a mathematical proof, we do not give a proof of the principle as it relates to our proof. Instead, we cite the principle and briefly explain how our situation lends itself to this proof. This is the essential idea behind the pub and cite operators. Pub publishes our proof: i.e., it makes it ready for citation. Cite takes a published proof and feeds in the required variables that would specialize the proof.

The key idea is that the specialization (or Beta-reduction in lambda-calculus terms) never happens, because the pub token wraps the proof and inhibits the reduction. This formally models our intuitive proof writing behavior.

However, most theorem proving systems specialize the proofs (i.e. programs) by instantiating the bound variables with problem-specific data. It would be nice if we could extract lemmas and theorems from these proofs and get at the heart of proof and its supports.

Work

The key idea is that what we want to extract proofs of theorems and lemmas embedded within larger proofs. Notice that since these things are themselves proofs, we are looking for subprograms within the larger program.

Of course there are many subprograms within the scope of a larger program. The idea is to perform a kind of common subexpression elimination on multiple proofs to extract common lemmas and theorems.

Here’s an example:

References

http://citeseer.nj.nec.com/cache/papers/cs/3413/http:zSzzSzkirmes.inferenzsysteme.informatik.th-darmstadt.dezSz~kolbezSzpatProRep.pdf/kolbe95patching.pdf