Monday, April 23, 2012

An efficient linear unification algorithm

Below is a useful article I found on reocities (a mirror of the now defunct geocities) written by John Henckel and licensed as public domain. I am reposting the article on my blog in order to preserve it.

I used the described algorithm in JSUnify, a declarative CoffeeScript DSL(domain specific language) for logic programming. I will be releasing version 1.0 of JSUnify soon which will be accompanied by another blog post. In the mean time you can track JSUnify's progress on its GitHub page, https://github.com/freethenation/JSUnify.git.

by John Henckel, 6 Feb 1997 (public domain)


Table of Contents

Problem
Solution
References
Source Code

Problem

Unification is a basic operation in many symbolic artificial intelligence systems. Unification is, given two first-order terms, find a substitution of the variables in the terms so that they become identical. Reference (1) gives a good introduction to unification and gives textbook algorithm for finding the most general unifier. The problem is that this algorithm has exponential worse case complexity. Other algorithms are known that have linear worse case complexity (2) but these are very complicated to implement.


Solution

I am implementing a resolution theorem proving program for a class project and I wrote a unification algorithm which has linear complexity, I think. It is easy to implement, and it has a number of extra benefits. I will use the following example. Assume w,x,y,z are variables and we want to unify f(x,t(x),z) and f(a,y,h(w,y)). First convert the terms to tree structure as follows

In the trees each variable is replaced by a number unique in the term. These trees are READ ONLY, they are not changed by the unification process and they can be refered to by many term instances in the database. This structure sharing saves a lot of storage in the theorem proving program.

The unification program works on term instances (tins). A tin has a pointer to a tree, and a pointer to a variable list. The size of the variable list depends on the number of variables in the tree. (in the example both trees have two variables). The variable list is actually an array of tins. The tins in the var list have two pointers also, if both of these are null, we say the tin is free, or unbound.

The two tins for the example would be:

that is they both have a head tin that points to the structure tree and to a list of free tins size 2.

The unification procedure is recursive, it walks the structure trees pointed to by the two terms. It returns 1=unify was successful or 0=unify failed.

After running this procedure on the example, the tins look like this

In this figure: Tin1 points to the f in the first tree and tin1 has two variables x and z in its varlist. x has two pointers, a pointer to a structure and to a varlist, they point to the 'a' in tree2 and the varlist of tin2, respectively. z points to the 'h' and to the same varlist in tin2. Tin2 has two variables also, y and w. y is bound to 't' and points to varlist of tin1. w is free.

The unification succeeded. If we were to print the value of the variable z we would print z = h(w,t(a)). To print the value of z we print the tree it points to using the varlist it points to lookup any variables that may occur in the tree and recursively print them.

This algorithm has worst case linear complexity. Each recursive invocation of the unify function either dereferences a variable or moves to a new node in the structure tree. If n is the number of nodes in the structure trees and m is the number of variables, unify will be called recursively at most n+m times.

There is one case that was not spelled out yet. When two vars are free, as in (*1*), then one is shared to the other. This is done using a special kind of "chain" tin in which the structure pointer is null and the varlist pointer points to the tin of the other variable. When a term has many variables these chains can grow long. For example when f(w,w,x,x,y,y,z) is unified with f(s,t,t,u,u,v,v), (assume s..z are vars) then first w chains to s, then t chains to w, then x chains to t, etc. When both vars are free, it doesn't matter which is chained to which. By carefully choosing which to chain, we can ensure that the length of the longest chain does not grow faster that O(log(n)) where n is the number of variables processed by unify. These are the rules for chaining two free vars.

  1. if one of the vars has a chain longer than the other, link the shorter one to the longer.
  2. else always link the t1 var to t2 var, this ensures that the chain does not zig-zag between the terms.
In my theorem prover program each time a variable is bound or shared, a pointer to it is added to a "changelist". When unify is finished, I can quickly undo the effect of the unify by going though the changelist and setting all the tins it points to to "free" (two null pointers).

The benefits of using this algorithm are

  1. it is time efficient: linear worst case complexity
  2. it is space efficient: structures are shared by many term instances, no new memory needs to be allocated during unification, extra overhead is just two pointers per variable.
  3. it is easy to add an un-unify feature to it by keeping a list of changes during unify.

References

  1. Genesereth, M., Nils Nilsson, Logical Foundations of Artificial Intelligence, Morgan Kaufmann Publishers, Inc., Los Altos, CA, 1986.
  2. Paterson, M.S., and Wegman, M.N. Linear Unification, J. Comput. Syst. Sci. 16 (1978) pp 158-167.
  3. Escalade-Imaz, G., Ghallab, M., A Practically Efficient and Almost Linear Unification Algorithm. Artificial Intelligence, v36, 1988, 249-263.
  4. The diagrams were made using Jan's excellent (free) Diagram Editor.

Source Code


my favorite things * contact me

1 comment:

  1. Is there a proof of correctness for this algorithm? What is the source for this algorithm? ie. is it the Martelli-Montanari algorithm, or based on it?

    ReplyDelete