Prolog in Haskell - Refining unification

5th May 2026
2 min read
Tags:
haskell,
code kata,
coding dojo,
functional programming

In the previous log, we have started a naive Prolog implementation, we have introduced unification, but it has the followings shortcomings:

Variables are mixed-up, when there are rules-recursions, same-named variable will go in the same set
Substitution are accumulated via (<>), which indistinctly
Produce Substitution contains intermediary Var

To highlight this, let's draft a test, as follows:

it "rules facts and atom-based goal should be valid" $
  query mempty [fruit, salad, fruitSalad] ("fruitSalad" @ var "F") `shouldBe` [Map.singleton (Var "F") (atom "melon")]

Sadly, the result is quite different, see the following:

expected: [fromList [("F",TAtom "melon")]]
 but got: [fromList [("F",TAtom "tomato"),("X",TVar "F")],fromList [("F",TAtom "melon"),("X",TVar "F")]]

Propagate not used variables break the function.

The first step is to extract the variables from arguments as follows:

class (Eq a, Show a, Typeable a) => Args a where
  -- ...
  vars :: a -> Set Var

instance Args () where
  -- ...
  vars _ = mempty

instance Args Term where
  -- ...
  vars =
    \case
      TAtom _ -> mempty
      TVar v -> Set.singleton v
      TGoal (Goal _ args) -> vars args

instance Args (Term, Term) where
  -- ...
  vars (x, y) = vars x <> vars y

instance Args (Term, Term, Term) where
  -- ...
  vars (x, y, z) = vars x <> vars y <> vars z

The next step is to filter substitutions as follows:

runQuery :: [Predicate] -> Goal -> [Substitution]
runQuery ps g@(Goal _ gArgs) = flip Map.restrictKeys targetVars <$> query mempty ps g
  where targetVars = vars gArgs

It slightly improve the following results:

expected: [fromList [("F",TAtom "melon")]]
 but got: [fromList [("F",TAtom "tomato")],fromList [("F",TAtom "melon")]]

The problem is that, in the current substitution application phase, variables are somehow turned global, creating false positives.

To do so, the process of evaluating a predicate should make each variable globally unique, it's often referred as freshness, starting as follows at Args-level:

class (Eq a, Show a, Typeable a) => Args a where
  -- ...
  rename :: Int -> a -> a

instance Args () where
  -- ...
  rename _ () = ()

instance Args Term where
  -- ...
  rename i = \case
    TVar (Var v) -> TVar (Var (v <> "_" <> show i))
    TAtom a      -> TAtom a
    TGoal g      -> TGoal (renameGoal i g)

renameGoal :: Int -> Goal -> Goal
renameGoal i (Goal f a) = Goal f (rename i a)

instance Args (Term, Term) where
  -- ...
  rename i (x, y) = (rename i x, rename i y)

instance Args (Term, Term, Term) where
  -- ...
  rename i (x, y, z) = (rename i x, rename i y, rename i z)

We suffix all variables with a unique number.

We should also replace recursively variables as follows:

unifyTerm :: Substitution -> Term -> Term -> Maybe Substitution
unifyTerm ss x y =
  -- ...
    sub =
      \case
        TVar v | Just t <- Map.lookup v ss -> sub t
        o -> o

To have a unique number, we have to introduce a dedicate state monad holding a counter as follows:

type Logic = StateT Int []

query :: Substitution -> [Predicate] -> Goal -> Logic Substitution
query ss ps (Goal gFunctor gArgs) = do
  Predicate pFunctor pClauses <- lift ps
  guard (gFunctor == pFunctor)

  freshnessIndex <- get
  put (freshnessIndex + 1)

  (rawArgs, rawBody) <- lift $ toList pClauses
  let cArgs = rename freshnessIndex rawArgs
  let body  = map (renameGoal freshnessIndex) rawBody

  case eqTypeRep (typeOf gArgs) (typeOf cArgs) of
    Just HRefl -> do
      ssa <- lift $ maybeToList $ unify ss cArgs gArgs
      solveGoals (ss <> ssa) body
    Nothing -> lift []
  where
    solveGoals :: Substitution -> [Goal] -> Logic Substitution
    solveGoals ss' =
      \case
        [] -> lift [ss']
        (g : gs) -> do
          ss'' <- query  ss' ps g
          solveGoals ss'' gs

Each time we get the freshness index, we increment it to keep it unique, which makes the following test to work:

it "rules facts and atom-based goal should be valid" $
  runQuery [fruit, salad, fruitSalad] ("fruitSalad" @ var "F") `shouldBe` [Map.singleton (Var "F") (atom "melon")]

Beyond this, one of the things I mean the most, but I know the price is too high, is the ability for a variable to unify with itself, which make the following equality beautiful:

eq(X, X).

And our code is supporting it already, as shown in the following test:

it "rule fact with referencing twice the arg should be valid" $
  hasGoal [eq] ("eq" @ (atom "tomato", atom "tomato")) `shouldBe` True
it "rule fact with referencing twice the arg should be invalid" $
  hasGoal [eq] ("eq" @ (atom "tomato", atom "carrots")) `shouldBe` False

Sadly, querying is broken as highlighted is the following test:

it "rule fact with referencing twice the arg should be valid" $
  runQuery [eq] ("eq" @ (atom "tomato", var "X")) `shouldBe` [Map.singleton (Var "X") (atom "tomato")]

Which gives the following result:

  test/Spec.hs:46:57: 
  1) Prolog.runQuery rule fact with referencing twice the arg should be valid
       expected: [fromList [("X",TAtom "tomato")]]
        but got: [fromList [("X",TVar "X_0")]]

We only lack a substitution in the query processing, let's extract our helper and apply it as follows:

runQuery :: [Predicate] -> Goal -> [Substitution]
runQuery ps g@(Goal _ gArgs) = ground <$> evalStateT (query mempty ps g) 0
  where targetVars = vars gArgs
        ground ss = Map.fromSet (substitute ss . TVar) targetVars

substitute :: Substitution -> Term -> Term
substitute ss =
  \case
    TVar v | Just t <- Map.lookup v ss -> substitute ss t
    o -> o

In the next log we will support built-in predicates, which change the unification process, and why is it useful.