GHC 9.0 Bug: FunEqCan To NonCanonical Issue

Alex Johnson
-
GHC 9.0 Bug: FunEqCan To NonCanonical Issue

Hey guys, let's dive into a tricky issue that some of us have encountered while working with GHC 9.0 and the ghc-tcplugins-api-0.18. It revolves around how the unflattening process transforms FunEqCan into NonCanonical, which unfortunately makes them unusable by mkTyConSubst. This can be a real headache, especially if you've got code that was working smoothly with ghc-tcplugins-api-0.17. So, let's break it down and see what's going on.

The Core Problem: FunEqCan Turning NonCanonical

At the heart of this issue is a change in how GHC handles type equalities during the unflattening process. In ghc-tcplugins-api-0.18, unflattening can inadvertently convert FunEqCan (Functional Equality Canonical) constraints into NonCanonical constraints. Now, you might be wondering, "What's the big deal?" Well, the problem arises because mkTyConSubst, a crucial function for creating type constructor substitutions, can't handle NonCanonical constraints.

To really understand why this is happening, let's look at a simplified example. Imagine you have a type family like this:

type family F (a :: Type) :: Type where
  F Int = Bool

When GHC processes this, it might generate a FunEqCan constraint that essentially says "F Int is the same as Bool". This is a nice, clean, canonical equality. However, during unflattening, this constraint can get transformed into a NonCanonical form, which is a less direct and potentially more complex representation of the same equality. This transformation gums up the works for mkTyConSubst because it's expecting that pristine FunEqCan format.

Diving Deeper: A Code Example

To illustrate this, let's look at the code snippet provided in the original issue. This code uses type-level natural numbers and some advanced GHC extensions. Don't worry if you don't grasp every detail; the key takeaway is the transformation of a CFunEqCan constraint into a CNonCanonical one.

Consider this Haskell code:

{-# OPTIONS_GHC -fplugin GHC.TypeLits.KnownNat.Solver #-}

{-# LANGUAGE CPP, TypeFamilies, ScopedTypeVariables, TypeApplications, TypeOperators, DataKinds, GADTs #-}

{-# OPTIONS_GHC -ddump-tc-trace -ddump-to-file #-}

module Test where

import GHC.TypeNats
#if __GLASGOW_HASKELL__ >= 906
  hiding (type SNat)
#endif

import TestFunctions
import Data.Kind

data SNat :: Nat -> Type where
  SNat :: KnownNat n => SNat n

test30 :: forall a b . (b ~ (2^a)) => SNat a -> SNat (Log (2^a))
test30 SNat = SNat @(Log (2^a))

And this accompanying module:

{-# LANGUAGE CPP, DataKinds, FlexibleInstances, GADTs, KindSignatures,
             MultiParamTypeClasses, RankNTypes, ScopedTypeVariables, TemplateHaskell,
             TypeApplications, TypeFamilies, TypeOperators,
             UndecidableInstances #-}

module TestFunctions where

import Data.Proxy            (Proxy (..))
import Data.Type.Bool        (If)
import GHC.TypeLits.KnownNat
#if __GLASGOW_HASKELL__ >= 802
import GHC.TypeNats
import Numeric.Natural
#else
import GHC.TypeLits
#endif

type family Max (a :: Nat) (b :: Nat) :: Nat where
  Max 0 b = b -- See [Note: single equation TFs are treated like synonyms]
  Max a b = If (a <=? b) b a

instance (KnownNat a, KnownNat b) => KnownNat2 $(nameToSymbol ''Max) a b where
  natSing2 = let x = natVal (Proxy @a)
                 y = natVal (Proxy @b)
                 z = max x y
             in  SNatKn z
  {-# INLINE natSing2 #-}

{- [Note: single equation TFs are treated like synonyms]
Single equation (closed) type families (TF) are treated like type synonyms, this
means that type-applications of such a TF only shows up in its expanded form.

Consequently, the KnownNat solver plugin does not have a TyCon name to look
up the corresponding instance of the KnownNat2 class.
-}

type family Min (a :: Nat) (b :: Nat) :: Nat where
  Min 0 b = 0 -- See [Note: single equation TFs are treated like synonyms]
  Min a b = If (a <=? b) a b

-- Unary functions.
#if __GLASGOW_HASKELL__ >= 802
withNat :: Natural -> (forall n. (KnownNat n) => Proxy n -> r) -> r
withNat n f = case someNatVal n of
  SomeNat proxy -> f proxy
#else
withNat :: Integer -> (forall n. (KnownNat n) => Proxy n -> r) -> r
withNat n f = case someNatVal n of
               Just (SomeNat proxy) -> f proxy
               Nothing              -> error ("withNat: negative value (" ++ show n ++ ")")
#endif

type family Log (n :: Nat) :: Nat where

#if __GLASGOW_HASKELL__ >= 802
logInt :: Natural -> Natural
#else
logInt :: Integer -> Integer
#endif
logInt 0 = error "log 0"
logInt n = go 0
  where
    go k = case compare (2^k) n of
             LT -> go (k + 1)
             EQ -> k
             GT -> k - 1

instance (KnownNat a) => KnownNat1 $(nameToSymbol ''Log) a where
  natSing1 = let x = natVal (Proxy @a)
             in SNatKn (logInt x)

By enabling -ddump-tc-trace, we can observe the type-checking process. The crucial part is this:

Initially, the trace shows a CFunEqCan constraint:

[G] co_a3iD {1}:: (2 ^ a_a3ir[sk:1])
                  GHC.Prim.~# fsk_a3iC[fsk:1] (CFunEqCan)

But after unflattening, it becomes CNonCanonical:

[G] co_a3iE {1}:: (2 ^ a_a3ir[sk:1])
                  GHC.Prim.~# b_a3is[sk:1] (CNonCanonical)

This transformation is the root cause of the issue. The ghc-typelits-knownnat plugin, like other tools relying on mkTyConSubst, expects CFunEqCan and gets tripped up by the CNonCanonical version.

The Impact: Broken Code

So, what's the real-world impact? Well, code that worked perfectly fine with ghc-tcplugins-api-0.17 might suddenly break when you upgrade to ghc-tcplugins-api-0.18 on GHC 9.0 and older. This is exactly what happened in the ghc-typelits-knownnat project, as highlighted in this pull request.

Imagine you have a library that heavily relies on type-level arithmetic and custom type-checking plugins. Suddenly, after a seemingly minor upgrade, your users start reporting cryptic type errors. Debugging this kind of issue can be incredibly time-consuming, as the root cause is buried deep within GHC's internals.

Why Does This Happen? Understanding the Unflattening Process

To truly grasp this issue, we need to delve a bit into what "unflattening" actually means in the context of GHC's type-checking. Unflattening is a transformation that GHC applies to constraints to simplify them and make them easier to work with. It's part of GHC's constraint solver, which is the engine that figures out whether your types are consistent.

Think of it like simplifying an algebraic equation. You might start with a complex expression, but through various transformations, you can reduce it to a simpler, equivalent form. Unflattening does something similar with type constraints.

However, in this case, the simplification process, while generally beneficial, has an unintended side effect. The transformation from FunEqCan to NonCanonical, while technically preserving the meaning of the constraint, makes it incompatible with certain tools like mkTyConSubst.

The Role of mkTyConSubst

mkTyConSubst is a function that creates a substitution that maps type constructors to other types. It's a fundamental building block for many type-checking plugins and other advanced type-level programming techniques. It expects type equalities in a specific, canonical form (FunEqCan). When it encounters a NonCanonical equality, it doesn't know how to handle it, and things break down.

Potential Solutions and Workarounds

Now, let's talk about how to tackle this problem. If you're facing this issue, you have a few options:

  1. Downgrade ghc-tcplugins-api: If possible, you could revert to using ghc-tcplugins-api-0.17. This is a straightforward solution if you absolutely need your code to work as it did before, but it means missing out on any improvements or bug fixes in the newer version.

  2. Adapt Your Plugin: If you're the author of a type-checking plugin, you might be able to modify your code to handle CNonCanonical constraints. This could involve writing additional logic to normalize the constraints or using alternative methods to achieve the same goal as mkTyConSubst. However, this can be a complex undertaking, as it requires a deep understanding of GHC's internals.

  3. Report the Issue: Make sure the issue is reported and tracked in the GHC issue tracker. This helps the GHC developers understand the impact of the bug and prioritize a proper fix in a future release.

  4. Use the Latest GHC: The issue is specific to GHC 9.0 and older. Newer versions of GHC might have addressed this behavior, so upgrading your compiler could be a solution.

A Deeper Dive into Adapting Your Plugin

Let's explore the second option, adapting your plugin, in a bit more detail. If you're determined to make your plugin work with ghc-tcplugins-api-0.18 and GHC 9.0, you'll need to understand how to deal with CNonCanonical constraints. This typically involves a few steps:

  1. Detect CNonCanonical Constraints: You'll need to modify your plugin to recognize when it encounters a CNonCanonical constraint that represents a type equality you care about.

  2. Normalize the Constraint: The key challenge is to somehow transform the CNonCanonical constraint back into a CFunEqCan or an equivalent form that mkTyConSubst can handle. This might involve using GHC's constraint solver API to manipulate the constraint or writing your own normalization logic.

  3. Handle Edge Cases: Be prepared to encounter edge cases and situations where normalization is not straightforward. You might need to add heuristics or fallback mechanisms to handle these cases gracefully.

This is definitely not a task for the faint of heart. It requires a solid grasp of GHC's type-checking machinery and a willingness to dive into the details of constraint solving.

Conclusion

The FunEqCan to NonCanonical issue in ghc-tcplugins-api-0.18 is a subtle but impactful bug that can break code relying on mkTyConSubst. It highlights the complexities of GHC's type-checking process and the challenges of writing robust type-checking plugins. While there are workarounds, such as downgrading or adapting your plugin, the ideal solution is a proper fix in GHC itself.

If you've run into this issue, I hope this explanation has shed some light on the root cause and potential solutions. Type-level programming in Haskell can be incredibly powerful, but it also comes with its share of quirks and challenges. Keep experimenting, keep learning, and don't be afraid to dive deep into the rabbit hole of GHC's internals!

For more in-depth information on GHC's type system and constraint solving, you can check out the GHC User's Guide.

You may also like