Introduction to GHC's Generics
In my previous log, I have focused on an annoying
aeson
, I have mentioned that it
was involving some GHC's Generic
s
code which were too complex for this log.
I had some feedbacks suggesting that it might be interesting to dive deeper in this subject, so here I am!
In brief, GHC's Generic
s
is a mechanism allowing to write code depending on data-types type-level structure.
It is defined as this:
-- | Generic representation type
type Rep a :: Type -> Type
-- | Convert from the datatype to its representation
Rep a
gives a representation based on very specialized type (e.g. M1
, K1
, :+:
, etc.),
I have never made sense of it (even after 15-20 derivation mechanisms implementation),
so, instead I rely on examples.
Usually you let the GHC deriving it:
data T
= A0
| A1 ()
| A2 { a2f0 :: () }
| A3 { a3f0 :: (), a3f1 :: () }
deriving stock (Generic)
Let's have a look at Rep T
in ghci
(GHC's repl)
> :kind! Rep T
Rep T :: Type -> Type
= M1
D
(MetaData "T" "Ghci3" "interactive" False)
((M1 C (MetaCons "A0" PrefixI False) U1
:+: M1
C
(MetaCons "A1" PrefixI False)
(M1
S
(MetaSel
Nothing NoSourceUnpackedness NoSourceStrictness DecidedLazy)
(K1 R ())))
:+: (M1
C
(MetaCons "A2" PrefixI True)
(M1
S
(MetaSel
(Just "a2f0") NoSourceUnpackedness NoSourceStrictness DecidedLazy)
(K1 R ()))
:+: M1
C
(MetaCons "A3" PrefixI True)
(M1
S
(MetaSel
(Just "a3f0") NoSourceUnpackedness NoSourceStrictness DecidedLazy)
(K1 R ())
:*: M1
S
(MetaSel
(Just "a3f1") NoSourceUnpackedness NoSourceStrictness DecidedLazy)
(K1 R ()))))
Overwhelming, I know, let's break this down.
I have the following heuristics: datatype are described by layers of
SomeType metadataPhantomTypes actualTypes proxy
.
Let's peel the onion:
M1 D (MetaData "T" "Ghci3" "interactive" False)
M1
is a metadata holder (here the context) of some structure (D
is for datatype):+:
is the sum type representation (the|
like inA0 | A1
)M1 C (MetaCons "A0" PrefixI False) U1
is the metadata of a constructor (C
is for constructor) without value (U1
), alsoPrefixI
indicate the fixity (i.e. prefix / infix)(M1 S (MetaSel Nothing NoSourceUnpackedness NoSourceStrictness DecidedLazy) (K1 R ()))
is a metadata of a record selector (S
) without "label" (Nothing
) with a field (K1 R
) of type unit (()
)M1 C (MetaCons "A2" PrefixI True) (M1 S (MetaSel (Just "a2f0") NoSourceUnpackedness NoSourceStrictness DecidedLazy) (K1 R ()))
similar to the previous one, except the named "label" (Just a2f0
):*:
is the product type representation (it happens when you have multiple values per constructorsA0 Int Char
)
It's quite verbose, but each type is simple by it-self.
Then we have to process them, let's illustrate that with an extract of the library I have described in my previous log:
It will be our type class
to work on Generic
types.
It will be used to define generic instances (based on from
output):
(Generic a, GToSumText (Rep a)) =>
ToSumText (PureSum a)
where
toSumText = gToSumText . from . unPureSum
We can start with the top-level case which is not interesting, so we only extract its value:
-- base type
gToSumText (M1 x) = gToSumText x
Then we arrive at constructor-level, and since we only need constructor name,
we can stop here and extract it through symbolVal
which convert a type-level
literal (string) to a value-level String
:
-- constructor
gToSumText (M1 _) = T.pack $ symbolVal $ Proxy @cntr
Then, the last interesting case is the sum type:
-- sum type
gToSumText (R1 x) = gToSumText x
gToSumText (L1 x) = gToSumText x
Now, what will happen ate runtime, let's take A0
, which will be converted via
from
to:
M1 -- Datatype
(L1 -- Left part of the sum type (:+:)
(M1 -- "A0" constructor
U1 -- No value
)
)
The heavy-lifting being done at instance selection.
It works well when you have the value, but when you only have the type, it's a bit more complex.
For instance, when we had to define FromJSON (PureSum a)
, we need type's
name for withText :: String (Value -> Parser a) -> Parser a
parseJSON = withText (getConst $ (to @a) <$> gConstructorName) pureSumWithParser
To do so, we can rely on Const
, which as a non-operant Functor
instance
.
Not, it is not operant because it is a functor over a phantom type
newtype Const a b = Const { getConst :: a }
fmap _ (Const v) = Const v
Finally we can simply fetch the literal:
-- base type
gConstructorName = Const $ symbolVal (Proxy @typeName)