I’ve been studying THIH by Mark P Jones recently (Typing Haskell in Haskell); it’s a good, executable introduction* to the mechanics of type inference in our favorite language.
There is a copy of the original repository on Hackage (uploaded by Gwern in 2008 , thih: Typing Haskell In Haskell ), but until today it did not have a backup maintainer.
I think it would be valuable to save this project from bitrot and perhaps feature it in a more central place in our ecosystem. It certainly deserves better Haddock documentation and more readable tests.
I think it would be great to have a simple implementation of the type system to do research on. I wanted to see what it would take to implement MLF and also found THIH, but I bounced off.
I also copied the thih source and iterated on it to produce Duet GitHub - chrisdone/duet: A tiny language, a subset of Haskell aimed at aiding teachers teach Haskell which is a Haskell subset. The thing missing from THIH which I had to figure out was converting type class contexts into actual dictionaries. That’s a whole extra stage. Plus kind checking for data declarations and type checking of instances and stuff like that.
(I’ve since done a complete ground up implementation of a compiler (https://inflex.io/) with type classes based on this experience but the architecture is quite different (Inflex - Google Slides), but it’s not open source yet.)
Fortunately, we are doing exactly that! We have a paper that formalizes an extremely big part of GHC Haskell’s typesystem, and we are also developing a typechecker that implements this formalism – Kind of like Practical Type Inference for Arbitrary Rank Types but much bigger
The math is nearly finished, and most of my time is currently dedicated to the implementation.
Here are the extensions that the formalism currently specifies:
One of the main goals of the implementation is to be in a one-to-one – and when possible, even line-to-line – correspondence with the rules.
The paper doesn’t give an specification for kindchecking datatypes, classes, synonyms, pattern synonyms, and type family declarations. Instead, it assumes that these are already in the environment. Also, if I’m not mistaken, the solver is planned to be able to solve everything GHC solves other than FunctionalDependencies and ImplicitParamters. Additionally we might not support representational equality.