link home
╔══════════════════════════════════════════════════════════════════════════════╗
║ ║
║ A Nominal Intensional Dependent Type System ║
║ ║
╚══════════════════════════════════════════════════════════════════════════════╝
┌───────────────────┬──────────┬───────────────────┐
│ │ Abstract │ │
│ └──────────┘ │
│ │
│ This article describes a method for verifying │
│ the type safety of a dependently typed language │
│ without the need for proof of extensionality and │
│ relatively minimal abstract interpretation. │
└──────────────────────────────────────────────────┘
Background and Context
────────────────────────────
Programming languages which are statically typed rely on a type system to
express the notion of correctness. A type-checking algorithm will inspect a
program in that language, model its properties using the type system, and if the
model obeys the rules of the type system, the algorithm will report that the
program is "type safe".
The type system is normally a separate language from the actual programming
language. The programming language is called the "term" language and the type
system language is called the "type language". The reason these two languages
are separate is because the type language is normally a very simple language
that isn't as expressive as the term language. This means the type language
can't express as many things as the term language, but the things it can
express are guaranteed to follow the rules and be "type safe".
Dependent type systems don't use a separate language, they use the same
language for both terms and types. This means that the type-level model of a
program is... the program itself. If you ask a mathemetician, they'll explain
this using "dependent sums" and "dependent products". It's non-obvious how
these have a computational interpretation, so I won't use them. An easier way
to describe dependent typing is using the lambda cube. A full dependently typed
language is classified as a
"higher order dependently typed polymorphic lambda calculus".
In other words, you can use the same language to create a function from
A) term to term
B) term to type
C) type to term
D) type to type
Type assertions in a dependently typed language work just like assertions in
every other language, except that the type in the assertion is a normal
expression in the language. In pseudocode, if I wanted to make a variable that
binds an instance of "A", I would write,
And if I wanted to assert that the result of an expression is an instance of
type B, I would write,
In both examples, A and B are terms, not types. That is to say they're valid
expressions by themselves, outside of a type constraint.
Now, lets say that I had a function,
What would the type of this function be? Simple...
But here's where things get tricky, let's say I had another function,
and I wanted to assert that
This is difficult because the constraint is not constructed with a simplified
type language, but the whole term-level language. I can't create an abstract
model of the constrained term anymore, I have to evaluate the term language
and I need whatever environment the term language depends on to be evaluated.
Here's a new word, Extensionality. It's the property that two functions which
might have different constructions, still share the same image and preimage. For
total functions with finite, countable images and preimages, you can decide if
two functions are extensional (given enough time). Partial functions are a
different story. Now you have to grapple with the halting problem. If a function
diverges, you will never be able to decide if its image and preimage are the
same as some other function, because you will never be able to identify all the
members of the image/preimage. You will only have a partial image/preimage to
work with. We got lucky with the example above because those two functions are
trivially total and it's easy to see that they do the same thing just by
looking. In the real world, it's usually much more complicated.
This is the great big snag with dependently typed languages. It's probably
the biggest reason why they haven't caught on like other paradigms and type
systems. If you want to use the same language in both the term and type level,
you have to incorporate formal mathematical proofs into your program so that
something like,
might actually typecheck before the heat death of the universe. You could say
there's two glaring issues with dependently typed languages...
1) Good luck creating a formal mathematical proof for every little bit of your
program. Often times there's no proof that you could ever construct which
will properly verify your program.
2) Formal mathematical proofs are very poor at telling you if you've met
product design's requirement that the top-left button should be purple on
every second Tuesday.
At this point, anyone who's a fan of dependently typed languages might offer
some common refutations to all I've been saying such as, "Most of the time
your code contains things that the compiler can trivially prove.", or "It's not
that hard once you get used to it." I beg to differ. The moment you have to
engage with proof tactics, you've lost. A useful dependently typed language
should at least allow you to conduct proofs about your programs in the same
language as your programs. Additionally useful dependently typed language should
allow you to write programs as proofs. There is not a dependently typed language
that exists which provides this functionality.
Introduce Nominal Constraints and Intensionality
──────────────────────────────────────────────────────
So the problem is that dependent types break down when you try to do any
moderately involved program verification with them. They break down a lot
because useful programming languages are full of diverging partial functions.
Especially in the embedded world that needs reliable, high uptime software,
the same mechanism that provides the uptime obfuscates lots of forms of
dependent type checking; looping and recursion.
My solution to this is to flip dependent type checking upside down. If you
have any ol' function. A mathemetician might look at it and ask, "Is evaluating
this function the same as evaluating this other function?" A computer scientist
would instead ask, "How can I predict and constrain the effects of evaluating
this function?" These are two entirely different questions. Only the first is
answered by extensionality. Ironically the latter is what most people are asking
when they turn to dependent types for more control over type checking. Flip it
upside down. Instead of asking about extensionality, ask about intensionality.
How is this function constructed so that I can make predictions about what it
does?
The trick is that the second question is much, much easier to answer than
straight up extensionality. For example, maybe I have a function "foo" and I
want it to do the same thing as "bar". The easiest way to enforce this is to
make sure that "foo" calls "bar" somewhere in its body. That's exactly what
a Nominal Intensional Dependent Type System does.
Let's build up an intuition by asking a question. If I have a function,
how many programs can I write that call foo, with A, that result in B?
That's a simple question. The easiest answer is,
But what about this program?
Or this program...
...or this program...
You might start to notice a pattern in these examples. There's certain ways that
you can arrange a program where the flow will preserve some data. In the case
of these examples, I'm preserving A, until foo is applied to it, after which
I'm preserving B until it's yeilded. A Nominal Intensional Dependent Type
System provides a formal way to check this flow property in a program.
How this works in nᵗʰ
───────────────────────────
In the programming language, nᵗʰ, which I'm developing, Nominal Intensional
Dependent (NID) typing is used when type checking functions and code blocks. You
can engage it simply by using a normal function as a type. Normally, you can
create symbolic functions that match over things abstractly, such as this
"type level" dependent function,
┏╾───────────────────────────────────────────╼┓
╿ ╿
│ (define return-a-type (Boolean → Type) │
│ (True → Natural) │
│ (False → Rational)) │
╽ ╽
┗╾───────────────────────────────────────────╼┛
which can be used as a signature for other, more specific functions,
┏╾────────────────────────────────────────────╼┓
╿ ╿
│ (define (return-an-instance : return-a-type) │
│ b → (b ? 1 1.0)) │
╽ ╽
┗╾────────────────────────────────────────────╼┛
However, what if I take a normally defined function,
┏╾────────────────────────╼┓
╿ ╿
│ (define A x → (f (g x))) │
╽ ╽
┗╾────────────────────────╼┛
and use it as the signature for another "normal" function,
┏╾────────────────────────╼┓
╿ ╿
│ (define (B : A) x → h x) │
╽ ╽
┗╾────────────────────────╼┛
How would I check this if not through dependent type checking with formal
proofs, tactics, and unification? The answer is to compare the definitions of
both functions. This is similar to the normal intensional relation where two
functions have the same image and preimage if they have the same definition.
However, A
and B
are decidedly different. This is where the Nominal bit comes
into play. If B
included all the same function applications as A
, B
would
appear the same as A
extensionally. Using NID typing, we can adjust the
definition of B
to have the same effects as A
like so,
┏╾──────────────────────────────────╼┓
╿ ╿
│ (define (B : A) x → (f (g (h x)))) │
╽ ╽
┗╾──────────────────────────────────╼┛
assuming that h
yields a result type which is accepted by g
, now the
definition, (f (g (h x)))
, conforms to the constraint, (B : A)
because A
calls
g
, then f
, then yeilds the result, and now B
does the same. This is much
different than normal checks and proofs for extensionality. There's not even
a guarantee of true extensionality since A
and B
may diverge. However, by
constraining what can appear in the definition of B
to the definition of A
,
we might establish a partial extensional relation where B
must have at most
a subset of the preimage and image of A
. From the perspective of a
computer scientist interested in constraining program behavior to a verifiable
subset, this partial extensional relation is very desirable.
The informal rules for how to satisfy a NID type constraint are as follows,
1) The constrained function must have the same result type(s) as the
constraining function.
2) The constrained function is not required to have the same parameter types as
the constraining function.
3) If the constraining function features a function application, the
constrained function must also apply that same function by name.
4) The constrained function may include functions which satisfy identity
anywhere in its definition.
5) The constrained function may insert an arbitrary number of sequential
expressions at any point which preserves the function applications of the
constraining function.
Some More Examples
────────────────────────
Here's some rapid fire examples which are all valid according to NID type
rules...
┏╾─────────────────────────────────╼┓
╿ ╿
│ (define foo (x → (x + 1))) │
│ (define (bar : foo) │
│ (x → │
│ (foo (let m = x │
│ in (m ← (m + 1), │
│ m ← (m + 1)))))) │
╽ ╽
┗╾─────────────────────────────────╼┛
┏╾─────────────────────────────────────────────────╼┓
╿ ╿
│ (define foo (x → (A), (B), (C))) │
│ (define (bar : foo) (x → (P), (Q), (R), │
│ (A), (K), (X), │
│ (Y), (Z (Z' (Z'' (B)))), │
│ (C))) │
╽ ╽
┗╾─────────────────────────────────────────────────╼┛
┏╾────────────────────────────────────────────────────────────╼┓
╿ ╿
│ (define foo (x → (P), (Q (R)))) │
│ (define (bar : foo) (x → (I), (J), (K), (P), │
│ (let id = (x → x) │
│ in ((compose id id id id Q id id) (R)) │
│ ))) │
╽ ╽
┗╾────────────────────────────────────────────────────────────╼┛
Formal Abstract Nonsense
──────────────────────────────
Nonsense will be added once I finish making it formally abstract...