Is parsing not an illusory problem caused by using
the wrong data structure? (Harrison Ainsworth)

Thoughts on a well-typed unsafePerformIO
Mon, 17 Aug 2015 21:56:01 GMT

Using state monads works well with Haskell, but as soon as you try to actually verify your software, it gets a lot harder. I am a fan of uniqueness types – they fit well into the framework of MLTT, and can be statically checked. Idris has experimental support. However, it can be undesirable to have purely syntactical extensions that lie outside of the scope of the type system. Extending the type system is possible but not the method of choice – in my opinion, the type system should be as simple as possible.

DiffArrays follow a different pattern: They are destructively modified in the backend, but their interface is referentially transparent: Old versions of the destructively modified array can be recreated. In theory, the overhead is limited to changing an old reference, as long as the array is used in a linear fashion. Native Coq uses them. Their definition is:

type 'a t = ('a kind) ref
and 'a kind =
  | Array of 'a array 
  | Updated of int * 'a * 'a t

That is, a diffarray is a (mutable) reference to an object that is either an array, or a key-value-diffarray-triple. Changing the value of an index in a diffarray "feels" like creating a modified copy of it. As soon as an old version of the array is modified, the array is copied; in this case, the algorithm will be slow.

There is a canonical equivalence relation between two diffarrays, namely the one that relates diffarrays which map the same key-value-pairs. As long as we are only using the interface for creating modified pseudo-copies of an array and reading its elements, there is no way to distinguish between arrays that are equal modulo this relation. If we forget for a moment that every diffarray is a mutable reference that points to an array or another such reference, we get the following algebraic definition of diffarrays:

$\mathbb{A}(B) := \mu_X(\mathbb{A}_1:(\mathbb{N}\rightarrow B)\rightarrow X, \mathbb{A}_2:B\rightarrow\mathbb{N}\rightarrow X\rightarrow X)$

Looking up the nth element would become

$(\mathbb{A}_1 m)_n = mn$

$(\mathbb{A}_2 qmd)_n = \left\{\begin{array}{cl}q&\mbox{ for } m=n \\ d_n&\mbox{ otherwise}\end{array}\right.$

The equivalence would become $a\equiv b :\Leftrightarrow \forall n, a_n = b_n$ . Any function that only uses our defined lookup function to access our array structure cannot distinguish equivalent diffarrays. We could as well work with $\mathbb{A}/\equiv$ . Especially, we could even replace an occurence of one representation of an array by another one, without significantly changing that term's behavior – but replacing one term with another is the same as destructively modifying a reference. And that is exactly what a diffarray is: Destructively modifying parts of terms, but in a way that cannot change the term's behavior.

And I assume that most uses of Haskell's unsafePerformIO are of this kind: They implement some stateful operation, but encapsulate it into a referentially transparent interface that makes different states indistinguishable.

Maybe by using the above construction, it would be possible to encapsulate some of such use cases into a dependently typed framework. Maybe in the form of quotient types, or in the form of higher inductive types.

Update: There appears to be a paper about basically the same thing. It might be even more useful than I thought, persistent arrays can be implemented better with mutable state. And furthermore, I was thinking about how this kind of thing could be used for memoization. I think it is undoubtedly useful.

Show comments (Requires JavaScript, loads external content and cookies from Disqus.com)

Thoughts on a well-typed unsafePerformIOMon, 17 Aug 2015 21:56:01 GMT

Thoughts on a well-typed unsafePerformIO
Mon, 17 Aug 2015 21:56:01 GMT