language hipsters also have never used C#, if they had they

On (Functional) Programming And Software In General
Wed, 13 Mar 2013 18:00:00 GMT

This text is, as most of my texts, just a "ranty" compilation of impressions I have when watching current discussions about programming languages and software. I appreciate most comments, as long as they contain interesting points, especially when they change my point of view. It may not be worthwile to read, but as I have plenty of other posts here, like my Comics this should not become a problem.

So a few weeks ago I switched from Firefox to Chromium shortly after I updated my installation of Tiny Tiny RSS. My vserver was a bit too slow to answer all the requests immediately, which is not nice, but on the other hand nothing that I had to change immediately. However, Firefox seems to do active waiting when processing the requests (I also noticed this behavior on other occasions). And compared to Chromium, it performs really bad, even on my 8G Ram + SSD machine. And if one tab hangs, all others hang, too, while Chromium supports multithreading. And even though many of the files in Firefox’s directories are SQLite3 databases which would support transactions, it is impossible to have two instances of Firefox running on the same session.

Of course, it is no secret that Chromium also has disadvantages. For example, it is not possible to have a multi-line tab bar, and there are no tab groups, and tabs are always on the top of the window, not on the bottom where it belongs, which can be achieved using Tab Mix Plus. The Bookmarks-Sidebar in Firefox is also much better. And tabs are apparently not synced with other instances, as it would be done with Firefox Sync.

I hope that with the change of Opera‘s rendering engine, Opera will become more usable in the future, and will therefore be a real alternative. I always liked the UI of Opera.

I think Firefox is just one instance of a much more general problem. When I run an old Firefox in a Windows 98 Virtualbox, it performs better than a native modern Firefox. Sure, the modern versions of Firefox supports some modern APIs that were not existent 1998 - but essentially, most of the stuff that can now be done could be done 1998 too, it just had different names. Of course, PNGs, Flash and some other formats were not supported, but I do not really believe that this is the reason. A similar example can be found when looking at Office. Microsoft Word under Windows 3.1 had functionality that would still be sufficient for most of the people who use it. Of course, in the last years, this software has learned a lot. But the basic functionality mostly stayed the same, so this is basically no excuse.

Probably the coding style got less efficient. A common attitude seems to be that "modern" concepts like abstract classes, JIT-compilation, highly dynamic data structures, recursion and garbage collection are the cause for this - that is, a lot of abstractions.

However, I do not think that this is the problem. There have been Lisps around since the 1960s, and Smalltalks since the 1970s. Even Haskell is pretty old now. And they all had reasonable performance on machines that are much worse than today’s computers. The problem seems to be that in those days the people made good abstraction layers which are usable for both the user and the computer, while today’s big software projects attract the stacking of abstractions which are bad for both the computer and the user.

One example for a good abstraction is the use of hierarchical file systems which is the de-facto-standard on most systems. It is overhead for the computer, who has to keep track of the names and types of files, but it is easy copared to more sophisticated database models, and it is an understandable model for the user. Of course, this model opposes limits to both the computer and the user, which is why some user interfaces stack abstractions. However, now that computers are faster and have much more memory, search databases are more predominant, and maybe hierarchical filesystems will vanish or at least lose its importance.

Another example for good abstraction is the usage of functions and dynamic programming libraries, which also has not been as predominant as it is today. Not only is it possible to reuse code, it is also a nice possibility to bring structure to programs.

An example for a bad abstraction is - in my opinion - XML. It grew out of HTML, which had a special purpose for which it was good. However, as a pure data-format, it is hard to read for humans as well as for computers, and the whole validation- and namespacing-scheme is totally over-engineered, and bloated. S-Expressions and JSON are two alternatives for structurized data which are both human- and machine-readable. Another example is C++. The pile of shit that this language forms is really amazing, it appears to me that except for the parts that come from C, every single concept in this language is not thought through in some strange and confusing way. Macros are explicitly not turing complete, but templates are. Templates generate strange error messages, and it is possible to write code of which it is not easily decidable whether it compiles. On the other hand, there now is a form of turing-complete macros, the so-called constexprs, but also with limitations that make them vitally useless. That is, though there are two turing complete techniques, macro-programming in C++ is like a rectal amygdalectomy.

Similariy, that is the case with LaTeX, which has a nice syntax for mathematica formuae, but is a mess otherwise. I do not know a singe person who really knows the internsls of latex, it could as wel just be a conspirancy, and in real, latex does not suppord programming with macros at all.

Speaking of latex, one can see another common antipattern of large software projects: Not only are bad abstractions stacked inside them, they also contain a lot of code nobody understands and nobody wants to maintain. Mostly it is possible to find a hack around every problem that arises, and some people have a good point why there is no urge to change this. It is a matter of taste whether one explicitly likes evolutionary grown software, Linus Torvalds seems to like it, I do not. But it is a fact that evolution, be it good or bad, occurs, and software is often not used the way it was designed for, but in the way that just randomly occured to be the easiest way of achieving a small goal. The Web used to be just a bunch of static pages, and now has evolved to a full-blown semantic remote desktop protocol, even though there were alternative solutions for this, and even though these would be suited better for that purpose. There are techniques and guidelines which try to prevent such things, but in general, I guess it cannot be prevented, and maybe it is better to just hack everything together until it works, instead of thinking first - it would be interesting to hear about studies that investigate whether hacking and debugging is faster than thinking about everything in the first place, it might be related to Worse Is Better. Howbeit, it might be possible to find a way to get the best out of both worlds: formal program verification.

Formal program verification and hacking seem to contradict each other, but in my opinion, it is at least worth a thought: Formal program verifiaction just ensures that certain properties of a piece of software are satisfied. And they will still be satisfied when that piece of software is used for a completely different purpose. For the web, this would mean that you have your old tag-soup parser, but at least with a verified grammar, even though it is not XHTML-compatible. And you would have a clear infrastructure for your database backends, on which you could rely and write wrappers before abandonning old code.

Well, all this sounds nice in theory. In practice, it is a bit more complicated. One major problem is that formal program verification needs a lot more thinking than just hacking down a few lines of code. It requires abstract mathematical knowledge that not everybody has, which makes it more expensive, especially since it is impossible to abstract away from it, like it can be done in numerics or linear algebra.

Additionally, most of the research appears to be done either in automated verification, or in dependently typed functional programming languages. While functional programming languages are a good tool for most purposes, the underlying machine remains inherently imperative, and therefore problems remain that cannot (yet) be solved easily using purely functional programming languages. But well, formal program verification works also for these, it just appears not to be as popular.

Hopefully, functional languages will become more suitable for low-level stuff in the future. Before it does, there are still a lot of obstacles to be taken, the hardest probably being the scepticism of the applied, practical world. For example, it is a general belief that functional programming always creates slow code. And to be honest, yes, it is probably slower than well-designed C-code, but so is Java, Python, PHP, and many libraries that can be used with C++. The functional paradigm enforces (or at least encourages) a level of abstraction which is useful, but not for free, and the real crux is whether it is worth its costs.

However, it is no elementary law of nature that sais that functional programming must be slow. There is research done, in the orbit of Okasaki, and if you think that functional languages are not fast enough for some problem, go on and do research on it, and make it more efficient!

Furthermore, something problematic about functional programming is the amount of memory software consumes when running. Well, there have been functional languages around since the 1970es, and they worked well, but apparently, they suffer from the same fate as every other software: They are bloated. However, a good concurrent garbage collector can decrease memory consumption and can optimize caching - but there are plenty of bad garbage collectors out there. Additionally, memory is cheap, and functional programming is - as soon as you are expierienced with it - faster and easier than imperative programming. A lot of heuristics for optimizations can help too. And after all, functional programming languages are still programming languages - the coding style affects the quality of the software. I think it scales on the long term. But currently, it may be too early.

Some people criticize purely functional programming languages for wasting memory and being unsafe. Actually, I would guess that functional code is usually a lot more secure than imperative code, because the common weak spots such as heap corruption and buffer overflows are less likely. Danisch criticizes that is impossible to safely overwrite memory, when coding in a purely functional manner. Well, this person appears to criticize a lot of stuff, and mostly in a way that is beyond good conversational style, at least in my opinion, but in this case, he has a good point (even though he did not really get it himself): There are problems that cannot be easily solved using purely functional programming, and safely deleting a cryptographic key from the memory seems like one, since this is a stateful operation. Of course, as soon as no reference goes to this part of memory anymore, nothing will access it - at least if the whole system was completely correct, and you will hardly find such a system. Well, there are several methods of using stateful operations in purely functional programming languages. Uniqueness types are one, but Monads are the most popular of them - and having the key encapsulated in a monad sounds like a nice API, actually: You can only access the key as long as you are inside the monad, and as soon as you leave it, you have the guarantee that the key is destroyed. So, for this special kind of problem, it is possible to give a purely functional API. But this API would still be a binding to something imperative: This kind of problem is below the usual abstraction level of today's functional programming languages. It would be nice to have something like this for system-level operations, too, but I do not know anything into that direction, except Movitz, which is not purely functional, however. On the other hand, I cannot accept this statement without a red herring: It appears to not even be trivial in C to ensure that memory is actually deleted. And with all the swapping, paging and confusing indirection of memory in current machines, it is questionable whether this is trivial in assembler.

I agree that for that kind of thing, functional languages are not (yet) suitable. But not being suitable was never a reason.

Related to this is the common will to program "on the metal". User mode x86 is not really "on the metal", with all the paging and the several indirections, it is rather an own virtual architecture. Kernel programming and programming for embedded devices is probably sort of "on the metal". But most of the code is not kernel code. And for small embedded devices, one should probably employ a different coding style than usually, anyway.

And while we are at arguments against functional programming, a common one is that no relevant commercial product was written in a functional language. Well, firstly, commercial usage does not yield quality: PHP and Java are both widespread but hated by many people. Commercial usage can depend on so many things, and quality is just one of them. Furthermore, if you count Lisp as a functional programming language, there were several Lisp Machines, there was Crash Bandicoot, there is Allegro and LispWorks. Reddit was originally written in Lisp. Then there is Jane Street Capital (whatever they actually do). And several companies use it. It may still be a niche, but it is questionable whether it stays one.

I guess it will not stay a niche, but I also do not expect too much positive development from this: You can mess up functional code, too, and I am pretty sure this is what will happen. Instead of using the additional layer of abstraction to write in a simple, short and clear way, I expect the abstraction layers to be stacked and unverified code being unmaintained, as it is done now. The whole enthusiasm about Haskell and its monads is an instance of it: Some Haskell-tutorials read like mathematical papers rather than programming tutorials, and while I have the knowledge to actually understand them, many people may not have it, and most people, including me, are not willing to read a paper about how to embed mutable arrays into a monad instead of just seeing a simple and clear example of their usage. To me it sometimes seems like the Haskell community tries to establish an elitist argot of over-formalization to enforce some strange sort of ideology, but without really providing anything new and worthwile - a programming concept that is not easily understandable and does not solve a common problem that has no widely accepted different solution is not worth anything at all. For states and IO, I consider uniqueness types the clearer and more useful abstraction. However, monads are useful too, if you do not over-formalize them. For theoretical computer science, talking about monads as monoids in the category of the endofunctors is good practice, but the nature of theoretical computer science is to be a bridge between theory and actual programming, and monads are practical, intuitive and natural in many situations, if explained correctly. However, it is a concept that is useful for some problems. Nothing more.

Show comments (Requires JavaScript, loads external content and cookies from Disqus.com)

On (Functional) Programming And Software In GeneralWed, 13 Mar 2013 18:00:00 GMT

On (Functional) Programming And Software In General
Wed, 13 Mar 2013 18:00:00 GMT