Thoughts on Concepts for Programming Paradigms
Fri, 13 Apr 2012 11:38:57 GMT

I think there is a great confusion on programming paradigms right now. "Functional vs. Imperative" is, for example, a great discussion, with a lot of strange arguments, mixing together completely different opinions, which have nothing to do with each other.

What does, for example, "functional" mean? Matthias Benkard has pointed out that "functional programming is algebraic programming". And here is a nice article on what "object oriented" means.

So at first, one might ask what such systems of paradigms are good for, anyway. It is not immediately clear why it is useful to make these rather coarse distinctions between languages. And it is questionable whether such classifications give any new insights rather than just being pointless technobabble, inventing concepts just for their own sake.

I think they are useful. They can help us to understand similarities between programming languages, and also their differences, which means that it is easier to compare programming languages. For a low-level image encoder, for example, one will probably not choose a functional language, while for a complex tree reduction, one might consider it as it is optimized for this kind of problems, and will bring you to your goal faster.
Also, when trying to combine languages or their features, it can help estimating the compatibility. It is rather hard, though possible, to have a purely functional language without garbage collection, and is probably not worth the effort. On the other hand, formal verification of imperative code always somehow yields a functional embedding, which makes it mostly not worth the effort.
Furthermore, such a classification makes it easier to learn about programming languages, as you can easily see their similarities and do not need to learn things twice.

One should not forget, however, that every such classification should only be considered as a model. It cannot be absolute, it must be rethought regularly, and it cannot be exhaustive, as long as new concepts evolve.

I will try to give a classification that sounds reasonable for me, but I will also give examples of languages that do not fit in there. My goal is not to aggressively fit every language into some category for its own sake, but to try to point out similarities and differences.
Also keep in mind that most languages are mixtures, and the distinctions here should not be seen in a black-white-manner, but rather as extreme points in some continuous space.
And, as one further thing before I start, I would like to point out a fallacy that many people have when such distinctions are made: You will hardly find a language in which you can not, in theory, emulate the behavior of another language. That is simply due to the fact that most languages are turing complete. However, the classifications address the way of programming a language encourages, rather than what is possible to do with it.

So, the first distinction that sounds reasonable to me is between functional, function-oriented and flat programming languages.

Functional languages are those languages which have functions as first-class objects, and encourage their use. They usually have closures, some have continuations, most have optimizations for tail recursion, and code written in these languages usually makes heavy use of recursion, currying and lambda-abstraction. All Lisps I know are, by this definition, functional. All functional languages I know, except NewLisp, have a garbage collector. While C++ meanwhile has lambda-forms, and and Java has interfaces with closures in their methods, both do not encourage this kind of programming, so they are not functional programming languages. Same for python and perl, in which lambda forms are just a feature, not a principle.
Inside functional languages, the availability of ephemeral data structures can cause many problems due to the deep nesting of objects and functions which is often involved. Therefore, side effects are discouraged in most modern functional languages.

Function-oriented languages do have functions, and they are used for code reuse and modularization, and the user is not required to know the actual mechanisms of a function call. Many of them support recursion and a crude form of function variables, for example function pointers in C, but they are usually used for callbacks only. Function-oriented languages are currently the most widely used type. Old BASIC dialetcs are not function-oriented. Though they have GoSub and Return as directives, these are instructions to manipulate the program flow, rather than structuring the code. Same goes for x86 assembly.

Flat languages do not have functions at all. Most assembly languages are flat. Old BASIC dialects are flat. MS-Dos-Batch-Files are flat. In some sense, also Prolog and SQL are flat. Spreadsheets are flat.

The second classification is to distinguish data-oriented and code-oriented programming languages, where data-oriented programming languages can be algebraic or relational, with object-oriented languages as a special case, and code-oriented programming languages can be liquid and solid (I could not come up with a better metaphor, sorry). Though it might seem so, these categories are not mutually exclusive.

Data-oriented languages focus on the data they work with, rather than the code.

Algebraic languages, as a form of data-oriented languages, focus on the structure of the data they work with. They have strong type systems. They usually have algebraic inductive types and some form of polymorphism, which is why they are usually also functional. Code often involves a lot of datatype definitions rather than actual functions, and functions often consist of a few recursion operators for the algebraic types, that is, case distinctions on what the actual structure of a set of data is, and recursive calls basing on them. Data is mostly immutable, ephemeral data is, if available at all, encapsulated in special operators and definitions in all implemented languages I know, with Xanadu being a yet unimplemented but proposed exception. Haskell, Agda and ML are canonical examples.

Relational languages, as a form of data-oriented languages, focus on the relations between the data entities they work with. They usually have some sort of notion for records, unions and references, and their program states can often be visualized by graphs. Algorithms often involve walking through the graph of data entities, to find paths to objects with a certain property. Prolog, Minlog and SQL are canonical examples.

Object oriented languages are a special form of relational languages, which focus on equivalence relations between objects. Such equivalence relations are often generated through information hiding. Message-passing and proxifying are techniques to add membership of an object to a class. In most languages, like Common Lisp and Java, there is a cycle-free directed graph of classes of objects which by default inherit, but may overload properties of objects of their classes' parents. To prevent problems like the diamond dispatch problem, it is desirable to keep this graph a tree. In many languages like C++ and Python, these classes are connected to their own namespaces. Object oriented languages are probably the most widely used type. An example for a language which does not have a class graph is the prototype-based language JavaScript. Object systems can be put on almost every other paradigm mentioned here.

Code-oriented languages focus on what is actually done rather than the data involved. They often use techniques of other paradigms, but not as a major tool of programming. They usually do not have strong abstraction barriers over the machine they are run on, and focus on efficiency. They tend to make heavy use of local ephemeral data in loops, with Clojure being a modern exception which enforces encapsulating them using tail recursion, and other Lisp dialects mostly providing possibilities to do so. Code-oriented languages usually require familiarity with the concept of a pointer or reference, even though some of them have garbage collection. All Lisp dialects I know are code-oriented, Lisps are the canonical examples of languages being both code-oriented and functional.

Solid code-oriented languages do not provide a natural way of generating efficient code at runtime. Code and data are treated as separate entities. Therefore, the code can be compiled, but no compiler has to be embedded into the binary, and no dependency added to one. C is a canonical example.

Liquid code-oriented languages allow generation of efficient code at runtime from either source code or abstract syntax trees. Therefore, they must either be interpreted themselves, or the compiled binary must contain some form of compiler. Pico Lisp is a canonical example which uses syntax trees. Nearly every scripting language is an example since it has some form of eval-function. SBCL is an example of a compiler.

So, these were my current thoughts. I might change my opinion, and I am interested in other opinions on that topic.

Show comments (Requires JavaScript, loads external content and cookies from Disqus.com)

Thoughts on Concepts for Programming ParadigmsFri, 13 Apr 2012 11:38:57 GMT

Thoughts on Concepts for Programming Paradigms
Fri, 13 Apr 2012 11:38:57 GMT