Januar 2011 Archive

fork vs. posix_spawn

| Keine Kommentare
It seems like in unix-like systems, the default way of executing another process is using fork(2) and execve(2), and if you want to redirect the input and output, using dup2(2). As far as I read, fork(2) creates a virtual copy of the process image, that is, it mapps the pages of the original process in a copy-on-write mode.

Now, the man page of malloc(3) sais

By default, Linux follows an optimistic memory allocation strategy.  This means that when malloc() returns non-NULL there is no guarantee that the memory really is available.  This is a really bad bug.  In case it turns out that the system is out of memory,  one  or  more  processes  will be killed by the infamous OOM killer.  In case Linux is employed under circumstances where it would be less desirable to suddenly lose some randomly picked processes, and moreover the kernel version is sufficiently recent, one can switch off this overcommitting behavior using a command like:

# echo 2 > /proc/sys/vm/overcommit_memory


This behavior seems to also affect fork(2): When forking a huge process, even though there is no need to duplicate any pages, it will not be allowed, because the kernel must not overcommit. However, I do not really like the idea of having an OOM killer that randomly kills processes if there happens to be less space then necessary. So if one writes software, normally one should have only a small process that forks. On the other hand, sometimes one just wants to execute a little external executable, which is why one probably does not want two processes per default, and fork(2), execve(2) and dup2(2) are a bad choice since they need to have about twice as much available memory as the forking process.

I have heard of this problem the first time when somebody tried to use Runtime.exec in Java, and it failed, because he turned off overcommitting, and the jvm apparently used fork(2) internally - hence, the created process is as large as the whole jvm. Googling around a bit, I found that this problem is known and will probably be changed in further jvm versions.

I wondered about the alternatives, and there seems to be an alternative, called posix_spawn(p), which executes an external command without forking.

The problem is that posix_spawn(p) is not a linux-syscall, and in theory, it could be implemented via fork(2), since POSIX specifies interfaces rather than implementations. So I tried a little test. By

# echo 0 > /proc/sys/vm/overcommit_memory

I can ensure that I have the default behaviour of Linux. The following code works:

#include <malloc.h>
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>

int main (void) {
  volatile void *bla = malloc (1024*1024*1024);
  int f = fork ();
  printf("Pid: %d\nFork: %d\n---\n", getpid(), f);
  scanf("scanf");
}


and has an output like:

Pid: 28875
Fork: 28876
---
Pid: 28876
Fork: 0
---


Now when I turn off overcommiting by

# echo 2 > /proc/sys/vm/overcommit_memory

and run the same again, it fails:

Pid: 29014
Fork: -1
---

fork(2) returns -1, which means that it was not successful, as expected. Of course, therefore, I cannot execute an external process, because one of both processes would have to run execve(2). Now, I have written code that creates a new process using posix_spawn(p). It runs sleep(1), and waits for it. If I tried to do this using fork(2), execve(2) and dup2(2), I would run into exactly that problem. However, the following code works without overcommiting:

#include <malloc.h>
#include <unistd.h>
#include <spawn.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <string.h>

int main (void) {
  volatile void *bla = malloc (1024*1024*1024);
  int status;
  int pid;

  char* spawnedArgs[] = { "/bin/sleep", "2d", NULL };
 
  int f = posix_spawnp(&pid, spawnedArgs[0], NULL, NULL, spawnedArgs, NULL);

  printf("Pid: %d\nposix_spawn: %d\npid: %d\n---\n", getpid(), f, pid);
  scanf("scanf");

  wait(&status);
}

and produces an output like

Pid: 29075
posix_spawn: 0
pid: 29076
---


Nice. However, when it comes to controlling the input and output of the created process, dup2(2) is not enough. One must use posix_spawn_file_actions_t for that. The following code was created by me and Matthias Benkard, and does exactly this. The data structure saves actions that need to be done on the file handles. I dont want to give a deeper introduction into the functions used, they have good manpages, but I think a working example is a good starting point for understanding them anyway.

#include <spawn.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>

int main(int argc, char **argv) {
  int out[2];
  int in[2];
  int pid;
  posix_spawn_file_actions_t action;
  char* spawnedArgs[] = { "/bin/cat", NULL };
  int status;
  char r[1024];
  bzero(r, 1024);

  pipe(out);
  pipe(in);

  posix_spawn_file_actions_init(&action);
  posix_spawn_file_actions_adddup2(&action, out[0], 0);
  posix_spawn_file_actions_addclose(&action, out[1]);

  posix_spawn_file_actions_adddup2(&action, in[1], 1);
  posix_spawn_file_actions_addclose(&action, in[0]);
 
  posix_spawnp(&pid, spawnedArgs[0], &action, NULL, spawnedArgs, NULL);

  close(out[0]);
  close(in[1]);

  write(out[1], "Hallo!", strlen("Hallo!"));
  close(out[1]);


  read(in[0], r, 1024);
  printf("Read data: \"%s\"\n", r);

  wait(&status);

  return EXIT_SUCCESS;
}


The function calls "/bin/cat", which reads from the stdin and writes to the stdout. We are taking these streams and send "Hallo!" to the input, and read the output. And in fact, its output is

Read data: "Hallo!"

Of course, if I had more data to send, this program could run into a deadlock, since the write-call would block if the buffers are full. This can be solved in the usual ways, by asynchronous I/O or by multithreading. But for this example, it is better to use this simpler method.

Blogging Pause

| Keine Kommentare
Once again, I have much too much work to do, thus, for the next month, I will probably not write new posts, at least not many. I hope I will still have some readers afterwards.

Super Mario Land 2 ...

| Keine Kommentare

A nice property of the rational plane

| Keine Kommentare
Assuming the continuum hypothesis, there is a black-withe-coloring of the real plane <m>\mathbb{R}</m> into two colors, such that on every line parallel to the x-axis, there are only countably many white points, andon every line parallel to the y-axis, there are only countably many black points. I have found this, including a proof, on Mathoverflow.

Well, something similar holds for the rational plane <m>\mathbb{Q}^2</m>. In that case, there is a coloring such that on every line parallel to the x-axis we only have finitely many white points, and on every line parallel to the y-axis, we only have finitely many black points.

The proof is simple: We know there exists a bijection <m>b:\mathbb{Q}\rightarrow\mathbb{N}</m>. Now color every point <m>(p,q)\in\mathbb{Q}^2</m> white if <m>b(p)<b(q)</m>, and black otherwise. For every q there can only be finitely many p such that <m>b(p)<b(q)</m>, and for every p there can only be finitely many q such that <m>b(q)\le b(p)</m>.

The interesting part about that is that this is a constructive result - it can actually be calculated - but still, it is highly counter-intuitive that such a coloring exists, and its rather difficult to find it when you do not know it already.

The awesomeness of Common Lisp gives a great opportunity to actually code it, using its included facilities of fractions:

(defun cantor-pair (n m)
  (+ (* (+ n m) (+ n m 1) (/ 2)) m))

(defun z-to-n-0 (z)
  (+ (* 2 (abs z)) (signum z)))

(defun rat-pair (q)
  (cantor-pair (z-to-n-0 (numerator q)) (1+ (denominator q))))

(defun white-p (p q)
  (< (rat-pair p) (rat-pair q)))

It uses Cantor's pairing function. I thought it'd be interesting to actually draw these points. Well, actually, it is not really interesting, anyway, here is the code that uses lispbuilder-sdl:

(sdl:with-init ()
  (sdl:window 400 400)
  (let ((zoom 100))
    (flet ((draw-zoomed ()
         (sdl:clear-display (sdl:color :r 0 :g 0 :b 0))
         (dotimes (i 400)
           (dotimes (j 400)
         (when (white-p (/ i 400 zoom) (/ j 400 zoom))
           (sdl:draw-pixel-* i j :color (sdl:color :r 255 :g 255 :b 255)))))
         (sdl:update-display)))
      (draw-zoomed)
      (sdl:update-display)
      (sdl:with-events ()
    (:mouse-button-down-event
     (:button btn)
     (cond
       ((= btn sdl:mouse-wheel-up)
        (incf zoom)
        (draw-zoomed))
       ((= btn sdl:mouse-wheel-down)
        (decf zoom)
        (draw-zoomed))))
    (:quit-event () t)))))


with the scroll-wheel you can change the interval that is shown (but be careful, if you scroll down 100 times, you get a division by zero, since the zoom will be 0 - as it was not that interesting I did not put that much effort into making it work). For everybody who does not want to do this, here is an image of some screenshot (I do not actually remember the zoom-factor):



It looks sort of nice when you scroll through it. The stripes are probably some boundaries of denominators.

[PS: For some reason, at the time of publishing this post, MathJax does not render the set-symbols correctly. I will probably investigate on this later. It should be clear anyway.]

GUIs ... and stuff

| Keine Kommentare
In a mood of procrastination I had a few thoughts on how to create simple GUIs with Common Lisp. For the leveleditor in Uxul World I use LTK - it is installable via QuickLisp meanwhile, and it seems to be the best and most portable possibility so far. The main disadvantage is that it runs the TCL shell in a separate process. Then there is CLIM of course, and though there seem to be a few quite interesting libraries around CLIM, the only one I have ever used is McCLIM (with Climplayer). McClim seems to have several backends, and a nice API for creating new ones, but in the past I could not manage to run it under MS Windows which was why I focused on LTK - and then never concentrated on CLIM.
But actually, I do not really think that GUIs are that important. I think webinterfaces are meanwhile sufficient for most purposes, they are simple to code, and they are portable (and I am not the only one who thinks this apparently). A problem with local webinterfaces is that I do not know any webbrowser that supports connecting to, say, unix domain sockets. Therefore, there are two possibilities of securing the interface from other users on the same computer: Using IPTables and a local shared secret - and I do not like both possibilities. Maybe someday, some browsers will be able to connect to unix domain sockets, projects like uzbl seem to be predestined for that. What is already possible is to let them open FIFOs. So I was doing a little experiment with it, using SBCL's posix-binding: For every link I create a FIFO and a Thread that tries to write on it. Opening a FIFO blocks until both sides of the FIFO are opened by default, thus, as long as the created thread is blocked, you know that the FIFO was not opened by the browser, as soon as it is not blocked anymore, you know that it was opened and can generate the content. So the basic handling mechanism looks like

(require :sb-posix)

(defun make-handle (f)
  (let
      ((fname (sb-posix:mktemp "/tmp/clgui.XXXXXX"))
       (fkeep t))
    (sb-thread:make-thread
     (lambda ()
       (loop while fkeep do
        (sb-posix:mkfifo fname #o700)
        (with-open-file (str fname :direction :output :if-exists :overwrite)
          (setf fkeep (funcall f str))
          (finish-output str))
        (delete-file fname))))
    fname))


It takes a function that itself takes a stream (the one sent to the browser) as an argument and returns a generalized boolean - if it returns true, then the fifo keeps being listened to, if nil, its deleted and not listened anymore. Then I define a few variables and two handlers that increment and decrement a variable:

(defparameter *ctr* -1)
(defparameter *incf* "")
(defparameter *decf* "")

(defun do-inc (str)
  (incf *ctr*)
  (format str "
<html><body><a href=~s>&lt;&lt;</a>~d<a href=~s>&gt;&gt;</a></body></html>
"
      *decf* *ctr* *incf*) T)

(defun do-dec (str)
  (decf *ctr*)
  (format str "
<html><body><a href=~s>&lt;&lt;</a>~d<a href=~s>&gt;&gt;</a></body></html>
"
      *decf* *ctr* *incf*) T)


Finally, I create the handles and set the appropriate links:

(setf *incf* (format nil "file:///~a" (make-handle #'do-inc)))
(setf *decf* (format nil "file:///~a" (make-handle #'do-dec)))
(format t "Now open ~a in your browser." *incf*)


Opening the link in the browser should show you two links and a number that can be incremented and decremented by these links. Be careful: Reading from a FIFO when no process writes to it may freeze Firefox.

There are a lot of drawbacks in this approach: A lot of files are created. And the amount of data sent back to the application is very limited. Formdata cannot be passed - this could be solved using JavaScript and additional protocol handlers for Firefox - which is very complicated.

After all, it was just a little experiment. In theory, user interfaces that only contain links but no text fields and checkboxes could be implemented with it. In practice, I would not do it, but anyway it would be great to make browsers be able to connect to unix domain sockets I think.

Happy New Year

| Keine Kommentare
Don't forget to check out 2012b.