A few very nice examples on how GCC optimizes code can be found at
ridiculousfish.com.
An example of generalized tail recursion was given, the transformal of
int factorial(int x) {
if (x > 1) return x * factorial(x-1);
else return 1;
}
into
int factorial(int x) {
int result = 1;
while (x > 1) result *= x--;
return result;
}
This is a well-known optimization and Iwas wondering whether SBCL also does this optimization. The second declaration becomes
(defun fact2 (x)
(declare (type fixnum x))
(let ((res 1))
(loop while (> x 1)
do (setf res (* res (decf x))))))
the firts one becomes
(defun fact1 (x)
(declare (type fixnum x))
(if (> x 1) (* x (fact1 (1- x))) 1))
Of course I have to declare the type here, otherwise it will put code to distinguish between several numeric types. And I did a
(declaim (optimize (speed 3)
(safety 0)
(space 0)
(debug 0)
(compilation-speed 0)))
just to be fair. The second one disassembles into
CL-USER> (disassemble #'fact2)
; disassembly for FACT2
; 248173A4: BA04000000 MOV EDX, 4 ; no-arg-parsing entry point
; A9: EB18 JMP L2
; AB: L0: 8BC3 MOV EAX, EBX
; AD: 83E804 SUB EAX, 4
; B0: 8BD8 MOV EBX, EAX
; B2: 895DFC MOV [EBP-4], EBX
; B5: 8BF8 MOV EDI, EAX
; B7: E8F18E7EFD CALL #x220002AD ; GENERIC-*
; BC: 7302 JNB L1
; BE: 8BE3 MOV ESP, EBX
; C0: L1: 8B5DFC MOV EBX, [EBP-4]
; C3: L2: 83FB04 CMP EBX, 4
; C6: 7FE3 JNLE L0
; C8: BA0B001022 MOV EDX, 571473931
; CD: 8BE5 MOV ESP, EBP
; CF: F8 CLC
; D0: 5D POP EBP
; D1: C3 RET
NIL
Somehow I cannot make the
GENERIC-* disappear, not even with a
the or
coerce declaration around the numbers. But the disassembly of the first one is
CL-USER> (disassemble #'fact1)
; disassembly for FACT1
; 24783F0D: L0: L1:83F904 CMP ECX, 4 ; no-arg-parsing entry point
; 10: 7F09 JNLE L3
; 12: B804000000 MOV EAX, 4
; 17: L2: 8BE5 MOV ESP, EBP
; 19: 5D POP EBP
; 1A: C3 RET
; 1B: L3: 894DFC MOV [EBP-4], ECX
; 1E: 8BC1 MOV EAX, ECX
; 20: 8BD0 MOV EDX, EAX
; 22: 83EA04 SUB EDX, 4
; 25: 8BDD MOV EBX, EBP
; 27: 8D4424F8 LEA EAX, [ESP-8]
; 2B: 83EC20 SUB ESP, 32
; 2E: 8BCA MOV ECX, EDX
; 30: 8918 MOV [EAX], EBX
; 32: 8BE8 MOV EBP, EAX
; 34: E819000000 CALL L5
; 39: 8B4DFC MOV ECX, [EBP-4]
; 3C: 8BD1 MOV EDX, ECX
; 3E: 8BF8 MOV EDI, EAX
; 40: E868C387FD CALL #x220002AD ; GENERIC-*
; 45: 7302 JNB L4
; 47: 8BE3 MOV ESP, EBX
; 49: L4: 8BC2 MOV EAX, EDX
; 4B: EBCA JMP L2
; 4D: 8F4504 POP DWORD PTR [EBP+4]
; 50: EBBB JMP L1
; 52: L5: 8F4504 POP DWORD PTR [EBP+4]
; 55: EBB6 JMP L1
NIL
From 10 to 1A, this seems to check whether the first argument is greater than something (1?), and if so, it jumps to L3, if not, it returns. In L3, it does a lot of things, especially it calls L5, which then jumps to L1, the first line, again. That is, it seems like it really does recursion here.
This is not very nice. Why is it that way?