Joe Marshall <eval.apply@gmail.com> wrote:
+---------------
| Kent M Pitman <pit...@nhplace.com> wrote:
| > Consequently, customizing a dialect of Lisp to a particular concrete
| > machine, while it's not done for you, is not impossible to do; it's
| > probably easier than customizing C to a particular new concrete machine.
|
| The fallacy is `Proof by Lack of Imagination'. Just because one
| doesn't know how to write `close to the metal' code in Lisp doesn't
| mean that it is impossible or even impractical.
+---------------
(*sigh*) I find it odd that so far none of the senior Lispers in
this thread have pointed to *THEIR OWN LISPS' COMPILERS* as being
precisely such an example of "customizing a dialect of Lisp to a
particular concrete machine"!!! Come on, guys, Lisp has been doing
this since *DAY ONE*!!! To wit, here are a couple of small snippets
from "cmucl/src/compilercmucl-19c/src/compiler/x86". The INST forms
emit bare x86 instructions, while other forms in :GENERATOR sections
are other VOPs [virtual operations] that emit sequences of INSTs --
effectively, VOPs provide a "smart" assembler with macros. I think
you'll find this is about as "close to the metal" as you're going to
get in *any* language other than raw binary!!
;;;; Type frobbing VOPs
...
(define-vop (get-type)
(:translate get-type)
(:policy :fast-safe)
(:args (object :scs (descriptor-reg)))
(:temporary (:sc unsigned-reg :offset eax-offset :to (:result 0)) eax)
(:results (result :scs (unsigned-reg)))
(:result-types positive-fixnum)
(:generator 6
(inst mov eax object)
(inst and al-tn lowtag-mask)
(inst cmp al-tn other-pointer-type)
(inst jmp :e other-ptr)
(inst cmp al-tn function-pointer-type)
(inst jmp :e function-ptr)
;; pick off structures and list pointers
(inst test al-tn 1)
(inst jmp :ne done)
;; pick off fixnums
(inst and al-tn 3)
(inst jmp :e done)
;; must be an other immediate
(inst mov eax object)
(inst jmp done)
FUNCTION-PTR
(load-type al-tn object (- vm:function-pointer-type))
(inst jmp done)
OTHER-PTR
(load-type al-tn object (- vm:other-pointer-type))
DONE
(inst movzx result al-tn)))
...
(define-vop (make-fixnum)
(:args (ptr :scs (any-reg descriptor-reg) :target res))
(:results (res :scs (any-reg descriptor-reg)))
(:generator 1
;;
;; Some code (the hash table code) depends on this returning a
;; positive number so make sure it does.
(move res ptr)
(inst shl res 3)
(inst shr res 1)))
...
;;;; Other random VOPs.
...
(defknown lisp::%scrub-control-stack () (values))
;;; Scrub the control stack.
;;;
;;; On the x86 port the stack grows downwards, and to support grow on
;;; demand stacks the stack must be decreased as it is scrubbed.
;;;
(define-vop (%scrub-control-stack)
(:policy :fast-safe)
(:translate lisp::%scrub-control-stack)
(:args)
(:results)
(:temporary (:sc unsigned-reg) count)
(:temporary (:sc any-reg) stack-save zero)
(:generator 25
(inst mov stack-save esp-tn)
(inst mov zero 0)
(inst push zero)
;; Scrub the stack.
SCRUB
(inst add esp-tn 4)
(inst mov count 2048)
SCRUB-LOOP
(inst dec count)
(inst push zero)
(inst jmp :nz SCRUB-LOOP)
;; Look for a clear stack unit.
(inst mov count 2048)
LOOK-LOOP
(inst sub esp-tn 4)
(inst cmp (make-ea :dword :base esp-tn) zero)
(inst jmp :ne SCRUB)
(inst dec count)
(inst jmp :nz LOOK-LOOP)
;; Done, restore the stack pointer.
(inst mov esp-tn stack-save)))
...
;; the RDTSC instruction (present on Pentium processors and
;; successors) allows you to access the time-stamp counter, a 64-bit
;; model-specific register that counts executed cycles. The
;; instruction returns the low cycle count in EAX and high cycle count
;; in EDX.
;;
;; In order to obtain more significant results on out-of-order
;; processors (such as the Pentium II and later), we issue a
;; serializing CPUID instruction before reading the cycle counter.
;; This instruction is used for its side effect of emptying the
;; processor pipeline, to ensure that the RDTSC instruction is
;; executed once all pending instructions have been completed.
;;
;; Note that cache effects mean that the cycle count can vary for
;; different executions of the same code (it counts cycles, not
;; retired instructions). Furthermore, the results are per-processor
;; and not per-process, so are unreliable on multiprocessor machines
;; where processes can migrate between processors.
;;
;; This method of obtaining a cycle count has the advantage of being
;; very fast (around 20 cycles), and of not requiring a system call.
;; However, you need to know your processor's clock speed to translate
;; this into real execution time.
(defknown read-cycle-counter
()
(values (unsigned-byte 32) (unsigned-byte 32))
())
(define-vop (read-cycle-counter)
(:translate read-cycle-counter)
(:guard (backend-featurep :pentium))
(:args )
(:policy :fast-safe)
(:results (lo :scs (unsigned-reg))
(hi :scs (unsigned-reg)))
(:result-types unsigned-num unsigned-num)
(:temporary (:sc unsigned-reg :offset eax-offset :target lo) eax)
(:temporary (:sc unsigned-reg :offset edx-offset :target hi) edx)
(:generator 1
(inst cpuid)
(inst rdtsc)
(move hi edx)
(move lo eax)))
#+pentium
(defun read-cycle-counter ()
(read-cycle-counter))
And it's not just CMUCL, of course. *Any* other mature Lisp compiler
will provide countless more examples of such "customizing ... Lisp to
a particular concrete machine".
-Rob
-----
Rob Warnock <rpw3@rpw3.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607