Scott <xscottg@gmail.com> wrote:
+---------------
| If we drop all the context and qualifiers, can you can tell me what
| declaration, optimization settings, or other reasonable things to add
| so that Allegro (or any other implementation) will turn a silly little
| dot product into something that effectively use the SSE "mulps" and
| "addps" instructions:
|
| (defun dot (a b)
| (loop
| for x across a
| for y across b
| sum (* x y)))
+---------------
One tiny thing that might help would be to replace "FOR Y" with "AND Y",
which immediately tells the compiler that X & Y are being stepped in
parallel, not sequentially. Yes, a "sufficiently-smart compiler" can
easily figure that out from the lack of data-flow dependency from
X to Y (or A to B), but I suspect that the very first SSE generators
might be more template-based [just to get something useful out the
door -- might even be just a compiler macro on LOOP!] and might be
looking for the AND... ;-}
-Rob
-----
Rob Warnock <rpw3@rpw3.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607