Subject: Re: lisp as server process for shell scripts
From: rpw3@rpw3.org (Rob Warnock)
Date: Sun, 22 Oct 2006 00:12:56 -0500
Newsgroups: comp.lang.lisp
Message-ID: <SuOdnWP4x7JFYKfYnZ2dnUVZ_qednZ2d@speakeasy.net>
mark.hoemmen@gmail.com <mark.hoemmen@gmail.com> wrote:
+---------------
| I've been poking around the web to look for a howto on
| setting up <your favorite lisp implementation here> as
| a server process, so that I can pipe it shell scripts for
| execution, rather than firing up a new lisp process for
| every shell script that I want to run.  (I know Clisp has
| a pretty short startup time, but it would be nice to know
| how to do this in general.)
+---------------

Actually, my measurements some time ago indicated that
CMUCL's startup is actually slightly faster than CLISP's
[despite CMUCL's greatly larger memory footprint], not
to mention CMUCL having a compiler to machine code, so
for the last several years I've been using CMUCL for my
Lisp-based "shell scripting":

    $ cat ./test3c.lisp
    #!/usr/local/bin/clisp -q
    (format t "hello world!~%")
    $ time-hist ./test3c.lisp
    Timing 100 runs of: ./test3c.lisp
       4 0.019
      26 0.020
      70 0.021
    $ 

versus:

    $ cat ./test3a.lisp
    #!/usr/local/bin/cmucl -script
    (format t "hello world!~%")
    $ time-hist ./test3a.lisp
    Timing 100 runs of: ./test3a.lisp
      66 0.016
      34 0.017
    $ 

[The "-script" switch for CMUCL is a local hack to "site-init.lisp"
that I've been intending to publish for several years. My bad.]

Yes, if your script begins by loading up a bunch of libraries
[the way *so* many Perl scripts do!!], then things slow down
a little, but it's still plenty fast enough for my day-to-day
"scripting" needs:

    $ cat test5.lisp
    #!/usr/local/bin/cmucl -script
    ;;; Do all of the same REQUIRE/USE that are in the "cgi.core" image.
    (require :utils)
    (require :htout)
    (require :uri)
    (require :pg)
    (use-package :utils)
    (use-package :htout)
    (use-package :uri)
    (use-package :pg)
    (format t "hello world!~%")
    $  time ./test5.lisp
    hello world!
    0.099u 0.038s 0:00.15 80.0%     132+2456k 0+0io 0pf+0w
    $ time-hist ./test5.lisp
    Timing 100 runs of: ./test5.lisp
       1 0.128
       1 0.131
      13 0.135
      55 0.136
       4 0.137
       7 0.138
      16 0.139
       3 0.140
    11.358u 3.180s 0:14.72 98.7%    144+2104k 0+0io 0pf+0w
    $ 

[Aside: Leaving out the (REQUIRE :PG) saves ~35ms of that.]

This is fast enough that I have several dozen such scripts
in my personal "bin/" directory, along with other languages,
of course [output lightly edited to remove irrelevancies]:

    $ file ~/bin/* | sed -e 's/^[^:]*: *//' | \
      egrep 'shell|cmu|clisp|perl|mz|awk' | \
      sort | uniq -c
    135 Bourne shell script text executable
     40 a /usr/local/bin/cmucl -script script text executable
     22 a /usr/local/bin/mzscheme -r script text executable
     11 a /usr/bin/perl -w script text executable
      5 a /usr/local/bin/clisp script text executable
      2 new awk script text executable
      2 C shell script text executable
    $ 

Of course, you can always save a heap image that has all the
stuff you commonly use in "scripting" and make that be the
default heap for the implementation you use, and then the
startup time will be about the same as the distribution
version [though I have not bothered to do that myself, other
than a few early experiments].

Such 20-30ms startup times make Common Lisp quite usable even
for low-traffic CGI scripting [as implied by the "cgi.core"
mentioned above], though as the apps get more complicated it's
probably better to switch to a "mod_lisp"-style CL application
server.

Anyway, enough proselytizing about simple scripting...  ;-}

+---------------
| I've seen some server-client setups, but I want to make
| sure that I'm not opening up some huge security hole for
| arbitrary folks to send me their arbitrary commands to execute.
+---------------

I don't know about CLISP [so you'll need to translate the following
into CLISP equivalents], but with CMUCL the easiest way to be safe
is to make your server listen to a local-domain socket [a.k.a.
Posix-domain or Unix-domain], domain "AF_LOCAL" in a Berkeley-style
"socket()" call, and put the socket file underneath a directory to
which only you [or processes running as you] have access. Yes, I
know this shouldn't be necessary, since as it says in "unix(4)"
[on FreeBSD, or "unix(7)" on Linux]:

    Normal filesystem access-control mechanisms are also
    applied when referencing pathnames; e.g., the destination
    of a connect(2) or sendto(2) must be writable.

So simply setting the file permissions to 0600 should suffice.
But I've been told that some operating systems ignore filesytems
permissions for local-domain sockets; in this case controlling
access to the enclosing directory can be used for protection.

[Tip: EXT:CREATE-UNIX-LISTENER also doesn't provide a :REUSE-ADDR
option to unlink an existing socket of the same name, so be sure
to do that yourself.]

[Tip#2: CMUCL's EXT:CREATE-UNIX-LISTENER doesn't provide a way
to set the file permissions on the newly-created socket, so be
sure to do that after the EXT:CREATE-UNIX-LISTENER call and
before the first EXT:ACCEPT-UNIX-CONNECTION call.]

+---------------
| I'd also like the connection to be encrypted if possible.
+---------------

That's completely unnecessary if you use a properly-protected
local-domain socket. Nor do you need any sort of password or
authentication token in this case.

However... If you insist on using an AF_INET socket, then 
quite the reverse is true!! Things exposed to the Internet
need to be *very* "hardened"! Even local AF_INET sockets
(address 127.0.0.1 in IPv4) need to be authenticated [though
not encrypted], since any user may access them.

+---------------
| ...in some cases I might set the server up on a
| remote machine (which may not have a firewall).
+---------------

Another user mentioned SSH, which recommendation I'll second.
Rather than trying to write your own secure socket protocol,
just use SSH. But since OpenSSH's "-L port:host:hostport"
doesn't support local-domain remote sockets, what you want
to do is use SSH to run a small trampoline program on the
remote system that opens the server's local-domain socket
and then passes date from its standard input to the socket
and passes output from the socket to its standard output.
You might be able to use standard "telnet" for that, but its
attempts to negotiate Telnet Options with the server might
mess up your scripts, so you might be better using something
like "attachtty" <http://www.cliki.net/detachtty>, e.g.:

    $ attachtty /usr/local/lisp/local/appsrv/run/repl.sock
    ;;; Oct 21 20:56:39 attachtty: connecting directly to /usr/local/lisp/local/appsrv/run/repl.sock
    app_srv> (+ 1 2)

    3
    app_srv> (expt 2 100)

    1267650600228229401496703205376
    app_srv> ;;; Oct 21 20:57:08 attachtty: closed connection due to zero-length read
    $ 

But as you see, even "attachtty" prints messages that might
mess up your scripting, so you might want to just write a
small trampoline of your own.

Or on second thought, it's probably easier to just add a
"-q" (quiet) option to the existing "attachtty", since
"attachtty" already very conveniently provides for using
SSH to connect to remote systems, e.g.:

    $ attachtty user@hostname:/path/to/socket


-Rob

p.s. If you're using a Lisp application server with a "mod_lisp"-like
protocol on a local-domain socket [as I am], then "attachtty" is also
sometimes helpful for debugging that, too:

    $ attachtty rpw3@rpw3.org:/tmp/.cgi_sock
    ;;; Oct 21 21:09:05 attachtty: connecting through ssh to /tmp/.cgi_sock on rpw3@rpw3.org

    ;;; Oct 21 21:09:05 attachtty: Successfully started
    ;;; Oct 21 21:09:06 attachtty: connecting directly to /tmp/.cgi_sock

    REQUEST_METHOD                     <=== [I typed these lines...]
    GET
    SERVER_NAME
    127.0.0.1
    SERVER_PORT
    80
    DOCUMENT_ROOT
    /usr/local/apache/htdocs
    PATH_INFO
    /hacks/lisp/minimal.lhp
    end                                <=== [...down through here.]
    Content-Type: text/html

    <HTML><HEAD><TITLE>Simple Test Page</TITLE></HEAD>
    <BODY><H1>Simple Test Page</H1>
    This is a simple test page with not much on it.
    <P>[Look <A HREF='http://rpw3.org/hacks/lisp/minimal.lisp'>here</A> for the source.]
    </BODY></HTML>
    ;;; Oct 21 21:10:47 attachtty: closed sock connection due to zero-length read
    Got signal 20, closing down
    ;;; Oct 21 21:10:47 attachtty: ssh exited, so closed connection
    $ 

-----
Rob Warnock			<rpw3@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607