Subject: Re: Lisp for market analysis?
From: rpw3@rpw3.org (Rob Warnock)
Date: Fri, 19 Jun 2009 20:40:39 -0500
Newsgroups: comp.lang.lisp
Message-ID: <rsadneOmbYuKo6HXnZ2dnUVZ_rKdnZ2d@speakeasy.net>
ianeslick  <ianeslick@gmail.com> wrote:
+---------------
| Bata <batabo...@yahoo.ca> wrote:
| > The raw data that I would have to work with is a set of open, high,
| > low, close, time, and volume numbers. I figure that packages each of
| > these from a CSV file into a class or structure object would be the best
| > way to begin, ...
...
| I've got code for parsing CSV into various formats somewhere, but it's
| a straightforward task so probably a good one to learn a bit more
| about the language, file interaction...
+---------------

I agree: writing a PARSE-LINE-AS-CSV function is an excellent
learning tool! One can start with the simple case, then add
the more complex special cases incrementally [things like double
quotes, commas within double quotes, escape characters other
than double quotes (e.g., some applications permit "\"), escaped
commas & double quotes, fields with a mixture of all of these, etc.].
<http://en.wikipedia.org/wiki/Comma-separated_values> and
<http://tools.ietf.org/rfc/rfc4180.txt> mention some of the
subtleties [except the additional escape characters].

+---------------
| ...and if you like the excellent cl-ppcre library (regular expressions)
| or split-sequence mini-library.
+---------------

Unfortunately, those "more complex special cases" I mentioned
above aren't easy to handle with either CL-PPCRE or SPLIT-SEQUENCE.
I would suggest using a simple TAGBODY-based state machine
[or LOOP (ECASE state), if you don't like TAGBODY].

+---------------
| Usually I start a quick interactive development of file-processing by
| importing files into lists to make sure my parser works since it's so
| easy to introspect on the resulting data.  So, before performance
| tuning, you would simply read a CSV line into a string, use cl-ppcre
| or split-sequence to turn that into a list of symbols (using intern)
| and numbers (using parse-integer), and then use that list to create
| objects.
+---------------

I would differ only slightly, suggesting first a READ-FILE-LINES function
that slurps a whole file into a list of lines, and then writing/debugging
your PARSE-LINE-AS-CSV function with individual test-ase lines, then
doing (MAPCAR #'PARSE-LINE-AS-CSV (READ-FILE-LINES "file")). That will
yield a list if lists of strings, which you can further parse as numbers
or names or whatever.


-Rob

-----
Rob Warnock			<rpw3@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607