[This is getting pretty far off-topic, so maybe after this we should
go private...]
<luvisi@andru.sonoma.edu> wrote:
+---------------
| rpw3@rigden.engr.sgi.com (Rob Warnock) writes:
| > And on a few architectures & C compilers [I mention both because both
| > matter], the C "setjmp()/longjmp()" can be used with no additional
| > assembly-language code to build coroutines [Unix kernel hackers should
| > be mumbling "resume()" right about now].
|
| could I trouble you to outline how one would go about doing this?
| in particular, how does one create a second stack which would
| automatically grow like the main one does?
+---------------
Those are two separate questions, the first easy and the second much harder.
1. Using C's setjmp/longjmp to build coroutines:
void resume(proc_p p)
{
extern proc_p self; /* the currently running thread */
jmp_buf regs;
if (setjmp(regs))
return; /* we've just been resume'd again */
self->save_area = (long *)®s[0]; /* save pointer to our state */
self = p; /* switch identities */
longjmp(*(jmp_buf*)self->save_area, 1); /* wake our new self */
/*NOTREACHED*/
}
[Note that resume() does *not* save the old "self" anywhere, so the caller
of resume() must do so, usually on a run queue or wait queue or something.]
Obviously, this only works the first time if somebody has created the
new coroutine (a.k.a. thread or process) object's stack and set it up so
that it looks like a setjmp() was previously done. A routine to do that:
proc_p
new_process(int stack_size, char name[], void (*f)(), int arg2, ...)
is a bit more complex than "resume()", so I won't show the whole thing
here, but basically it:
a. Malloc's some space for a new stack;
b. Copies its args into the new stack (for use when calling "f" the
first time, passing "arg2" and some number of further ones).
c. Does a setjmp() into the new stack, then diddles with the saved
value of the stack pointer so that a longjmp() will use the new
stack (with the args "in the right place").
d. Returns the constructed stack object (== thread == coroutine).
e. When resume()'d the very first time, calls "f(arg2, ...)".
On top of new_process() and resume(), one can build [and I have done]
whatever wait & run queues you like. For Simula-like (or these days,
we might say "Verilog-like") simulations, one of the most useful things
is a priority queue for which the key is "virtual time to start" (much
like the old Unix kernel "timeout" facility), and a "busy(n)" which
blocks the current thread until the virtual time in now+n ticks. With
that, plus simple wait-for-event queues, you can build a complete
discrete event simulator.
2. Creating multiple stacks that grow automatically: This ranges from
"trivial" (when using call/cc in Scheme) to "impossible" (running on
systems without multiple virtual memory segments). The above-mentioned
coroutine library set a *fixed* stack size at coroutine-creation time
[see the first arg to "new_process()"], and it was useful for a wide
variety of applications. But it was possible (too easy, even) to overflow
one of the stacks without warning, unless you added some safety code.
The main ways I know of to get multiple independently-growing stacks are:
a. Change the coroutine call so that the currently running thread doesn't
run *on* the stack object, it runs on the "main" stack. This involves
copying the main stack out into the heap & back on every resume()
[and allocating a new larger heap object for a thread's stack whenever
the old one isn't large enough to save a stack being "swapped out"].
This can be cheap or ex*PEN*sive, depending on average stack depth.
[But note that many Scheme implementations that use the C stack as
the Scheme execution stack do precisely this to implement call/cc.]
b. Change the way your compiler(s) work to use "segmented" (or "cactus")
stacks. Not an option for most people.
c. Write in a language that uses a heap-based stack (e.g. some Schemes
or Lisps, and maybe at least one ML?), so that cactus stacks are "free".
d. Play some tricky games (if your O/S lets you) with "mmap()", to allocate
larger-than-you'll-ever-need stacks sparsely in your virtual address
space. Some operating systems implement "allocate on first reference"
pages which are useful for this.
Like I said, it's harder & more environment-dependent than simple
coroutines with fixed-size stacks. (Sorry 'bout that...)
-Rob
-----
Rob Warnock, 8L-855 rpw3@sgi.com
Applied Networking http://reality.sgi.com/rpw3/
Silicon Graphics, Inc. Phone: 650-933-1673
2011 N. Shoreline Blvd. FAX: 650-964-0811
Mountain View, CA 94043 PP-ASEL-IA