Re: fork() problems.

From: Christopher Fraser <chrisf_at_nospam.org>
Date: Thu Dec 01 1994 - 01:11:07 PST

I am by no means a i386 guru ... any or all of this could be
wrong! Hopefully I can help clear up some of the user/kernel state
confusion ...

But I thought Tim Newsham said:
>
>
> I'm having problems again. I know, I know, its hard to believe.
> "Tim is actually having problems?" Well its true :)

Haven't you finished it *yet*? Sheesh! :-)
 
> It seems I can fork()/wait()/exit() ok (most of the time) as done
> by the "fork" command in the testsh. The second time I issue the
> command, however, the system just dies. It dies so hard that the
> hardclock() stops running (I have it print dots to the screen so
> I know when something goes bad). The fork() command is behaving
> properly the first time. I know this for sure because I am able
> to fork off a child and exec a file (finally got that working after
> switching to the VSTa ld). My guess is that something is going
> bad in the exit routines. I did check the process table after the
> fork()/exit() and the table seemed to still be in tact. In reviewing
> things in machproc.c I've run across some questions:
>
> in dup_stack():
> /*
> * New entity returns with 0 value; ESP is one lower so that
> * the resume() path has a place to write its return address.
> * This simulates the normal context switch mechanism of
> * setjmp/longjmp.
> */
> new->t_kregs->eip = (ulong)retuser;
> new->t_kregs->ebp = (ulong)(new->t_uregs);
> new->t_kregs->esp = (new->t_kregs->ebp) - sizeof(ulong);
> new->t_uregs->eax = 0;
>
> I'm unclear on why the stack pointer is decremented by a word.

(I not 100% certain, but I'm fairly sure this is right)

When setjmp() is called to save the kernel state of a thread, esp
points to the return address in the calling function. This is
standard i386 calling convention. This is the value of esp saved
in the jmpbuf (in this case the jmpbuf is t_uregs). longjmp() loads
the esp value from the trapframe and saves the return address at
where esp points to (note that it's doesn't push, it overwrites).
The last thing it does is execute a ret instruction which pops this
return address off the stack.

> This buffer is later used by retuser() to warp to the new context
> (restore registers from t_kregs).

Not really. t_kregs is the *kernel* state of the thread as saved
by setjmp (or set up by dup_stack). resume() is used to load this.
The user state is stored in a trapframe and loaded by retuser().

> When the longjmp occurs the
> instruction pointer gets set to "retuser" which causes the program
> to enter the retuser() function (again). How does a process
> continue execution if it keeps warping back to retuser()?

The above should help clarify this ... basicly it's because there
is a distinction between the kernel and user state of a thread.
When a thread enters the kernel (via and
exception, interrupt, system call or whatever) its user state is
saved on the kernel stack in the form of a trapframe. If the
kernel decides it needs to jump to a different thread it (essentially)
saves the *kernel* state of the thread in t_kregs and loads new
state out of a different threads' t_kregs. This new thread
returns to user via retuser().

> What is the function of ebp? I don't believe the 68k has an
> analagous register (a dedicated one at least). The 68k uses
> the link command at the entry of a function which acts similarly
> to ebp (correct?). If the longjmp warps to a function that
> does this then the link register doesnt have to be set does it?

ebp is the frame pointer -- I don't _think_ it has to be used at
all, but it can be useful for debugging. But I'm not really sure,
my 386 book doesn't talk about it much other than the enter
instruction fiddling it as per the standard i386 calling conventions.
I'm not sure what's done on the 68k with the frame pointer either.

(Incidentally, the R3000 does away with frame pointers all together.
Well behaved functions decrement all they need from the stack in
one go at the beginning. This value is stored as part of the debugging
info ... so if a debugger wants to walk up the stack it just has
to look up the frame size for the current function.)

> At any rate here is the code I am working with right now:
>
> /*
> * New entity returns with 0 value; SP is one lower so that
> * the resume() path has a place to write its return address.
> * This simulates the normal context switch mechanism of
> * setjmp/longjmp.
> */
> new->t_kregs->pc = (ulong)retuser;
> new->t_kregs->sp = ((ulong)(new->t_uregs)) - sizeof(ulong);
> new->t_uregs->f_regs[REG_D0] = 0;
>
> you'll notice this is basically just a blind copy of the above.

What does your setjmp/longjmp do?
 
> in resume():
> /*
> * Make kernel stack come in on our own stack now. This
> * isn't used until we switch out to user mode, at which
> * time our stack will always be empty.
> * XXX esp is overkill; only esp0 should ever be used.
> */
> tss->esp0 = tss->esp = (ulong)
> ((char *)(t->t_kstack) + KSTACK_SIZE);
>
> This one has me really confused. I'm not to clear on what the tss
> does. In my port I have this stuff commented out. I have a feeling
> this has something to do with my problems and the 1 word space left
> on the stack after the longjmp.

I think this is the kernel stack which the i386 switches to on
receipt of an interrupt/exception/whatever. I believe the 68k has
a supervisor stack of some description to do this. If you're current
doing it on the user stack you shoudln't, because you can't trust it
to be right.
 
> What amazes me is that even for my limited understanding of this
> that my port is actually working. Tasks are switching in and out
> fine and I have one process which does a tfork() successfully and
> I can do one fork()/exec() pair fine (although not two in a row).

I'm quite impressed! My R3000 port has never successfully made it
into user mode yet let alone done fork()/exec() ...
 
Cheers,

Christopher.
Received on Thu Dec 1 00:45:27 1994

This archive was generated by hypermail 2.1.8 : Thu Sep 22 2005 - 15:12:10 PDT