Re: Threads race for VM

From: Andy Valencia <vandys_at_nospam.org>
Date: Thu Nov 04 1999 - 11:52:21 PST

[Eric_Jacobs@fc.mcps.k12.md.us (Eric Jacobs) writes:]

>Ah... I see. thread1 hasn't taken the pset lock yet, so he could
>be pre-empted without even falling asleep on a semaphore.
>Probably unlikely, but certainly possible.

In a kernel mod I did for the Oracle production systems (back when I was at
Sequent), I sent out a patch before having it reviewed by "the" SMP VM guy.
In his review, he found the most miniscule of potential race conditions, and
commented (wryly) "it's almost impossible to hit, so they'll have to reboot
only once a day". In fact, it hit twice a day, but fortunately it hung the
CPU in a way which could be cleared by manual intervention (rather than a
reboot).

My lesson: in OS's, everything possible, happens.

>I'm curious: how would the per-virtual-page struct help us
>here?

If you flag when the ATL is added for a given pview slot, then you can check
this flag in the fault handling (and know that you should just re-run the
instruction). It also streamlines address space cleanup, since you don't
have to scan all the atl's for a given physical page.

>I'm also wondering if that scenario could be avoided by holding
>off pre-emption until we have the pset lock. It's not very far
>from the sti() in trap() until the find_pview() in vas_fault().
>Allowing another thread in that period has the effect of
>invalidating the processor's decision to enter the trap. Unless
>there's a real easy way to revalidate that decision after the
>slot lock is taken, I don't know if it makes sense to allow
>another thread to come in at that point.

That still wouldn't protect against another CPU running in parallel. And
no, I don't think a global lock would be the right way to go! (The
thread on the other processor is *probably* going to be faulting against
some completely unrelated page. No need to serialize at any point.)

Andy
Received on Thu Nov 4 12:51:39 1999

This archive was generated by hypermail 2.1.8 : Thu Sep 22 2005 - 15:12:56 PDT