Re: VSTa, tsync, user level semaphores

From: Andrew Valencia <vandys_at_nospam.org>
Date: Thu Dec 15 1994 - 11:03:58 PST

[Gary Shea <shea@xmission.com> writes:]

>His first mail to me asked why I was using an infinite-loop
>spinlock to protect the user-level semaphore structures:

Because, statistically, it won't be held (since it protects only a few
instructions) and it is much, much cheaper than dropping into the kernel.
Having yield() available might be a plus.

>In the absence of such a beast, should I be giving the user more
>control over what the spinlock does? I guess it could simply
>fail after some number of (user-specified, maybe 0 => forever?)
>loops and return an error... EACCESS?

This can be useful for some errors, but not otherwise. After all, this is
not the spinlock protecting the ultimate resources, this is the spinlock
protecting the semaphore's own state.

>If you're saying that using a kernel semphore is too expensive
>sometimes when it needn't be, and that a spinlock is impractical
>because it wastes cycles spinning while the resource is held, well,
>that's exactly what I would have said!

If you can do a user-level atomic operation in a single instruction, and a
reasonable fraction of the time take an uncontended resource, you are way
ahead of a kernel transition. Remember, our friends at Intel have given us
a ring 3->0 transition time of 100-200 cycles (depending on what you save)
and another 30-100 to come back. This is for a null kernel call. An atomic
bit test/set is (from memory) 1? 3? cycles.

>>But if msleep() can give CPU to another thread in the same process
>>(even if it MAY give CPU to another process), it might be
>>favorable to call msleep with a certain fixed period in spin-loop.
>This seems reasonable, and is pretty much what I plan to do when
>an msleep()-like routine appears.

As Dave already noted, we have __msleep() and __usleep() already. yield()
is on its way.

>>2) Is it possible to implement kernel level semaphores with the same
>> interface? Thus a user can specify user-level or kernel-level in
>> semaphore creation call. I think kernel-level threads are better
>> for highly contended resources.
>I think the package _is_ that, right now. If a thread tries to
>get the resource, but it's taken, then they call tsleep (via the
>semaphore code) and get blocked on a kernel semaphore (which holds
>all user-level-semaphore-blocked threads). There _is_ the additional
>overhead of creating a user-level pid_t queue entry for the thread...

Right. For highly contended resources, you spend most of your time not
running, so the 1:100 (or more) cycle ratio is a wash. That is, it isn't
worth optimizing. This leaves optimizing the case of uncontended resources.

>Ideally, it should be possible to spin for some user-specified
>amount of time before dropping into the kernel, just in case a
>thread on a different processor gives up the resources. That
>code is not yet in place. The current code should work in the
>SMP case anyway (I think!).

Yes, we already bounced this around. I guess it isn't as important until I
get my dual-Pentium motherboard. Please, no floating point jokes. I am
seriously out wangling for a dual processor system, at which point I can
start playing with SMP again.

                                                        Regards,
                                                        Andy
Received on Thu Dec 15 10:38:28 1994

This archive was generated by hypermail 2.1.8 : Thu Sep 22 2005 - 15:12:11 PDT