Re: threads

From: Werner Vogels <werner_at_nospam.org>
Date: Sun Feb 27 1994 - 16:52:23 PST

Mike Larson writes:

> Let me rephrase my question in a different way. Assume for a momemt
> that a) process context switch time is very fast under VSTA, b) there
> is a mechanism for sharing resources, in particular for sharing memory
> between processes (a useful thing anyway). Then my question is: given
> a) and b) above, does a thread package buy you anything? And if it
> does, is it worth the additional effort/code?
                                              
Your assumptions are based on the idea (Rob Pike & friends share your view,
so you are in good company) that threads are only there to fix performance
problems that existed in heavy weight OS's like SVr4. I do not agree with
this point of view. Threads, in my eyes, are an abstraction that gives easy
access to parallelism to the programmer and as such have proven their right
of existence in many other environments and programming languages (who where
there before "forking" became a problem). Threads have become much more
popular and visible since the wider availability of multiprocessor systems:
threads were no longer an "nice programming abstraction" but they actually
could buy you additonal cycles by exploiting the available parallel power.

> But, as you mention, it may be possible to emulate a
> threads library using existing process-level interfaces

I think you have four options:

1. you want threads only as a nice programming abstraction. In this you would
   probably not be bothered wheter your run-time system forks shared memory
   processes or uses a coroutine like structure to manage the threads of
   control. (Sun LWP and Tom Doepner's Brown threads are simple examples,
   Presto and FastThreads (UW) more elaborate ones).

2. you want your application to have full control over your scheduling. You
   allocate a number of processors to an adress space, notify the thread
   runtime system of the current state and let the run-time system figure out
   the best way to schedule the threads. This is done in Tom Anderson's
   "scheduler activations" and also available for Mach. The new OS build in
   the Pegasus project (Mullender and Leslie) explores this method even more
   deeper. The kernel needs to be "process aware", but not really
   "thread aware".

3. you want to explore parallelism as much as possible but are satisfied
   with the system handling this for you and you do not have a system that
   does (2). Your runtime systems either interfaces with kernel threads that
   are wired to different processors or you use a shared memory process
   interface (that also can be allocated to different processors).

4. you want real-time thread support. Forget all previous options. The only
   way to achieve system wide predictable scheduling is to have the kernel
   manage the threads. Interaction between the synchronization management
   and the scheduler are also inevitable. A hierachical organization where
   the processes are "real-timed" and the process run-time system manages
   the threads like in the previous options is in some "very soft" cases
   possible (for ex. dedicated multi media systems). But in general anything
   else then centralized threads for real-time is a no-no.

My main problem with the shared memory process approach is that I do not
really believe that process context switches will ever be as cheap as
thread switching. In the process case you still have to manage some vm
mapping and do some domain protection work. But the worst thing is that
in most cases the TLB contents will be lost, which will be the death of
your system performance wise. Thread scheduling normally is saving a few
registers and loading the new ones and off you go.

If VSTa wants to focus on real real-time (I'll start a "thread" on that ones
this one dies out) it will be difficult to avoid the last option where your
threads will have to be kernel based. In other cases I think a "scheduler
activations" approach would be extremely elegant.

> 2) OK, here's an example where a threads are better than processes.
> But I wonder if it is common to need so many threads/processes. Two
> applications that I have worked on recently, for example, use in the
> order of 5 - 10 processes.
                           
Many threads, etc: everything that you can describe as "highly parallel"
Distributed object stores, telecommunications switches, banking/money
exhange applications, etc. I must admit I exaggerated a bit by use 200.000,
threads, commonly in the telecommunication switch applications we don't
get further than invoking 2.000 - 8.000 threads per second. The object
stores are much more complex, have more threads, but most of them are
suspended on some communication channel.

especially in the case of the fast running of threads like in the telecom
switches keeping the TLB happy is essential. having the same physicall to
virtual mapping all the time helps a lot.

> (3) this is true for a user-level threads library. Is it true for
> a kernel-supported threads environment?

No, kernel threads obviously cost more than user level ones. This is the
whole idea behind the scheduler activations work. If you do not need
explicite kernel scheduling (like in real-time) you should leave it to the
application run-time system.

> (4) from an implementation standpoint or execution standpoint? If
> the answer is implementation, this leads me to an important question:
> will VSTa ever support process level synchronization primitives?
> If so, then aren't threads-specific synchronization primitives
> redundant?

Again it all depends where you do it. Buidling user level synchronization
is in principle not very difficult if you can avoid to be preempted when
accessing the synchronization primitives. It is also extremely cheap.

Process synchronization is the worst case because your synchronization
object have to be available system wide. But in principle they do not
have to be much less efficient that kernel scheduled thread
synchronization. For the latter you can make a number of optimalizations
as threads synchronizations primitives only are used to be used with one
resource container.

> But I think its useful to ask the following question: can
> we support a reasonable variety of applications with a minimum amount of
> redundancy?

if you want to serve predictable & deterministic real-time applications as
well as regular time sharing applications you'll have to decide what to put
in run-time support systems and what in the kernel. In general the strigent
timing constraints of real-time apllications will be the most demanding wrt
kernel support.

Remember that the only thing the kernel schedules in VSTa is threads,
processes are only there to indicate the shared resources among threads.

--
Werner
PS with  respect to the QNX question, the last thing I heard they are waiting
for the 1003.4a spec to become more stable. Which could mean many more years.
  
Received on Sun Feb 27 10:59:13 1994

This archive was generated by hypermail 2.1.8 : Wed Sep 21 2005 - 21:02:10 PDT