Notify problems and what-not

From: Andrew Valencia <vandys_at_nospam.org>
Date: Mon Nov 14 1994 - 08:03:17 PST

[Gary Shea <shea@xmission.com> writes:]

>Could you give me a clue on how you figured out what it is?
>Maybe I could make use of that info someday :)

The exception type (from "tf") is 255, which is the interrupt vector used
for system calls. So then you look at the "bt" and see which system call
was invoked. If it's just do_exit() then you know he asked to exit. If
it's by way of sendev() then he's dying on a signal. If his current signal
is "abort", you can figure he failed an assertion.

BTW...

I found the bug in p_iowait semaphore counting last night. The problem is
in the interrupted I/O code pathof msg_send(). If the p_sema_v_lock with
PRICATCH is interrupted, the semaphore count is incremented to "back out"
the sleeper. However, the portref's p_state continues to indicate that
there's an I/O waiter for the semaphore. If the server runs during this
window (after the sleeper has been interrupted, but before he can run) he
will move the state to "I/O done" and v_sema the semaphore even though
there's no sleeper. This leaves a semaphore at count 1, which will allow
the next guy to fall straight through.

If he did another message operation it wouldn't matter; msg_send() currently
resets the semaphore count before using the semaphore. But msg_disconnect()
doesn't, which is why this bug has hit so rarely.

                                                Andy
Received on Mon Nov 14 07:42:31 1994

This archive was generated by hypermail 2.1.8 : Thu Sep 22 2005 - 15:12:10 PDT