|
Bugzilla – Full Text Bug Listing |
| Summary: | UnixSystemThread::Join never returns if the thread is doing blocking IO | ||
|---|---|---|---|
| Product: | ns-3 | Reporter: | Mathieu Lacage <mathieu.lacage> |
| Component: | core | Assignee: | ns-bugs <ns-bugs> |
| Status: | RESOLVED FIXED | ||
| Severity: | normal | CC: | craigdo |
| Priority: | P3 | ||
| Version: | pre-release | ||
| Hardware: | All | ||
| OS: | All | ||
| Attachments: |
send signal to unblock blocked thread
Mathieu's original patch + structured thread exit support |
||
|
Description
Mathieu Lacage
2008-10-17 08:12:08 UTC
Created attachment 266 [details]
send signal to unblock blocked thread
craig, could you review and apply ? Mathieu, you're the maintainer of src/core, so I won't make any changes without your okay.
This patch subtly changes the semantics of the Join operation. IMO, Join() is understood to block the calling thread until the other thread terminates. Nothing more. You should be able to spin up thread(s) and then Join() them immediately, waiting for it/them to finish. This is a common worker thread model.
What you are addressing with this patch is the need to have threads terminate in an orderly fashion -- including those that are blocked on IO. In this model, you spin up the threads, and let them run until you thin it is time to close them down. Then you convince them to exit and then you Join() them as they finish up. In this case there needs to be some kind of cooperation between the thread that is doing the shutdown and the threads that are shutting down. You are relying on the correct interpretation of EINTR as returned by a blocking read.
This patch doesn't acknowledge this implicit cooperation. That can cause unexpected problems.
Consider a thread that is spawned and writes a value, reads for a response then exits. What happens if the caalling thread immediately does the Join() as I would intuitively think woud work just fine. In the current patch, the read would be (unexpectedly) interrupted by the Join() and the worker thread could never do its job (or even worse, it would sometimes work depending on kernel thread scheduling order).
So, I would separate out the Join() operation and the shutdown mechanism. I would leave Join() as it is, but add a couple of new methods along the lines of:
/*
* Called by thread during "cooperative" processing to determine
* if it should exit
*/
bool
Break (void)
{
return m_break;
}
/*
* Called by "managing" thread to tell the "managed" thread that it is
* time to exit. Also does a signal to cause any blocking IO in the
* managed thread to be interrupted (will return EINTR).
*/
void
Shutdown (void)
{
m_break = true;
pthread_kill (m_thread, SIGALRM);
}
If you just want to spin up some worker threads and wait for them to finish, you can ignore Shutdown() and Break(). The case above would work.
If you have a situation like in the tap device, you would actually have to cooperate by using the appropriate functions. In the device read thread you would see:
ReadThreadFunction ()
{
for (;;)
{
read (something); /* interruptable by the signal */
if (m_mythread->Break ())
{
break;
}
}
}
In the device shutdown code:
ShutDownDevice ()
{
m_readthread->Shutdown ();
m_readthread->Join ();
}
In this case, Join() means what you think it means.
There is the possibility of conflicting signals, too.
Created attachment 279 [details]
Mathieu's original patch + structured thread exit support
I added the Shutdown() and Break() methods previously discussed.
this looks good to me. |