POSIX signals have a long history and at least a couple unpleasant limitations. For one thing, with some threading implementations (those with fewer processes than threads) you can’t reliably target a specific thread as a signal recipient. However, luckily for me, that is not my problem.
My problem is both organizational and technical. Signal masks are for an entire process, and that means that masking a signal in your code may unintentionally impact code elsewhere in your application that expected signal delivery to work. This could in theory affect any codebase written by more than one person, but it really becomes an issue when your process uses code written by third parties.
The Background
We recently did some work to enable Android applications to use our MCAPI library. Most Android developers work with Java, with each application run in its own virtual machine. However, our MCAPI library is “native code” (i.e. C, not Java), and for that Android uses its own C library called “bionic” and its own threading implementation. The first problem is that bionic doesn’t implement one of the POSIX thread APIs: pthread_cancel().
As it so happens, we use pthreads in MCAPI for internal control messages. When the user de-initializes MCAPI, we need to shut those threads down, and so on Linux we ordinarily use pthread_cancel(). Since that’s unavailable in an Android environment, we implemented our own by sending a signal to wake our control thread. The thread is typically blocked in the kernel waiting for a hardware interrupt, so a signal causes it to be scheduled again, at which point it notices it should exit. Not a lot of code; tested on Linux and worked great; problem solved. When we ran it on Android though, it did nothing at all.
Remember how signals are process-wide? Well, as it turns out, Dalvik uses some signals for itself, including the signal we chose for MCAPI: SIGUSR1. When it came time to kill our thread, we sent the signal… but unbeknown to us, Dalvik code elsewhere in the application had masked SIGUSR1. Our thread never woke up and never exited.
Preparing RecommendationsThe Solution (?)
The fix? Use SIGUSR2 instead. Works great; problem solved.
Longer term though, there’s no guarantee that Dalvik won’t start using that too, or an application will link with some other library that (like us) tries to use SIGUSR2. Since there is no standard API to request and reserve signal numbers, conflicts seem inevitable.
So what to do? The best general solution I can come up with is one that embedded software developers should be familiar with: punt the problem to the integrator. The developer who writes the application using our library should be able to configure MCAPI to use an arbitrary signal, which they ensure won’t conflict with the rest of the application and libraries through code inspection. (Sure hope their third-party libraries come with source code.)
That doesn’t feel very satisfying to me either.
Comments
No one has commented yet on this post. Be the first to comment below.
Commented on 8:54 PM, Aug 19, 2010
By Randell Jesup
Commented on 12:45 PM, Aug 30, 2010
By Thomas Ulber
Commented on 4:30 PM, Aug 30, 2010
By Hollis Blanchard
Add Your Comment
Please complete the following information to comment or sign in.