Nunc Stans 0.2.0 comes with many improvements for job safety. Most consumers of this framework will not notice the difference if they are using it "correctly", but in other cases, you may find you have error conditions.
Jobs now flow through a set of states in their lifetime.
All jobs start in the WAITING state. At this point, the job can have two transitions. It is sent to ns_job_done, and marked as NEEDS_DELETE, or it can be sent to ns_job_rearm, and marked as NEEDS_ARM. A job that is WAITING can be safely modify with ns_job_set_* and accessed with ns_job_get_* from any thread.
Once a job is in the NEEDS_ARM state, it can not be altered by ns_job_set_*. It can be read from with ns_job_get_*. It can be sent to ns_job_done (which moves to NEEDS_DELETE), but generally this is only from within the job callback, with code like the following.
callback(ns_job_t *job) { ns_job_rearm(job); ns_job_done(job); }
NEEDS_ARM in most cases will quickly move to the next state, ARMED
In the ARMED state, this means that the job has been sucessfully queued into the event or work queue. In the ARMED state, the job can be read from with ns_job_get_*, but it cannot be altered with ns_job_set_*. If a job could be altered while queued, this could cause issues with the intent of what the job should do (set_data, set_cb, set_done_cb) etc.
A job that is ARMED and queued can NOT be removed from the queue, or stopped from running. This is a point of no return!
In the RUNNING state, the job is in the process of executing the callback that the job contains. While RUNNING, the thread that is executing the callback may call ns_job_done, ns_job_rearm, ns_job_get_* and ns_job_set_* upon the job. Note, that calling both ns_job_done and ns_job_rearm from the callback, as the 'done' is a 'stronger' action we will delete the job even though rearm was also called.
While RUNNING other threads (ie, not the worker thread executing the callback) may only call ns_job_get_* upon the job. Due to the design of the synchronisation underneath, this will block until the execution of the callback, so for all intents and purposes by the time the external thread is able to call ns_job_get_*, the job will have moved to NEEDS_DELETE, NEEDS_ARM or WAITING.
When you call ns_job_done, this marks the job as NEEDS_DELETE. The deletion actually occurs at "some later point". When a job is set to NEEDS_DELETE, you may not call any of the ns_job_get_* and ns_job_set_* functions on the job.
This state only exists on the job briefly. This means we are in the process of deleting the job internally. We execute the ns_job_done_cb at this point, so that the user may clean up and free any data as required. Only the ns_job_done_cb thread may access the job at this point.
This state machine encourages certain types of work flows with jobs. This is because the current states are opaque to the caller, and are enforced inside of nunc-stans. The most obviously side effect of a state machine violation is a ASSERT failure with -DDEBUG, or PR_FAILURE from get()/set(). This encourages certain practices:
Some work flows that don't work well here:
Inside of the nunc-stans project, the tests/cmocka/stress_test.c code is a good example of a socket server and socket client using nunc-stans that adheres to these principles.