NT synchronization primitive driver¶
This page documents the user-space API for the ntsync driver.
ntsync is a support driver for emulation of NT synchronization primitives by user-space NT emulators. It exists because implementation in user-space, using existing tools, cannot match Windows performance while offering accurate semantics. It is implemented entirely in software, and does not drive any hardware device.
This interface is meant as a compatibility tool only, and should not be used for general synchronization. Instead use generic, versatile interfaces such as futex(2) and poll(2).
Synchronization primitives¶
The ntsync driver exposes three types of synchronization primitives: semaphores, mutexes, and events.
A semaphore holds a single volatile 32-bit counter, and a static 32-bit integer denoting the maximum value. It is considered signaled when the counter is nonzero. The counter is decremented by one when a wait is satisfied. Both the initial and maximum count are established when the semaphore is created.
A mutex holds a volatile 32-bit recursion count, and a volatile 32-bit identifier denoting its owner. A mutex is considered signaled when its owner is zero (indicating that it is not owned). The recursion count is incremented when a wait is satisfied, and ownership is set to the given identifier.
A mutex also holds an internal flag denoting whether its previous owner
has died; such a mutex is said to be abandoned. Owner death is not
tracked automatically based on thread death, but rather must be
communicated using NTSYNC_IOC_MUTEX_KILL
. An abandoned mutex is
inherently considered unowned.
Except for the “unowned” semantics of zero, the actual value of the owner identifier is not interpreted by the ntsync driver at all. The intended use is to store a thread identifier; however, the ntsync driver does not actually validate that a calling thread provides consistent or unique identifiers.
An event holds a volatile boolean state denoting whether it is signaled or not. There are two types of events, auto-reset and manual-reset. An auto-reset event is designaled when a wait is satisfied; a manual-reset event is not. The event type is specified when the event is created.
Unless specified otherwise, all operations on an object are atomic and totally ordered with respect to other operations on the same object.
Objects are represented by files. When all file descriptors to an object are closed, that object is deleted.
Char device¶
The ntsync driver creates a single char device /dev/ntsync. Each file description opened on the device represents a unique instance intended to back an individual NT virtual machine. Objects created by one ntsync instance may only be used with other objects created by the same instance.
ioctl reference¶
All operations on the device are done through ioctls. There are four structures used in ioctl calls:
struct ntsync_sem_args {
__u32 sem;
__u32 count;
__u32 max;
};
struct ntsync_mutex_args {
__u32 mutex;
__u32 owner;
__u32 count;
};
struct ntsync_event_args {
__u32 event;
__u32 signaled;
__u32 manual;
};
struct ntsync_wait_args {
__u64 timeout;
__u64 objs;
__u32 count;
__u32 owner;
__u32 index;
__u32 alert;
__u32 flags;
__u32 pad;
};
Depending on the ioctl, members of the structure may be used as input, output, or not at all. All ioctls return 0 on success.
The ioctls on the device file are as follows:
-
NTSYNC_IOC_CREATE_SEM¶
Create a semaphore object. Takes a pointer to struct
ntsync_sem_args
, which is used as follows:sem
On output, contains a file descriptor to the created semaphore.
count
Initial count of the semaphore.
max
Maximum count of the semaphore.
Fails with
EINVAL
ifcount
is greater thanmax
.
-
NTSYNC_IOC_CREATE_MUTEX¶
Create a mutex object. Takes a pointer to struct
ntsync_mutex_args
, which is used as follows:mutex
On output, contains a file descriptor to the created mutex.
count
Initial recursion count of the mutex.
owner
Initial owner of the mutex.
If
owner
is nonzero andcount
is zero, or ifowner
is zero andcount
is nonzero, the function fails withEINVAL
.
-
NTSYNC_IOC_CREATE_EVENT¶
Create an event object. Takes a pointer to struct
ntsync_event_args
, which is used as follows:event
On output, contains a file descriptor to the created event.
signaled
If nonzero, the event is initially signaled, otherwise nonsignaled.
manual
If nonzero, the event is a manual-reset event, otherwise auto-reset.
The ioctls on the individual objects are as follows:
-
NTSYNC_IOC_SEM_POST¶
Post to a semaphore object. Takes a pointer to a 32-bit integer, which on input holds the count to be added to the semaphore, and on output contains its previous count.
If adding to the semaphore’s current count would raise the latter past the semaphore’s maximum count, the ioctl fails with
EOVERFLOW
and the semaphore is not affected. If raising the semaphore’s count causes it to become signaled, eligible threads waiting on this semaphore will be woken and the semaphore’s count decremented appropriately.
-
NTSYNC_IOC_MUTEX_UNLOCK¶
Release a mutex object. Takes a pointer to struct
ntsync_mutex_args
, which is used as follows:mutex
Ignored.
owner
Specifies the owner trying to release this mutex.
count
On output, contains the previous recursion count.
If
owner
is zero, the ioctl fails withEINVAL
. Ifowner
is not the current owner of the mutex, the ioctl fails withEPERM
.The mutex’s count will be decremented by one. If decrementing the mutex’s count causes it to become zero, the mutex is marked as unowned and signaled, and eligible threads waiting on it will be woken as appropriate.
-
NTSYNC_IOC_SET_EVENT¶
Signal an event object. Takes a pointer to a 32-bit integer, which on output contains the previous state of the event.
Eligible threads will be woken, and auto-reset events will be designaled appropriately.
-
NTSYNC_IOC_RESET_EVENT¶
Designal an event object. Takes a pointer to a 32-bit integer, which on output contains the previous state of the event.
-
NTSYNC_IOC_PULSE_EVENT¶
Wake threads waiting on an event object while leaving it in an unsignaled state. Takes a pointer to a 32-bit integer, which on output contains the previous state of the event.
A pulse operation can be thought of as a set followed by a reset, performed as a single atomic operation. If two threads are waiting on an auto-reset event which is pulsed, only one will be woken. If two threads are waiting a manual-reset event which is pulsed, both will be woken. However, in both cases, the event will be unsignaled afterwards, and a simultaneous read operation will always report the event as unsignaled.
-
NTSYNC_IOC_READ_SEM¶
Read the current state of a semaphore object. Takes a pointer to struct
ntsync_sem_args
, which is used as follows:sem
Ignored.
count
On output, contains the current count of the semaphore.
max
On output, contains the maximum count of the semaphore.
-
NTSYNC_IOC_READ_MUTEX¶
Read the current state of a mutex object. Takes a pointer to struct
ntsync_mutex_args
, which is used as follows:mutex
Ignored.
owner
On output, contains the current owner of the mutex, or zero if the mutex is not currently owned.
count
On output, contains the current recursion count of the mutex.
If the mutex is marked as abandoned, the function fails with
EOWNERDEAD
. In this case,count
andowner
are set to zero.
-
NTSYNC_IOC_READ_EVENT¶
Read the current state of an event object. Takes a pointer to struct
ntsync_event_args
, which is used as follows:event
Ignored.
signaled
On output, contains the current state of the event.
manual
On output, contains 1 if the event is a manual-reset event, and 0 otherwise.
-
NTSYNC_IOC_KILL_OWNER¶
Mark a mutex as unowned and abandoned if it is owned by the given owner. Takes an input-only pointer to a 32-bit integer denoting the owner. If the owner is zero, the ioctl fails with
EINVAL
. If the owner does not own the mutex, the function fails withEPERM
.Eligible threads waiting on the mutex will be woken as appropriate (and such waits will fail with
EOWNERDEAD
, as described below).
-
NTSYNC_IOC_WAIT_ANY¶
Poll on any of a list of objects, atomically acquiring at most one. Takes a pointer to struct
ntsync_wait_args
, which is used as follows:timeout
Absolute timeout in nanoseconds. If
NTSYNC_WAIT_REALTIME
is set, the timeout is measured against the REALTIME clock; otherwise it is measured against the MONOTONIC clock. If the timeout is equal to or earlier than the current time, the function returns immediately without sleeping. Iftimeout
is U64_MAX, the function will sleep until an object is signaled, and will not fail withETIMEDOUT
.objs
Pointer to an array of
count
file descriptors (specified as an integer so that the structure has the same size regardless of architecture). If any object is invalid, the function fails withEINVAL
.count
Number of objects specified in the
objs
array. If greater thanNTSYNC_MAX_WAIT_COUNT
, the function fails withEINVAL
.owner
Mutex owner identifier. If any object in
objs
is a mutex, the ioctl will attempt to acquire that mutex on behalf ofowner
. Ifowner
is zero, the ioctl fails withEINVAL
.index
On success, contains the index (into
objs
) of the object which was signaled. Ifalert
was signaled instead, this containscount
.alert
Optional event object file descriptor. If nonzero, this specifies an “alert” event object which, if signaled, will terminate the wait. If nonzero, the identifier must point to a valid event.
flags
Zero or more flags. Currently the only flag is
NTSYNC_WAIT_REALTIME
, which causes the timeout to be measured against the REALTIME clock instead of MONOTONIC.pad
Unused, must be set to zero.
This function attempts to acquire one of the given objects. If unable to do so, it sleeps until an object becomes signaled, subsequently acquiring it, or the timeout expires. In the latter case the ioctl fails with
ETIMEDOUT
. The function only acquires one object, even if multiple objects are signaled.A semaphore is considered to be signaled if its count is nonzero, and is acquired by decrementing its count by one. A mutex is considered to be signaled if it is unowned or if its owner matches the
owner
argument, and is acquired by incrementing its recursion count by one and setting its owner to theowner
argument. An auto-reset event is acquired by designaling it; a manual-reset event is not affected by acquisition.Acquisition is atomic and totally ordered with respect to other operations on the same object. If two wait operations (with different
owner
identifiers) are queued on the same mutex, only one is signaled. If two wait operations are queued on the same semaphore, and a value of one is posted to it, only one is signaled. The order in which threads are signaled is not specified.If an abandoned mutex is acquired, the ioctl fails with
EOWNERDEAD
. Although this is a failure return, the function may otherwise be considered successful. The mutex is marked as owned by the given owner (with a recursion count of 1) and as no longer abandoned, andindex
is still set to the index of the mutex.The
alert
argument is an “extra” event which can terminate the wait, independently of all other objects. If members ofobjs
andalert
are both simultaneously signaled, a member ofobjs
will always be given priority and acquired first.It is valid to pass the same object more than once, including by passing the same event in the
objs
array and inalert
. If a wakeup occurs due to that object being signaled,index
is set to the lowest index corresponding to that object.The function may fail with
EINTR
if a signal is received.
-
NTSYNC_IOC_WAIT_ALL¶
Poll on a list of objects, atomically acquiring all of them. Takes a pointer to struct
ntsync_wait_args
, which is used identically toNTSYNC_IOC_WAIT_ANY
, except thatindex
is always filled with zero on success if not woken via alert.This function attempts to simultaneously acquire all of the given objects. If unable to do so, it sleeps until all objects become simultaneously signaled, subsequently acquiring them, or the timeout expires. In the latter case the ioctl fails with
ETIMEDOUT
and no objects are modified.Objects may become signaled and subsequently designaled (through acquisition by other threads) while this thread is sleeping. Only once all objects are simultaneously signaled does the ioctl acquire them and return. The entire acquisition is atomic and totally ordered with respect to other operations on any of the given objects.
If an abandoned mutex is acquired, the ioctl fails with
EOWNERDEAD
. Similarly toNTSYNC_IOC_WAIT_ANY
, all objects are nevertheless marked as acquired. Note that if multiple mutex objects are specified, there is no way to know which were marked as abandoned.As with “any” waits, the
alert
argument is an “extra” event which can terminate the wait. Critically, however, an “all” wait will succeed if all members inobjs
are signaled, or ifalert
is signaled. In the latter caseindex
will be set tocount
. As with “any” waits, if both conditions are filled, the former takes priority, and objects inobjs
will be acquired.Unlike
NTSYNC_IOC_WAIT_ANY
, it is not valid to pass the same object more than once, nor is it valid to pass the same object inobjs
and inalert
. If this is attempted, the function fails withEINVAL
.