About Transactional Memory
Over the last decades, much of the gains in CPU performance can be attributed to the increase in clock frequency. The last few years have seen processor frequency leveling out and the focus shifting to multi-core CPUs, i.e., chips that integrate multiple processors, as a way to increase efficiency. To get a continued application speedup, applications need to be able to harness the parallelism of the underlying hardware. This is commonly achieved using multi-threading.
Writing correct and scalable multi-threaded programs is a challenging task. While it is well known that shared resources must be protected from concurrent accesses to avoid data corruption, guarding individual resources is often not sufficient. Sets of semantically related actions may need to execute in mutual exclusion to avoid semantic inconsistencies. Currently, most multi-threaded programs use lock-based synchronization, which is is inherently complex and error-prone. For example, it is typically difficult to determine---and also difficult to derive mechanically---the set of variables in a critical section that are protected by a given lock. As a matter of fact, many software bugs, like the Therac-25 accident, can be traced back to program synchronization issues. Such problems can be effectively dealt with by using ``transactional'' constructs, which provide adequate concurrency control mechanisms to help developers build dependable, yet efficient, multi-threaded applications. They provide better composability in the sense that transactions can be composed into larger transactions without possibly violating correctness.
Transactions have traditionally been used to manage concurrent accesses to a database. They guarantee that operations gathered into transactions satisfy the four so-called ACID properties: atomicity, i.e., transactions executes completely or not at all; consistency, i.e., transactions are a correct transformation of the state; isolation, i.e., even though transactions execute concurrently, it appears for each transaction T that other transactions execute either before T or after T, but not both; and durability, i.e., modifications performed by completed transactions survive failures. Databases implement this behavior by controlling access to shared data, and undoing the actions of a transaction that did not complete successfully (roll-back).
The synchronization problems encountered by multi-threaded applications are mostly similar to concurrency control in a database. Shared objects must be accessed in isolation by multiple threads, and consistency and atomicity must be maintained for sets of semantically-related actions. However, the durability property can be relaxed as shared objects are usually transient and do not need to survive the failure of the application. Transactional Memory (TM) provides transactions for one resource, main memory. Implementations can be in hardware, software, or both. Hardware TM is not yet available in current processors, but there are announcements that upcoming processors will contain hardware TM implementations that can execute smaller transactions.
There has also been a lot of research on Software transactional memory (STM), and a few research prototypes are available. STM provides programmers with constructs to delimit transactional operations and implicitly takes care of the correctness of concurrent accesses to shared data. Most current STM implementations scale better than or as well as fine-grained locking and some implementations have single-thread overheads of as low as 20%-50%. There are two kinds of interfaces for STMs. The first one requires the programmer to add explicit calls to the STM library for every access that is transactional. This approach is applicable in every system, but requires changes to the application's source code and can be error-prone. Here is an example for a function that adds an element to a concurrent linked list:
STM_START_TRANSACTION;
prev = (node_t *)STM_LOAD(&set->head);
next = (node_t *)STM_LOAD(&prev->next);
while (1) {
v = (int)STM_LOAD(&next->val);
if (v >= val) break;
prev = next;
next = (node_t *)STM_LOAD(&prev->next);
}
result = (v != val);
if (result) {
n = new_node(val, next, tx);
STM_STORE(&prev->next, n);
}
STM_COMMIT_TRANSACTION;
The second approach is to use a compiler or preprocessor to automatically convert all accesses to memory that happen within a transaction, so that the programmer only has to mark the boundaries of transactions (i.e., where they start and end). Here is the same code using this approach. Note that this is essentially the unmodified sequential code, plus the tanger_begin() and tanger_commit() calls, which our tool Tanger interprets as declarations for the start/end of a transaction.
tanger_begin();
prev = set->head;
next = prev->next;
while (1) {
v = next->val;
if (v >= val) break;
prev = next;
next = prev->next;
}
result = (v != val);
if (result) {
n = new_node(val, next, tx);
prev->next = n;
}
tanger_commit();
Using this second approach requires tool support (see our tool called Tanger) but is much more convenient for the programmer and less error-prone.
Besides Tanger, we also provide an efficient STM implementation called TinySTM.
If you are interested in TM research, you can have a look at our publications (and follow the work cited in the sections about related work) and the Transactional Memory Bibliography (which doesn't list all papers, though). The TRANSACT workshops is a good resource, and conferences in which TM papers are published are, for example, PPoPP, ISCA, PODC, DISC, PLDI, SPAA, and several others.
Some links to external pages (with comments):
- Wikipedia on Transactional Memory. When I last checked, the basics were there but there were some mistakes, important research results were missing and not so important things explained.
