Skip to content

Commit 16278d5

Browse files
committed
Add fixes for bug examples and STM
1 parent 9e722eb commit 16278d5

File tree

2 files changed

+29
-15
lines changed

2 files changed

+29
-15
lines changed

README.adoc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -70,8 +70,8 @@ C. Atomicity and Order Violations
7070
|link:papers/serebry2009threadsanatizer.pdf[]
7171

7272
|Beautiful concurrency
73-
|beytonjones2007beautiful
74-
|link:papers/beytonjones2007beautiful.pdf[]
73+
|peytonjones2007beautiful
74+
|link:papers/peytonjones2007beautiful.pdf[]
7575

7676
|KISS: keep it simple and sequential
7777
|qadeer2004kiss

src/paper.tex

Lines changed: 27 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -73,14 +73,19 @@
7373
Single core processors were widely replaced by multi-core architectures optimized to run processes and threads concurrently.
7474
To utilize the full performance of the given hardware, programmers need to write their software with a high degree of parallelism.
7575
This introduces the potential for bugs that would not occur in sequential programs due to non-determinism in the order of execution.
76-
For this purpose tools and languages have been developed to ease the creation of concurrency-aware programs and help programmers find concurrency bugs effectively.
77-
This paper will evaluate the different tools and techniques that are available to find, reproduce and fix concurrency bugs from the viewpoint of how usable they are in a production environment.
76+
Such bugs can be very challenging to fix because they are not easily reproducible.
77+
For this purpose tools and languages have been developed to ease the creation of concurrency-aware programs and to help programmers find concurrency bugs effectively.
78+
This paper will evaluate three different techniques and concrete tool implementations of them, that are available to find, reproduce and fix concurrency bugs from the viewpoint of how usable they are in a production environment.
7879
The requirements for these tools are that they need to be easy to deploy, they should bring only a minimal computational and storage overhead, they should have a high coverage with minimal false-positive reports and they need to enable the developer to quickly find the cause of a bug.
80+
The techniques evaluated are: Dynamic code analysis with the record and replay tool rr and the data race detector ThreadSanitizer.
81+
Concurrency-aware testing with a combined approach of evaluating thread schedules with delta-debugging to automatically pinpoint concurrency bugs.
82+
Static code analysis with the methods of sequentialization and model-checking.
7983
All tools are evaluated for the Go programming language but the concepts are mainly applicable to every other language as well.
8084
\end{abstract}
8185

8286
\begin{IEEEkeywords}
83-
[Software Engineering]: Testing and Debugging
87+
[Software Engineering]: Software testing and Debugging
88+
[Computing methodologies]: Concurrent programming languages
8489
\end{IEEEkeywords}
8590

8691

@@ -139,11 +144,11 @@ \subsection{The Go Programming Language}
139144

140145
\subsection{Table of Contents}
141146

142-
\Cref{sct:taxonomy} gives a brief introduction on the different types of concurrency bugs and their main causes.
143-
\Cref{sct:dynamic} covers some techniques to dynamically detect concurrency bugs and reliably reproduce them.
144-
\Cref{sct:testing} shows different methods of concurrency-aware testing to detect concurrency bugs automatically by manipulating the thread scheduler.
145-
And \Cref{sct:static} finally covers how to detect concurrency bugs with static code analysis.
146-
In the end there is a conclusion with a short outlook on the possible future of multi-threaded debugging.
147+
\Cref{sct:taxonomy} gives a brief introduction on the different types of concurrency bugs and their main causes: Deadlocks, Data races and Atomicity and Order violations.
148+
\Cref{sct:dynamic} covers some techniques to dynamically detect concurrency bugs, reliably reproduce them by utilizing record and replay with the tool \emph{rr} and detect data races dynamically with the tool \emph{ThreadSanitizer}.
149+
\Cref{sct:testing} shows different methods of concurrency-aware testing to detect concurrency bugs automatically by manipulating the thread scheduler and delta debugging.
150+
And \Cref{sct:static} finally covers how to detect concurrency bugs with static code analysis by sequentialization and model-checking.
151+
In the end there is a conclusion with a short comparison of all techniques and a quick outlook on the possible future of multi-threaded debugging.
147152

148153

149154
% ------------------------------------ %
@@ -166,9 +171,9 @@ \section{Taxonomy of Concurrency Bugs}
166171
Non-blocking bugs are often harder to find because they can occur even when the termination of a program was successful but the result is wrong.
167172
This can also lead to a cascade of bugs where the root cause is a non-blocking concurrency bug which might not be obvious.
168173

169-
For example: A method concurrently creates a list of numbers that is expected to be ordered but due to a non-blocking concurrency bug the list contains unordered elements.
170-
This failure might only occur once in a thousand executions due to the exponentially growing number of possible thread interleavings.
171-
But if other methods depend on the correct order of elements in this list, the program might crash or generate wrong results and the reason can be very hard to find.
174+
For example: A method concurrently creates a list of numbers that is expected to be ordered but due to a non-blocking concurrency bug the list suddenly contains unordered elements.
175+
This failure might only occur once in a thousand executions or even less, due to the exponentially growing number of possible thread interleavings.
176+
But if other methods depend on the correct order of the elements in this list, the program might crash or generate wrong results and the reason can be very hard to find.
172177

173178
\subsection{Deadlocks}
174179
\begin{lstlisting}[float=h, language=Go, label=lst:deadlockWG, caption=Deadlock caused by waiting for the \emph{WaitGroup} at a wrong location -- based on \cite{tu2019go}]
@@ -188,6 +193,7 @@ \subsection{Deadlocks}
188193
The most commonly manifestation of blocking bugs are \emph{deadlocks}, where circular dependencies between resources block the flow of a program.
189194
\Cref{lst:deadlockWG} shows one example of such a deadlock in a Go program.
190195
The problem is a \emph{blocking synchronization} where the \lstinline{group.Wait()} inside the for-loop is causing the block.
196+
This statement has to be moved outside the for loop to resolve the unintentional blocking and fix the concurrency bug.
191197
Although the error seems obvious in this case, those small mistakes can quickly happen and can get unrecognized into the production environment if not tested well enough.
192198

193199
\begin{lstlisting}[float=h, language=Go, label=lst:deadlockCh, caption={Deadlock caused by misuse of an \emph{unbuffered Channel}}]
@@ -206,6 +212,8 @@ \subsection{Deadlocks}
206212
A second example of a deadlock that might not be obvious is \Cref{lst:deadlockCh} which uses two unbuffered channels to transfer information between threads.
207213
The problem here is that without an active listener on an unbuffered channel, any send action will be blocked.
208214
To fix this, one could replace the unbuffered channel with a buffered one so that the execution flow of the program can continue without blocking.
215+
This shows how important it is to know the concrete implementation of a concurrency abstraction.
216+
Even though intended to ease the synchronization between threads and make inter-thread communication safer, by not knowing the implementation of an abstraction the developer can unknowingly create hard to find concurrency bugs.
209217

210218
Another common problem in event-driven concurrent programs are \emph{blocking operations} like filesystem operations that are executed inside an event-handler.
211219
These ``can penalize and even paralyze the entire program execution.''~\cite{tchamgoue2012testing}
@@ -229,7 +237,8 @@ \subsection{Data Races}
229237

230238
\Cref{lst:race} shows an example of a data race that is frequently found.~\cite{serebry2009threadsanitizer}
231239
The data race happens when ``two threads access a non-thread-safe complex object [e.g. a map] without synchronization.''~\cite{serebry2009threadsanitizer}
232-
Even though the two threads in this example write to different keys, this might cause a corruption of data or even crash the program because the default Go map is not concurrency-aware.
240+
Even though the two threads in this example write to different keys of the map \lstinline{m}, this might cause a corruption of data or even crash the program because the default Go map implementation is not concurrency-aware.
241+
To fix this, the access to the map needs to be synchronized by a lock for example.
233242

234243
% TODO: Is it a data race or an atomicity violation?
235244
A special case of data races are multi-variable data races.
@@ -258,7 +267,8 @@ \subsection{Atomicity and Order Violations}
258267

259268
\Cref{lst:order} shows a common order violation bug pattern called ``Test-and-Use''.
260269
The programmer's intention is to check if a variable is not \lstinline{nil} and then use this variable.
261-
However, due to the thread that was launched before, it could happen that after the check in line 7, the thread of the goroutine gets scheduled and the data variable is set to \lstinline{nil}.
270+
However, due to the thread that was launched before, it could happen that after the \lstinline{if} check in line 7, the thread of the goroutine gets scheduled and the data variable is set to \lstinline{nil}.
271+
To fix this bug, the check and the usage of the variable need to become an atomic operation to enforce the order of execution.
262272

263273
\begin{lstlisting}[float=h, language=Go, label=lst:atomicity, caption=Load-Store bug pattern -- Atomicity violation]
264274
func main() {
@@ -279,11 +289,15 @@ \subsection{Atomicity and Order Violations}
279289
The programmer assumes that ++ is an atomic operation because it is one literal in Go.
280290
However, after compilation this is expanded to 3 instructions: LOAD, INCREMENT and finally STORE.
281291
The thread scheduler could switch the context after any of these instructions what leads to undefined behavior, when multiple threads try to increment the same variable.
292+
To fix this bug pattern, the \lstinline{++} operation also needs to be replaced by an atomic operation that does not allow other threads to access the \lstinline{sum} variable while incrementing.
282293

283294
Lu, Park, Seo and Zhou conducted a study in 2008 where they analyzed the characteristics of real-world concurrency bugs.~\cite{lu2008mistakes}
284295
One key finding was that:
285296
``Most of the examined non-deadlock concurrency bugs are covered by two simple patterns: atomicity-violation and order-violation''~\cite{lu2008mistakes}
286297

298+
A promising solution to atomicity and order violation bugs is \emph{software transactional memory} (STM) as proposed by Peyton Jones.~\cite{peytonjones2007beautiful}
299+
It is an alternative to traditional lock-based synchronization where atomic regions get declared explicitly.
300+
Languages like Haskell provide STM by their design of language but Go can also utilize this mechanism by using external libraries.
287301

288302
% ------------------------------------ %
289303
% ------ DYNAMIC CODE ANALYSIS ------- %

0 commit comments

Comments
 (0)