CS 4650 / CS 5650
(Research in High Performance Computing)
Instructor: Gene Cooperman
Spring, 2012
Tuesdays, 321 Hayden Hall, 6:00 - 9:00
NEWS: The
Course Wiki
is available.
OpenMP and parallel benchmarking test suites
Organization
- Eliott Wiener (FReD -- reading internals: Python, record-replay
module for DMTCP)
- Samaneh Kazemi - Cilk (software model for multi-threaded programming)
- Adapt to FReD; but needs determinism on replay.
Understand internals: analyze Cilk races: wrapper around
atomic increment (used by Cilk), and rdtsc Intel assembly;
Analyze internals (how and why does Cilk use atomic increment,
rdtsc): analyze Cilk run-time libraby (where are the Cilk
functions, spawn and sync defined?) [ has already verified
that the Cilk test suite can be checkpointed using DMTCP ]
- Andrew Hannon-Rizza - DThreads - download and try out Dthreads;
Next step: checkpoint Dthreads with DMTCP
- Jonathan Albernaz - Averting TCP bandwidth throttling by enhancing
UDP protocol to add functionality so that
underlying application cannot be bandwidth-throttled; also encryption;
use UDP for testing)
- Rohan Garg - implementation of epoll wrapper (benefits for
Apache, Firefox, and some VMs like user-space Qemu !!)
- Zhengping Jin - Lguest - Reading internals (a virtual machine
in only 5,9000 words) Using DMTCP for snapshots?
- Ashutosh Waikoo - ??? (maybe interested in epoll wrappers, etc. ???
(currently working on adding PID/TID virtualization using DMTCP
modules; (experimental module for single-process checkpointing
exists by Kapil Arya; extend to distributed checkpointing)
- Komal Sodha - emacs23 bug -- (investigating DMTCP) ; May need
to debug DMTCP with GNU screen first. :-)
- Jim Shargo - X11 graphics, checkpoint using suspend-to-disk ;
understand internals of suspend-to-disk
and adapt to DMTCP; Prof. Bart Massey (Portland State U.)
is an expert on this; We'll ask his advice.
- SAME ORDER AS ABOVE. (Ashutosh Waikoo will be early or later)
- Jonathan Albernaz and Andrew - New Layer 4 protocol
- hiding detection of content,
bandwidth, actual
ports by network switches in the middle: encryption, port hopping,
multi-port channelling, randomized protocol byte order structuring;
Use TCP as model (starting point)
- Samaneh Kazemi - Adapt Cilk for reversible debugging using FReD
- Rohan Garg and Ashutosh Warikoo - Checkpointing Qemu
- Zhengping Jin and Komal Sodha - Lguest
- Jim Shargo - checkpointing 3-D graphics (OpenGL) using suspend-to-disk
- Eliott Wiener - FReD and Haskell
The prerequisites for this course are a familiarity with programming
in C (including pointers) under the Linux operating system, and the
ability to consider a new system call, read the Linux man page for it,
and to then understand how to use it in your program. Ideally, you should
also be comfortable using a symbolic debugger (e.g. gdb), although this
can be assimilated during the course itself. Students may in some cases
also be migrating from a background with a different operating system.
The remaining background knowledge (including
systems concepts) will be introduced/reviewed in the course.
If you want to privately test yourself on the
prerequisites, then read man mmap, and try writing a short
C program that uses the mmap system call. Also, write a short
amount of testing code to verify that the system call produced the
result that you expected.
If you have taken a systems or operating systems earlier, or if you are
taking such a course during the same semester, this would normally provide
those prerequisites. If you have questions whether your background
fulfills the prerequisites, please see me.
Course Structure:
As the
official course description states, this course "introduces students
to research in the domain of high-performance computing."
At its core, research is messier than the highly structured courses that
one more typically sees, but it can also be very exciting to see things
that no one in the world has ever seen before. For this reason, the course
requires highly motivated students who will operate semi-autonomously,
while reporting back to the class at regular intervals. The course
will have a small to moderate enrollment with the opportunity for more
personal attention.
There will be a warm-up project in January, followed by a term project.
The warm-up project should be done individually, although discussions
among students are encouraged both in class and outside of class.
The term project will typically be done in teams. The ideal team size
is two or three students.
The warm-up project gives you an opportunity to try out a research
area, and also to bring yourself up-to-date on a selection of techniques
that you may need for systems programming. (See,
Debugging and Other Systems Tricks for an example
of some of the useful techniques, many of which will also be
discussed in class.)
For the term project, students may choose to work either on their
own research projects that they bring to the course, or on
research
questions that have evolved from the instructor's own research lab.
Lectures will be customized to present background concepts, theory,
and practical techniques of special value for the term projects as they
develop. The instructor and his students will offer generous amounts
of time to collaborate with the teams in small meetings.
In keeping with the course goal to take students to the forefront of research,
there will be opportunities after the course is over to continue
to collaborate with the goal of a competitive conference publication. However,
to maintain a sharp line between the academic course and extracurricular
work toward a publication, any interest in a collaboration toward a
conference publication should be discussed only after the student
has received his or her final course grade. Interested students
should be forewarned that the effort to produce a competitive conference
publication after the course is at least as great as the effort in the
course itself.
(Note that the overlap of certain weeks is intentional.)
WEEKS 1 and 2: Introduction to research topics; students
choose mini-project
WEEKS 3 and 4: Continuing lectures on research topics; students
complete mini-projects
WEEK 4: Students present results of mini-project
(oral and written)
WEEKS 4 and 5: Students choose course project.
WEEKS 4 through 8: Lectures guided by needs of students for
projects.
WEEK 6: Interim project reports by students (oral and written).
WEEK 9: Further interim project reports (oral and written).
WEEKS 10 through 12: Students lead discussions of lessons from
research: results of research topics to date; potential for
new research directions; interaction with other research results
in the literature
WEEK 12: Final project presentations (oral and written)
Instructor Information:
Office: 336 WVH
(and also look in my High Performance Computing Lab, 370 WVH)
Office Hours:
After class, and 4:20 - 5:30, Tuesday and Thursday; and also by appointment.
If students are having problems with their code, they are encouraged
to stay after class or arrange an appointment, so as to develop some
code jointly with the instructor.
Text:
There is no textbook. Internal documents and pointers to resources
on the Web will be provided. Please also note the two reference books
on systems programming listed at the end of this web page.
Grades:
Grades will be determined by the sophistication of the project, along with
the quality of the reports to the class (both oral and written reports).
Both individual and joint projects are possible. Students will be
encouraged to first do a (warm-up) mini-project, followed by a full
project that need not be on the same topic.
Research consists of exploration into the unknown.
Since all research is speculative, research results consist both of
positive and negative results. In geographical terms, the discovery
of a new mountain range (a new barrier) is just as interesting
as the discovery of a new river (a new exploration route).
For 2012, the course will be project-based, and will leverage the
research of the
High Performance Computing Laboratory. It will
emphasize two research vehicles:
- DMTCP
(Distributed MultiThreaded CheckPointing):
DMTCP is an open source
package freely available from Sourceforge and developed by
a team originating in the High Performance Computing Laboratory.
There is
a video demo of it here. There is also a
description of its internal
architecture.
It transparently
checkpoints the state of a process or computation to disk.
It does so in user space (no modification to the Linux kernel).
dmtcp_checkpoint a.out # run a.out under checkpoint control
rm ckpt_a.out-*.dmtcp # remove any old checkpoint image files
dmtcp_command -c # checkpoint the current process
dmtcp_restart ckpt_a.out-*.dmtcp # restart process from disk
DMTCP transparently follows the creation of new threads,
the forking of child processes, and the spawning of remote
processes via ssh. It currently does not checkpoint certain
processes involving X-Windows, the ptrace system call (e.g. gdb),
or suspended processes (^Z). The research question is how well
DMTCP can checkpoint common processes (without modifying the kernel),
and how well it can be extended to novel applications (checkpointing
GUIs using X-Windows, creation of a reversible debugger by
checkpointing gdb, etc.). For example, an interesting novelty
would be the ability to checkpoint some open windows of your
current session, and carry them home with you on your USB key.
- VISION: Checkpointing has seen three important uses:
restarting long-running computations in the middle after a
computer crash; load balancing and process migration; and
more recently,
restoring an earlier state for purposes of programming or
debugging. DMTCP supports all three modes, but some of the most
interesting research goals lie int he third area. Wouldn't it be
nice to checkpoint an X-Windows application, and move it to another
machine, and restart it? Can one do that with 3-D graphics
(an extension to basic X-Windows)?
Can one checkpoint a virtual machine such as user-space Qemu
or Linux lguest? If one could do this, one could even think
of running malware inside Windows inside a virtual machine.
Why is this useful? We can checkpoint fast (in seconds, unlike
the time for a virtual machine snapshot).
If the malware detects that it is being
spied on, we can back up to a previous checkpoint. If we are not
sure what input to pass to the malware, we can restart from the
checkpoint several times, and play "What if" games.
Don't worry if you have never used Qemu or lguest. All concepts
will be explained in a self-contained manner in the course.
-
EXAMPLE PROJECTS FOR 2012: Checkpointing single X11 apps (e.g.,
checkpointing Firefox: the ultimate bug report
for just before it crashes); Checkpointing a user-space
virtual machine; Infiniband support and
porting projects from expensive Infiniband clusters
to cheaper TCP/IP clusters for leisurely debugging.
- FReD
(Fast Reversible Debugger):
FReD
is an open source reversible debugger. It implements such
commands as reverse-step, reverse-next, and reverse-watch
(a generalization of watchpoints).
Suppose one is using a debugger and the
variable x
has the wrong value. When did it get the wrong value? Wouldn't
it be nice to revert to an earlier state and
examine x?
One can with DMTCP, which immediately yields a reversible debugger.
If we had checkpointed a debugging session 100 commands ago,
and we wish to undo the last debugging command, then just restart
the checkpoint image from 100 commands ago, and re-execute
the first 99 debugging commands. Now, combine the last
two ideas: I'm sure you've all seen how easily web browsers
can crash. Wouldn't it be great to go back and find out at which
statement they did something causing the crash?
An old description of
FReD can be found in these slides
from here. While reversible debuggers have been available
at least since 1970, they have seldom gained widespread use.
Most recently, GDB version-7.2 and later provides excellent support
for reversible debugging using its target record
command. GDB-7.3 is available in Ubuntu~11.10, and you will
find a copy of it in the instructor's directory.
Some strong points of the FReD reversible debugger are:
(i) supports multi-threaded programs at near full speed;
(ii) supports long-running programs (in contrast, GDB reversibility
is not practical for programs running even a few seconds); and
(iii) FReD supports a novel feature,
reverse expression watchpoints. (See the
slides
for a description of this feature.)
FReD is in the last stages before a public release. An alpha copy
of the code, along with two documents describing it are at:
The vision and example projects follow:
- VISION:
FReD provides a Python-based scripting language that allows one
to directly call debugging commands that can manipulate the
debugging history of a process. Using this platform, one can
automatically search for the cause of bugs. For example, if a
a program dereferences a NULL pointer, FReD can bring one back
in time within the GDB debugger to the point where the corresponding
pointer variable was being set to NULL. If a buffer is allocated
via malloc, and a program calls
free twice on the same memory buffer, then FReD can bring
one back to a point in time where the first call to free
was made. This is done using reverse expression watchpoints.
FReD can be extended to other debuggers besides GDB, and to
other mechanisms for searching for the cause of a bug, beyond
the examples above.
-
EXAMPLE PROJECTS FOR 2012: extend FReD to work with
multi-threaded languages such as Cilk and OpenMP ;
add reversibility to the functional, lazy language Haskell ;
implement reversible memory leak detector
that will go back in time to the cause of the memory leak
The instructor will cover any missing systems knowledge either in class,
or one-on-one with individual students.
GDB and other UNIX resources:
Some help files for UNIX and its compilers,
editors, etc. are also available.
In particular, the use of gdb (the GNU debugger) is especially encouraged
as an important productivity tool.
Here is also one book that is very nice for learning systems
programming concepts. Choose a chapter of interest, rather than
reading it from front to back. The Rochkind book is an excellent book,
with simple, example source code showing useful programs. The
has the table of contents, and downloadable example source code.
I also recommend the online book, "The Linux Kernel", below
for qn excellent overview of the kernel.
The book by Robert Love
provides more technical details on the Linux Operating System,
but it would only needed for more unusual aspects of certain projects.
The Linux kernel (online) (my current favorite book on the Linux
kernel --- a gentle introduction without confusing newcomers
with all the gory details)
Online: Linux System Programming by Love, 2007 (free online version
accessible from Northeastern computer network via
Safari Books Online)
- If using from another ISP outside of Northeastern U.,
then try tunneling using your CCIS account:
ssh -L1234:0-proquest.safaribooksonline.com.ilsprod.lib.neu.edu:80 denali.ccs.neu.edu
or:
ssh -L1234:safari.oreilly.com:80 denali.ccs.neu.edu
Then point one's browser at the URL http://localhost:1234/
Note also the notes on debugging below.
Each of these mini-projects is described only in outline. Within the
class, further details will be provided for those mini-projects of
interest to the students. Never mind if you produce working software.
The goal is to understand. It is only incidental if you succeed with
a "deliverable".
- DMTCP (Familiarize yourself now with the code.
Note especially the subdirectory dmtcp/doc with descriptions
of many parts of the DMTCP internals.
Use tools such as gdb for a deeper understanding.
Read the QUICK-INSTALL file for more
tips about DMTCP and its debugging tools.
Then write an overview of the implementation of DMTCP.
This is a paper-only project. If you take this on,
it will require detailed descriptions of the functionality
of the components of DMTCP.
Below are some alternative mini-projects concerned
more closely with producing code or pseudo-code.
- Use the module facility of DMTCP to build a new module.
An example module might be wrappers for the functions malloc
and free. The wrappers should allocate additional "guard
regions" around the memory buffer. It should catch bugs
like user code that writes beyond the end of allocated memory,
or user code that frees a buffer twice. For interested students,
there is the possibility of building on This mini-project,
to provide a novel, advanced memory leak detector in the
FReD reversible debugger.
- Write a new DMTCP wrapper function for a new system call.
(One suggestion is for man epoll.) There are examples
of wrapper functions in trunk/dmtcp/src/pidwrappers.cpp
and other files with names *wrappers.cpp. If you choose
an advanced system call such as epoll, it is acceptable
to work jointly with another student on a single mini-project.
- DMTCP currently has a bug in checkpointing emacs23
(version 23 of emacs). Investigate the cause of this bug.
The primary responsibility is the diagnosis of the bug. You are
not required to produce a bug fix.
(The bug appears to occur in the context of screen. If you
are interested in this project, tell me, and I will help you
reproduce this bug.)
- Provide a paper design for a new MTCP module. This type of
module is unrelated to the DMTCP modules described above. Currently,
DMTCP has an option for using "gzip" to dynamically compress
files on the fly. It is mostly in MTCP. This is too restrictive.
The MTCP subdirectory should support arbitrary user-defined
modules that are called by the MTCP checkpoint or restart routines.
The modules may save a checkpoint image locally or on a remote
machine; using gzip or a newer fast compression
routine such as
Snappy,
LZO,
FastLZ,
QuickLZ,
or other.
You provide a paper design for the framework, and third parties
build whatever module they want. As part of your paper design,
consider options for third-party module writers to write to RAM
and then fork a child process that saves on disk -- or a third-party
module that mmaps RAM to disk and lets the operating system worry
about the best optimization.
You don't yet write code
in this mini-project, but you should refer to existing
code in the MTCP subdirectory.
- FReD (Familiarize yourself now with the code.) This primarily
involves reading Python code, the C++ record-replay module
for DMTCP, and a rough "black box" understanding of DMTCP.
There is a limited introduction in
this paper from the PLOS-11 workshop.
In particular, read about the primitive reverse-xxx algorithms,
and then study the code to see how reverse-watch works.
Document your findings in a report.
- Dthreads
is a novel idea for determinism: replace a multi-threaded process
by multiple processes with shared memory. There is also
a
full paper on Dthreads. This provides for
efficient deterministic multithreading. A well-known problem with
reversible debuggers is that if you go back in time, and then
execute forwards, do you arrive at the same place, or did the
operating system produce a different thread schedule that changes
the behavior? By adding determinism,
Dthreads provides a "simple" idea for allowing FReD to easily
implement
determinism. Is such a combined implementation possible? How would
it work? This project is purely a paper design.
- Linux LGuest
is a virtual machine written in just 5000 lines of well-documented
code! You can read and understand every line of the code.
Lguest does not provide for snapshots. But using DMTCP, we could
checkpoint the Lguest "process". This provides a fast
checkpoint of an entire virtual operating system. Can this
idea work? Try it. If you try it, probably some things go wrong.
What goes wrong? Propose on paper some approaches to overcome
these difficulties. There are opportunities to continue
this work into the term project.
The main projects are listed below. As always, you are also welcome to
bring your own research project. The project may be a tool useful for
High Performance Computing or a large computation itself.
We will also set up a course Wiki, where you will describe the
status of your projects. The Wiki will also have a space for general
issues/comments in supporting Roomy and DMTCP.
DMTCP (list of projects still being revised)
- Using suspend-to-disk mode to enable checkpointing of graphics
programs (including OpenGL (3D graphics))
-
MTT (MPI Testing Tool) for automated testing for
Open MPI and DMTCP
- Checkpointing Qemu, first step towards checkpointing malware,
and then running the malware reversibly
- Another option that may or may not work is the
Linux lguest simple hypervisor
- Fast process migration (e.g. for servers):
Step 1: Add an MTCP module capability
Step 2: Write an MTCP checkpoint module
that checkpoints to remote RAM
Step 3: Restart on the remote machine
Step 4: Tune it for speed
- DMTCP attach using new ptrace capability.
- Checkpoint the job control/suspend feature of your favorite shell (^Z)
- Have MTCP use a standard ELF linker script.
- Hijack/Attach to already running process and checkpoint
(There is a question here about how to follow socket connections,
if that process is already talking to other processes.)
- Thread race condition detector: A traditional race condition
eventually causes a crash. But since it's a race condition,
it doesn't always crash at that location.
Experiment with different checkpoints,
until you find a checkpoint location for which the process
always crashes upon restart. Then modify MTCP to only allow
a subset of the threads to resume, and keep the other threads
suspended. By trial and error, discover which two threads
have a race condition.
- Portable Linux Apps: DMTCP checkpoint images include any
libraries that have been loaded. If the environment variable
LD_BIND_NOW is set (set to anything), then the loader will preload
every library that it will need. This should enable one to copy
a checkpoint image from Debian Linux to OpenSuse Linux to RedHat Linux
to Ubuntu Linux to (etc.). Does this work? If not, what's needed
to make it work?
- Incremental Checkpoint: DMTCP may want to keep multiple
checkpoint images, so that it can return to any of several
execution points in the past. This would normally require a lot
of disk space. How does one efficiently store a diff between
checkpoints. (This may be a somewhat easier project, for those
who are looking for that. With other projects, one will often
find that at the end of the semester, one has to report that some
parts are still not working, and why. This project offers the
opportunity of finishing most of the project, if there are no
surprises.)
- Checkpoint valgrind or other binary rewriters. Examples of
programs using dynamic binary translation include:
Valgrind,
Pin, and
Paradyn/Dyninst.
Currently, it appears that DMTCP cannot checkpoint these
packages. Why? Can it be fixed?
FReD (list of projects still being revised)
- Reversibly debugging
Cilk programs
- Reversibly debugging
OpenMP programs
- Extend the FReD reversible debugger to Java programs via 'jdb'
- Haskell and reversible debugging (Due to the functional lazy design
of the language, Haskell has to worry about side effects in debugging)
- Combine FReD with DThreads (highly speculative, but very high impact)
- Memory leak detector: If a memory leak occurs later in the
program, valgrind runs too slowly to easily find it.
So, use a malloc debugger or your own memory/free interceptor.
A good one is the
DUMA library (libduma)
(Detect Unintended Memory Access), which is a newer replacement for
the classic Electric Fence (libefence).
This defines regions created through malloc. Late in the program,
it will be easy to find a region of memory that is a memory
leak (that no one ever touches again).
Alternatively, using checkpoint/restart
tricks, find the last time that anyone touched that memory
segment. Report that line of code using a standard tool to
convert between a line of assembly language and the source code line.
To guarantee that no one ever uses that memory again, remove
read-write protection from that region of memory and add
a segfault handler to trap any accesses. Then automate the
many checkpoint-restart to automatically find where the memory
segment was last touched.
DMTCP
If you have questions about DMTCP, please send e-mail to
Kapil Arya and me. The username of Kapil Arya is his
first name (all lower case) and: @ccs.neu.edu
DMTCP is available through
the sourceforge web page.
The easiest way to start is (in Linux) to type:
svn co https://dmtcp.svn.sourceforge.net/svnroot/dmtcp/trunk dmtcp
cd dmtcp
./configure
[ OR: ./configure --enable-debug ]
make
make check [OPTIONAL]
Then read the QUICK-START file in the top-level dmtcp directory.
From there start browsing the source code.
FReD
The FReD software will be made available soon.
If the suggestions are unclear, use "man" to find out more
about the commands.
-
pgrep -n a.out
pkill -9 a.out, where you replace
a.out by the name of your binary. Also, consider
pgrep with -o, and with no flags.
- GDB (beyond the basics):
The use of gdb is essential. Note the
introduction to UNIX tools.
- I invoke
gdb --args <COMMAND STRING>
This form allows you to include the command and its arguments.
- Within gdb, try this ASCII graphics mode: ^Xa
(control-X a)
- Include useful utility functions in your code. You can then do
things like
(gdb) print my-utility-len-linked-list(mylist)
This also works on system calls:
(gdb) print lseek(4, 0, 0)
This finds the file offset for file descriptor number 4,
where the final 0 corresponds to SEEK_SET, whose value is
found from grep -R SEEK_SET /usr/include .
Similarly, you can find out what fd 4 corresponds to:
(gdb) set $pid = getpid()
(gdb) shell ls -l /proc/$pid/fd
NOTE: getpid() can be called only after
the gdb run (after the target is running).
-
set follow-fork-mode child will follow the child
process
on fork. parent will have it follow the parent process.
break fork will have GDB stop before executing the
fork system call.
- If your program under GDB exits too soon,
try
(gdb) break _exit . (You can slso break on other
system calls in libc.)
-
gdb a.out PID OR
gdb a.out
(gdb) attach PID
(where the attach command is given within gdb); a convenient
single command that finds the PID is:
gdb a.out `pgrep a.out | tail -1`
- Consider delay loops in combination with "gdb attach". These are
better than calls to 'sleep' since they do not access external
libraries. For example, insert into your source code:
{int x = 1; while(x);}
Then do "gdb attach" and
(gdb) print x=0
An alternate form to stop at the 5th occurrence is:
{static int x = 1; if (x++ > 5) while(x);}
- To use gdb with assembly, consider x/10i $pc, and
x/10i $pc-12, etc. A convenient version is:
(gdb) display/5i $pc
followed by stepi (si) or nexti (ni). To set a breakpoint in
assembly, try:
(gdb) break *0xbfdea000
for a breakpoint
at the given address.
- To use gdb on multi-threaded programs, try:
(gdb) info threads
(gdb) thread <NUM>
(gdb) thread apply all where
(gdb) thread set scheduler-locking on
NOTE: the last command (scheduler-locking) has the
possibility of creating
deadlock -- for example, if one thread is holding a low-level
libc or C++ lock, and you try to advance a second thread whose
execution requires the lock.
-
strace -o myoutput a.out (trace system calls
based on kernel API: /usr/include/asm/unistd*.h ;
decide in advance if it should trace all child processes or not;
the flags -f and -ff exist for tracing
parent and all children)
-
ltrace -o myoutput a.out (not as useful as
strace, but sometimes interesting: trace library
calls instead of system calls).
-
ps auxw | grep a.out
-
pstree -pu $USER or pstree -lu $USER
(tree of processes and child
process; names in curly braces are additional threads);
Note idioms like: pstree -p | grep -C2 a.out
-
top
- When your program runs too slowly, it might not be CPU-bound. Check
man iostat
man vmstat for
disk/file I/O (Blk_read/s / Blk_wrtn/s),
and paging to disk (bi/bo/id), respectively. A local disk (not
on the network; SANs are different) can sequentially
read or write (not both at once) roughly at a rate
from 50 MB/s to 100 MB/s.
If you are accessing files mostly and you don't
see that bandwidth, then your program is not efficient. If you are
paging to disk and you do see a bandwidth
anywhere near that bandwidth, then you are using too much RAM.
-
watch -d ls -l /tmp/myfile.txt
watch -d "pstree -l | grep -A1 `basename $SHELL`"
(repeatedly execute COMMAND for watch -d COMMAND)
- Search for SUBSTRING in all dmtcp/src/* files:
find dmtcp| xargs grep SUBSTRING
or alternatively: grep -r SUBSTRING
Don't forget:
grep -C3 ...; grep -A5 ...; (and so on.)
-
grep and google are your friends when
searching for information. Besides "grep'ing" through source code,
here is a grep trick you may not have seen:
find /usr/share/man/man3 | xargs zgrep MYSTRING
find /usr/share/man/man3 | xargs gzip -dc | grep -C3 MYSTRING
-
less /proc/PID/maps<
-
ls -l /proc/PID/fd
- List open file descriptors:
lsof | grep a.out
If you discover an interesting socket with SOCKET_ID
through 'lsof' or 'ls -l /proc/PID/fd', then find
the other end of the socket:
lsof | grep SOCKET_ID
- List environment variables of a process:
cat /proc/PID/environ | tr '\000' '\n'
-
nm a.out (or nm library.so)
Note the form nm -o for printing out filenames. This
can be useful with brute force strategies:
nm -o /usr/lib/lib* | grep MY_SYMBOL
(Also see discussion of readelf and objdump below.)
-
strings -a a.out (for some binary, a.out)
- Sometimes, a macro was expanded, and it's hard to track down what
happened. Try:
cpp -dM
-I. -Iother/path/to/include/files myfile.c
- Sometimes, it's not clear what includes paths to use. In the above
example, try:
rm myfile.o; make myfile.o
and copy the command line used by 'make' to build myfile.o.
If 'make' uses libtools, you may also have to remove hidden
directories with names like .libs .
- Get to know your loader. It executes before your executable file:
man ld.so
-
env LD_DEBUG=help a.out
env LD_DEBUG=files a.out
(and try other options to LD_DEBUG)
-
ldd a.out (for some binary, a.out)
- Replace PID in following:
pushd /proc/PID;
ls -l exe;
echo -n "cmdline: "; cat -v cmdline;
echo ""; cat -v environ; echo "";
popd
- DMTCP:
./configure --enable-debug; make clean; make
and then run and look at
/tmp/dmtcp-USER@HOST/jassert* files
for your value of USER and HOST. Before
your next test, rm -rf /tmp/dmtcp-USER@HOST .
- MTCP: Look at mtcp/Makefile and uncomment the line taht
adds to CFLAGS the flag:
-DDEBUG
- More on gdb: Using gdb with C++ : For a C++ function
with namespace, class,
and signature (e.g.: dmtcp::myClass::foo(int, bool) ),
try listing it first:
list 'dmtcp::myC<TAB>
It will autocomplete. Extend it, and type the final quote mark (').
Once you're sure you can list it, you can do things like set
a breakpoint:
b 'dmtcp::myC<TAB> (and complete it with quote mark
as before).
- Using gdb with errno: In glibc, the global variable errno (see
man errno) is a macro that is redefined to:
*(int *)__errno_location()
If you want to p errno within gdb, you will have
to modify this into p *(int *)__errno_location()
On 64-bit Linux, glibc seems to do something even more complicated,
requiring a more complicated solution.
- If you look at gdb and some call frames on the stack have
no information (only a hex address and "?"), then find out where
the call frames come from. Look at the hexadecimal address.
Then do:
(gdb) shell cat /proc/PID/maps (where the PID
of the current process is given by
(gdb) info proc ).
Alternatively:
(gdb) info proc mappings
Find which library or
other memory segment the unknown hexadecimal address came from.
Knowing which library was called is useful, but you may be able
to find out more.
If it comes from libc.so (or some other well-known library),
then see the next two tips for how to get the library
to show you its internal debugging information.
- (Continued) If you need a libc.so (or other well-known library)
with debugging symbols, then:
- Install the
package libc6-dbg. (The package name might differ for you.
Also, this assumes you have root privilege on your Linux.)
This will install a special libc.so in the directory
/usr/lib/debug .
Please note that the CCIS Ubuntu Linux machines already have
a debugging version of libc installed, currently as
/usr/lib/debug/libc-2.7.so .
- Next, do:
env LD_LIBRARY_PATH=/usr/lib/debug dmtcp_checkpoint a.out
(Presumably, after you checkpoint, the restarted a.out process
will be using the pre-checkpoint libraries and hence the
debugging versions. So, probably you don't need to
use env LD_LIBRARY_PATH=/usr/lib/debug for the restart
command. But if you're unsure, it doesn't hurt.)
-
The a.out process above should now be using a debugging version
of libc.so and perhaps other libraries.
You can verify this by looking at /proc/PID/maps
for your process. Now, in gdb, you will see the symbol information
in the call frame and a source code file and line number.
-
To read the corresponding source code, you can either download it from
the main source code location:
http://www.gnu.org/software/libc/libc.html#Availability
(try to choose the same libc version, and note that
the line numbers may be different in your Linux distro),
or download the source package for your particular Linux distro.
- (Continued) [You can find a more conceptual version
of this discussion here.]
If gdb still shows some call frames with "?", and
you have the full pathname of the library on disk, then you
can often fix it as follows. (Once you understand this procedure,
you may want to try the bin/gdb-add-symbol-file shell
script found in DMTCP.)
- In /proc/PID/maps look up the full pathname
of the library you need to load. The address of the call frame
with missing information should be in the address range of that
library.
- In gdb, read help add-symbol-file
- In gdb, type
add-symbol-file FILE ADDR
where FILE is the full pathname you identified in the
/proc/PID/maps file. The ADDR
will be the hexadecimal sum of:
- beginning of text segment address (text segment normally
has r-x permission) in /proc/PID/maps; and
- hexadecimal address for Addr heading corresponding
to .text when you look it up under Headers:
with either of the following command:
readelf -S FILE
objdump -h FILE
- In the last step, the maps file provided the beginning address of
the whole segment, but the binary library on disk contains many
sections for a segment, and the .text section need not be the
first section in the file. So, we must add the offset of the
.text section, found by analyzing the binary library on disk.
- In gdb, a convenient way to add hexadecimal numbers is:
p/x addr1 + addr2
where addr1 and addr2 are the two addresses we discussed. If those
addresses are in hexadecimal, make sure to include 0x
at the beginning of each hexadecimal number.
- Now do 'where' in gdb, and you should see full call
frame information.
- If gdb is inconvenient, you can set a breakpoint directly in
your program using
a technique of Nikolay Igotti. In short, one defines a handler
for SIGTRAP, forks a child, and uses ptrace on the child to set some
of the child's x86 hardware debug registers using the POKEUSER
option of ptrace and include/sys/user.h.
- The two commands readelf and objdump are useful
for inspecting the contents of binary files. These are related
to the other commands, nm and strings, but
these commands have many more options, including the ability
to disassemble into assembly code, the ability to display
section headers, etc. Scan the man pages
quickly to see if something might be useful for you.
- For an assembly level listing as you do stepi in gdb,
try
objdump -S a.out > a.out.listing
where a.out should be replaced by your binary.
For a more verbose form, try one of:
gcc -c -g -Wa,-alh,-L file.c > file.s
gcc -c -g -Wa,-ahls=file.s file.c
Variations of this can also produce assembly code that can be
directly assembled by gcc or by as.
For example, if you want to modify and re-compile the source
code for libc.so,
this is normally quite painful. A nice trick is to disassemble
libc.so into assembly, and then cut or copy out the particular assembly
routines that you want to assemble into a modified library.
-
UNIX system calls, by Open Group;
(enter system call in search box);
This is the clearest, most precise man pages for system calls
you will ever find.
- Valgrind (Memory and leak detection
utility); This is easy-to-use and surprisingly powerful.
- If you want to see the stack just before a segfault, a quick idea that
may help is:
catchsegv COMMAND_LINE
- Another method for catching segfaults that may give you more control is
to try the glibc
call backtrace:
man backtrace It mangles any
C++ names, but they are mostly readable (and utilities exist
for demangling the names). Read the notes of man backtrace
(for example, compile with gcc -rdynamic to get
symbol names. Also, note man addr2line.
Look at the example file, backtrace.c for this course.
Also, for any call frames with no symbol
name, look up the hex address in /proc/<PID>/maps.
Use addr2line to translate hex addresses into
line numbers in source code. (If it's a .so dynamic library,
give it the offset, the hex address minus the beginning
library address as shown by /proc/<PID>/maps.
- Understand addresses of symbols:
- Your executable will be loaded into RAM at an unknown base address.
But once it is loaded, you can find the base address it was loaded to:
less /proc/PID/maps
- Your executable file provides the offset from the base
address at which you will find the beginning of:
'text', 'data' (and possibly 'bss' segment).
Commands like 'objdump' and 'readelf' will show you this
offset in the file.
readelf -S a.out | grep '\.text '
- You may want to know the address of a particular symbol within
the 'text', 'data', or 'bss' segment. Use 'nm', 'objdump',
or 'readelf'.
SHELL% nm a.out | grep main
080483e4 T main
(As described in the 'man' pages, 't', 'T', 'd', 'D', 'b', 'B', 'U'
tell you if the symbol is in text, data, bss, or undefined
(presumably defined in a different library). Lower case means
file-private, and upper case means a globally visible symbol.
Look up __attribute__ ((visibility ("hidden"))) for
declaring a symbol library-private:
globally visible within a .o file, but file-private
within the .so (library) file.)
- In gdb, if it fails to correctly show you the stack, maybe some
memory was mmapped in (e.g. by DMTCP) that confused it.
The command
(gdb) help add-symbol-file
along with the above information will allow you to tell gdb
at what address in RAM the executable or library file on disk
was loaded. The file on disk contains the symbol information.
- Some of these calculations can be automated for you by
a DMTCP utility:
(gdb) shell utils/gdb-add-symbol-file
- An interesting Linux command: addr2line
('main' is on line 2 of tmp.c in the example below.)
SHELL% nm a.out
...
080483e4 T main
SHELL% addr2line -C -e /tmp/a.out 080483e4
/tmp/tmp.c:2
- In comparing two versions of a file, consider programs such as:
kompare, kdiff3, meld, gvimdiff (or text-based vimdiff).
- To examine the Linux source code,
try "google LXR" (Linux Cross Reference), which should
lead you to lxr.linux.no .
LXR is free software for hyper-linking large code bases. Another
popular choice is Doxygen
(available as a package in many Linux distros).
- To see which virtual memory pages are currently mapped to physical memory,
see /proc/PID/pagemap .
- To find out information known to BIOS:
sudo dmiprobe -t help
- When using google to get technical information,
stackoverflow.com
tends to have high quality answers. Try those hits first.