CS 4650 / CS 5650 (Research in High Performance Computing)

Instructor: Gene Cooperman
Spring, 2012
Tuesdays, 321 Hayden Hall, 6:00 - 9:00

NEWS: The Course Wiki is available.
OpenMP and parallel benchmarking test suites

Organization


Mini-Projects (chosen)


Mini-Projects (talks)


Term Projects


PREREQUISITES:

The prerequisites for this course are a familiarity with programming in C (including pointers) under the Linux operating system, and the ability to consider a new system call, read the Linux man page for it, and to then understand how to use it in your program. Ideally, you should also be comfortable using a symbolic debugger (e.g. gdb), although this can be assimilated during the course itself. Students may in some cases also be migrating from a background with a different operating system. The remaining background knowledge (including systems concepts) will be introduced/reviewed in the course. If you want to privately test yourself on the prerequisites, then read man mmap, and try writing a short C program that uses the mmap system call. Also, write a short amount of testing code to verify that the system call produced the result that you expected.

If you have taken a systems or operating systems earlier, or if you are taking such a course during the same semester, this would normally provide those prerequisites. If you have questions whether your background fulfills the prerequisites, please see me.

Course Structure:

As the official course description states, this course "introduces students to research in the domain of high-performance computing." At its core, research is messier than the highly structured courses that one more typically sees, but it can also be very exciting to see things that no one in the world has ever seen before. For this reason, the course requires highly motivated students who will operate semi-autonomously, while reporting back to the class at regular intervals. The course will have a small to moderate enrollment with the opportunity for more personal attention.

There will be a warm-up project in January, followed by a term project. The warm-up project should be done individually, although discussions among students are encouraged both in class and outside of class. The term project will typically be done in teams. The ideal team size is two or three students.

The warm-up project gives you an opportunity to try out a research area, and also to bring yourself up-to-date on a selection of techniques that you may need for systems programming. (See, Debugging and Other Systems Tricks for an example of some of the useful techniques, many of which will also be discussed in class.)

For the term project, students may choose to work either on their own research projects that they bring to the course, or on research questions that have evolved from the instructor's own research lab. Lectures will be customized to present background concepts, theory, and practical techniques of special value for the term projects as they develop. The instructor and his students will offer generous amounts of time to collaborate with the teams in small meetings.

In keeping with the course goal to take students to the forefront of research, there will be opportunities after the course is over to continue to collaborate with the goal of a competitive conference publication. However, to maintain a sharp line between the academic course and extracurricular work toward a publication, any interest in a collaboration toward a conference publication should be discussed only after the student has received his or her final course grade. Interested students should be forewarned that the effort to produce a competitive conference publication after the course is at least as great as the effort in the course itself.


Syllabus:

(Note that the overlap of certain weeks is intentional.) WEEKS 1 and 2: Introduction to research topics; students choose mini-project
WEEKS 3 and 4: Continuing lectures on research topics; students complete mini-projects
WEEK 4: Students present results of mini-project (oral and written)
WEEKS 4 and 5: Students choose course project.
WEEKS 4 through 8: Lectures guided by needs of students for projects.
WEEK 6: Interim project reports by students (oral and written).
WEEK 9: Further interim project reports (oral and written).
WEEKS 10 through 12: Students lead discussions of lessons from research: results of research topics to date; potential for new research directions; interaction with other research results in the literature
WEEK 12: Final project presentations (oral and written)

Instructor Information: Office: 336 WVH (and also look in my High Performance Computing Lab, 370 WVH)

Office Hours: After class, and 4:20 - 5:30, Tuesday and Thursday; and also by appointment. If students are having problems with their code, they are encouraged to stay after class or arrange an appointment, so as to develop some code jointly with the instructor.

Text: There is no textbook. Internal documents and pointers to resources on the Web will be provided. Please also note the two reference books on systems programming listed at the end of this web page.

Grades:

Grades will be determined by the sophistication of the project, along with the quality of the reports to the class (both oral and written reports). Both individual and joint projects are possible. Students will be encouraged to first do a (warm-up) mini-project, followed by a full project that need not be on the same topic.

Research consists of exploration into the unknown. Since all research is speculative, research results consist both of positive and negative results. In geographical terms, the discovery of a new mountain range (a new barrier) is just as interesting as the discovery of a new river (a new exploration route).


Two Project Themes Evolving from Instructor's Research Laboratory:

For 2012, the course will be project-based, and will leverage the research of the High Performance Computing Laboratory. It will emphasize two research vehicles:
  1. DMTCP (Distributed MultiThreaded CheckPointing):
    DMTCP is an open source package freely available from Sourceforge and developed by a team originating in the High Performance Computing Laboratory. There is a video demo of it here. There is also a description of its internal architecture. It transparently checkpoints the state of a process or computation to disk. It does so in user space (no modification to the Linux kernel). dmtcp_checkpoint a.out # run a.out under checkpoint control
    rm ckpt_a.out-*.dmtcp # remove any old checkpoint image files
    dmtcp_command -c # checkpoint the current process
    dmtcp_restart ckpt_a.out-*.dmtcp # restart process from disk
    DMTCP transparently follows the creation of new threads, the forking of child processes, and the spawning of remote processes via ssh. It currently does not checkpoint certain processes involving X-Windows, the ptrace system call (e.g. gdb), or suspended processes (^Z). The research question is how well DMTCP can checkpoint common processes (without modifying the kernel), and how well it can be extended to novel applications (checkpointing GUIs using X-Windows, creation of a reversible debugger by checkpointing gdb, etc.). For example, an interesting novelty would be the ability to checkpoint some open windows of your current session, and carry them home with you on your USB key.
    • VISION: Checkpointing has seen three important uses: restarting long-running computations in the middle after a computer crash; load balancing and process migration; and more recently, restoring an earlier state for purposes of programming or debugging. DMTCP supports all three modes, but some of the most interesting research goals lie int he third area. Wouldn't it be nice to checkpoint an X-Windows application, and move it to another machine, and restart it? Can one do that with 3-D graphics (an extension to basic X-Windows)?

      Can one checkpoint a virtual machine such as user-space Qemu or Linux lguest? If one could do this, one could even think of running malware inside Windows inside a virtual machine. Why is this useful? We can checkpoint fast (in seconds, unlike the time for a virtual machine snapshot). If the malware detects that it is being spied on, we can back up to a previous checkpoint. If we are not sure what input to pass to the malware, we can restart from the checkpoint several times, and play "What if" games. Don't worry if you have never used Qemu or lguest. All concepts will be explained in a self-contained manner in the course.
    • EXAMPLE PROJECTS FOR 2012: Checkpointing single X11 apps (e.g., checkpointing Firefox: the ultimate bug report for just before it crashes); Checkpointing a user-space virtual machine; Infiniband support and porting projects from expensive Infiniband clusters to cheaper TCP/IP clusters for leisurely debugging.
  2. FReD (Fast Reversible Debugger):
    FReD is an open source reversible debugger. It implements such commands as reverse-step, reverse-next, and reverse-watch (a generalization of watchpoints).

    Suppose one is using a debugger and the variable x has the wrong value. When did it get the wrong value? Wouldn't it be nice to revert to an earlier state and examine x? One can with DMTCP, which immediately yields a reversible debugger. If we had checkpointed a debugging session 100 commands ago, and we wish to undo the last debugging command, then just restart the checkpoint image from 100  commands ago, and re-execute the first 99 debugging commands. Now, combine the last two ideas: I'm sure you've all seen how easily web browsers can crash. Wouldn't it be great to go back and find out at which statement they did something causing the crash?

    An old description of FReD can be found in these slides from here. While reversible debuggers have been available at least since 1970, they have seldom gained widespread use. Most recently, GDB version-7.2 and later provides excellent support for reversible debugging using its target record command. GDB-7.3 is available in Ubuntu~11.10, and you will find a copy of it in the instructor's directory.

    Some strong points of the FReD reversible debugger are:
    (i) supports multi-threaded programs at near full speed;
    (ii) supports long-running programs (in contrast, GDB reversibility is not practical for programs running even a few seconds); and
    (iii) FReD supports a novel feature, reverse expression watchpoints. (See the slides for a description of this feature.)

    FReD is in the last stages before a public release. An alpha copy of the code, along with two documents describing it are at: The vision and example projects follow:
    • VISION: FReD provides a Python-based scripting language that allows one to directly call debugging commands that can manipulate the debugging history of a process. Using this platform, one can automatically search for the cause of bugs. For example, if a a program dereferences a NULL pointer, FReD can bring one back in time within the GDB debugger to the point where the corresponding pointer variable was being set to NULL. If a buffer is allocated via malloc, and a program calls free twice on the same memory buffer, then FReD can bring one back to a point in time where the first call to free was made. This is done using reverse expression watchpoints. FReD can be extended to other debuggers besides GDB, and to other mechanisms for searching for the cause of a bug, beyond the examples above.
    • EXAMPLE PROJECTS FOR 2012: extend FReD to work with multi-threaded languages such as Cilk and OpenMP ; add reversibility to the functional, lazy language Haskell ; implement reversible memory leak detector that will go back in time to the cause of the memory leak


Course Resources:

The instructor will cover any missing systems knowledge either in class, or one-on-one with individual students.

GDB and other UNIX resources: Some help files for UNIX and its compilers, editors, etc. are also available. In particular, the use of gdb (the GNU debugger) is especially encouraged as an important productivity tool.

Here is also one book that is very nice for learning systems programming concepts. Choose a chapter of interest, rather than reading it from front to back. The Rochkind book is an excellent book, with simple, example source code showing useful programs. The has the table of contents, and downloadable example source code. I also recommend the online book, "The Linux Kernel", below for qn excellent overview of the kernel. The book by Robert Love provides more technical details on the Linux Operating System, but it would only needed for more unusual aspects of certain projects.


Mini-Projects:

Each of these mini-projects is described only in outline. Within the class, further details will be provided for those mini-projects of interest to the students. Never mind if you produce working software. The goal is to understand. It is only incidental if you succeed with a "deliverable".
  1. DMTCP (Familiarize yourself now with the code. Note especially the subdirectory dmtcp/doc with descriptions of many parts of the DMTCP internals. Use tools such as gdb for a deeper understanding. Read the QUICK-INSTALL file for more tips about DMTCP and its debugging tools. Then write an overview of the implementation of DMTCP. This is a paper-only project. If you take this on, it will require detailed descriptions of the functionality of the components of DMTCP. Below are some alternative mini-projects concerned more closely with producing code or pseudo-code.
    1. Use the module facility of DMTCP to build a new module. An example module might be wrappers for the functions malloc and free. The wrappers should allocate additional "guard regions" around the memory buffer. It should catch bugs like user code that writes beyond the end of allocated memory, or user code that frees a buffer twice. For interested students, there is the possibility of building on This mini-project, to provide a novel, advanced memory leak detector in the FReD reversible debugger.
    2. Write a new DMTCP wrapper function for a new system call. (One suggestion is for man epoll.) There are examples of wrapper functions in trunk/dmtcp/src/pidwrappers.cpp and other files with names *wrappers.cpp. If you choose an advanced system call such as epoll, it is acceptable to work jointly with another student on a single mini-project.
    3. DMTCP currently has a bug in checkpointing emacs23 (version 23 of emacs). Investigate the cause of this bug. The primary responsibility is the diagnosis of the bug. You are not required to produce a bug fix. (The bug appears to occur in the context of screen. If you are interested in this project, tell me, and I will help you reproduce this bug.)
    4. Provide a paper design for a new MTCP module. This type of module is unrelated to the DMTCP modules described above. Currently, DMTCP has an option for using "gzip" to dynamically compress files on the fly. It is mostly in MTCP. This is too restrictive. The MTCP subdirectory should support arbitrary user-defined modules that are called by the MTCP checkpoint or restart routines. The modules may save a checkpoint image locally or on a remote machine; using gzip or a newer fast compression routine such as Snappy, LZO, FastLZ, QuickLZ, or other. You provide a paper design for the framework, and third parties build whatever module they want. As part of your paper design, consider options for third-party module writers to write to RAM and then fork a child process that saves on disk -- or a third-party module that mmaps RAM to disk and lets the operating system worry about the best optimization. You don't yet write code in this mini-project, but you should refer to existing code in the MTCP subdirectory.
  2. FReD (Familiarize yourself now with the code.) This primarily involves reading Python code, the C++ record-replay module for DMTCP, and a rough "black box" understanding of DMTCP. There is a limited introduction in this paper from the PLOS-11 workshop. In particular, read about the primitive reverse-xxx algorithms, and then study the code to see how reverse-watch works. Document your findings in a report.
  3. Dthreads is a novel idea for determinism: replace a multi-threaded process by multiple processes with shared memory. There is also a full paper on Dthreads. This provides for efficient deterministic multithreading. A well-known problem with reversible debuggers is that if you go back in time, and then execute forwards, do you arrive at the same place, or did the operating system produce a different thread schedule that changes the behavior? By adding determinism, Dthreads provides a "simple" idea for allowing FReD to easily implement determinism. Is such a combined implementation possible? How would it work? This project is purely a paper design.
  4. Linux LGuest is a virtual machine written in just 5000 lines of well-documented code! You can read and understand every line of the code. Lguest does not provide for snapshots. But using DMTCP, we could checkpoint the Lguest "process". This provides a fast checkpoint of an entire virtual operating system. Can this idea work? Try it. If you try it, probably some things go wrong. What goes wrong? Propose on paper some approaches to overcome these difficulties. There are opportunities to continue this work into the term project.

Main Projects:

The main projects are listed below. As always, you are also welcome to bring your own research project. The project may be a tool useful for High Performance Computing or a large computation itself.

We will also set up a course Wiki, where you will describe the status of your projects. The Wiki will also have a space for general issues/comments in supporting Roomy and DMTCP.

DMTCP (list of projects still being revised)

  1. Using suspend-to-disk mode to enable checkpointing of graphics programs (including OpenGL (3D graphics))
  2. MTT (MPI Testing Tool) for automated testing for Open MPI and DMTCP
  3. Checkpointing Qemu, first step towards checkpointing malware, and then running the malware reversibly
  4. Another option that may or may not work is the Linux lguest simple hypervisor
  5. Fast process migration (e.g. for servers):
    Step 1: Add an MTCP module capability
    Step 2: Write an MTCP checkpoint module that checkpoints to remote RAM
    Step 3: Restart on the remote machine
    Step 4: Tune it for speed
  6. DMTCP attach using new ptrace capability.
  7. Checkpoint the job control/suspend feature of your favorite shell (^Z)
  8. Have MTCP use a standard ELF linker script.
  9. Hijack/Attach to already running process and checkpoint (There is a question here about how to follow socket connections, if that process is already talking to other processes.)
  10. Thread race condition detector: A traditional race condition eventually causes a crash. But since it's a race condition, it doesn't always crash at that location. Experiment with different checkpoints, until you find a checkpoint location for which the process always crashes upon restart. Then modify MTCP to only allow a subset of the threads to resume, and keep the other threads suspended. By trial and error, discover which two threads have a race condition.
  11. Portable Linux Apps: DMTCP checkpoint images include any libraries that have been loaded. If the environment variable LD_BIND_NOW is set (set to anything), then the loader will preload every library that it will need. This should enable one to copy a checkpoint image from Debian Linux to OpenSuse Linux to RedHat Linux to Ubuntu Linux to (etc.). Does this work? If not, what's needed to make it work?
  12. Incremental Checkpoint: DMTCP may want to keep multiple checkpoint images, so that it can return to any of several execution points in the past. This would normally require a lot of disk space. How does one efficiently store a diff between checkpoints. (This may be a somewhat easier project, for those who are looking for that. With other projects, one will often find that at the end of the semester, one has to report that some parts are still not working, and why. This project offers the opportunity of finishing most of the project, if there are no surprises.)
  13. Checkpoint valgrind or other binary rewriters. Examples of programs using dynamic binary translation include: Valgrind, Pin, and Paradyn/Dyninst. Currently, it appears that DMTCP cannot checkpoint these packages. Why? Can it be fixed?

FReD (list of projects still being revised)

  1. Reversibly debugging Cilk programs
  2. Reversibly debugging OpenMP programs
  3. Extend the FReD reversible debugger to Java programs via 'jdb'
  4. Haskell and reversible debugging (Due to the functional lazy design of the language, Haskell has to worry about side effects in debugging)
  5. Combine FReD with DThreads (highly speculative, but very high impact)
  6. Memory leak detector: If a memory leak occurs later in the program, valgrind runs too slowly to easily find it. So, use a malloc debugger or your own memory/free interceptor. A good one is the DUMA library (libduma) (Detect Unintended Memory Access), which is a newer replacement for the classic Electric Fence (libefence). This defines regions created through malloc. Late in the program, it will be easy to find a region of memory that is a memory leak (that no one ever touches again).

    Alternatively, using checkpoint/restart tricks, find the last time that anyone touched that memory segment. Report that line of code using a standard tool to convert between a line of assembly language and the source code line. To guarantee that no one ever uses that memory again, remove read-write protection from that region of memory and add a segfault handler to trap any accesses. Then automate the many checkpoint-restart to automatically find where the memory segment was last touched.

Project Software:

DMTCP

If you have questions about DMTCP, please send e-mail to Kapil Arya and me. The username of Kapil Arya is his first name (all lower case) and: @ccs.neu.edu

DMTCP is available through the sourceforge web page. The easiest way to start is (in Linux) to type:

  svn co https://dmtcp.svn.sourceforge.net/svnroot/dmtcp/trunk dmtcp
  cd dmtcp
  ./configure
  [ OR:   ./configure --enable-debug ]
  make
  make check  [OPTIONAL]
Then read the QUICK-START file in the top-level dmtcp directory. From there start browsing the source code.

FReD

The FReD software will be made available soon.

Debugging and Other Systems Tricks:

If the suggestions are unclear, use "man" to find out more about the commands.
  1. pgrep -n a.out
    pkill -9 a.out
    , where you replace a.out by the name of your binary. Also, consider pgrep with -o, and with no flags.
  2. GDB (beyond the basics): The use of gdb is essential. Note the introduction to UNIX tools.
    1. I invoke gdb --args <COMMAND STRING>
      This form allows you to include the command and its arguments.
    2. Within gdb, try this ASCII graphics mode: ^Xa (control-X a)
    3. Include useful utility functions in your code. You can then do things like (gdb) print my-utility-len-linked-list(mylist) This also works on system calls: (gdb) print lseek(4, 0, 0) This finds the file offset for file descriptor number 4, where the final 0 corresponds to SEEK_SET, whose value is found from grep -R SEEK_SET /usr/include . Similarly, you can find out what fd 4 corresponds to: (gdb) set $pid = getpid()
      (gdb) shell ls -l /proc/$pid/fd
      NOTE: getpid() can be called only after the gdb run (after the target is running).
    4. set follow-fork-mode child will follow the child process on fork. parent will have it follow the parent process. break fork will have GDB stop before executing the fork system call.
    5. If your program under GDB exits too soon, try (gdb) break _exit . (You can slso break on other system calls in libc.)
    6. gdb a.out PID OR
      gdb a.out
      (gdb) attach PID
      (where the attach command is given within gdb); a convenient single command that finds the PID is: gdb a.out `pgrep a.out | tail -1`
    7. Consider delay loops in combination with "gdb attach". These are better than calls to 'sleep' since they do not access external libraries. For example, insert into your source code: {int x = 1; while(x);} Then do "gdb attach" and (gdb) print x=0 An alternate form to stop at the 5th occurrence is: {static int x = 1; if (x++ > 5) while(x);}
    8. To use gdb with assembly, consider x/10i $pc, and x/10i $pc-12, etc. A convenient version is: (gdb) display/5i $pc followed by stepi (si) or nexti (ni). To set a breakpoint in assembly, try: (gdb) break *0xbfdea000 for a breakpoint at the given address.
    9. To use gdb on multi-threaded programs, try:
      (gdb) info threads
      (gdb) thread <NUM>
      (gdb) thread apply all where
      (gdb) thread set scheduler-locking on
      NOTE: the last command (scheduler-locking) has the possibility of creating deadlock -- for example, if one thread is holding a low-level libc or C++ lock, and you try to advance a second thread whose execution requires the lock.
  3. strace -o myoutput a.out (trace system calls based on kernel API: /usr/include/asm/unistd*.h ; decide in advance if it should trace all child processes or not; the flags -f and -ff exist for tracing parent and all children)
  4. ltrace -o myoutput a.out (not as useful as strace, but sometimes interesting: trace library calls instead of system calls).
  5. ps auxw | grep a.out
  6. pstree -pu $USER or pstree -lu $USER (tree of processes and child process; names in curly braces are additional threads); Note idioms like: pstree -p | grep -C2 a.out
  7. top
  8. When your program runs too slowly, it might not be CPU-bound. Check man iostat
    man vmstat
    for disk/file I/O (Blk_read/s / Blk_wrtn/s), and paging to disk (bi/bo/id), respectively. A local disk (not on the network; SANs are different) can sequentially read or write (not both at once) roughly at a rate from 50 MB/s to 100 MB/s. If you are accessing files mostly and you don't see that bandwidth, then your program is not efficient. If you are paging to disk and you do see a bandwidth anywhere near that bandwidth, then you are using too much RAM.
  9. watch -d ls -l /tmp/myfile.txt
    watch -d "pstree -l | grep -A1 `basename $SHELL`"
    (repeatedly execute COMMAND for watch -d COMMAND)
  10. Search for SUBSTRING in all dmtcp/src/* files: find dmtcp| xargs grep SUBSTRING
    or alternatively: grep -r SUBSTRING
    Don't forget: grep -C3 ...; grep -A5 ...; (and so on.)
  11. grep and google are your friends when searching for information. Besides "grep'ing" through source code, here is a grep trick you may not have seen:
    find /usr/share/man/man3 | xargs zgrep MYSTRING
    find /usr/share/man/man3 | xargs gzip -dc | grep -C3 MYSTRING
  12. less /proc/PID/maps<
  13. ls -l /proc/PID/fd
  14. List open file descriptors: lsof | grep a.out If you discover an interesting socket with SOCKET_ID through 'lsof' or 'ls -l /proc/PID/fd', then find the other end of the socket: lsof | grep SOCKET_ID
  15. List environment variables of a process: cat /proc/PID/environ | tr '\000' '\n'
  16. nm a.out (or nm library.so) Note the form nm -o for printing out filenames. This can be useful with brute force strategies:
    nm -o /usr/lib/lib* | grep MY_SYMBOL
    (Also see discussion of readelf and objdump below.)
  17. strings -a a.out (for some binary, a.out)
  18. If there is a syntax error in a .h file, try: cpp -I. -Iother/path/to/include/files myfile.c and you can see the expanded C or C++ code with no #include files. This often makes it easier to find the syntax error.
  19. Sometimes, a macro was expanded, and it's hard to track down what happened. Try: cpp -dM -I. -Iother/path/to/include/files myfile.c
  20. Sometimes, it's not clear what includes paths to use. In the above example, try: rm myfile.o; make myfile.o and copy the command line used by 'make' to build myfile.o. If 'make' uses libtools, you may also have to remove hidden directories with names like .libs .
  21. Get to know your loader. It executes before your executable file: man ld.so
  22. env LD_DEBUG=help a.out
    env LD_DEBUG=files a.out
    (and try other options to LD_DEBUG)
  23. ldd a.out (for some binary, a.out)
  24. Replace PID in following:
    pushd /proc/PID;
    ls -l exe;
    echo -n "cmdline: "; cat -v cmdline;
    echo ""; cat -v environ; echo "";
    popd
  25. DMTCP: ./configure --enable-debug; make clean; make and then run and look at /tmp/dmtcp-USER@HOST/jassert* files for your value of USER and HOST. Before your next test, rm -rf /tmp/dmtcp-USER@HOST .
  26. MTCP: Look at mtcp/Makefile and uncomment the line taht adds to CFLAGS the flag: -DDEBUG
  27. More on gdb: Using gdb with C++ : For a C++ function with namespace, class, and signature (e.g.: dmtcp::myClass::foo(int, bool) ), try listing it first:
    list 'dmtcp::myC<TAB>
    It will autocomplete. Extend it, and type the final quote mark ('). Once you're sure you can list it, you can do things like set a breakpoint:
    b 'dmtcp::myC<TAB> (and complete it with quote mark as before).
  28. Using gdb with errno: In glibc, the global variable errno (see man errno) is a macro that is redefined to:
    *(int *)__errno_location()
    If you want to p errno within gdb, you will have to modify this into p *(int *)__errno_location() On 64-bit Linux, glibc seems to do something even more complicated, requiring a more complicated solution.
  29. If you look at gdb and some call frames on the stack have no information (only a hex address and "?"), then find out where the call frames come from. Look at the hexadecimal address. Then do:
    (gdb) shell cat /proc/PID/maps (where the PID of the current process is given by (gdb) info proc ). Alternatively:
    (gdb) info proc mappings
    Find which library or other memory segment the unknown hexadecimal address came from. Knowing which library was called is useful, but you may be able to find out more. If it comes from libc.so (or some other well-known library), then see the next two tips for how to get the library to show you its internal debugging information.
  30. (Continued) If you need a libc.so (or other well-known library) with debugging symbols, then:
    1. Install the package libc6-dbg. (The package name might differ for you. Also, this assumes you have root privilege on your Linux.) This will install a special libc.so in the directory /usr/lib/debug . Please note that the CCIS Ubuntu Linux machines already have a debugging version of libc installed, currently as /usr/lib/debug/libc-2.7.so .
    2. Next, do:
      env LD_LIBRARY_PATH=/usr/lib/debug dmtcp_checkpoint a.out (Presumably, after you checkpoint, the restarted a.out process will be using the pre-checkpoint libraries and hence the debugging versions. So, probably you don't need to use env LD_LIBRARY_PATH=/usr/lib/debug for the restart command. But if you're unsure, it doesn't hurt.)
    3. The a.out process above should now be using a debugging version of libc.so and perhaps other libraries. You can verify this by looking at /proc/PID/maps for your process. Now, in gdb, you will see the symbol information in the call frame and a source code file and line number.
    4. To read the corresponding source code, you can either download it from the main source code location: http://www.gnu.org/software/libc/libc.html#Availability (try to choose the same libc version, and note that the line numbers may be different in your Linux distro), or download the source package for your particular Linux distro.
  31. (Continued) [You can find a more conceptual version of this discussion here.] If gdb still shows some call frames with "?", and you have the full pathname of the library on disk, then you can often fix it as follows. (Once you understand this procedure, you may want to try the bin/gdb-add-symbol-file shell script found in DMTCP.)
    1. In /proc/PID/maps look up the full pathname of the library you need to load. The address of the call frame with missing information should be in the address range of that library.
    2. In gdb, read help add-symbol-file
    3. In gdb, type add-symbol-file FILE ADDR where FILE is the full pathname you identified in the /proc/PID/maps file. The ADDR will be the hexadecimal sum of:
      1. beginning of text segment address (text segment normally has r-x permission) in /proc/PID/maps; and
      2. hexadecimal address for Addr heading corresponding to .text when you look it up under Headers: with either of the following command: readelf -S FILE
        objdump -h FILE
    4. In the last step, the maps file provided the beginning address of the whole segment, but the binary library on disk contains many sections for a segment, and the .text section need not be the first section in the file. So, we must add the offset of the .text section, found by analyzing the binary library on disk.
    5. In gdb, a convenient way to add hexadecimal numbers is:
      p/x addr1 + addr2 where addr1 and addr2 are the two addresses we discussed. If those addresses are in hexadecimal, make sure to include 0x at the beginning of each hexadecimal number.
    6. Now do 'where' in gdb, and you should see full call frame information.
  32. If gdb is inconvenient, you can set a breakpoint directly in your program using a technique of Nikolay Igotti. In short, one defines a handler for SIGTRAP, forks a child, and uses ptrace on the child to set some of the child's x86 hardware debug registers using the POKEUSER option of ptrace and include/sys/user.h.
  33. The two commands readelf and objdump are useful for inspecting the contents of binary files. These are related to the other commands, nm and strings, but these commands have many more options, including the ability to disassemble into assembly code, the ability to display section headers, etc. Scan the man pages quickly to see if something might be useful for you.
  34. For an assembly level listing as you do stepi in gdb, try objdump -S a.out > a.out.listing where a.out should be replaced by your binary. For a more verbose form, try one of: gcc -c -g -Wa,-alh,-L file.c > file.s
    gcc -c -g -Wa,-ahls=file.s file.c
    Variations of this can also produce assembly code that can be directly assembled by gcc or by as. For example, if you want to modify and re-compile the source code for libc.so, this is normally quite painful. A nice trick is to disassemble libc.so into assembly, and then cut or copy out the particular assembly routines that you want to assemble into a modified library.
  35. UNIX system calls, by Open Group; (enter system call in search box); This is the clearest, most precise man pages for system calls you will ever find.
  36. Valgrind (Memory and leak detection utility); This is easy-to-use and surprisingly powerful.
  37. If you want to see the stack just before a segfault, a quick idea that may help is: catchsegv COMMAND_LINE
  38. Another method for catching segfaults that may give you more control is to try the glibc call backtrace: man backtrace It mangles any C++ names, but they are mostly readable (and utilities exist for demangling the names). Read the notes of man backtrace (for example, compile with gcc -rdynamic to get symbol names. Also, note man addr2line.
    Look at the example file, backtrace.c for this course.
    Also, for any call frames with no symbol name, look up the hex address in /proc/<PID>/maps. Use addr2line to translate hex addresses into line numbers in source code. (If it's a .so dynamic library, give it the offset, the hex address minus the beginning library address as shown by /proc/<PID>/maps.
  39. Understand addresses of symbols:
    1. Your executable will be loaded into RAM at an unknown base address. But once it is loaded, you can find the base address it was loaded to: less /proc/PID/maps
    2. Your executable file provides the offset from the base address at which you will find the beginning of: 'text', 'data' (and possibly 'bss' segment). Commands like 'objdump' and 'readelf' will show you this offset in the file. readelf -S a.out | grep '\.text '
    3. You may want to know the address of a particular symbol within the 'text', 'data', or 'bss' segment. Use 'nm', 'objdump', or 'readelf'. SHELL% nm a.out | grep main
      080483e4 T main
      (As described in the 'man' pages, 't', 'T', 'd', 'D', 'b', 'B', 'U' tell you if the symbol is in text, data, bss, or undefined (presumably defined in a different library). Lower case means file-private, and upper case means a globally visible symbol. Look up __attribute__ ((visibility ("hidden"))) for declaring a symbol library-private: globally visible within a .o file, but file-private within the .so (library) file.)
    4. In gdb, if it fails to correctly show you the stack, maybe some memory was mmapped in (e.g. by DMTCP) that confused it. The command (gdb) help add-symbol-file along with the above information will allow you to tell gdb at what address in RAM the executable or library file on disk was loaded. The file on disk contains the symbol information.
    5. Some of these calculations can be automated for you by a DMTCP utility: (gdb) shell utils/gdb-add-symbol-file
  40. An interesting Linux command: addr2line
    ('main' is on line 2 of tmp.c in the example below.) SHELL% nm a.out
    ...
    080483e4 T main
    SHELL% addr2line -C -e /tmp/a.out 080483e4
    /tmp/tmp.c:2
  41. In comparing two versions of a file, consider programs such as: kompare, kdiff3, meld, gvimdiff (or text-based vimdiff).
  42. To examine the Linux source code, try "google LXR" (Linux Cross Reference), which should lead you to lxr.linux.no . LXR is free software for hyper-linking large code bases. Another popular choice is Doxygen (available as a package in many Linux distros).
  43. To see which virtual memory pages are currently mapped to physical memory, see /proc/PID/pagemap .
  44. To find out information known to BIOS: sudo dmiprobe -t help
  45. When using google to get technical information, stackoverflow.com tends to have high quality answers. Try those hits first.