Homework 0

(No submission)

Please check back regularly for updates!

I expect all students to have an initial familiarity with the more common Linux (UNIX) system calls in the C language before the first class of CS 5600. This homework is intended to establish the intended familiarity with the Linux/UNIX operating system and will help students to get started with future homeworks.

Setting up the workstation

  1. Have a Linux based system accessible. There are several options:
    1. Use the CCIS Linux machines. You can find more information about CCIS Linux support including remote access at: http://howto.ccs.neu.edu/howto/unixlinux/
    2. Install a Linux distribution on your personal computer. On Windows computer, one can enable dual booting for this purpose. On Apple computers, dual booting is possible, but is slightly trickier (see option "c" below).
    3. Install a Linux virtual machine on your Windows or Mac computer. For Windows computers, one can use VirtualBox (freely available) or VMware Workstation (freely available for download via the Software Downloads link on the myNEU portal).
    Sometimes the CCIS machines might not be available and so it's highly recommended to go with either option (b) or (c).

  2. Familiarize with the Linux shell and command line. There are several online tutorials and guides. See for example the tutorials listed on: http://howto.ccs.neu.edu/howto/unixlinux/learning-linux/

  3. If you are not using the CCIS machines, install the "gcc" compiler on your Linux installation or the Linux virtual machine.
    • For Debian/Ubuntu Linux distribution, use:
      sudo apt-get install build-essential
    • For Fedora, use:
      sudo dnf group install 'Development Tools'
    • For OpenSUSE, use:
      sudo zypper install -t pattern devel_basis

  4. Compile a simple "hello world" C program on Linux using the "gcc" compiler and execute it.

    You can compile the program using:
    gcc hello.c -o hello

    This generates an executable 'a.out', which can be executed using:
    ./hello

Setup MARS Simulator

Download the MIPS assembly language simulator as instructed on the MARS page and try to run a simple fibonacci program on it.

Readings

  1. Read the manpage for mmap system call.
  2. Familiarize with the more common Linux system calls in the C language. This prerequisite also includes a knowledge of C pointers. These include: fork/execve/waitpid, open/close, read/write/dup, and malloc/free. In each case, you can find a detailed description by studying the man pages. For example, man 2 open on the command line will describe everything about the call to open(). You can then experiment with the system calls in your C program.

    There are several good tutorials on systems programming in C. Some possibilities are:

Familiarity with tools

  1. Familiarize yourself with make syntax. This will save you a lot of time and effort. In fact, no one should ever need to compile a file directly without using make. There are several tutorials available online - for example this and this.
  2. Familiarize yourself with clang-format to autoformat your code before submission.
  3. We will follow Mozilla's style guide for code formatting.

Exercise

On Linux systems, the /proc/PID/maps file contains the memory maps of a process with pid PID. A running process can inspect its own memory maps by using /proc/self/maps (see man proc for more information). For example, here list the memory maps of the cat process:
login-students.ccs.neu.edu:~> cat /proc/self/maps
00400000-0040b000 r-xp 00000000 fd:00 33561256              /usr/bin/cat
0060b000-0060c000 r--p 0000b000 fd:00 33561256              /usr/bin/cat
0060c000-0060d000 rw-p 0000c000 fd:00 33561256              /usr/bin/cat
01cca000-01ceb000 rw-p 00000000 00:00 0                     [heap]
7f178544d000-7f178bc79000 r--p 00000000 fd:00 33560742      /usr/lib/locale/locale-archive
7f178bc79000-7f178be30000 r-xp 00000000 fd:00 67184079      /usr/lib64/libc-2.21.so
7f178be30000-7f178c030000 ---p 001b7000 fd:00 67184079      /usr/lib64/libc-2.21.so
7f178c030000-7f178c034000 r--p 001b7000 fd:00 67184079      /usr/lib64/libc-2.21.so
7f178c034000-7f178c036000 rw-p 001bb000 fd:00 67184079      /usr/lib64/libc-2.21.so
7f178c036000-7f178c03a000 rw-p 00000000 00:00 0 
7f178c03a000-7f178c05b000 r-xp 00000000 fd:00 75669346      /usr/lib64/ld-2.21.so
7f178c214000-7f178c239000 rw-p 00000000 00:00 0 
7f178c259000-7f178c25a000 rw-p 00000000 00:00 0 
7f178c25a000-7f178c25b000 r--p 00020000 fd:00 75669346      /usr/lib64/ld-2.21.so
7f178c25b000-7f178c25c000 rw-p 00021000 fd:00 75669346      /usr/lib64/ld-2.21.so
7f178c25c000-7f178c25d000 rw-p 00000000 00:00 0 
7ffd98d8f000-7ffd98db0000 rw-p 00000000 00:00 0             [stack]
7ffd98dc8000-7ffd98dca000 r--p 00000000 00:00 0             [vvar]
7ffd98dca000-7ffd98dcc000 r-xp 00000000 00:00 0             [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0     [vsyscall]
Each line describes a distinct memory region in that process. The first line of above file (00400000-0040b000 r-xp ...) can be interpreted as follows:
  • 00400000 represents the starting memory address in hex.
  • 0040b000 represents the ending memory address in hex (exclusive of the last byte).
  • r-xp represents a set of permission on the memory region.
    • r: read
    • w: write
    • x: execute
    • p: private
    The given region has read and execute permissions, but doesn't have write permissions.
  • 00000000 represents offset within the file.
  • fd:00 represents the device id.
  • 33561256 represents inode number
  • /usr/bin/cat name of the file from which this area was populated. If missing, it represents an anonymous region.

Part A

Write a C program that takes PID of a running process and prints the virtual memory size for that process. The virtual memory size of a given process can be computed by adding the size of all memory regions (end address - start address) as seen in /proc/PID/maps file. In addition, this program should print the total virtual memory size of read-only and read-write memory regions.

Part B

Consider the following data structure:
struct MemoryRegion
{
  void *startAddr;
  size_t size;
  int isReadable;
  int isWriteable;
  int isExecutabl;
}
  1. Write a program, that populates an instance of MemoryRegion for each memory region of a given PID (passed in via command line, e.g., ./my_prog 4242) and writes it to a file. If no PID is provided, it should read its own memory maps. The program should also print the total size of all read-only memory regions. Also print the total size of all read-write regions.
  2. Write another program which reads the file you just created and prints all the information in human readable format. It should also print the total size of read-only and read-write regions. Validate your program by comparing the two outputs.