Part III

Input and Output

Everyone can write a badly organized program that reads a bunch of numbers or prints a few lines. In reality though, reading and writing from a program means translating between a format that the rest of the computer can understand and the data representation of your chosen language. Fortunately, Scheme has a rich, expressive data sublanguage and, therefore everyone can do a reasonable job, even with complex forms of information, as long as nobody else imposes constraints on us.

We therefore start this section with the way Schemers would deal with input and output when they are on their own. Then we move on to some standard situations, especially reading plain text lines and formatted text lines. We also cover Scheme's general model of thinking about input and output and how it applies to files and network devices in a uniform way.

1  Input and Output: The Scheme Way

In 1958 or some time around then, the designers of LISP faced the question of how to save their list data on permanent storage devices (disks). Given that they represented structured data as S-expressions internally, it was natural to invent a similar format for storing data externally to the program and to add procedures for storing and retrieving those pieces of data. This way of thinking about input and output is one of the things that Scheme inherited from LISP10 (besides some basic syntax). We cover this way first because it is the best way of dealing with input/output when we're free to design our own format.

1.1  Input and Output via External S-expressions

Help Desk: input, output, read, write, display, newline

Roughly speaking external S-expressions are parenthesized pieces of information. They are thus a useful compromise between human readers and program convenience. Humans can read and create this form of structure information easily. Programmers can turn this form of information into internal data (and vice versa) with a single command, which solves the translation problem once and for all.

An external S-expression is a convenient textual representation of a tree of conses in Scheme. The basic pieces of information are tokens of text; parentheses provide the structuring mechanism. Following the presentation of Scheme's internal S-expression, we say that the atomic pieces of information are numbers (Scheme syntax), symbols (without quote), and strings.

a-symbol      10.20      "hello world"
The first is an external symbol, the second a number, and the third a string.

Placing a pair of parentheses around any number of S-expressions forms another S-expression:

(world hello)      ()      (life ((is) 1) "mess")
The first is a sequence of two symbols. The second one is the empty sequence of S-expressions and corresponds to empty. The third one consists of three S-expressions, including a string; its second component is a complex S-expression that contains a number.

Every S-expression naturally corresponds to a piece of Scheme data:

a-symbol 'a-symbol
10.20 10.20
"hello world" "hello world"
(world hello) (list 'world 'hello)
() (list)
(life ((is) 1) "mess") (list 'life (list (list 'is) 1) "mess")
In general, each atomic sequence of keyboard characters is represented by a symbol, and every other atomic S-expressions is represented by itself. To translate a parenthesized S-expression into Scheme, we add list to the right of ``('' and then translate the rest.

To make this whole discussion concrete, let's look at a typical toy problem from a first-semester programming course:

Develop a program that reads a bunch of numbers and prints their average.
In part I, we have seen the average function several times. Here it is again:
;; (Listof Number)  -->  Number 
(define (average alon) (/ (apply + alon) (length alon)))

Hence, if we want our friends to use this function, we just tell them to open the file with DrScheme, click Execute, apply average to a list of numbers, and admire the result:

> (average (list 1 2 3))

While we, the authors, consider it desirable for everyone in the world to know Scheme, nobody should have to wait that long for Scheme programs to become useful. In short, we need to wrap some code around this function that reads a bunch of numbers and prints their average -- without much ado.

;;  -->  S-expression
;; reads an external S-expression and produces it in its internal form
(define (read) ...)

;; S-expression  -->  Void 
;; to print a Scheme value as an external S-expression on a single line
(define (write S) ...)

;; S-expression  -->  Void 
;; to print a Scheme value as an external S-expression on a single line, 
;; stripping a string's quotes
(define (display S) ...)

;;  -->  Void 
;; to end a line of printed matter
(define ( newline) ...)

Figure 22:  Scheme's basic functions for input and output

Figure 22 specifies the four basic functions for I/O based on S-expressions. All four functions interact with the default input and output devices of a computer, which are the keyboard and the monitor in this day and age. The Scheme report suggests that write is for producing machine-readable output and display is for human-readable output.

Figure 23:  Reading numbers and printing their average

Since the goal is to create a program that our friends can use, we need to compose read with average and then display. The first function reads an external list of numbers, the second one computes the average, and the third one writes it to the standard output device:11

;;  -->  Void
;; to read an external list of numbers and to print its average 
(define (main)
  (display (average (read))))

If we now wish to use the average function, we type (main) in the Interactions window and then type in an external list. The screen shot in figure 23 illustrates how it all works. The box in the Interactions window frames the interaction between the user and the program. For this particular interaction, we typed the list (1 2 3), and main responded with 2, the average of the list. More generally, the application (read) intercepts what we type at the keyboard and converts it into Scheme data. Then the main function processes the data structure and, finally display consumes the result and prints it to the monitor.

Although using main is slightly simpler for a user without DrScheme knowledge, it is still obscure because the user must still click execute and enter inputs without being prompted. It is much more polite to prompt the user with a short phrase:

;;  -->  Void
;; to prompt the user for, and to read, an external list and 
;; to print its average on a new line 
(define (main-with-prompt)
  (display "Enter a list of numbers: ")
  (display (average (read)))
  ( newline))

When we now evaluate (main-with-prompt) in the Interactions window, we get the dialog in the top half of figure 24. That is, the function prints the short phrase, and then waits for the user to enter a parenthesized S-expression on the same line. Once the user enters a complete external S-expression, the program reads the list of numbers, computes the average, prints it, and terminates the line.

Figure 24:  Input and output in DrScheme's Interactions window

;;  -->  Void
;; to continuously prompt the user for, and to read, an external list and 
;; to print its average on a new line 
(define (main-forever)
  (display "Enter a list of numbers: ")
  (let ([the-input (read)])
      [(eq? 'x the-input)
       (display "Good bye.")
       ( newline)]
       (display (average the-input))
       ( newline)

Figure 25:  An average-of-numbers computer

At this point, the user must still know how to start a DrScheme program. Ideally, the user should be able to walk up to a computer and just use it as an ``average computing machine.'' That is, someone should start a program once and that program should compute averages until the user gets tired of it. Figure 25 contains the complete function for this purpose; the bottom half of figure 24 illustrates how this function works.

The function uses (a trivial form of) generative recursion (see section 10.2). Following a rather common convention, the program prompts the user for data and deals with two different kinds of user inputs: the symbol 'x and a list of numbers. If it reads 'x, the program exits. Otherwise it computes an average and starts over.

Bad Inputs:  Following the spirit of HtDP, the programs we have written so far did not deal with bad inputs. To deal with bad inputs, say, a list that contains non-numbers, we can either write a checked version of average, or we can set up an exception handler that deals with the exceptions that apply or + throw when given bad inputs.

Specifically, when apply is given a non-list or when + finds a non-number in a list, Scheme raises an exn:application:type exception. Using an exception handler, average can catch this kind of exception and inform the user about the problem:

;; (Listof X)  -->  Number
;; effect: signal an error when given a non-number
(define (average alon)
  (with-handlers ([exn:application:type?
                   (lambda (x)
                     (display "expected: list of numbers; found: ")
		     (display (exn:application-value x))
                     (error "in average"))])
    (/ (apply + alon) (length alon))))

Here the revised version of average prints a string that specifies what kind of input it expected and what kind of problem data it found and then stops the computation. 

1.2  External S-expressions and Files

Help Desk: with-input-from-file, with-output-to-file

Now suppose you have become the head grader for the famous HtDP introductory programming course. One of your chores is to produce grade-point averages for all of the students in class. To do that, you keep track of homework grades in a file, grades.dat, and since Scheme is the only language you know, you keep the grades in an S-expression format:

(("Adam" 78 88 69 66)
 ("Brad" 88 87 86 22)
 ("Cath" 99 88 88 90)
 ("Dave" 77 78 77 78)
 ("Fawn" 90 89 81 60)
 ("Gege" 67 78 81 85) 
 ("Zoro" 33 44 55 66))

Perhaps you want to be able to modify the data with a plain text editor. Or perhaps you are thinking of expanding the program so that anyone on the course staff can fire it up and enter grades without knowing the exact format of the grade database. In any case, you're now facing the problem of getting the data from the file into the program.

;; String ( -->  X)  -->  X
;; to turn the file f into the default input device 
;; during the evaluation of (thunk) 
(define (with-input-from-file f thunk) ...)

;; String ( -->  X) [Symbol]  -->  X
;; to turn the file f into the default output device 
;; during the evaluation of (thunk) 
;; The optional symbol specifies what to do when the file exists. 
(define (with-output-to-file f thunk) ...)

Figure 26:  Scheme's functions for redirecting file input and output

A program like the one that averages a bunch of numbers doesn't solve our problem. Copying and pasting the grade file to and from DrScheme is just too cumbersome. Instead, we want the program to read and write directly to this file. Put differently, we don't want read to get inputs from the standard input device and we don't want write (or display) to put outputs onto the standard output device.

For this purpose, Scheme provides functions that redirect read, write, and other input/output functions. Figure 26 specifies the two new function. Both consume a string, which should be the name of a file, and a thunk, a function of no arguments. They use the string to open a file for reading or writing as the input or output for the thunk. When the thunk returns a value, the redirection primitives close the connections to the files, re-establish the standard I/O devices, and finally return the value of the thunk. Thus, the expression

(with-input-from-file "grades.dat" read)

forces the read expression to read a parenthesized S-expression from the file and to turn it into an internal S-expression:

(list (list "Adam" 78 88 69 66)
      (list "Brad" 88 87 86 22)
      (list "Cath" 99 88 88 90)
      (list "Dave" 77 78 77 78)
      (list "Fawn" 90 89 81 60)
      (list "Gege" 67 78 81 85) 
      (list "Zoro" 33 44 55 66))

;;  -->  (Listof (list String Number))
(define (gpas) (compute-gpas (with-input-from-file DB read)))

;; (Listof (cons String (Listof Number)))  -->  (Listof (list String Number))
;; to compute the homework gpa for each item in lines
(define (compute-gpas g)
  (map (lambda (a-record) (list (first a-record) (average (rest a-record)))) g))

;; (Listof Number)  -->  Number
(define (average alon) ( exact--> inexact (/ (apply + alon) (length alon))))

;; Constants: 
(define DB "grades.dat")

;; run program run:

Figure 27:  Computing grades

To make sure that others know what kind of file formats your program expects or produces, it is good practice to specify the format in something like a data definition. For example, the data in a grade database could be specified in a README file as

  The program expects grades.dat to contain a DB.
  A DB is 
   (GradeRecord ...)
  A GradeRecord is 
   (String Grade ...)
  A Grade is a 
   number between 0 and 100 (inclusive)

The difference between a data definition and a file format definition (or external data definition) is that the former specifies constructors for compound data and the latter only specifies where to put the parentheses. It is then up to the programmer translate the information into constructed data.

;;  -->  Void
;; to add a grade to each record in DB via a query program
(define (add-grades-to-all)
  (let ([new (map add-one-grade (with-input-from-file DB read))])
    (with-output-to-file DB (lambda () (write new)) 'replace)))

;; GradeRecord  -->  GradeRecord
;; read and add one grade to a record 
(define (add-one-grade x)
  (let ([name (first x)])
    (display "Please enter a grade for ") (display name) 
    (display ": ") (flush-output)
    (cons name (cons (read-grade) (rest x)))))

;;  -->  Grade
;; to read a homework grade (a number between 0 and 100)
(define (read-grade)
  (let ([try (read)])
      [(and (number? try) (<= 0 try 100)) try]
	(display "expected: a number between 0 and 100; found: ") (display try)
	( newline)

;; Constants
(define DB "grades.dat")

;; run program run

Figure 28:  Managing grades

Let's look at a natural extension of the example. In addition to computing the gpas, you also need a program that adds one homework grade to each student's record. We already know how to read the database from the file. Once we have the database, we need to construct a new record for each student. This suggests an iteration with map over db. The function that is mapped over the database records must prompt the instructor for the grade of the student, read a grade, and splice the grade into the record the name and the list of numbers.

The code in figure 28 accomplishes all this. It employs with-input-from-file to redirect the read for the database to the file grades.dat. Then it runs add-grades-to-all in a context where read uses the default input device (the keyboard) but writes the data to the file grades.dat. The redirection of write to the file uses the 'replace option, because the file must be overwritten.

Putting compute-gpas and add-grades-to-all together, we get a simple but effective pair of programs for administrating homework grades. One adds homework grades to the database, the other one computes the average homework grade. They exchange data through the file system, which means that what one program writes the other one must read. For that reason, the programs use write rather than display to ensure that other Scheme programs properly recognize the string-names. Of course, this plan also assumes that humans can read the output of write, even if it is not as pretty as the output that display produces.

1.3  Pretty Printing

Help Desk: pretty print, pretty-print


Your professor runs your program and then opens grade.dat with some text editor to find this mess:

(("Adam" 88 78 88 69 66) ("Brad" 100 88 87 ... 

That is, running the add-grades-to-all program flattened the file into one, long line of text. The file has become almost unmanageable for a human user.

;; S-expression  -->  Void
;; to pretty-print the value v using the same printed form as write, but with
;; newlines and whitespace inserted to make the output well-formatted
(define (pretty-print v) ...)

;; S-expression  -->  Void
;; same as pretty-print, 
;; but using display's style rather than write's
(define (pretty-display v) ...)

Figure 29:  Functions for pretty-printing S-expressions

What you want to do now is to print the S-expression in pretty shape, readable shape. Naturally Scheme comes with a library for PRETTY PRINTING S-expression. The most useful functions in this library are pretty-print and pretty-display, which are specified in figure 29. Using pretty-print, we can easily format the output in a reasonable manner with one change to add-grades-to-all:

;;  -->  Void
;; to add a grade to each record in DB via a query program
(define (add-grades-to-all)
  (let ([new (map add-one-grade (with-input-from-file DB read))])
    (with-output-to-file DB (lambda () (pretty-print new)) 'replace)))

Using this revision, grades.dat now remains in the original format and is thus both readable by programs and professors.

The other function from the library, pretty-display, also writes S-expressions in a pretty format but uses display instead of write to write out atomic pieces. As a result, the printed output is usually what people want to read, but the read function can no longer input this data in the original form. Using this function for the gpa program, like this,

;;  -->  (Listof (list String Number))
(define (gpas) (pretty-display (compute-gpas (with-input-from-file DB read))))

produces an S-expression for the class gpa that everyone can read.

Historically, pretty-printing was important for printing programs, which of course are also in S-expression form. People soon realized that it is as useful for ordinary programs. Pretty-printing libraries therefore provide may ways of specifying the properties of the output that the pretty-printing functions produce. For more information, consult the help desk and experiment with the tools that the library provides.

10 And we give credit where credit is due.

11 Calling this function main is a convention not a necessity in Scheme.