Data Definitions, Functions and "The Design Recipe"

The purpose of this lab is to both introduce the "Design Recipe", a step-by-step prescription for designing programs, and gain some practice using it. For a more thorough treatment you are encouraged to look through How to Design Programs by, Felleisen, Findler, Flatt and Krishnamurthi.

The Design Recipe

The design of every procedure involves the following steps:
  1. Contract
  2. Header
  3. Purpose Statement
  4. Examples
  5. Procedure body
  6. Tests

Contract

A contract specifies the set of values that the procedure expects as input and the set of values that the procedure guarantees to produce as outputs. For example:

          ;; sales-tax : Number * Number -> Number
        

specifies a contract for the function sales-tax which takes two arguments both Numbers and produces a Number as an answer. The notation is borrowed from mathematics

   f : Number × Number → Number.

We use the name Number to denote the set of values made up of integers, reals etc. Some other examples are String, Char, Boolean, Symbol, SchemeValue. Later on in this lab we will also see how to define such a name and the set of values it denotes.

Header

The header is simply the name and arguments of the procedure. Typically a comment line that shows how to use the function we are about to define giving a name to the function (i.e. sales-tax) and its argument(s) (i.e. amt, perc). We use ellipsis (...) to show that the function is incomplete. Each step in the design recipe will help us to fill these in.
          ;; sales-tax : Number * Number -> Number
          ;; usage : (sales-tax amt perc)

          (define sales-tax
            (lambda (amt perc)
              ...))
        

Purpose

Well, what is this function expected to do? Keep it simple, this should be a one (or two) line description explaining what the function should produce. In this description we can refer to argument names defined at the previous step of the Design Recipe.
          ;; sales-tax : Number * Number -> Number
          ;; usage : (sales-tax amt perc)
          ;; produces : sales tax to be paid on given amt (amt*perc)

          (define sales-tax
            (lambda (amt perc)
              ...))
        

Examples

Before typing in the body of your function, write down examples of how this function can be used, giving values for input(s) and the expected values for output(s). Make sure that you create enough examples that exercise most (if not all) of your code. These examples will later become tests for your function.
          ;; sales-tax : Number * Number -> Number
          ;; usage : (sales-tax amt perc)
          ;; produces : sales tax to be paid on given amt (amt*perc)

          ;; examples: 
          ;; (sales-tax 100 0.1) ==> 10
          ;; (sales-tax 100 0.3) ==> 30
          ;; (sales-tax 230 0.15) ==> 34.5

          (define sales-tax
            (lambda (amt perc)
              ...))
        

Procedure Body

At these point we have all the information we need to complete the function's definition. We have to write a function that satisfies the contract, can be called as specified by the header, it does exactly what the purpose states and produces the expected output for the examples that we have written down.
          ;; sales-tax : Number * Number -> Number
          ;; usage : (sales-tax amt perc)
          ;; produces : sales tax to be paid on given amt (amt*perc)

          ;; examples: 
          ;; (sales-tax 100 0.1) ==> 10
          ;; (sales-tax 100 0.3) ==> 30
          ;; (sales-tax 230 0.15) ==> 34.5
          
          (define sales-tax
            (lambda (amt perc)
              (* amt perc)))
        

Test

The examples that we have can be turned into tests. Run each example and make sure that the function returns the expected results. (For the purpose of this course a test-harness will be provided to which you will have to add your test cases and test your code)

Data Definitions

For any datatype that you are considering (or designing) you need to be clear on how to
So with each datatype we associate the following functions
These functions together form the protocol that programs have to use when manipulating instances of a datatype. Breaking this protocol and accessing internal data through other means is a recipe for disaster!

Before we can even talk about a solution we need to be clear about what kind of data we are working with. For example

          ;;  A day-of-the-week is one of:
          ;;     --  'monday
          ;;     --  'tuesday
          ;;     --  'wednesday
          ;;     --  'thursday
          ;;     --  'friday
          ;;     --  'saturday
          ;;     --  'sunday
         

The data definition above defines day-of-the-week, a kind of value that we can use in our contracts much like String, Number etc. An instance of day-of-the-week can be one of the choices listed e.g. 'monday, 'tuesday etc. These choices are all Symbol values.
A function that takes as input a datatype with multiple variants must check and handle each variant. Along with the datatype a set of predicates must be provided to distinguish each variant. In this case we can use scheme's predicate eq? to test a day-of-the-week. We also use Scheme's cond with a datatype's predicates to test for each variant.

[Exercise 1] Design a function weekend?, that takes in a day-of-the-week and produces #t if the day given represents a weekend and #f otherwise.

Recursive Datatypes

These are datatypes whose definition is recursive, i.e. refers back to itself. A very common datatype of this kind is list.
          ;; An Los (list of strings) is one of: 
          ;; -- '()
          ;; -- (cons String Los)
        
The second variant is made up of two parts, a String and another Los, the same Los that we are trying to define! This circularity in its definition is important and it will be reflected in the functions that consume an Los. For instance lets design a function that takes an Los and produces a String made up of all the elements in Los.
          ;; los->string: Los -> String 
          ;; usage: (los->string los) 
          ;; produces: appends the elements of los into one big string

          ;; examples: 
          ;; (los->string '("she" "sells" "seashells")) ==> "shesellsseashells"
          ;; (los->string '()) ==> ""
          ;; (los->string '("ab" "ba")) ==> "abba"

          (define los->string
            (lambda (los)
              (cond 
                [(null? los) ""]
                [else (string-append (car los)
                                     (los->string (cdr los)))])))
        
Note how the structure of the code mimics the structure of the data definition. The first cond-clause deals with the first variant of Los. The second cond-clause deals with the second variant of Los. More importantly, notice the recursive call to los->string that deals with the recursive datatype (los->string (cdr los)).

[Exercise 2] Design a function that takes an Los and produces the longest string in the Los. In the case were your function is given the empty list, you should output #f. [Have a look at the scheme function string-length]

[Exercise 3] Design a function that takes an NELos and produces the smallest string in the NELos.
    ;; An NELos (non empty Los) is 
    ;; - (cons String '())
    ;; - (cons String NELos)
    

[Exercise 4] Homework is defined as follows
          ;; A Homework is (list Student Grade)
          ;; where Student is a String
          ;;       Grade is a Number
          ;; (list s n) creates a Homework for student s with a grade n.
          ;; (list "john" 90) represents John's homework grade of 90 
Define a datatype Loh to represent a list of Homework.

[Exercise 5] Design a function that consumes an Loh and produces the average of all grades in the Loh.

[Exercise 6] Design a function that consumes an Loh and a student s and produces a list of all Homeworks for student s.

[Exercise 7] Write a new data definition for Homework-by-Student that contains a student and a list of grades, representing a student with a list of grades received for each part of a homework. Design a function that takes a Homework-by-Student and produces the students average grade.

[Exercise 8]Define a datatype LoHbS that represents a list of Homework-by-Students. Design a function called highest-grade that consumes an LoHbS and produces the maximum grade.

[Exercise 9] Write a new data definition for Homework-by-Grade that contains a list of Students and a grade, representing all the students that received the same grade for this homework. Design a function that consumes a Homework-by-Grade and a Student and produces the given student's grade.

[Exercise 10] Define a datatype LoHbG that represents a list of Homework-by-Grades. Design a function that takes in an LoHbG and a Student and produces a list with all the grades obtained by the given student.

A Helping Hand from EoPL

The EoPL language inside DrScheme provides define-datatype that helps define datatypes both with and without multiple variants. Along with define-datatype, the EoPL language level also provides cases, which can be used to decompose instances of a define-datatype definition.

Lets define an Assignment with define-datatype

          ;; An Assignment is (hw-record Student Grade)
          ;; where Student is a String 
          ;;       Grade is a Number.
          ;; (hw-record s n) creates an Assignment for Student s with Grade n

          (define-datatype assignment assignment? 
            [hw-record (name string?)
                       (grade number?)])
        
So what does the above code tell Scheme to do? Well, we want to define a datatype called assignment along with a predicate, assignment?, that will check whether a Scheme value is an assignment or not. This new datatype will have one variant (enclosed in [ ]) which we can construct with a constructor whose name is hw-record. This constructor takes 2 arguments: In order to access sub elements of a value created with define-datatype we have to use cases. For example here is a function that consumes an Assignment and produces a Homework.
 
      ;; assignment->homework : Assignment -> Homework
      ;; usage : (assignment->homework a) 
      ;; produces: an instance of Homework with the same values as the 
      ;;           given assignment
  
      ;; examples: 
      ;; (assignment->homework (hw-record "john" 90)) ==> ("john" 90)
      ;; (assignment->homework (hw-record "" 0)) ==> ("" 0)
      ;; (assignment->homework (hw-record "mary" 100)) ==> ("mary" 100)
  
      (define assignment->homework 
        (lambda (a) 
          (cases assignment a 
            [hw-record (n g) (list n g)])))
    
The keyword cases is followed by the name of the datatype that we are about to decompose assignment, followed by the actual value to be decomposed a followed by a list of clauses, one for each variant of the datatype. Cases will actually perform a runtime test to verify that the value given is in fact an instance of the datatype expected (i.e. it will call (assignment? as)). For each clause we have We can use define-datatype to define datatypes with multiple variants as well
          ;; A Filesystem is one of 
          ;; - EmptyFS 
          ;; - (nonemptyfs FSElement Filesystem)

          ;; EmptyFS is (emptyfs) 

          ;; A FSElement is one of 
          ;; - File
          ;; - Directory
          ;; - SymbolicLink 

          ;; A File is (file String Number)
          ;; (file s n) creates a file with name s and size n

          ;; A Directory is (dir String String Number Filesystem)
          ;; (dir s1 s2 n fs) creates the directory s1 owned by user s2 
          ;; with size n and  contents fs.

          ;; A SymbolicLink is (slink String String) 
          ;; (slink s1 s2) creates a link with name s1 pointing to s2

          (define-datatype filesystem filesystem?
            [emptyfs]
            [nonemptyfs (ele fselement?)
                        (fs filesystem?)])
  
          (define-datatype fselement fselement?
            [file (name string?)
                  (size number?)]
            [dir (name string?) 
                 (owner string?)
                 (size number?) 
                 (contents filesystem?)]
            [slink (name string?)
                   (path string?)])
        

[Exercise 11] Write example instances of filesystem. Rewrite the data definition for Filesystem as a set of rules of inference, and as a grammar.

[Exercise 12] Design a function disk-usage that takes as input a Filesystem and produces the total size occupied by all its files. Symbolic links and directories occupy no space.

[Exercise 13] Design a function which checks that each directory size is equal to the total size of its contents. Your function should consume a filesystem and produce a Boolean.

[Exercise 14] Design a function fsck which consumes a filesystem and produces a filesystem like the original but any inconsistent directories found in the input filesystem have been corrected. A directory is inconsistent when actual occupied size of its contents is not equal to the value stored in dir's size element.

[Exercise 15] Design a function that consumes a filesystem and a user name and produces a list of directory names that belong to the user.

[Exercise 16] Design a function that consumes a filesystem and a user name and produces the total size occupied by directories owned by this user.

[Exercise 17] Design a function that consumes a filesystem and produces a list of all the files that are pointed to by a symbolic link.