The SRFI-1 list library -*- outline -*- Olin Shivers 98/10/16 Last Update: 99/10/3 Emacs should display this document in outline mode. Say c-h m for instructions on how to move through it by sections (e.g., c-c c-n, c-c c-p). During the SRFI discussion period, the current draft may be found at ftp://ftp.ai.mit.edu/people/shivers/srfi/srfi-1/srfi-1.txt * Table of contents ------------------- Abstract Introduction Procedure index General discussion "Linear update" procedures Improper lists Errors Not included in this library The procedures Constructors Predicates Selectors Miscellaneous: length, append, reverse, zip & count Fold, unfold & map Filtering & partitioning Searching Deletion Association lists Set operations on lists Primitive side-effects Acknowledgements References & links Copyright * Abstract ---------- R5RS Scheme has an impoverished set of list-processing utilities, which is a problem for authors of portable code. This SRFI proposes a coherent and comprehensive set of list-processing procedures; it is accompanied by a reference implementation of the spec. The reference implementation is - portable - efficient - completely open, public-domain source * Introduction -------------- The set of basic list and pair operations provided by R4RS/R5RS Scheme is far from satisfactory. Because this set is so small and basic, most implementations provide additional utilities, such as a list-filtering function, or a "left fold" operator, and so forth. But, of course, this introduces incompatibilities -- different Scheme implementations provide different sets of procedures. I have designed a full-featured library of procedures for list processing. While putting this library together, I checked as many Schemes as I could get my hands on. (I have a fair amount of experience with several of these already.) I missed Chez -- no on-line manual that I can find -- but I hit most of the other big, full-featured Schemes. The complete list of list-processing systems I checked is: R4RS/R5RS Scheme, MIT Scheme, Gambit, RScheme, MzScheme, slib, Common Lisp, Bigloo, guile, T, APL and the SML standard basis As a result, the library I am proposing is fairly rich. Following this initial design phase, this library went through several months of discussion on the SRFI mailing lists, and was altered in light of the ideas and suggestions put forth during this discussion. In parallel with designing this API, I have also written a reference implementation. I have placed this source on the Net with an unencumbered, "open" copyright. A few notes about the reference implementation: - Although I got procedure names and specs from many Schemes, I wrote this code myself. Thus, there are *no* entanglements. Any Scheme implementor can pick this library up with no worries about copyright problems -- both commercial and non-commercial systems. - The code is written for portability and should be trivial to port to any Scheme. It has only four deviations from R4RS, clearly discussed in the comments: - Use of an ERROR procedure; - Use of the R5RS VALUES and a simple RECEIVE macro for producing and consuming multiple return values; - Use of simple :OPTIONAL and LET-OPTIONALS macros for optional argument parsing and defaulting; - Use of a simple CHECK-ARG procedure for argument checking. - It is written for clarity and well-commented. The current source is 768 lines of source code and 826 lines of comments and white space. - It is written for efficiency. Fast paths are provided for common cases. Side-effecting procedures such as FILTER! avoid unnecessary, redundant SET-CDR!s which would thrash a generational GC's write barrier and the store buffers of fast processors. Functions reuse longest common tails from input parameters to construct their results where possible. Constant-space iterations are used in preference to recursions; local recursions are used in preference to consing temporary intermediate data structures. This is not to say that the implementation can't be tuned up for a specific Scheme implementation. There are notes in comments addressing ways implementors can tune the reference implementation for performance. In short, I've written the reference implementation to make it as painless as possible for an implementor -- or a regular programmer -- to adopt this library and get good results with it. * Procedure index ----------------- Here is a short list of the procedures provided by the list-lib package. "#" marks R5RS procedures; "+" marks extended R5RS procedures Constructors # cons list xcons cons* make-list list-tabulate list-copy circular-list iota Predicates # pair? null? proper-list? circular-list? dotted-list? not-pair? null-list? list= Selectors # car cdr ... cdddar cddddr list-ref first second third fourth fifth sixth seventh eighth ninth tenth car+cdr take drop take-right drop-right take! drop-right! split-at split-at! last last-pair Miscellaneous: length, append, concatenate, reverse, zip & count # length length+ # append reverse append! reverse! concatenate concatenate! append-reverse append-reverse! zip unzip1 unzip2 unzip3 unzip4 unzip5 count Fold, unfold & map + map for-each fold unfold pair-fold reduce fold-right unfold-right pair-fold-right reduce-right append-map append-map! map! pair-for-each filter-map map-in-order Filtering & partitioning filter partition remove filter! partition! remove! Searching + member # memq memv find any every list-index take-while drop-while take-while! span break span! break! Deleting delete delete-duplicates delete! delete-duplicates! Association lists + assoc # assq assv alist-cons alist-copy alist-delete alist-delete! Set operations on lists lset<= lset= lset-adjoin lset-union lset-union! lset-intersection lset-intersection! lset-difference lset-difference! lset-xor lset-xor! lset-diff+intersection lset-diff+intersection! Primitive side effects # set-car! set-cdr! ------ Four R4RS/R5RS list-processing procedures are extended by this library in backwards-compatible ways: map for-each (Extended to take lists of unequal length) member assoc (Extended to take an optional comparison procedure) The following R4RS/R5RS list- and pair-processing procedures are also part of list-lib's exports, as defined by the R5RS report: cons pair? null? list length append reverse car cdr ... cdddar cddddr set-car! set-cdr! list-ref memq memv assq assv The remaining two R4RS/R5RS list-processing procedures are *not* part of this library: list-tail (renamed DROP) list? (see PROPER-LIST?, CIRCULAR-LIST? and DOTTED-LIST?) * General discussion -------------------- A set of general criteria guided the design of this library. I don't require "destructive" (what I call "linear update") procedures to alter and recycle cons cells from the argument lists. They are allowed to, but not required to. (The reference implementations I have written *do* recycle the argument lists.) See below for further discussion. List-filtering procedures such as FILTER or DELETE do not disorder lists. Elements appear in the answer list in the same order as they appear in the argument list. This constrains implementation, but seems like a desirable feature, since in many uses of lists, order matters. (In particular, disordering an alist is definitely a bad idea.) Contrariwise, although the reference implementations of the list-filtering procedures share longest common tails between argument and answer lists, it not is part of the spec. Because lists are an inherently sequential data structure (unlike, say, vectors), list-inspection functions such as FIND, FIND-TAIL, FOR-EACH, ANY and EVERY commit to a left-to-right traversal order of their argument list. However, constructor functions, such as LIST-TABULATE and the mapping procedures (APPEND-MAP, APPEND-MAP!, MAP!, PAIR-FOR-EACH, FILTER-MAP, MAP-IN-ORDER) do *not* specify the dynamic order in which their procedural argument is applied to its various values. Predicates return useful true values wherever possible. Thus ANY must return the true value produced by its predicate, and EVERY returns the final true value produced by applying its predicate argument to the last element of its argument list. Functionality is provided both in pure and linear-update (potentially destructive) forms wherever this makes sense. No special status accorded Scheme's built-in equality functions. Any functionality provided in terms of EQ?, EQV?, EQUAL? is also available using a client-provided equality function. Proper design counts for more than backwards compatibility, but I have tried, ceteris paribus, to be as backwards-compatible as possible with existing list-processing libraries, in order to facilitate porting old code to run as a client of the procedures in this library. Name choices and semantics are, for the most part, in agreement with existing practice in many current Scheme systems. I have indicated some incompatibilities in the following text. These procedures are *not* "sequence generic" -- i.e., procedures that operate on either vectors and lists. They are list-specific. I prefer to keep the library simple and focussed. I have named these procedures without a qualifying initial "list-" lexeme, which is in keeping with the existing set of list-processing utilities in Scheme. I follow the general Scheme convention (VECTOR-LENGTH, STRING-REF) of placing the type-name before the action when naming procedures -- so we have LIST-COPY and PAIR-FOR-EACH rather than the perhaps more fluid, but less consistent, COPY-LIST, or FOR-EACH-PAIR. I have generally followed a regular and consistent naming scheme, composing procedure names from a set of basic lexemes. ** "Linear update" procedures ============================= Many procedures in this library have "pure" and "linear update" variants. A "pure" procedure has no side-effects, and in particular does not alter its arguments in any way. A "linear update" procedure is allowed -- but *not* required -- to side-effect its arguments in order to construct its result. "Linear update" procedures are typically given names ending with an exclamation point. So, for example, (APPEND! list1 list2) is allowed to construct its result by simply using SET-CDR! to set the cdr of the last pair of list1 to point to list2, and then returning list1 (unless list1 is the empty list, in which case it would simply return list2). However, APPEND! may also elect to perform a pure append operation -- this is a legal definition of APPEND!: (define append! append) This is why we do not call these procedures "destructive" -- because they aren't *required* to be destructive. They are *potentially* destructive. What this means is that you may only apply linear-update procedures to values that you know are "dead" -- values that will never be used again in your program. This must be so, since you can't rely on the value passed to a linear-update procedure after that procedure has been called. It might be unchanged; it might be altered. The "linear" in "linear update" doesn't mean "linear time" or "linear space" or any sort of multiple-of-n kind of meaning. It's a fancy term that type theorists and pure functional programmers use to describe systems where you are only allowed to have exactly one reference to each variable. This provides a guarantee that the value bound to a variable is bound to no other variable. So when you *use* a variable in a variable reference, you "use it up." Knowing that no one else has a pointer to that value means the system primitive is free to side-effect its arguments to produce what is, observationally, a pure-functional result. In the context of this library, "linear update" means you, the programmer, know there are *no other* live references to the value passed to the procedure -- after passing the value to one of these procedures, the value of the old pointer is indeterminate. Basically, you are licensing the Scheme implementation to alter the data structure if it feels like it -- you have declared you don't care either way. You get no help from Scheme in checking that the values you claim are "linear" really are. So you better get it right. Or play it safe and use the non-! procedures -- it doesn't do any good to compute quickly if you get the wrong answer. Why go to all this trouble to define the notion of "linear update" and use it in a procedure spec, instead of the more common notion of a "destructive" operation? First, note that destructive list-processing procedures are almost always used in a linear-update fashion. This is in part required by the special case of operating upon the empty list, which can't be side-effected. This means that destructive operators are not pure side-effects -- they have to return a result. Second, note that code written using linear-update operators can be trivially ported to a pure, functional subset of Scheme by simply providing pure implementations of the linear-update operators. Finally, requiring destructive side-effects ruins opportunities to parallelise these operations -- and the places where one has taken the trouble to spell out destructive operations are usually exactly the code one would want a parallelising compiler to parallelise: the efficiency-critical kernels of the algorithm. Linear-update operations are easily parallelised. Going with a linear-update spec doesn't close off these valuable alternative implementation techniques. This list library is intended as a set of low-level, basic operators, so we don't want to exclude these possible implementations. The linear-update procedures in this library are take! drop-right! append! reverse! append-reverse! append-map! map! filter! partition! remove! delete! alist-delete! delete-duplicates! lset-adjoin! lset-union! lset-intersection! lset-difference! lset-xor! lset-diff+intersection! ** Improper lists ================= Scheme does not properly have a list type, just as C does not have a string type. Rather, Scheme has a binary-tuple type, from which one can build binary trees. There is an *interpretation* of Scheme values that allows one to treat these trees as lists. Further complications ensue from the fact that Scheme allows side-effects to these tuples, raising the possibility of lists of unbounded length, and trees of unbounded depth (that is, circular data structures). However, there is a simple view of the world of Scheme values that considers every value to be a list of some sort. That is, every value is either - a "proper list" -- a finite, nil-terminated list, such as: (a b c) () (32) - a "dotted list" -- a finite, non-nil terminated list, such as (a b c . d) (x . y) 42 george - or a "circular list" -- an infinite, unterminated list. Note that the zero-length dotted lists are simply all the non-null, non-pair values. This view is captured by the predicates PROPER-LIST?, DOTTED-LIST?, and CIRCULAR-LIST?. List-lib users should note that dotted lists are not commonly used, and are considered by many Scheme programmers to be an ugly artifact of Scheme's lack of a true list type. However, dotted lists do play a noticeable role in the *syntax* of Scheme, in the "rest" parameters used by n-ary lambdas: (lambda (x y . rest) ...). Dotted lists are *not* fully supported by list-lib. Most procedures are defined only on proper lists -- that is, finite, nil-terminated lists. The procedures that will also handle circular or dotted lists are specifically marked. While this design decision restricts the domain of possible arguments one can pass to these procedures, it has the benefit of allowing the procedures to catch the error cases where programmers inadvertently pass scalar values to a list procedure by accident, e.g. by switching the arguments to a procedure call. ** Errors ========= Note that statements of the form "it is an error" merely mean "don't do that." They are not a guarantee that a conforming implementation will "catch" such improper use by, for example, raising some kind of exception. Regrettably, R5RS Scheme requires no firmer guarantee even for basic operators such as CAR and CDR, so there's little point in requiring these procedures to do more. Here is the relevant section of the R5RS report: When speaking of an error situation, this report uses the phrase "an error is signalled" to indicate that implementations must detect and report the error. If such wording does not appear in the discussion of an error, then implementations are not required to detect or report the error, though they are encouraged to do so. An error situation that implementations are not required to detect is usually referred to simply as "an error." For example, it is an error for a procedure to be passed an argument that the procedure is not explicitly specified to handle, even though such domain errors are seldom mentioned in this report. Implementations may extend a procedure's domain of definition to include such arguments. ** Not included in this library =============================== The following items are not in this library: - Sort routines - Destructuring/pattern-matching macro - Tree-processing routines They should have their own SRFI specs. * The procedures ---------------- In a Scheme system that has a module or package system, these procedures should be contained in a module named "list-lib". The templates given below obey the following conventions for procedure formals: list A proper (finite, nil-terminated) list clist A proper or circular list flist A finite (proper or dotted) list pair A pair x, y, d, a Any value object, value Any value n, i A natural number (an integer >= 0) proc A procedure pred A procedure whose return value is treated as a boolean = A boolean procedure taking two arguments It is an error to pass a circular or dotted list to a procedure not defined to accept such an argument. ** Constructors =============== cons a d -> pair R5RS The primitive constructor. Returns a newly allocated pair whose car is A and whose cdr is D. The pair is guaranteed to be different (in the sense of EQV?) from every existing object. (cons 'a '()) ==> (a) (cons '(a) '(b c d)) ==> ((a) b c d) (cons "a" '(b c)) ==> ("a" b c) (cons 'a 3) ==> (a . 3) (cons '(a b) 'c) ==> ((a b) . c) list object ... -> list R5RS Returns a newly allocated list of its arguments. (list 'a (+ 3 4) 'c) ==> (a 7 c) (list) ==> () xcons d a -> pair (lambda (d a) (cons a d)) Of utility only as a value to be conveniently passed to higher-order procedures. (xcons '(b c) 'a) => (a b c) The name stands for "eXchanged CONS." cons* elt1 elt2 ... -> object Like LIST, but the last argument provides the tail of the constructed list, returning (cons elt1 (cons elt2 (cons ... eltn))). This function is called LIST* in Common Lisp and about half of the Schemes that provide it; and CONS* in the other half. (cons* 1 2 3 4) => (1 2 3 . 4) (cons* 1) => 1 make-list n [fill] -> list Returns an N-element list, whose elements are all the value FILL. If the FILL argument is not given, the elements of the list may be arbitrary values. (make-list 4 'c) => (c c c c) (make-list 10) => (2 3 5 7 11 13 17 19 23 29) list-tabulate n init-proc -> list Returns an N-element list. Element i of the list, where 0 <= i < N, is produced by (INIT-PROC i). No guarantee is made about the dynamic order in which INIT-PROC is applied to these indices. (list-tabulate 4 values) => (0 1 2 3) list-copy flist -> flist Copies the "spine" of the argument. circular-list elt1 elt2 ... -> clist Constructs a circular list of the elements. (circular-list 'z 'q) => (z q z q z q ...) iota count [start step] -> list Returns a list containing the elements (start start+step ... start+(count-1)*step) The START and STEP parameters default to 0 and 1, respectively. This procedure takes its name from the APL primitive. (iota 5) => (0 1 2 3 4) (iota 5 0 -0.1) => (0 -0.1 -0.2 -0.3 -0.4) ** Predicates ============= Note: the predicates PROPER-LIST?, CIRCULAR-LIST?, and DOTTED-LIST? partition the entire universe of Scheme values. proper-list? x -> boolean Returns true iff X is a proper list -- a finite, nil-terminated list. More carefully: The empty list is a proper list. A pair whose cdr is a proper list is also a proper list: ::= () (Empty proper list) | (cons ) (Proper-list pair) Note that this definition rules out circular lists. This function is required to detect this case and return false. Nil-terminated lists are called "proper" lists by R5RS and Common Lisp. The opposite of proper is improper. R5RS binds this function to the variable LIST?. (not (proper-list? x)) = (or (dotted-list? x) (circular-list? x)) circular-list? x -> boolean True if X is a circular list. A circular list is a value such that for every n >= 0, cdr^n(x) is a pair. Terminology: The opposite of circular is finite. (not (circular-list? x)) = (or (proper-list? x) (dotted-list? x)) dotted-list? x -> boolean True if X is a finite, non-nil-terminated list. That is, there exists an n >= 0 such that cdr^n(x) is neither a pair nor (). This includes non-pair, non-() values (e.g. symbols, numbers), which are considered to be dotted lists of length 0. (not (dotted-list? x)) = (or (proper-list? x) (circular-list? x)) pair? object -> boolean R5RS Returns #t if OBJECT is a pair; otherwise, #f. (pair? '(a . b)) ==> #t (pair? '(a b c)) ==> #t (pair? '()) ==> #f (pair? '#(a b)) ==> #f (pair? 7) ==> #f (pair? 'a) ==> #f null? object -> boolean R5RS Returns #t if OBJECT is the empty list; otherwise, #f. null-list? list -> boolean LIST is a proper or circular list. This procedure returns true if the argument is the empty list (), and false otherwise. It is an error to pass this procedure a value which is not a proper or circular list. This procedure is recommended as the termination condition for list-processing procedures that are not defined on dotted lists. not-pair? x -> boolean (lambda (x) (not (pair? x))) Provided as a procedure as it can be useful as the termination condition for list-processing procedures that wish to handle all finite lists, both proper and dotted. list= elt= list1 ... -> boolean Determines list equality, given an element-equality procedure. Proper list A equals proper list B if they are of the same length, and their corresponding elements are equal, as determined by ELT=. If the element-comparison procedure's first argument is from LISTi, then its second argument is from LISTi+1, i.e. it is always called as (elt= a b) for a an element of list A, and b an element of list B. In the n-ary case, every LISTi is compared to LISTi+1 (as opposed, for example, to comparing LIST1 to every LISTi, for i>1). If there are no list arguments at all, LIST= simply returns true. It is an error to apply LIST= to anything except proper lists. While implementations may choose to extend it to circular lists, note that it cannot reasonably be extended to dotted lists, as it provides no way to specify an equality procedure for comparing the list terminators. Note that the dynamic order in which the ELT= procedure is applied to pairs of elements is not specified. For example, if LIST= is applied to three lists, A, B, and C, it may first completely compare A to B, then compare B to C, or it may compare the first elements of A and B, then the first elements of B and C, then the second elements of A and B, and so forth. The equality procedure must be consistent with EQ?. That is, it must be the case that (eq? x y) => (elt= x y). Note that this implies that two lists which are EQ? are always LIST=, as well; implementations may exploit this fact to "short-cut" the element-by-element comparisons. (list= eq?) => #t ; Trivial cases (list= eq? '(a)) => #t ** Selectors ============ car pair -> value R5RS cdr pair -> value R5RS These procedures return the contents of the car and cdr field of their argument, respectively. Note that it is an error to apply them to the empty list. (car '(a b c)) ==> a (car '((a) b c d)) ==> (a) (car '(1 . 2)) ==> 1 (car '()) ==> *error* (cdr '(a b c)) ==> (b c) (cdr '((a) b c d)) ==> (b c d) (cdr '(1 . 2)) ==> 2 (cdr '()) ==> *error* caar pair -> value R5RS cadr pair -> value : cdddar pair -> value cddddr pair -> value These procedures are compositions of CAR and CDR, where for example CADDR could be defined by (define caddr (lambda (x) (car (cdr (cdr x))))). Arbitrary compositions, up to four deep, are provided. There are twenty-eight of these procedures in all. list-ref clist i -> value R5RS Returns the Ith element of CLIST. (This is the same as the car of (DROP CLIST I).) It is an error if I >= N, where N is the length of CLIST. (list-ref '(a b c d) 2) ==> c first second third fourth fifth sixth seventh eighth ninth tenth: pair -> value Synonyms for car, cadr, caddr, ... (third '(a b c d e)) => c car+cdr pair -> [x y] The fundamental pair deconstructor: (lambda (p) (values (car p) (cdr p))) This can, of course, be implemented more efficiently by a compiler. take x i -> list drop x i -> object TAKE returns the first I elements of list X. DROP returns all but the first I elements of list X. (take '(a b c d e) 2) => (a b) (drop '(a b c d e) 2) => (c d e) X may be any value -- a proper, circular, or dotted list: (take '(1 2 3 . d) 2) => (1 2) (drop '(1 2 3 . d) 2) => (3 . d) (take '(1 2 3 . d) 3) => (1 2 3) (drop '(1 2 3 . d) 3) => d For a legal I, TAKE and DROP partition the list in a manner which can be inverted with APPEND: (append (take x i) (drop x i)) = x DROP is exactly equivalent to performing I cdr operations on X; the returned value shares a common tail with X. If the argument is a list of non-zero length, TAKE is guaranteed to return a freshly-allocated list, even in the case where the entire list is taken, e.g. (TAKE LIS (LENGTH LIS)). take-right flist i -> object drop-right flist i -> list TAKE-RIGHT returns the last I elements of FLIST. DROP-RIGHT returns all but the last I elements of FLIST. The returned list may share a common tail with the argument list. (take-right '(a b c d e) 2) => (d e) (drop-right '(a b c d e) 2) => (a b c) FLIST may be any finite list, either proper or dotted: (take-right '(1 2 3 . d) 2) => (2 3 . d) (drop-right '(1 2 3 . d) 2) => (1) (take-right '(1 2 3 . d) 0) => d (drop-right '(1 2 3 . d) 0) => (1 2 3) For a legal I, TAKE-RIGHT and DROP-RIGHT partition the list in a manner which can be inverted with APPEND: (append (take flist i) (drop flist i)) = flist TAKE-RIGHT's return value is guaranteed to share a common tail with FLIST. If the argument is a list of non-zero length, DROP-RIGHT is guaranteed to return a freshly-allocated list, even in the case where nothing is dropped, e.g. (DROP-RIGHT LIS 0). take! x i -> list drop-right! flist i -> list TAKE! and DROP-RIGHT! are "linear-update" variants of TAKE and DROP-RIGHT: the procedure is allowed, but not required, to alter the argument list to produce the result. If X is circular, TAKE! may return a shorter-than-expected list: (take! (circular-list 1 3 5) 8) => (1 3) (take! (circular-list 1 3 5) 8) => (1 3 5 1 3 5 1 3) split-at x i -> [list object] split-at! x i -> [list object] SPLIT-AT splits the list X at index I, returning a list of the first I elements, and the remaining tail. It is equivalent to (values (take x i) (drop x i)) SPLIT-AT! is the linear-update variant. It is allowed, but not required, to alter the argument list to produce the result. (split-at '(a b c d e f g h) 3) => (a b c) (d e f g h) last pair -> object last-pair pair -> pair LAST returns the last element of the non-empty, finite list PAIR. LAST-PAIR returns the last pair in the non-empty, finite list PAIR. (last '(a b c)) => c (last-pair '(a b c)) => (c) (last-pair '(a b c . d)) => (c . d) ** Miscellaneous: length, append, concatenate, reverse, zip & count =================================================================== length list -> integer R5RS length+ clist -> integer or #f Both LENGTH and LENGTH+ return the length of the argument. It is an error to pass a value to LENGTH which is not a proper list (finite and nil-terminated). In particular, this means an implementation may diverge or signal an error when LENGTH is applied to a circular list. LENGTH+, on the other hand, returns #F when applied to a circular list. The length of a proper list is a non-negative integer N such that CDR applied N times to the list produces the empty list. (length '(a b c)) ==> 3 (length '(a (b) (c d e))) ==> 3 (length '()) ==> 0 append list1 ... -> value R5RS append! list1 ... -> value APPEND returns a list consisting of the elements of LIST1 followed by the elements of the other list parameters. (append '(x) '(y)) ==> (x y) (append '(a) '(b c d)) ==> (a b c d) (append '(a (b)) '((c))) ==> (a (b) (c)) The resulting list is always newly allocated, except that it shares structure with the final LISTi argument. This last argument may be any value at all; an improper list results if it is not a proper list. All other arguments must be proper lists. (append '(a b) '(c . d)) ==> (a b c . d) (append '() 'a) ==> a (append '(x y)) ==> (x y) (append) ==> () APPEND! is the "linear-update" variant of APPEND -- it is allowed, but not required, to alter cons cells in the argument lists to construct the result list. The last argument is never altered; the result list shares structure with this parameter. concatenate list-of-lists -> value concatenate! list-of-lists -> value These functions append the elements of their argument together. That is, CONCATENATE returns (apply append list-of-lists) or, equivalently, (reduce-right append '() list-of-lists) CONCATENATE! is the linear-update variant, defined in terms of APPEND! instead of APPEND. Note that some Scheme implementations do not support passing more than a certain number (e.g., 64) of arguments to an n-ary procedure. In these implementations, the (APPLY APPEND ...) idiom would fail when applied to long lists, but CONCATENATE would continue to function properly. As with APPEND and APPEND!, the last element of the input list may be any value at all. reverse list -> list R5RS reverse! list -> list REVERSE returns a newly allocated list consisting of the elements of LIST in reverse order. (reverse '(a b c)) ==> (c b a) (reverse '(a (b c) d (e (f)))) ==> ((e (f)) d (b c) a) REVERSE! is the linear-update variant of REVERSE. It is permitted, but not required, to alter the argument's cons cells to produce the reversed list. append-reverse rev-head tail -> value append-reverse! rev-head tail -> value APPEND-REVERSE returns (append (reverse rev-head) tail) It it provided because it is a common operation -- a common list-processing style calls for this exact operation to transfer values accumulated in reverse order onto the front of another list, and because the implementation is significantly more efficient than the simple composition it replaces. (But note that this pattern of iterative computation followed by a reverse can frequently be rewritten as a recursion, dispensing with the REVERSE and APPEND-REVERSE steps, and shifting temporary, intermediate storage from the heap to the stack, which is typically a win for reasons of cache locality and eager storage reclamation.) APPEND-REVERSE! is just the linear-update variant -- it is allowed, but not required, to alter REV-HEAD's cons cells to construct the result. zip clist1 clist2 ... -> list (lambda lists (apply map list lists)) If ZIP is passed N lists, it returns a list as long as the shortest of these lists, each element of which is an N-element list comprised of the corresponding elements from the parameter lists. (zip '(one two three) '(1 2 3) '(odd even odd even odd even odd even)) => ((one 1 odd) (two 2 even) (three 3 odd)) (zip '(1 2 3)) => ((1) (2) (3)) At least one of the argument lists must be finite: (zip '(3 1 4 1) (circular-list #f #t)) => ((3 #f) (1 #t) (4 #f) (1 #t)) unzip1 list -> list unzip2 list -> [list list] unzip3 list -> [list list list] unzip4 list -> [list list list list] unzip5 list -> [list list list list list] UNZIP1 takes a list of lists, where every list must contain at least one element, and returns a list containing the initial element of each such list. That is, it returns (MAP CAR LISTS). UNZIP2 takes a list of lists, where every list must contain at least two elements, and returns two values: a list of the first elements, and a list of the second elements. UNZIP3 does the same for the first three elements of the lists, and so forth. (unzip2 '((1 one) (2 two) (3 three))) => (1 2 3) (one two three) count pred clist1 clist2 ... -> integer PRED is a procedure taking as many arguments as there are lists and returning a single value. It is applied element-wise to the elements of the LISTs, and a count is tallied of the number of elements that produce a true value. This count is returned. COUNT is "iterative" in that it is guaranteed to apply PRED to the LIST elements in a left-to-right order. The counting stops when the shortest list expires. (count even? '(3 1 4 1 5 9 2 5 6)) => 3 (count < '(1 2 4 8) '(2 4 6 8 10 12 14 16)) => 3 At least one of the argument lists must be finite: (count < '(3 1 4 1) (circular-list 1 10)) => 2 ** Fold, unfold & map ===================== fold kons knil clist1 clist2 ... -> value The fundamental list iterator. First, consider the single list-parameter case. If CLIST1 = (e1 e2 ... en), then this procedure returns (kons en ... (kons e2 (kons e1 knil)) ... ) That is, it obeys the (tail) recursion (fold kons knil lis) = (fold kons (kons (car lis) knil) (cdr lis)) (fold kons knil '()) = knil Examples: (fold + 0 lis) ; Add up the elements of LIS. (fold cons '() lis) ; Reverse LIS. (fold cons tail rev-head) ; See APPEND-REVERSE. ;; How many symbols in LIS? (fold (lambda (x count) (if (symbol? x) (+ count 1) count)) 0 lis) ;; Length of the longest string in LIS: (fold (lambda (s max-len) (max max-len (string-length s))) 0 lis) If N list arguments are provided, then the KONS function must take N+1 parameters: one element from each list, and the "seed" or fold state, which is initially KNIL. The fold operation terminates when the shortest list runs out of values: (fold cons* '() '(a b c) '(1 2 3 4 5)) => (c 3 b 2 a 1) At least one of the list arguments must be finite. fold-right kons knil clist1 clist2 ... -> value The fundamental list recursion operator. First, consider the single list-parameter case. If CLIST1 = (e1 e2 ... en), then this procedure returns (kons e1 (kons e2 ... (kons en knil))) That is, it obeys the recursion (fold-right kons knil lis) = (kons (car lis) (fold-right kons knil (cdr lis))) (fold-right kons knil '()) = knil Examples: (fold-right cons '() lis) ; Copy LIS. ;; Filter the even numbers out of LIS. (fold-right (lambda (x l) (if (even? x) (cons x l) l)) '() lis)) If N list arguments are provided, then the KONS function must take N+1 parameters: one element from each list, and the "seed" or fold state, which is initially KNIL. The fold operation terminates when the shortest list runs out of values: (fold-right cons* '() '(a b c) '(1 2 3 4 5)) => (a 1 b 2 c 3) At least one of the list arguments must be finite. pair-fold kons knil clist1 clist2 ... -> value Analogous to FOLD, but KONS is applied to successive sublists of the lists, rather than successive elements -- that is, KONS is applied to the pairs making up the lists, giving this (tail) recursion: (pair-fold kons knil lis) = (let ((tail (cdr lis))) (pair-fold kons (kons lis knil) tail)) (pair-fold kons knil '()) = knil For finite lists, the KONS function may reliably apply SET-CDR! to the pairs it is given without altering the sequence of execution. Example: ;;; Destructively reverse a list. (pair-fold (lambda (pair tail) (set-cdr! pair tail) pair) '() lis)) At least one of the list arguments must be finite. pair-fold-right kons knil clist1 clist2 ... -> value Holds the same relationship with FOLD-RIGHT that PAIR-FOLD holds with FOLD. Obeys the recursion (pair-fold-right kons knil lis) = (kons lis (pair-fold-right kons knil (cdr lis))) (pair-fold-right kons knil '()) = knil Example: (pair-fold-right cons '() '(a b c)) => ((a b c) (b c) (c)) At least one of the list arguments must be finite. reduce f ridentity list -> value REDUCE is a variant of FOLD. RIDENTITY should be a "right identity" of the procedure F -- that is, for any value X acceptable to F, (f x ridentity) = x REDUCE has the following definition: If LIST = (), return RIDENTITY. Otherwise, return (fold F (car LIST) (cdr LIST)). ...in other words, we compute (fold F RIDENTITY LIST). Note that RIDENTITY is used *only* in the empty-list case. You typically use REDUCE when applying F is expensive and you'd like to avoid the extra application incurred when FOLD applies F to the head of LIST and the identity value, redundantly producing the same value passed in to F. For example, if F involves searching a file directory or performing a database query, this can be significant. In general, however, FOLD is useful in many contexts where REDUCE is not (consider the examples given in the FOLD definition -- only one of the five folds uses function with a right identity. The other four may not be performed with REDUCE). Note: MIT Scheme and Haskell flip F's arg order for their REDUCE and FOLD functions. ;; Take the max of a list of non-negative integers. (reduce max 0 nums) ; i.e., (apply max 0 nums) reduce-right f ridentity list -> value REDUCE-RIGHT is the fold-right variant of REDUCE. It obeys the following definition: (reduce-right f ridentity '()) = ridentity (reduce-right f ridentity '(e1)) = (f e1 ridentity) = e1 (reduce-right f ridentity '(e1 e2 ...)) = (f e1 (reduce f ridentity (e2 ...))) ...in other words, we compute (fold-right F RIDENTITY LIST). ;; Append a bunch of lists together. ;; I.e., (apply append list-of-lists) (reduce-right append '() list-of-lists) unfold p f g seed [tail-gen]-> list UNFOLD is best described by its basic recursion: (unfold p f g seed) = (if (p seed) (tail-gen seed) (cons (f seed) (unfold p f g (g seed)))) P: Determines when to stop unfolding. F: Maps each seed value to the corresponding list element. G: Maps each seed value to next seed value. SEED: The "state" value for the unfold. TAIL-GEN: creates the tail of the list; defaults to (lambda (x) '()) In other words, we use G to generate a sequence of seed values SEED, (G SEED), (G^2 SEED), (G^3 SEED), ... These seed values are mapped to list elements by F, producing the elements of the result list in a left-to-right order. P says when to stop. UNFOLD is the fundamental recursive list constructor, just as FOLD-RIGHT is the fundamental recursive list consumer. While UNFOLD may seem a bit abstract to novice functional programmers, it can be used in a number of ways: (unfold (lambda (x) (> x 10)) ; List of squares: 1^2 ... 10^2. (lambda (x) (* x x)) (lambda (x) (+ x 1)) 1) (unfold null-list? car cdr lis) ; Copy a proper list. ;; Read current input port into a list of values. (unfold eof-object? values (lambda (x) (read)) (read)) ;; Copy a possibly non-proper list: (unfold not-pair? car cdr lis values) ;; Append HEAD onto TAIL: (unfold null-list? car cdr head (lambda (x) tail)) Interested functional programmers may enjoy noting that FOLD-RIGHT and UNFOLD are in some sense inverses. That is, given operations KNULL?, KAR, KDR, KONS, and KNIL satisfying (kons (kar x) (kdr x)) = x and (knull? knil) = #t then (FOLD-RIGHT kons knil (UNFOLD knull? kar kdr x)) = x and (UNFOLD knull? kar kdr (FOLD-RIGHT kons knil x)) = x. This combinator sometimes is called an "anamorphism;" when an explicit TAIL-GEN procedure is supplied, it is called an "apomorphism." unfold-right p f g seed [tail] -> value UNFOLD constructs a list with the following loop: (let lp ((seed seed) (lis tail)) (if (p seed) lis (lp (g seed) (cons (f seed) lis)))) P: Determines when to stop unfolding. F: Maps each seed value to the corresponding list element. G: Maps each seed value to next seed value. SEED: The "state" value for the unfold. TAIL: list terminator; defaults to '(). In other words, we use G to generate a sequence of seed values SEED, (G SEED), (G^2 SEED), (G^3 SEED), ... These seed values are mapped to list elements by F, producing the elements of the result list in a right-to-left order. P says when to stop. UNFOLD-RIGHT is the fundamental iterative list constructor, just as FOLD is the fundamental iterative list consumer. While UNFOLD-RIGHT may seem a bit abstract to novice functional programmers, it can be used in a number of ways: (unfold-right zero? ; List of squares: 1^2 ... 10^2 (lambda (x) (* x x)) (lambda (x) (- x 1)) 10) (unfold-right null-list? car cdr lis) ; Reverse a proper list. ;; Read current input port into a list of values. (unfold-right eof-object? values (lambda (x) (read)) (read)) ;; (APPEND-REVERSE rev-head tail) (unfold-right null-list? car cdr rev-head tail) Interested functional programmers may enjoy noting that FOLD and UNFOLD-RIGHT are in some sense inverses. That is, given operations KNULL?, KAR, KDR, KONS, and KNIL satisfying (kons (kar x) (kdr x)) = x and (knull? knil) = #t then (FOLD kons knil (UNFOLD-RIGHT knull? kar kdr x)) = x and (UNFOLD-RIGHT knull? kar kdr (FOLD kons knil x)) = x. This combinator presumably has some pretentious mathematical name; interested readers are invited to communicate it to the author. map proc clist1 clist2 ... -> list R5RS+ PROC is a procedure taking as many arguments as there are list arguments and returning a single value. MAP applies PROC element-wise to the elements of the lists and returns a list of the results, in order. The dynamic order in which PROC is applied to the elements of the lists is unspecified. (map cadr '((a b) (d e) (g h))) ==> (b e h) (map (lambda (n) (expt n n)) '(1 2 3 4 5)) ==> (1 4 27 256 3125) (map + '(1 2 3) '(4 5 6)) ==> (5 7 9) (let ((count 0)) (map (lambda (ignored) (set! count (+ count 1)) count) '(a b))) ==> (1 2) OR (2 1) This procedure is extended from its R5RS specification to allow the arguments to be of unequal length; it terminates when the shortest list runs out. At least one of the argument lists must be finite: (map + '(3 1 4 1) (circular-list 1 0)) => (4 1 5 1) for-each proc clist1 clist2 ... -> unspecified R5RS+ The arguments to FOR-EACH are like the arguments to MAP, but FOR-EACH calls PROC for its side effects rather than for its values. Unlike MAP, FOR-EACH is guaranteed to call PROC on the elements of the CLISTs in order from the first element(s) to the last, and the value returned by FOR-EACH is unspecified. (let ((v (make-vector 5))) (for-each (lambda (i) (vector-set! v i (* i i))) '(0 1 2 3 4)) v) ==> #(0 1 4 9 16) This procedure is extended from its R5RS specification to allow the arguments to be of unequal length; it terminates when the shortest list runs out. At least one of the argument lists must be finite. append-map f clist1 clist2 ... -> value append-map! f clist1 clist2 ... -> value Equivalent to (apply append (map f clist1 clist2 ...)) and (apply append! (map f clist1 clist2 ...)) Map F over the elements of the lists, just as in the MAP function. However, the results of the applications are appended together to make the final result. APPEND-MAP uses APPEND to append the results together; APPEND-MAP! uses APPEND!. The dynamic order in which the various applications of F are made is not specified. Example: (append-map! (lambda (x) (list x (- x))) '(1 3 8)) => (1 -1 3 -3 8 -8) At least one of the list arguments must be finite. map! f list1 clist2 ... -> list Linear-update variant of MAP -- MAP! is allowed, but not required, to alter the cons cells of LIST1 to construct the result list. The dynamic order in which the various applications of F are made is not specified. In the n-ary case, CLIST2, CLIST3, ... must have at least as many elements as LIST1. map-in-order f clist1 clist2 ... -> list A variant of the MAP procedure that guarantees to apply F across the elements of the LISTi arguments in a left-to-right order. This is useful for mapping procedures that both have side effects and return useful values. At least one of the list arguments must be finite. pair-for-each f clist1 clist2 ... -> unspecific Like FOR-EACH, but F is applied to successive sublists of the argument lists. That is, F is applied to the cons cells of the lists, rather than the lists' elements. These applications occur in left-to-right order. The F procedure may reliably apply SET-CDR! to the pairs it is given without altering the sequence of execution. (pair-for-each (lambda (pair) (display pair) (newline)) '(a b c)) ==> (a b c) (b c) (c) At least one of the list arguments must be finite. filter-map f clist1 clist2 ... -> list Like MAP, but only true values are saved. (filter-map (lambda (x) (and (number? x) (* x x))) '(a 1 b 3 c 7)) => (1 9 49) The dynamic order in which the various applications of F are made is not specified. At least one of the list arguments must be finite. ** Filtering & partitioning =========================== filter pred list -> list Return all the elements of LIST that satisfy predicate PRED. The list is not disordered -- elements that appear in the result list occur in the same order as they occur in the argument list. The returned list may share a common tail with the argument list. The dynamic order in which the various applications of PRED are made is not specified. (filter even? '(0 7 8 8 43 -4)) => (0 8 8 -4) partition pred list -> [list list] Partitions the elements of LIST with predicate PRED, and returns two values: the list of in-elements and the list of out-elements. The list is not disordered -- elements occur in the result lists in the same order as they occur in the argument list. The dynamic order in which the various applications of PRED are made is not specified. One of the returned lists may share a common tail with the argument list. (partition symbol? '(one 2 3 four five 6)) => (one four five) (2 3 6) remove pred list -> list Returns LIST without the elements that satisfy predicate PRED: (lambda (pred list) (filter (lambda (x) (not (pred x))) list)) The list is not disordered -- elements that appear in the result list occur in the same order as they occur in the argument list. The returned list may share a common tail with the argument list. The dynamic order in which the various applications of PRED are made is not specified. (remove even? '(0 7 8 8 43 -4)) => (7 43) filter! pred list -> list partition! pred list -> [list list] remove! pred list -> list Linear-update variants of FIND, PARTITION and REMOVE. These procedures are allowed, but not required, to alter the cons cells in the argument list to construct the result lists. ** Searching ============ The following procedures all search lists for a leftmost element satisfying some criteria. This means they do not always examine the entire list; thus, there is no efficient way for them to reliably detect and signal an error when passed a dotted or circular list. Here are the general rules describing how these procedures work when applied to different kinds of lists: Proper lists: The standard, canonical behavior happens in this case. Dotted lists: It is an error to pass these procedures a dotted list that does not contain an element satisfying the search criteria. That is, it is an error if the procedure has to search all the way to the end of the dotted list. However, this SRFI does *not* specify anything at all about the behavior of these procedures when passed a dotted list containing an element satisfying the search criteria. It may finish successfully, signal an error, or perform some third action. Different implementations may provide different functionality in this case; code which is compliant with this SRFI may not rely on any particular behavior. Future SRFI's may refine SRFI-1 to define specific behavior in this case. In brief, SRFI-1 compliant code may not pass a dotted list argument to these procedures. Circular lists: It is an error to pass these procedures a circular list that does not contain an element satisfying the search criteria. Note that the procedure is not required to detect this case; it may simply diverge. It is, however, acceptable to search a circular list *if the search is successful* -- that is, if the list contains an element satisfying the search criteria. Here are some examples, using the FIND and ANY procedures as canonical representatives: ;; Proper list -- success (find even? '(1 2 3)) => 2 (any even? '(1 2 3)) => #t ;; proper list -- failure (find even? '(1 7 3)) => #f (any even? '(1 7 3)) => #f ;; Failure is error on a dotted list. (find even? '(1 3 . x)) => error (any even? '(1 3 . x)) => error ;; The dotted list contains an element satisfying the search. ;; This case is not specified -- it could be success, an error, ;; or some third possibility. (find even? '(1 2 . x)) => error/undefined (any even? '(1 2 . x)) => error/undefined ; success, error or other. ;; circular list -- success (find even? (circular-list 1 6 3)) => 6 (any even? (circular-list 1 6 3)) => #t ;; circular list -- failure is error. Procedure may diverge. (find even? (circular-list 1 3)) => error (any even? (circular-list 1 3)) => error find pred clist -> value Return the first element of CLIST that satisfies predicate PRED; false if no element does. (find even? '(3 1 4 1 5 9)) => 4 Note that FIND has an ambiguity in its lookup semantics -- if FIND returns #F, you cannot tell (in general) if it found a #F element that satisfied PRED, or if it did not find any element at all. In many situations, this ambiguity cannot arise -- either the list being searched is known not to contain any #F elements, or the list is guaranteed to have an element satisfying PRED. However, in cases where this ambiguity can arise, you should use FIND-TAIL instead of FIND -- FIND-TAIL has no such ambiguity: (cond ((find-tail pred lis) => (lambda (pair) ...)) ; Handle (CAR PAIR) (else ...)) ; Search failed. find-tail pred clist -> pair or false Return the first pair of CLIST whose car satisfies PRED. If no pair does, return false. FIND-TAIL can be viewed as a general-predicate variant of the MEMBER function. Examples: (find-tail even? '(3 1 37 -8 -5 0 0)) => (-8 -5 0 0) (find-tail even? '(3 1 37 -5)) => #f ;; MEMBER X LIS: (find-tail (lambda (elt) (equal? x elt)) lis) In the circular-list case, this procedure "rotates" the list. FIND-TAIL is essentially DROP-WHILE, where the sense of the predicate is inverted: FIND-TAIL searches until it finds an element satisfying the predicate; DROP-WHILE searches until it finds an element that *doesn't* satisfy the predicate. take-while pred clist -> list take-while! pred clist -> list Returns the longest initial prefix of CLIST whose elements all satisfy the predicate PRED. TAKE-WHILE! is the linear-update variant. It is allowed, but not required, to alter the argument list to produce the result. (take-while even? '(2 18 3 10 22 9)) => (2 18) drop-while pred clist -> list Drops the longest initial prefix of LIST whose elements all satisfy the predicate PRED, and returns the rest of the list. (drop-while even? '(2 18 3 10 22 9)) => (3 10 22 9) The circular-list case may be viewed as "rotating" the list. span pred clist -> [list clist] span! pred list -> [list list] break pred clist -> [list clist] break! pred list -> [list list] SPAN splits the list into the longest initial prefix whose elements all satisfy PRED, and the remaining tail. BREAK inverts the sense of the predicate: the tail commences with the first element of the input list that satisfies the predicate. In other words: SPAN finds the intial span of elements satisfying PRED, and BREAK breaks the list at the first element satisfying PRED. SPAN is equivalent to (VALUES (TAKE-WHILE PRED CLIST) (DROP-WHILE PRED CLIST)). SPAN! and BREAK! are the linear-update variants. They are allowed, but not required, to alter the argument list to produce the result. (span even? '(2 18 3 10 22 9)) => (2 18) (3 10 22 9) (break even? '(3 1 4 1 5 9)) => (3 1) (4 1 5 9) any pred clist1 clist2 ... -> value Applies the predicate across the lists, returning true if the predicate returns true on any application. If there are N list arguments CLIST1 ... CLISTn, then PRED must be a procedure taking N arguments and returning a boolean result. ANY applies PRED to the first elements of the CLISTi parameters. If this application returns a true value, ANY immediately returns that value. Otherwise, it iterates, applying PRED to the second elements of the CLISTi parameters, then the third, and so forth. The iteration stops when a true value is produced or one of the lists runs out of values; in the latter case, ANY returns #F. The application of PRED to the last element of the lists is a tail call. Note the difference between FIND and ANY -- FIND returns the element that satisfied the predicate; ANY returns the true value that the predicate produced. Like EVERY, ANY's name does not end with a question mark -- this is to indicate that it does not return a simple boolean (#T or #F), but a general value. (any integer? '(a 3 b 2.7)) => #T (any integer? '(a 3.1 b 2.7)) => #F (any < '(3 1 4 1 5) '(2 7 1 8 2)) => #T every pred clist1 clist2 ... -> value Applies the predicate across the lists, returning true if the predicate returns true on every application. If there are N list arguments CLIST1 ... CLISTn, then PRED must be a procedure taking N arguments and returning a boolean result. EVERY applies PRED to the first elements of the CLISTi parameters. If this application returns false, EVERY immediately returns false. Otherwise, it iterates, applying PRED to the second elements of the CLISTi parameters, then the third, and so forth. The iteration stops when a false value is produced or one of the lists run out of values. In the latter case, EVERY returns the true value produced by its final application of PRED. The application of PRED to the last element of the lists is a tail call. If one of the CLISTi has no elements, EVERY simply returns #T. Like ANY, EVERY's name does not end with a question mark -- this is to indicate that it does not return a simple boolean (#T or #F), but a general value. list-index pred clist1 clist2 ... -> integer or false Return the index of the leftmost element that satisfies PRED. If there are N list arguments CLIST1 ... CLISTn, then PRED must be a function taking N arguments and returning a boolean result. LIST-INDEX applies PRED to the first elements of the CLISTi parameters. If this application returns true, LIST-INDEX immediately returns zero. Otherwise, it iterates, applying PRED to the second elements of the CLISTi parameters, then the third, and so forth. When it finds a tuple of list elements that cause PRED to return true, it stops and returns the zero-based index of that position in the lists. The iteration stops when one of the lists runs out of values; in this case, LIST-INDEX returns #F. (list-index even? '(3 1 4 1 5 9)) => 2 (list-index < '(3 1 4 1 5 9 2 5 6) '(2 7 1 8 2)) => 1 (list-index = '(3 1 4 1 5 9 2 5 6) '(2 7 1 8 2)) => #f member x list [=] -> list or #f R5RS+ memq x list -> list or #f R5RS memv x list -> list or #f R5RS These procedures return the first sublist of LIST whose car is X, where the sublists of LIST are the non-empty lists returned by (DROP LIST I) for I less than the length of LIST. If X does not occur in LIST, then #f is returned. MEMQ uses EQ? to compare X with the elements of LIST, while MEMV uses EQV? and MEMBER uses EQUAL?. (memq 'a '(a b c)) ==> (a b c) (memq 'b '(a b c)) ==> (b c) (memq 'a '(b c d)) ==> #f (memq (list 'a) '(b (a) c)) ==> #f (member (list 'a) '(b (a) c)) ==> ((a) c) (memq 101 '(100 101 102)) ==> *unspecified* (memv 101 '(100 101 102)) ==> (101 102) MEMBER is extended from its R5RS definition to allow the client to pass in an optional equality procedure = used to compare keys. The comparison procedure is used to compare the elements Ei of LIST to the key X in this way: (= X Ei) ; list is (E1 ... En) That is, the first argument is always X, and the second argument is one of the list elements. Thus one can reliably find the first element of LIST that is greater than five with (member 5 LIST <) Note that fully general list searching may be performed with the FIND-TAIL and FIND procedures, e.g. (find-tail even? list) ; Find the first elt with an even key. ** Deletion =========== delete x list [=] -> list delete! x list [=] -> list DELETE uses the comparison procedure =, which defaults to EQUAL?, to find all elements of LIST that are equal to X, and deletes them from LIST. The dynamic order in which the various applications of = are made is not specified. The list is not disordered -- elements that appear in the result list occur in the same order as they occur in the argument list. The result may share a common tail with the argument list. Note that fully general element deletion can be performed with the REMOVE and REMOVE! procedures, e.g.: ;; Delete all the even elements from LIS: (remove even? lis) The comparison procedure is used in this way: (= X Ei) That is, X is always the first argument, and a list element is always the second argument. The comparison procedure will be used to compare each element of LIST exactly once; the order in which it is applied to the various Ei is not specified. Thus, one can reliably remove all the numbers greater than five from a list with (delete 5 list <) DELETE! is the linear-update variant of DELETE. It is allowed, but not required, to alter the cons cells in its argument list to construct the result. delete-duplicates list [=] -> list delete-duplicates! list [=] -> list DELETE-DUPLICATES removes duplicate elements from the list argument. If there are multiple equal elements in the argument list, the result list only contains the first or leftmost of these elements in the result. The order of these surviving elements is the same as in the original list -- DELETE-DUPLICATES does not disorder the list (hence it is useful for "cleaning up" association lists). The = parameter is used to compare the elements of the list; it defaults to EQUAL?. If X comes before Y in LIST, then the comparison is performed (= X Y) The comparison procedure will be used to compare each pair of elements in LIST no more than once; the order in which it is applied to the various pairs is not specified. Implementations of DELETE-DUPLICATE are allowed to share common tails between argument and result lists -- for example, if the list argument contains only unique elements, it may simply return exactly this list. Be aware that, in general, DELETE-DUPLICATES runs in time O(n^2) for N-element lists. Uniquifying long lists can be accomplished in O(n lg n) time by sorting the list to bring equal elements together, then using a linear-time algorithm to remove equal elements. Alternatively, one can use algorithms based on element-marking, with linear-time results. DELETE-DUPLICATES! is the linear-update variant of DELETE-DUPLICATES; it is allowed, but not required, to alter the cons cells in its argument list to construct the result. (delete-duplicates '(a b a c a b c z)) => (a b c z) ;; Clean up an alist: (delete-duplicates '((a . 3) (b . 7) (a . 9) (c . 1)) (lambda (x y) (eq? (car x) (car y)))) => ((a . 3) (b . 7) (c . 1)) ** Association lists ==================== An "association list" (or "alist") is a list of pairs. The car of each pair contains a key value, and the cdr contains the associated data value. They can be used to construct simple look-up tables in Scheme. Note that association lists are probably inappropriate for performance-critical use on large data; in these cases, hash tables or some other alternative should be employed. assoc key alist [=] -> pair or #f R5RS+ assq key alist -> pair or #f R5RS assv key alist -> pair or #f R5RS ALIST must be an association list -- a list of pairs. These procedures find the first pair in ALIST whose car field is KEY, and returns that pair. If no pair in ALIST has KEY as its car, then #f is returned. ASSQ uses EQ? to compare KEY with the car fields of the pairs in ALIST, while ASSV uses EQV? and ASSOC uses EQUAL?. (define e '((a 1) (b 2) (c 3))) (assq 'a e) ==> (a 1) (assq 'b e) ==> (b 2) (assq 'd e) ==> #f (assq (list 'a) '(((a)) ((b)) ((c)))) ==> #f (assoc (list 'a) '(((a)) ((b)) ((c)))) ==> ((a)) (assq 5 '((2 3) (5 7) (11 13))) ==> *unspecified* (assv 5 '((2 3) (5 7) (11 13))) ==> (5 7) ASSOC is extended from its R5RS definition to allow the client to pass in an optional equality procedure = used to compare keys. The comparison procedure is used to compare the elements Ei of LIST to the KEY parameter in this way: (= KEY (CAR Ei)) ; list is (E1 ... En) That is, the first argument is always KEY, and the second argument is one of the list elements. Thus one can reliably find the first entry of ALIST whose key is greater than five with (assoc 5 ALIST <) Note that fully general alist searching may be performed with the FIND-TAIL and FIND procedures, e.g. ;; Look up the first association in ALIST with an even key: (find (lambda (a) (even? (car a))) alist) alist-cons key datum alist -> alist (lambda (key datum alist) (cons (cons key datum) alist)) Cons a new alist entry mapping KEY -> DATUM onto ALIST. alist-copy alist -> alist Make a fresh copy of ALIST. This means copying each pair that forms an association as well as the spine of the list, i.e. (lambda (a) (map (lambda (elt) (cons (car elt) (cdr elt))) a)) alist-delete key alist [=] -> alist alist-delete! key alist [=] -> alist ALIST-DELETE deletes all associations from ALIST with the given KEY, using key-comparison procedure =, which defaults to EQUAL?. The dynamic order in which the various applications of = are made is not specified. Return values may share common tails with the ALIST argument. The alist is not disordered -- elements that appear in the result alist occur in the same order as they occur in the argument alist. The comparison procedure is used to compare the element keys Ki of ALIST's entries to the KEY parameter in this way: (= KEY Ki) Thus, one can reliably remove all entries of ALIST whose key is greater than five with (alist-delete 5 alist <) ALIST-DELETE! is the linear-update variant of ALIST-DELETE. It is allowed, but not required, to alter the cons cells from the ALIST parameter to construct the result. ** Set operations on lists ========================== These procedures implement operations on sets represented as lists of elements. They all take an = argument used to compare elements of lists. This equality procedure is required to be consistent with EQ?. That is, it must be the case that (eq? x y) => (= x y). Note that this implies, in turn, that two lists that are EQ? are also set-equal by any legal comparison procedure. This allows for constant-time determination of set operations on EQ? lists. Be aware that these procedures typically run in time O(n * m) for N- and M-element list arguments. Performance-critical applications operating upon large sets will probably wish to use other data structures and algorithms. lset<= = list1 ... -> boolean Returns true iff every LISTi is a subset of LISTi+1, using = for the element-equality procedure. List A is a subset of list B if every element in A is equal to some element of B. When performing an element comparison, the = procedure's first argument is an element of A; its second, an element of B. (lset<= eq? '(a) '(a b a) '(a b c c)) => #t (lset<= eq?) => #t ; Trivial cases (lset<= eq? '(a)) => #t lset= = list1 ... -> boolean Returns true iff every LISTi is set-equal to LISTi+1, using = for the element-equality procedure. "Set-equal" simply means that LISTi is a subset of LISTi+1, and LISTi+1 is a subset of LISTi. (lset= eq? '(b e a) '(a e b) '(e e b a)) => #t (lset= eq?) => #t ; Trivial cases (lset= eq? '(a)) => #t lset-adjoin = list elt1 ... -> list Adds the ELTi elements not already in the list parameter to the result list. The result shares a common tail with the list parameter. The new elements are added to the front of the list, but no guarantees are made about their order. The = parameter is an equality procedure used to determine if an ELTi is already a member of LIST. Its first argument is an element of LIST; its second is one of the ELTi. The list parameter is always a suffix of the result -- even if the list parameter contains repeated elements, these are not reduced. (lset-adjoin eq? '(a b c d c e) 'a 'e 'i 'o 'u) => (u o i a b c d c e) lset-union = list1 ... -> list Returns the union of the lists, using = for the element-equality procedure. The union of lists A and B is constructed as follows: - If A is the empty list, the answer is B (or a copy of B). - Otherwise, the result is initialised to be list A (or a copy of A). - Proceed through the elements of list B in a left-to-right order. If b is such an element of B, compare every element r of the current result list to b: (= r b). If all comparisons fail, b is consed onto the front of the result. However, there is no guarantee that = will be applied to every pair of arguments from A and B. In particular, if A is EQ? to B, the operation may immediately terminate. In the n-ary case, the two-argument list-union operation is simply folded across the argument lists. (lset-union eq? '(a b c d e) '(a e i o u)) => (u o i a b c d e) ;; Repeated elements in LIST1 are preserved. (lset-union eq? '(a a c) '(x a x)) => (x a a c) (lset-union eq?) => () ; Trivial cases (lset-union eq? '(a b c)) => (a b c) lset-intersection = list1 list2 ... -> list Returns the intersection of the lists, using = for the element-equality procedure. The intersection of lists A and B is comprised of every element of A that is = to some element of B: (= a b), for a in A, and b in B. Note this implies that an element which appears in B and multiple times in list A will also appear multiple times in the result. The order in which elements appear in the result is the same as they appear in LIST1 -- that is, LSET-INTERSECTION essentially filters LIST1, without disarranging element order. The result may share a common tail with LIST1. In the n-ary case, the two-argument list-intersection operation is simply folded across the argument lists. However, the dynamic order in which the applications of = are made is not specified. The procedure may check an element of LIST1 for membership in every other list before proceeding to consider the next element of LIST1, or it may completely intersect LIST1 and LIST2 before proceeding to LIST3, or it may go about its work in some third order. (lset-intersection eq? '(a b c d e) '(a e i o u)) => (a e) ;; Repeated elements in LIST1 are preserved. (lset-intersection eq? '(a x y a) '(x a x z)) => '(a x a) (lset-intersection eq? '(a b c)) => (a b c) ; Trivial case lset-difference = list1 list2 ... -> list Returns the difference of the lists, using = for the element-equality procedure -- all the elements of LIST1 that are not = to any element from one of the other LISTi parameters. The = procedure's first argument is always an element of LIST1; its second is an element of one of the other LISTi. Elements that are repeated multiple times in the LIST1 parameter will occur multiple times in the result. The order in which elements appear in the result is the same as they appear in LIST1 -- that is, LSET-DIFFERENCE essentially filters LIST1, without disarranging element order. The result may share a common tail with LIST1. The dynamic order in which the applications of = are made is not specified. The procedure may check an element of LIST1 for membership in every other list before proceeding to consider the next element of LIST1, or it may completely compute the difference of LIST1 and LIST2 before proceeding to LIST3, or it may go about its work in some third order. (lset-difference eq? '(a b c d e) '(a e i o u)) => (b c d) (lset-difference eq? '(a b c)) => (a b c) ; Trivial case lset-xor = list1 ... -> list Returns the exclusive-or of the sets, using = for the element-equality procedure. If there are exactly two lists, this is all the elements that appear in exactly one of the two lists. The operation is associative, and thus extends to the n-ary case -- the elements that appear in an odd number of the lists. The result may share a common tail with any of the LISTi parameters. More precisely, for two lists A and B, A xor B is a list of - every element a of A such that there is no element b of B such that (= a b) - every element b of B such that there is no element a of A such that (= b a) However, an implementation is allowed to assume that = is symmetric -- that is, that (= a b) => (= b a). This means, for example, that if a comparison (= a b) produces true for some a in A and b in B, both a and b may be removed from inclusion in the result. In the n-ary case, the binary-xor operation is simply folded across the lists. (lset-xor eq? '(a b c d e) '(a e i o u)) => (d c b i o u) ;; Trivial cases. (lset-xor eq?) => () (lset-xor eq? '(a b c d e)) => (a b c d e) lset-diff+intersection = list1 list2 ... -> [list list] Returns two values -- the difference and the intersection of the lists. Is equivalent to (values (lset-difference = list1 list2 ...) (lset-intersection = list1 (lset-union = list2 ...))) but can be implemented more efficiently. The = procedure's first argument is an element of LIST1; its second is an element of one of the other LISTi. Either of the answer lists may share a common tail with LIST1. This operation essentially partitions LIST1. lset-union! = list1 ... -> list lset-intersection! = list1 list2 ... -> list lset-difference! = list1 list2 ... -> list lset-xor! = list1 ... -> list lset-diff+intersection! = list1 list2 ... -> [list list] These are linear-update variants. They are allowed, but not required, to use the cons cells in their first list parameter to construct their answer. LSET-UNION! is permitted to recycle cons cells from *any* of its list arguments. ** Primitive side-effects ========================= These two procedures are the primitive, R5RS side-effect operations on pairs. set-car! pair object -> unspecified R5RS set-cdr! pair object -> unspecified R5RS These procedures store OBJECT in the car and cdr field of PAIR, respectively. The value returned is unspecified. (define (f) (list 'not-a-constant-list)) (define (g) '(constant-list)) (set-car! (f) 3) ==> *unspecified* (set-car! (g) 3) ==> *error* * Acknowledgements ------------------ The design of this library benefited greatly from the feedback provided during the SRFI discussion phase. Among those contributing thoughtful commentary and suggestions, both on the mailing list and by private discussion, were Mike Ashley, Darius Bacon, Alan Bawden, Phil Bewig, Jim Blandy, Dan Bornstein, Per Bothner, Anthony Carrico, Doug Currie, Kent Dybvig, Sergei Egorov, Doug Evans, Marc Feeley, Matthias Felleisen, Will Fitzgerald, Matthew Flatt, Dan Friedman, Lars Thomas Hansen, Brian Harvey, Erik Hilsdale, Wolfgang Hukriede, Richard Kelsey, Donovan Kolbly, Shriram Krishnamurthi, Dave Mason, Jussi Piitulainen, David Pokorny, Duncan Smith, Mike Sperber, Maciej Stachowiak, Harvey J. Stein, John David Stone, and Joerg F. Wittenberger. I am grateful to them for their assistance. I am also grateful the authors, implementors and documentors of all the systems mentioned in the introduction. Aubrey Jaffer and Kent Pitman should be noted for their work in producing Web-accessible versions of the R5RS and Common Lisp spec, which was a tremendous aid. This is not to imply that these individuals necessarily endorse the final results, of course. * References & Links -------------------- This document, in HTML: http://srfi.schemers.org/srfi-1/srfi-1.html ftp://ftp.ai.mit.edu/people/shivers/srfi/srfi-1/srfi-1.html (draft) This document, in simple text format: http://srfi.schemers.org/srfi-1/srfi-1.txt ftp://ftp.ai.mit.edu/people/shivers/srfi/srfi-1/srfi-1.txt (draft) Source code for the reference implementation: http://srfi.schemers.org/srfi-1/srfi-1-reference.scm ftp://ftp.ai.mit.edu/people/shivers/srfi/srfi-1/srfi-1-reference.scm (draft) Archive of SRFI-1 discussion-list email: http://srfi.schemers.org/srfi-1/mail-archive/maillist.html SRFI web site: http://srfi.schemers.org/ [CommonLisp] Common Lisp: the Language Guy L. Steele Jr. (editor). Digital Press, Maynard, Mass., second edition 1990. Available at http://www.elwood.com/alu/table/references.htm#cltl2 The Common Lisp "HyperSpec," produced by Kent Pitman, is essentially the ANSI spec for Common Lisp: http://www.harlequin.com/education/books/HyperSpec/ [R5RS] Revised^5 Report on the Algorithmic Language Scheme, R. Kelsey, W. Clinger, J. Rees (editors). Higher-Order and Symbolic Computation, Vol. 11, No. 1, September, 1998. and ACM SIGPLAN Notices, Vol. 33, No. 9, October, 1998. Available at http://www.schemers.org/Documents/Standards/ * Copyright ----------- Certain portions of this document -- the specific, marked segments of text describing the R5RS procedures -- were adapted with permission from the R5RS report. All other text is copyright (C) Olin Shivers (1998, 1999). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Scheme Request For Implementation process or editors, except as needed for the purpose of developing SRFIs in which case the procedures for copyrights defined in the SRFI process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the authors or their successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE AUTHORS AND THE SRFI EDITORS DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. * Ispell "buffer local" dictionary ---------------------------------- Ispell dumps "buffer local" words here. Please ignore. LocalWords: RS SRFI Chez RScheme MzScheme slib Bigloo APL SML API CDR GC's Ei LocalWords: EQ consing lib xcons unzip del delq delv mem lset lset xor diff lp LocalWords: alist assq assv assoc cdr cdddar cddddr ref memq memv george iff LocalWords: proc lis accessor ary TAIL's NCONS EQV rcons Contrariwise clist LocalWords: paribus lexeme parallelise Destructuring init FP flist eof CLISTn LocalWords: generalisation elt cadr caddr rev kons knil len rzero LZERO Ki Ith LocalWords: arg LISTi pred cond LISTn ANY's EVERY's Uniquifying lg ridentity LocalWords: eq netnews generalise Maciej Stachowiak al Bewig LocalWords ELTi LocalWords: anamorphism apomorphism CLISTi ALIST's url ceteris eltn caar KNULL LocalWords: deconstructor RIGHT's KAR KDR kar kdr knull HTML CLtL Clinger gen LocalWords: Rees Bawden Blandy Bornstein Bothner Carrico Currie Dybvig expt LocalWords: Egorov Feeley Matthias Felleisen Flatt Hilsdale Hukriede CLISTs LocalWords: Kolbly Shriram Krishnamurthi Jussi Piitulainen Pokorny Joerg Todo LocalWords: Sperber Wittenberger documentors Jaffer initialised consed IE LocalWords: disarranging SIGPLAN CommonLisp cltl HyperSpec