BAT
Bit-level Analysis Tool, version 0.2

Documentation

Installing BAT

Installing BAT is relatively simple.
  1. Download BAT. You can get it here.
  2. Decompress the tarball using the command "tar xzf".
  3. Put the resulting "bat/" directory wherever you would like. You can move the files out of the bat directory, but you must put them all in the same directory.
  4. run the "bat" executable from the directory.

Command Line Arguments

The correct command line usage for BAT is
bat [-solver solver | -/+a | -/+u | -/+r | -/+s]* file

The -solver option tells BAT which solver to use. You can specify minisat, siege, zchaff, or none. Unless you choose none, the solver you choose must be in your path under its original executable name. That is, to use minisat, you must have the executable minisat in your path, to use siege, you need to have the executable siege_v4 in your path, and to use zchaff, you must have the executable zchaff in your path. The none option writes the CNF file and then stops.

The +a turns on a technique called and-propagation, whereas -a turns it off. By default, it is turned off.

The -/+u options turn the uniqueness techniques off/on. By default they are on. These are techniques for hashing terms to increase sharing of terms in the formula, which decreases the resulting CNF problem.

The -/+r options turn the rewriting techniques off/on. They are on by default. These techniques increase term sharing and improve the memory reduction technique.

The -/+s flags turn off/on the option of using the SatELite CNF preprocessor. SatELite effectively reduces the CNF problem and makes it simpler for the SAT solver. However, SatELite contains some bugs, so we turn it off by default.

Finally, the file argument is the file describing the problem you want BAT to solve. The format for this file is described in the next section.

File Format

BAT takes files in one of 3 formats. The first is a machine description for bounded model checking. A file in this format contains 3 items. The first is the keyword ":machine" (without the quotes). The second is the machine description (described below). The third is a natural number which represents the number of steps you want BAT to check the property for.

The other two formats are very similar. They are used to check if a formula holds for some values of the variables (existential), or if a formula holds for all values of the variables (universal). These files contain 4 items. The first is either ":exists" or ":forall" (without the quotes). The next is a list of variable declarations for the formula. The third is a list of function definitions for use in the formula (this can be () if there are no functions). The final argument is the formula itself, which is over the variables and functions declared earlier in the file.

For examples of all these formats, see the lisp files for download on our examples page.

Language Introduction

BAT takes in a machine description and LTL specification, and tries to find a counterexample in the number of steps provided by the user. The machine specification has 4 required sections, :vars, :init, :trans, and :spec. These correspond to the declaration of the variables making up the machine state, a boolean formula describing valid initial states, a boolean formula describing the transition relation, and an LTL formula giving the desired formula, respectively.

The BAT language is strongly typed. Variables are either bit vectors or memories. The :vars section is a list of variable declarations that specify the types of each variable. Each variable declaration is either

A :vars section then looks like this: (:vars (x 2) y (z 8 16)). In addition to variables, there are bit vector and integer constants. Bit vectors can be given in binary, hex, or octal. The binary representation is written as 0b followed by a sequence of 0s and 1s. Likewise, hex numbers are written as 0x followed by a sequence of hex digits, and octal numbers are written as 0o followed by octal digits. The type of these is simply the number of binary digits needed to represent the binary, hex, or octal numbers. The most significant digit is always on the left. Hence 0b000011110011, 0x0F3, 0o0363 are all the same 12-bit bit vector.

Integers are represented by signed bit vectors. The size of the bit vector is determined by BAT's unique type-inferencing mechanism. The appropriate size is determined by the context in which the integer is used. For example, if x is a 4-bit bit vector, then if we bitwise-and it with 3, it is written as (and x 3). Then in this context, 3 is represented by the bit-vector 0b0011, since bit-vectors that are bitwise-anded together must be of the same type. The only restriction in this case is that the integer must be representable in signed binary notation (2's compliment) in the number of bits dictated by the context. For example, (and x 8) would not be allowed, since 8 needs 5 bits to be represented in 2's compliment (0b0100). In addition, users can specify integers to be turned into their unsigned representation by following the integer with a u. That is, 8 would be turned into 0b01000 in its minimum representation, but 8u could be turned into 0b1000.

Boolean Formulas

The other 3 sections contain boolean formulas. These are given in S-Expression format. There are a variety of useful built-in functions for building up these formulas. All the basic bitwise boolean functions are represented. The functions and, or, and xor all take an arbitrary number of arguments and perform the appropriate operations. In addition, -> (implication), and <-> (iff) take exactly two arguments. The not function takes exactly one argument. All of these functions take bit-vectors of the same size, and return a bit vector of that size.

There are also arithmetic operations, such as =, <, >, <= (less than or equal to), >= (greater than or equal to), add, sub, +, -, inc, and dec. The + function takes a list of bit vectors of the same size and returns a bit vector that contains the ceiling of the log of the number of arguments to the addition more than the size of the inputs. For example, if a, b, c, d, e, and f are 4-bit bit vectors, (+ a b) returns a 5-bit vector, (+ a b c d) returns a 6-bit vector, and (+ a b c d e f) returns a 7-bit vector. The + function returns the sum of the numbers given, just as one would expect from integer addition. The add function, on the other hand, performs modular addition and adds the carry-out bit to the front of the result. The - and sub functions work in the same way. The inc and dec functions simply add or subtract 1 from the input (using + and - respectively). The =, <, >, <=, and >= functions all return 1 bit that is 1 if the given relation is true given a 2's compliment interpretation of the values of bit vectors, and 0 otherwise.

BAT contains bit vector related functions as well. These include >>, <<, >>>, <<<, cat, and ext. The >> and << functions perform right and left shifts, respectively. They take a bit-vector and a constant integer and shift the bit-vector by the number of bits specified, padding the rest with 0s. The <<< and >>> functions are similar, but perform circular shifts. The bit and bits functions get segments of bit vectors. The bit function takes in a bit vector and a natural number less than the number of bits in the vector, and return the bit corresponding to that number, where bit 0 is the least significant digit. The bits function takes a vector and two natural numbers, i and j, such that 0 <= i < j and j is less than the size of the bit vector. It returns bits i to j (inclusive) of the bit vector. It is also possible to get a single bit or a sequence of bits from a variable without the bit or bits function by using the variable as the function name. For example, (a 3) gets bit 3 from bit vector a, and (a 3 6) gets bits 3 to 6 of variable a. The cat function concatenates bit vectors, returning a bit vector with size equal to the sum of the inputs to the cat function. Again, the most significant bits are to the left so the earlier arguments to the cat formula are more significant than the later arguments. The ext function is signed vector extension. It takes a bit vector and an integer that must be greater than the size of the vector and it returns a new vector of the given size whose most significant digits are padded with the high order bit of the vector. So (ext 0b1100 6) would return 0b111100.

Conditional statements include if and cond. The if takes three arguments. The first is the test and must be a bit vector of 1 bit. The second and third arguments are the then and else clauses respectively and must be the same type. The cond works like a cond in Lisp. It takes a list of 2 element lists. The first element of each of these lists must be a 1-bit bit vector, and all of the second arguments must be of the same type. The semantics of a cond are like a series of if-then-else clauses. For example, consider the following.

(cond ((< x y) -1)
      ((> x y) 1)
      (0b1     0))

Here, if x is less than y, then -1 is returned. Otherwise, if x is greater than y, 1 is returned. If neither of these tests is true, then 0 is returned. If none of the tests is true, then the 0 bit-vector of the appropriate size is returned.

In addition, there are foldl and foldr functions that take the name of a binary function that takes in and returns 1-bit bit vectors, and a bit vector and returns the result of applying the function to all the bits of the bit vector, associating to the left or right respectively.

Memories

Memories are treated in a special way in the BAT language, which allows us to greatly reduce the size of the memories when converting the BAT specification to SAT. Memories are therefore restricted to the functions get, set, =, and if. The get function takes a memory and a bit vector and returns the word of the memory addressed by the bit vector. The set function takes a memory and two bit vectors. It returns a memory equivalent to the original memory except that the word addressed by the first bit vector is set to the value of the second bit vector. In both cases the addresses must have the number of bits equal to the log of the number of words in the memory, and in the case of the set the last argument must be the same size as the words of the memory. Memory equality can be tested by = if the two memories being compared have the same word size and number of words. Likewise, an if can have memory-typed then and else clauses as long as their types match up.

Multiple Value Return

BAT provides a way to return multiple values from an expression (this becomes helpful in conjunction with user-defined functions). This is done simply by wrapping a sequence of values in an mv form:

(mv (+ a b) (set m x y))

This returns both the sum and the result of the set form.

Local

The most complex structure of the BAT language is local. In its simplest form, it operates like a let* in Lisp. For example,

(local ((nsel0 (not (sel 0)))
        (nsel1 (not (sel 1)))
        (v0 (and in0 nsel0 nsel1))
        (v1 (and in1 (sel 0) nsel1))
        (v2 (and in2 nsel0 (sel 1)))
        (v3 (and in3 (sel 0) (sel 1))))
   (or v0 v1 v2 v3))

In this example, the first argument to the local is a list of variable bindings, and the second is the body of the local. So, nsel0 is set to be the negation of the 0th bit of sel. Likewise nsel1 is set to the negation of the 1st bit of sel. Then v0 through v3 are set to different combinations of the bits of sel, in0 through in3 and their negation. Finally, the disjunction of v0 through v3 is returned. This is the implementation of a 4-bit mux, with inputs in0 through in3 and selector input sel. If the type of a value to which a variable is being bound cannot be determined by the context, the type must be given. Such a binding would look like: (x 4 8) this would bind the variable x to be the 4-bit vector 0b0100.

The following implementation of an ALU slice demonstrates one of the more complex abilities of the local.

(local ((nb (xor bnegate b))
        (res0 (and a nb))
        (res1 (or a nb))
        (((cout 1) (res2 1)) (fa a nb cin)))
   (cat cout (mux-4 res0 res1 res2 1u op))))

Here, the last binding binds variables cout and res2 simultaneously. It declares each to be 1 bit, and binds them to the 2-bit output of the fa function (a user-defined function). This splits up the output of the fa function between cout and res2 according to their sizes. Another feature of the local is illustrated by the following.

(local ((c 2))
       (((t0 (c 0))
         (alu-slice (a 0) (b 0) bnegate bnegate op))
        ((t1 (c 1))
         (alu-slice (a 1) (b 1) t0 bnegate op))
        (zero (= c 0)))
    (cat t1 c zero))

Here an extra argument appears at the beginning of the local. This is a list of bit vector variable declarations. The idea is that these variables can be bound by bits and pieces through the bindings. The first binding binds several values, as in the last example. However, in this example the second value being bound is not a variable, but a bit of the variable, c, declared in the first argument to the local. Likewise, the other bit of c is set in the second binding. It is also possible to set a sequence of bits in a similar way by giving 2 integers: ((c 0 1) (and a b)).

Finally, it is possible to set multiple values to the result of an mv form:

(local ((aa mm) (mv (inc a) (set m a b)))
    (set mm c aa))

Here the types of the variables being bound are inferred from the type of the mv form.

Temporal Operators

In order to express transition relations and LTL specifications, BAT has a next operator, which can be applied only to variables. Examples include (= (next m) (set m a b)), (= (next a) (and a b)), and (< (next a) a).

In addition, BAT allows the LTL operators AG and AF for always and sometimes. For example (AG (< a (next a))) says that the next value of a is always greater than the previous value.

User Defined Functions

In addition to the :vars, :init, :trans, and :spec sections of a specification, the user can define his or her own functions in the :functions section. Consider the following example.

((:functions (maj (1) ((a 1) (b 1) (c 1))
                  (or (and a b) (and b c) (and a c)))
             (fa (2)
                 ((a 1) (b 1) (cin 1)) (cat (maj a b cin) (xor a b cin)))
             (mux-4 (1) ((in0 1) (in1 1) (in2 1) (in3 1) (sel 2))
                    (local ((nsel0 (not (sel 0)))
                            (nsel1 (not (sel 1)))
                            (v0 (and in0 nsel0 nsel1))
                            (v1 (and in1 (sel 0) nsel1))
                            (v2 (and in2 nsel0 (sel 1)))
                            (v3 (and in3 (sel 0) (sel 1))))
                       (or v0 v1 v2 v3)))
             (alu-slice (2) ((a 1) (b 1) (cin 1) (bnegate 1) (op 2))
                        (local ((nb (xor bnegate b))
                                (res0 (and a nb))
                                (res1 (or a nb))
                                (((cout 1) (res2 1)) (fa a nb cin)))
                           (cat cout (mux-4 res0 res1 res2 1u op))))
             (alu-2-bit (4) ((a 2) (b 2) (bnegate 1) (op 2))
                        (local ((c 2))
                               (((t0 (c 0))
                                 (alu-slice (a 0) (b 0) bnegate bnegate op))
                                ((t1 (c 1))
                                 (alu-slice (a 1) (b 1) t0 bnegate op))
                                (zero (= c 0)))
                            (cat t1 c zero))))
 (:vars (i1 2) (i2 2) (bn 1) (op 2) (out 2) (cout 1) (zero 1))
 (:init (and (= out 0) (= cout 1b1) (= zero 1b0)))
 (:trans (= (cat (next cout) (next out) (next zero)) (alu-2-bit i1 i2 bn op)))
 (:spec (AG (<-> zero (not cout)))))

This example describes an ALU. We have covered pieces of this code in previous examples. The functions section takes a list of function definitions. In this example, we define 5 functions. A function definition starts with the function name. Our first function is called maj. The second element in the definition is the type. This is a list containing 1 positive integer for a bit vector function (maj, for example returns a 1-bit bit vector), 2 positive integers if the return type is a memory, and a list of 1 integer lists and 2 integer lists if there are multiple values returned. For example ((1) (8 4)) would specify that the function returns a 1-bit bit vector and a memory with 8 4-bit words.

The third part of a function definition is its arguments. This is just a list of variable definitions just like the ones in the :vars section. In the case of maj, the inputs are 3 1-bit bit vectors, a, b, and c. The final element of a function definition is the body. Its return type must be compatible with that of the function.

In our example, the maj function calculates the majority function on the inputs, fa is a full adder that returns the carry out bit and the sum bit, alu slice is a 1-bit piece of the alu, and alu-2-bit is the 2-bit alu constructed out of the alu slices. The :vars section includes inputs i1 and i2, an output out, the negation bit (whether i2 should be negated in the case of addition), the operation code, the carry out bit, and the zero bit. The :init specifies that initial states must have an out of 0, a cout of 1, and a zero of 0. The transition relation sets the next values of cout, out, and zero according to the outputs of the alu-2-bit function. The :spec checks that zero is true if and only if cout is not.

Constants

In order to ease parameterization, BAT provides the ability to define constants that can then be used throughout the definition. This is done in the :constants section, which contains a list of constant declarations. Each constant declaration is a list of 2 elements. The first is the constant name and the second is the constant value. This can be a bit vector, signed integer, or unsigned integer. These can be used anywhere where such values can be used. For example, you can include the declaration (sh 4), and then anywhere in the spec, you can write (<< a sh). This will shift a by 4 bits. You can also declare a function to be of type (sh). This means that the function returns a 4-bit value. This allows for easy parameterization. You can create a constant to represent the word size of a processor design, (ws 4), and then use ws throughout their processor specification. Then, by changing ws to 32, the word size goes from 4 to 32.

Definitions

Definitions give the user the ability to define values in terms of the initial values of the variables in the :vars section that do not change through the operation of the machine. For example consider the following:

((:constants (ws 8) (dws 7) (as-out 17))
 (:functions (adder-step (ws) (i1 ws) (i2 ws) ...))
 (:vars (x ws) (y ws) done)
 (:definitions (z (bits (+ x y) 0 dws))
 (:init (= done 0b0))
 (:trans (= (cat (next x) (next y) (next done))
            (addr-step x y)))
 (:spec (AG (-> done (= y z)))))

Suppose the user defined a new adder that took multiple steps to complete. At each step, the adder alters x and y and returns their new values along with a Boolean specifying whether the addition has been completed. We want to tell if this new adder does the same thing as + does, but we need to know that the final answer is the sum in terms of the original values of x and y, not their values as they are altered by adder-step. The way we do this is by defining z to be the sum of x and y (chopping off the extra bit added on by addition) in the :definitions section. This says that z is the sum of the original values of x and y. Then in the end we compare the final value of y (where the sum is put in the end) to the value of z.

Primitives

This section describes all the primitive functions and operators of the BAT specification language. We use a simple notation for describing these primitives. The notation used is as follows. Tt is a BAT term or expression, which can be either a bit vector or a bit vector memory. t indicates the type of the term. If t is of the form ⟨bv, n⟩, then it indicates that the term is a bit vector having n bits. Note that the term can also be a constant whose value can be represented by an n bit vector. If t is of the form ⟨mem, s, n⟩, then it indicates that the term is a bit vector memory, where s is the word size of the memory and n is the number of words in the memory. D is used to represent decimal constants.