From blgrise@mediaone.net Thu Dec 14 22:18:33 2000 X-UIDL: d012476f9b3bd5e69252ccc5cb0a19f4 Return-Path: Received: from chmls20.mediaone.net (chmls20.mediaone.net [24.147.1.156]) by amber.ccs.neu.edu (8.10.0.Beta10/8.10.0.Beta10) with ESMTP id eBF3IVX10361 for ; Thu, 14 Dec 2000 22:18:31 -0500 (EST) Received: from mediaone.net (h00a0cc3bbd8d.ne.mediaone.net [24.147.252.251]) by chmls20.mediaone.net (8.8.7/8.8.7) with ESMTP id WAA05651; Thu, 14 Dec 2000 22:18:29 -0500 (EST) Message-ID: <3A398E30.1EECC288@mediaone.net> Date: Thu, 14 Dec 2000 22:21:20 -0500 From: Bud and Lori Grise X-Mailer: Mozilla 4.76 [en] (Windows NT 5.0; U) X-Accept-Language: en,pdf MIME-Version: 1.0 To: Karl Lieberherr CC: blgrise@mediaone.net Subject: COM3360 project Content-Type: multipart/mixed; boundary="------------6974F9DDE22646390F225ACF" Status: RO X-Status: X-Keywords: X-UID: 6 Content-Length: 69397 This is a multi-part message in MIME format. --------------6974F9DDE22646390F225ACF Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Project Overview ================ The purpose of this project was to rewrite an existing program using adaptive methods. The program is used to make a personal financial plan or budget with the goal of purchasing a home. The input to the program is a file containing a scripting language that defines variables and actions on those variables. The output of the program is a table showing a month-by-month projection of various variable values. Given a set of requirements for a home and mortgage, such as price and interest rate, the program can predict when the home will become affordable. This project description also includes implementation notes and a comparison of the original code to the new code. Program Behavior ================ There are five main concepts in the program: Events, Statements, Variables, Expressions, and Columns. The input file describes these and how they relate. When the simulation runs, it increments the current date and searches for events that occur on that date. For all such events, it executes the statements associated with the event. Then the simulation prints a row in a table containing the current values of some variables, and then advances to the next month. Events ------ Events are used to define a point in time where something significant occurs. The resolution of this "point in time" is one month, because this is a convenient unit for budget planning. When an event occurs, a set of statements associated with the event are executed. Events can be defined that they happen only once, at a specified month and year. The month can be specified as a number in the range 1-12, or by using the conventional three-letter abbreviation. Events can also be recurring, such as at the same month every year, or they can occur every month. Finally, an event can be triggered on a given event name. There are some predefined event names that the simulator defines and triggers automatically: startup - occurs in the first month of the simulation term_start - occurs when the mortgage is initiated term_end - occurs when the mortgage is complete You can also define your own named events. In this case the event name is simply a variable, and it is triggered when the value equals 1. Here are some examples of events. No statements are associated with them for the sake of simplicity. event new_car is once at jul 2002 end event Raise is yearly every oct end event inflation is monthly end event init is once at startup end event my_event is once at my_trigger end Statements ---------- Each event may have a list of statements associated with it. When the event occurs, the simulator executes these statements. If multiple events occur on the same date, the statements are executed in the order they appear in the input file. The supported statements are: assignment, if-then-else, while-loop, print, dump, and stop. There is no statement terminator (such as ';'). Assignment has the form " = ". When executed, the expression is evaluated and the resulting value is assigned to the variable. If-then-else has the form "if then [else ] end". The else part is optional. The expression is evaluated and if the value is non-zero the first set of statements are executed. Otherwise, the second set of statements (if present) is executed. If and While statements can be nested in each other. While-loop has the form "while loop end" The expression is evaluated and if the result is non-zero then the statements are executed. This repeats until the expression result is 0. The print statement has the form "print ". The string and expression (but not both) are optional. If present, the string is first printed to the output file. Then the expression, if present, is evaluated and the result printed to the output file. Finally a new-line is printed. Every print statement will be on a line by itself. The dump statement has the form "dump symboltable". When executed, it prints the names and values of all defined variables to the output file. It is intended to be used for debugging. The stop statement has the form "stop". When executed, the simulation will terminate at the end of the current month. The simulation will also stop if the (hard-coded) year 2030 is reached, to prevent the simulator from running forever if a stop is not executed. Here are some examples of statements: a = b if weight > 100 then print "That rock is too big. I can lift a smaller one." end while x < 5 loop x = x + 1 y = y * 1.5 if y > 50 then y = 50 stop end end Variables --------- The only data type supported is a simple scalar double-precision float. Variables must be declared, using the construct "declare [attribute] [= ]" The initial value is optional. However if the variable is referenced before any value is assigned then the simulation will print an error and stop. A variable can have an attribute of "asset" or "liability". There is a predefined variable named "savings" which represents the amount of money current saved. At the end of each month, the simulator adds the current value of all asset variables to savings, and subtracts the value of all liabilities from savings. The idea is that recurring monthly income/expenses can be conveniently treated this way. For example: declare paycheck asset = 2000 declare rent liability = 1000 Attributes are optional. If a variable does not have an attribute it is simply ignored in this end-of-month treatment. There are a number of predefined variables. These do not have to be declared, but they can be if you want to override the default values. savings - This is holds the current amount of money saved to date. It is automatically updated with assets and liabilities at the end of each month. The mortgage part of the simulation uses this variable to tell whether or not the home is affordable. If not explicitly declared, its initial value is 0. startup - this is a flag used by the simulator. It is set to 1 before the first month of the simulation and set to 0 after that month. This can be used as an event trigger. There is really no reason to write to it. start_month, start_year - These are the month and year the simulation will start. If not specified, these hold the value 12/2000. current_month, current_year - These are the month and year that the simulation is currently processing. There is really no reason to write to them. sumOfAssets, sumOfLiabilities - These are used when summing the values of all assets and liabilities at the end of each month. They are set to 0 at the start of each month. These are not really that useful for the user. (However they could be used to add one-time debts or income if you don't want to declare a variable, as in: sumOfAssets = sumOfAssets + 1500 ) There are also predefined variables that are used specifically for dealing with the mortgage: house_price - price of the house you want to buy. minimum_percent_down - what you want to use as a down payment, as a percentage of the purchase price of the house. For example: declare minimum_percent_down = 20 closing_costs - an estimate of the closing costs of the mortgage. This of course is just a rough estimate. loan_term - the number of years for which the loan will be taken out. mortgage_rate - The mortgage rate for the loan. The program assumes that a simple, fixed-rate loan will be used. The rate is a percentage. For example: declare mortgage_rate = 0.09 // 9% rate reserve_savings - this is the minimum amount of savings you wish to have when you take out the loan. term_start - This intended to be used as an event trigger. It is set to 1 just once, the first time savings > (house_price * minimum_percent_down/100 + closing_costs + reserve_savings) term_end - This is intended to be used an an event trigger is it set to 1 just once, "loan_term" number of years after the term_start event occurs. mortgage_payment - This is predefined liability with initial value of 0. When term_start occurs, this variable is set to the computed principal+interest for the loan, based on the above variables. When term_end occurs it is automatically set to 0. Expressions ----------- Expressions are value-producing operations on variables and constants. Expressions use the convention operators such as add, subtract, etc. in infix notation. Operator precedence is enforced, and a chain of operators at the same precedence level are evaluated in the conventional left-to-right order. For example, 3+4*5=23 (not 35), and 8-4+2=6 (not 2). The operators are: * boolean: and, or, not A boolean expression is deemed to be true if the value is 1.0. * relational <, <=, >, >=, ==, != * multiply *, divide * * addition +, subtraction - * unary - * parenthesized () Expression terminals are variables and constants. Constants can be floating-point or integer literals. Integers are automatically converted to floating-point values. Here are some examples of expressions: 2*3*4/5 a>2 and pi==3.14159 (a+b+c) / (d-e) Columns ------- Columns are used to define what is printed in the output file. Again, there are predefined columns as well as user-defined ones. User-defined columns have the form: "column [width] is <expression>". The title is a string that will be displayed in the table heading. The width of the column is defined by the length of the title string (with a lower limit of 7 characters). The expression is the value that will be printed each month in that column. The expression will be evaluated at the end of each month. Some examples of column definitions are: column "rent" is rent column "mortpymt" is mortgage_payment column "hobby costs" is (games + supplies + paper) The predefined columns are as follows: Year - the current year Month - the current month Savings - the value of the variable savings NetIncome - the value of sumOfAssets - sumOfLiabilities Income - the value of sumOfAssets Expenses - the value of sumOfLiabilities These appear in the above order from left-to-right. User-defined columns then appear in the order they are defined in the input file. Finally there is a column named "Events". The names of all events that occurred this month are printed in this column, separated by spaces. (Events that occur every month are not printed in order to save space.) This allows you to see what events happen when. File Format ----------- In the input file, declarations of events, variables, and columns can occur in any order. The usual comment delimiters "//" and "/* ... */" can be used as well. See the program.cd file for the precise input format. Also see program.input for a sample input file, and program.output for the corresponding output. Implementation Notes ==================== The structure of the program is contained in the behavior files: columns.beh - contains the traversal that prints the title lines and another that prints a row of the table. doevents.beh - contains the traversal that identifies events that should trigger on the current date. It executes the statements associated with these events. dostatements.beh - contains a traversal that executes each statement in a list. dumpst.beh - contains a traversal that dumps the name and value of each symbol in the symbol table. expreval.beh - contains a traversal that evaluates an expression getattrib.beh - contains a traversal that sums the values of all assets and all liabilities. initst.beh - contains a traversal that treats variables declarations in the input file. For each declaration, it evaluates the initial value expression (if present) and adds a corresponding symbol to the symbol table. month.beh - contains a mapping for all months to a number corresponding to their order. (jan=1, feb=2, etc). nameof.beh - contains a visitor that fetches the name of a symbol. program.beh - the main program. Contains the parsing of the input file, initialization of default variables and columns, and the date loop. Also contains the mortgage computations. searchst.beh - a traversal that fetches the symbol with the given name from the symbol table. valueof.beh - a traversal that fetches the value of a symbol from the symbol table. Development Challenges ====================== Probably the most difficult part of the implementation was the syntax and evaluation of expressions. The original program only provided for simple binary expressions, such as "a+3" or "3*4". Computing more complicated expressions required the user to declare intermediate temporary variables. To overcome this limitation require that either all binary operators be enclosed in parenthesis, or the program would have to enforce operator precedence rules. Fortunately I was able to find the correct grammar pattern for doing this. However, this produced parse trees that were not that easy to evaluate. In particular, an expression with operators at the same precedence level will be constructed more like a list of operators rather than an expression tree. This list is formed left-to-right, whereas an expression tree would be derived right-to-left. For example, the expression "8-4+3" has the result is 7. In an ideal world, the data structure would look like this: + / \ - 3 / \ 8 4 Due to the precedence-enforcing grammar, we actually get: NumExpr / \ 8 NumExpr_ (-) / \ 4 NumExpr_ (+) / \ 3 NumExpr_ (empty) Simply evaluating this like an expression tree produces the wrong result -- 1. After wrestling with this for a while, I realized that to treat this properly, I needed to define actions on the right construction edges from a given operator node instead of on the node itself. So instead of computing a sum on the Add node, it is computed on the edge Add,<next>,NumExpr_. (All the right edges are named "next".) As I mentioned in my interim status report, I started with the most basic behaviors, tested them, then added and tested the others one-by-one. I really didn't have much trouble with other parts of the simulator. I did need to find out the syntax for the various parts of DemeterJ, but I was able to locate examples in the DemeterJ documentation and in the homeworks. Comparison with Original ========================= The original program was written six years ago in Ada. The parser was hand-coded using recursive decent. It was written and improved upon over a period of several months. The source files consist of 3500 lines of code. The new version was written in Java using DemeterJ. The rewrite included extra functionality that was not present in the original version. These features were added because DemeterJ made it very easy to do so: full expression syntax user-defined columns user-defined events the while-loop statement implicit declaration of predefined variables The DemeterJ version consists of 1100 lines of code. This includes includes the *.beh files and the .cd file. A few things were lost in the new version, though these are minor. One is that there were better error syntax error messages provided in the original parser. I had given hints for correcting some common errors. Also the original version was not case sensitive, though whether that is actually useful or not is debatable. In summary, using DetemerJ was a success. Not only was the functionality of the program extended, but it was written with only 1/3 of the lines of code of the original. It is difficult to quantify the difference in development effort, since the original program was not done all at once. However, it definitely seemed easier, especially since the automatically generated parser virtually eliminated errors in that area and with the print visitor allowing quick display of the data being acted upon. --------------6974F9DDE22646390F225ACF