Two-hour Review of Java for sophomore-level programmers **************************************************************** Primary reference. James Gosling, Bill Joy, Guy Steele, Gilad Bracha. The Java Language Specification, Third Edition. Online at http://java.sun.com/docs/books/jls/third_edition/html/j3TOC.html [The Third Edition describes the version 5.0 release of 2004. These notes, however, describe Java as it was specified by the Second Edition, version 1.4.x and older.] **************************************************************** Outline of review. Values Types Names Program structure Execution rules Common mistakes **************************************************************** Values. A value is a datum that can be stored in a variable, passed as an argument, returned by a method, operated upon. In general, we describe a value by writing an expression, and a large part of the computer's job is to compute the values of expressions. The values of Java fall into two major categories, primitive and reference. These categories can be divided further into subcategories as follows: primitive values boolean values true false numeric values integral values byte values (8-bit signed) short values (16-bit signed) int values (32-bit signed) long values (64-bit signed) char values (16-bit unsigned) floating point values float values (IEEE single precision) double values (IEEE double precision) reference values null pointers to class instances pointers to arrays In Java's official terminology, an object is defined as a class instance or an array. Officially, therefore, a pointer to an object is a value, but an object is not itself a value. Unofficially, just about everyone ignores this distinction. It's just too much to have to say "pointer to an object" all the time, when we could say "an object" instead. When we speak as though a Java object were a value, what we mean is that a pointer to the object is a value. We couldn't afford to do that if we were C++ programmers, because the values of C++ include both objects and pointers to objects, and the distinction between objects and pointers is essential to understanding C++. In Java, however, the distinction between objects and pointers is not very important, and we can ignore it most of the time. **************************************************************** Objects. In Java, an object is like a piece of paper. Several things are written on that piece of paper in indelible ink: an identification number lines that divide the paper into locations the type of each location In addition, each location contains a value, which is normally written in pencil. So long as the value is written in pencil, the value written in a location can be changed by erasing it and writing another value in its place. The type of a location is just a constraint on what kinds of values can be written into the location. If the type of a location is int, for example, then programs are not allowed to write a boolean value into that location. In Java, objects are created by evaluating a new expression. new int[15]; // allocates a new array new Vector(); // allocates a new instance Evaluating a new expression is like taking a clean sheet of paper of the appropriate size and using indelible ink to write a unique identification number, different from the identification numbers of all other objects divide the paper into locations write the type of each location and then using pencil to write an appropriate value into each location. For example, the Java system would write 15 into one location of a 15-element array to indicate its length, and would write 0 into each of the 15 element locations. Unofficially, I use the word "allocation" to refer to the part of this process that uses indelible ink, and the word "construction" to refer to the part that uses pencil. After the object has been allocated and constructed, the new expression returns a pointer to the object. Java programs typically use an object for a while and then forget about it completely. Part of the computer's job is to locate objects that the program has forgotten, and to recycle their paper by turning it into clean blank paper that can be used to create new objects. Their forgotten identification numbers can be recycled as well. For historical reasons, these recycling processes are known as garbage collection. **************************************************************** Types. Types serve several distinct purposes in Java. On one level, types constrain the set of values that can be written into a location, passed to a method, or used in some other context. These constraints make programming easier by improving reliability readability performance ease of expression (we'll come back to this) In general, two Java types T1 and T2 are the same if and only if they name the same primitive type, the same class, or the same interface. This can be a little confusing because Java allows the names of classes and interfaces to be abbreviated. Example. Math and java.lang.Math usually name the same type. Example. If a file of Java code imports java.util.Collection, then Collection and java.util.Collection name the same interface type within that file. The types of Java are primitive types boolean numeric types integral types byte short int long char floating point types float double reference types null type class types interface types array types Intuitively, we can often regard each type as the subset of all Java values that is allowed by the type constraint. For example, we can regard the int type as standing for the set of all int values. In general, each of the primitive types stands for the set of all primitive values of that type. The reference types are more interesting, and therefore more confusing. For example, there is a one-to-one correspondence between classes and class types, but classes are not at all the same as class types. The null type stands for the set that contains only the null value. Each class type A stands for the set of values { null } \union { p | B is a subclass of A, and p is a pointer to an instance of B } Example. If classes D and E extend class C, and classes B and C extend class A, and these are the only classes that extend A or B or C or D or E, then the type A stands for the set of values that includes null together with all pointers to an instance of classes A, B, C, D, or E. Each interface type T stands for the union of all class types A such that class A was declared to implement the interface T. Example. If T is a Java interface, and classes A and B are declared to implement T, and these are the only classes that implement T, then T stands for the union of the sets of values represented by the types A and B. Each array type T[] stands for the set of arrays whose elements are constrained to hold values of type U, where U is a subtype of T. U is a subtype of T if and only the set of values for which U stands is a subset of the values for which T stands. Warning: This isn't quite right. The truth is much more complicated than this, but most Java programmers will never need to know the real truth. Example. public interface Shape { double area (); } class Circle implements Shape { ... } class Rectangle implements Shape { ... } class Square extends Rectangle implements Shape { ... } class Triangle extends Rectangle implements Shape { ... } With these declarations, Circle is a subtype of Shape Rectangle is a subtype of Shape Square is a subtype of Shape Triangle is a subtype of Shape Square is a subtype of Rectangle Triangle is a subtype of Rectangle The last of these subtyping relationships may be a bad idea, but Java allows programmers to express bad ideas as well as good. **************************************************************** Variables. In Java, a variable is defined as the combination of a location together with the type that constrains the values that can be written into the location. **************************************************************** Names. In Java, a name is either a single identifier or a qualified name, which consists of a sequence of identifiers separated by periods. By convention, the names of classes and interfaces begin with an upper case letter. If the name is formed from words, the first letter of each word is capitalized but the subsequent letters of each word are in lower case. Example. String TreeSet HashMap By convention, the names of packages, methods, variables, and values begin with a lower case letter. If the name is formed from each word, then the first letter of each word except the first is capitalized. Example. java java.lang java.util gui main intValue toArray length System.out (standard output stream) System.out.println By convention, the names of constants and final variables are in upper case, with words or components separated by underscores. Example. Math.PI java.lang.Integer.MAX_VALUE **************************************************************** Structure of a Java program. A Java program is divided into packages, which like directories can be nested in hierarchically. With Sun's JDK, the package structure of a program normally corresponds to the directory structure of the program's source code. That is, the default package corresponds to the directory that holds the program's source code, and each named package corresponds to a subdirectory whose name is exactly the same as the name of the package to which it corresponds. If this correspondence is broken, the program will not compile properly; this is a common mistake. Example. A student's compiler might have the following package structure. ast attributes parser scanner codegenerator env target typechecker typenv types The fully qualified package names of this program would be ast ast.attributes ast.parser ast.scanner codegenerator codegenerator.env codegenerator.target typechecker typechecker.typenv typechecker.types Each subpackage is considered to be part of its parent package. For example, both typechecker.typenv and typechecker.types would be part of the typechecker package, which is itself part of the default (top-level) package. The Java code in each package is usually divided into files. Each file should declare at least one interface or class. The name of a Java file should be the same as the name of the first interface or class that it declares, followed by a .java suffix. Example. Parser.java **************************************************************** Compiling and running a Java application. Every Java application must declare at least one public class that declares a static method named main whose declaration looks like public static void main (String[] args) { ... } Suppose this method is declared in the TestShape class, which is in the default (top-level) package, and is declared in a file named TestShape.java. On our Unix systems, the program can be compiled (translated) from Java to byte code by going into the directory that corresponds to the default package and saying % javac TestShape.java This creates TestShape.class and a bunch of other .class files, one for each class that will be required to run the program. The program is then run by saying % java TestShape The argument to main will be an array of strings, one for each of the command-line arguments. Example. If the application is run by saying % java TestShape 1 2 3 then args.length will be 3 args[0] will be "1" args[1] will be "2" args[2] will be "3" In Sun's JDK, the javac compiler will not work properly if it is executed from within a directory that does not correspond to the default package of the program being compiled. This is a common mistake. **************************************************************** Structure of a Java file. Files normally begin with a block comment that explains the purpose of the code in the file, names the authors, gives the version number, and so on. After that comment comes a package declaration that names the package in which the file's code belongs. If the file is part of the default (top-level) package, then this declaration is omitted. After the package declaration (if any) come a series of import declarations that name all of the types from other packages that are mentioned by the file's code. After the import declarations come the class declarations. The name of the class that is declared by the very first class declaration should match the name of the file. Example. Here is a hypothetical (untested!) file named CodeGenerator.java. /** * Interface to code generators for a Simula 67 compiler. * * @author Ole-Johan Dahl * @author Kristen Nygaard * @version %I%, %G% */ package codegenerator; import java.io.PrintStream; import ast.Ast; public interface CodeGenerator { public void generateCode (Ast pgm, PrintStream out); } **************************************************************** Structure of a class declaration. A class declaration describes a set of objects that have similar behavior. Each class declaration also creates a type whose name is the same as the name of the class. In its simplest form, a class declaration consists of the word class, followed by the name of the class, followed by a pair of matching curly braces that enclose the class body declarations. The class declaration may also include one or more of these modifiers before the word class: public abstract final The public modifier should be used if the class is intended to be used by code in other packages. (With Sun's JDK, a public class should be the first class declared within its file.) The abstract modifier should be used if the class has any abstract methods, and may also be used to prevent any instances of the class from being created. The final modifier may be used to prevent any subclasses of the class from being declared. Java does not allow a class to be both abstract and final. The class declaration may also include one of these clauses after the name of the class: extends implements An extends clause names the immediate superclass, or parent, of the class. An implements clause names an interface that the class implements. Both kinds of clause imply that the type declared by the class declaration will be a subtype of the class or interface type that the declaration extends or implements. A class may extend only one class, but it may implement more than one interface, in which case the interfaces are separated by commas in the implements clause. If no extends clause is present, then the class extends the Object class, which is the root of the Java class hierarchy. Example. class C { } public class C { } abstract class C { } final class C { } public abstract class C { } public class C extends B { } public class C extends B implements CodeGenerator { } In the last example, the type C will be a subtype of the class type B and also a subtype of the interface type CodeGenerator. **************************************************************** Class body declarations. The body of a class declaration may contain declarations of the following things: initializers [rare, so I'll ignore them] constructors members Constructors are not members. In particular, a constructor is not a method, even though a constructor declaration resembles a method declaration. Constructors are never inherited, so they cannot be overridden or hidden within a subclass. The visibility of a constructor or member can be controlled by prefixing its declaration with a visibility modifier: modifier visibility -------- ---------- public universal protected package + subclasses (no modifier) package private class If you explicitly say that something is protected, then it is less protected than it would have been had you not specified the visibility at all. This is a peculiarity of Java. **************************************************************** Constructor declarations. At least one constructor is called whenever an instance of the class is created by a new expression. The purpose of a constructor is to initialize the non-static members of an instance of the class. The name of a constructor is the same as the name of the class in which it is declared. A constructor declaration looks like a method declaration, except that a constructor declaration has no return type. Example. class Circle implements Shape { // class declaration Circle (double diameter) { // constructor this.diameter = diameter; } ... } If a class does not explicitly declare any constructors, then the Java compiler will automatically generate a default constructor with no arguments that does nothing except to invoke the superclass's constructor with no arguments. The visibility of a default constructor is the same as the visibility of the class. Example. If the class Circle did not declare any constructors, then the effect would be the same as if it had declared Circle () { super(); } To prevent any instances of class Circle from being created by code that is outside that class, a programmer can make all of its constructors private. To prevent the compiler from creating a non-private default constructor, programmers can declare an explicit private constructor with no arguments that does nothing. Example. class Circle implements Shape { // Don't let anyone else create instances of this class! private Circle () { } ... } **************************************************************** Member declarations. The members of a class are not only the members that are explicitly declared within the class declaration, but also any members that are inherited from the class's superclass. The following things can be declared as members: variables (also known as fields) methods classes interfaces Example. double PI = Math.PI; double diameter; double area () { double radius = diameter / 2.0; return PI * radius * radius; } Members can be declared static, which means there is only one thing that is declared by the member declaration, and that one thing is associated with the class in which the declaration occurs. If a member is not declared static, then each instance of the class will have its own member, distinct from the members of other instances. Exception: an interface is always static, even if it is not declared static. The distinction between static and non-static members is so important that it has given rise to special terminology: static non-static variables class variable instance variable methods static method dynamic method classes local class inner class interface This review will not cover inner classes. A member may be declared final. If a variable is declared final, then the Java compiler will not allow its value to be changed after the variable has been initialized. Constructors are allowed to change the value of a final instance variable, but methods are not. A static final variable should have an explicit initial value. If that initial value is an obvious constant, then the static final variable is effectively a constant, and may be used in the case labels of a switch statement, for example. Example. static final int READ_ONLY = 1; static final int WRITE_ONLY = 2; static final int READ_WRITE = 3; So far as I can tell, there is no good reason to declare that a static method is final. If a dynamic method is declared final, then the method cannot be overridden within a subclass. If a class is declared final, then it cannot be extended by a subclass. ****************************************************************