CS 5004: Object Oriented Design and Analysis

MODULES

Overview
Course Map
Objectives

In this module we take a look at Java's Collections library discussing sets, maps, queues, and other commonly used data structures. We discuss our previous implementations, and look at the JDK's implementations.

We introduce big-O notation, and evaluate some of our previous programs in terms of their runtime performance. We then use JUnit tests to validate our results from our runtime performance analysis.

Given a custom Map implementation, refactor the code to use the JDK standard Map.
Given a custom Queue implementation, refactor the code to use the JDK standard Queue.
Given a custom Set implementation, refactor the code to use the JDK standard Set.
Given a Collection, use the JDK Comparator to sort the elements.
Given a sorting specification, order a collection of data by using the Java standard library.
Given a Java class, write down the time complexity for a specific method using big-O notation.
Given a Java class, write down the time complexity for a specific method using big-Omega notation.
Given the complexity analysis of a method, interpret the performance.
Given a Java class with a method whose implementation uses recursion, write down the recurrence relation.
Given a set of Java classes, evaluate their performance by writing performance tests.
Given a Java class, use JUnit to write performance tests.
Given a set of Java classes with performance tests, plot their performance results.

Reading
Labs
Assignment 7
Assignment 8

Java Tutorial: Collections

DUE DATE: Monday 13th of March @ 11:59pm

Repository Contents

Your repositories must contain only the following

pom.xml file and
src folder with appropriate sub-folders with your code and test code only.

Code Criteria

Your solution for each problem should be in its own package. The package names should follow this naming convention edu.neu.ccs.cs5004.assignmentN.problemM where you replace N with the assignment number and M with the problem number, e.g., all your code for problem 1 for this assignment must be in a package named edu.neu.ccs.cs5004.assignment7.problem1.
Your project should successfully build using maven generating all the default reports.
Your Javadoc generation should complete with no errors or warnings.
Your Checkstyle report must have no violations.
Your JaCoCo report must indicate 75% code coverage per package for "Branches" and "Instructions" or more.
Your FindBugs report must have no violations (except for violations categorized as PERFORMANCE or SECURITY, you can ignore those).
Your PMD report should have no violations.

Custom implementation of HashSet

A company who is working on a new small device are using Java want a specialized implementation of a HashSet library. As such they are looking for people to implement specialized code for their application to use.

They would like an implementation of a HashSet for their application and they have hired you to implement it. Here is the specification that they provided.

We would like a new implementation for a HashSet for Integers only. We would like the new implementation to only support the following operations

the ability to create a new HashSet with its initial capacity.
the ability to add a new element. The add operation should return a boolean, true when the operation succeeds and false otherwise. If the element passed to the add operation that is already a member of our HashSet the operation must succeed and leave the HashSet unchanged. If the add operation is called on a HashSet that is at 80% capacity, your implementation must increase the capacity and allow the addition of the element.
the ability to check if the HashSet is empty or not. This operation should return a boolean.
the ability to provide an element to be removed. If the element is not in the HashSet then the HashSet must remain unchanged.
the ability to check if an element is already a member of our HashSet
the ability to return the total number of elements in our HashSet
the ability to return the union with another HashSet. The result should be a valid HashSet
the ability to return the intersection with another HashSet. The result should be a valid HashSet.

Other requirements for your implementation include

You implementation should provide a Set interface. Your customized HashSet class must be a concrete implementation of the Set interface.
The load factor for your HashSet implementation should be 80%.
Your implementation should be space efficient.
Your implementation for the contains, add and remove operations should on average have a complexity of O(1)
For the union and intersection operations operations your documentation should state clearly which objects are being mutated (altered)

NOTE: The notes covered in one of our labs are a very good starting point for a hash table implementation. Your implementation must support chaining. There is no requirement to implement any of the interfaces from java.util. You are also not allowed to extend java.util.HashSet or any other java.util class.

Using java.util

For this question you are required to use the types in java.util for your solutions.

A publishing house would like to generate the index for books that they publish. The index is a set of key words in the book and for each key word there is a list of page numbers that contain the key word.

Their current programs analyze the contents of a book and generate a java.util.Map<K,V> that contains pages as keys and a list of words as values. Here is an example of what the map contains once the program processes a book

               Page No.   | Word 
               ---------------------------------------------------------
                 1        | "abstract", "class", "fields", "methods"
                 2        | "class", "fields", "types"
                 3        | "method", "argument", "expression", "return" 
                 4        | "abstract", "abstraction", "parameter",
                 7        | "interface", "signature", "inheritance"

What they would like is for you to design a program that given the Map that they already have, returns a new Map that is the book index. So given the preceding example, your program should produce a Map that should hold the following data

                 Word          | Page Nos.
                 -------------------------
                 "abstract"    | 1, 4
                 "abstraction" | 4
                 "class"       | 1, 2
                 "fields"      | 1, 2
                 "methods"     | 1
                 "types"       | 2
                 "method"      | 3
                 "argument"    | 3
                 "expression"  | 3
                 "return"      | 3
                 "parameter"   | 4
                 "interface"   | 7
                 "signature"   | 7
                 "inheritance" | 7

Note that our output should contain the page numbers for a word in sorted order from smallest to largest page number.

DUE DATE: Monday 20th of March @ 11:59pm

Repository Contents

Your repositories must contain only the following

pom.xml file and
src folder with appropriate sub-folders with your code and test code only.

Code Criteria

Your solution for each problem should be in its own package. The package names should follow this naming convention edu.neu.ccs.cs5004.assignmentN.problemM where you replace N with the assignment number and M with the problem number, e.g., all your code for problem 1 for this assignment must be in a package named edu.neu.ccs.cs5004.assignment7.problem1.
Your project should successfully build using maven generating all the default reports.
Your Javadoc generation should complete with no errors or warnings.
Your Checkstyle report must have no violations.
Your JaCoCo report must indicate 75% code coverage per package for "Branches" and "Instructions" or more.
Your FindBugs report must have no violations (except for violations categorized as PERFORMANCE or SECURITY, you can ignore those).
Your PMD report should have no violations.

Iterators

For this question you are allowed to reuse code from previous assignment. If you do reuse your code from previous assignments make sure you update your old solutions to address any issues pointed out by the graders. You can also design and implement a new solution from scratch.

From assignment 5, problem 3, you designed an integer binary search tree (ibst). Develop the following features/operations for your ibst.

Develop an iterator that iterates over your ibst using a pre-order tree walk.
Develop an iterator that iterates over your ibst using a post-order tree walk.
Develop an iterator that iterates over your ibst using a in-order tree walk.

Your BST should provide 3 separate methods, one for each iterator so that clients can call the method that returns the iterator that they prefer.

Video Library

One of the professors from the Media department would like you to develop a small, simple library to manage a video library. Here is the description that they provided.

The video library is a collection of movies. There is no order in our library. For each movie however we would like to store the following information.

The movie's alias. This is a short name that I want to give to the movie. I would like to have the video library have unique aliases. So if I try to add a new movie and use an alias that is already in the video library, then I should be told that my alias is already used and then I will come up with a new alias.
The movie's title.
The movie's release year as a four digit number.
The movie's directors, there can be one or more. A director is typically a person's name.
The movie's main actors, there can be one or more. For each actor we provide their name.

Now once I populate my video library I would like to be able to perform the following tasks.

Given a director's name, I should be able to get all the movies that they directed. If there is more than 1 movie directed the results should be sorted by the movies release year. Most recent movie first.
I should be able to add a new movie to my video library. See my earlier comment on aliases.
I should be able to specify that I used the movie in my class. For example I want to keep track how many times a given movie was used. I should be able to provide the movie alias and the library should add 1 to the number of times that the movie was used.
I should be able to ask the video library to return the most used movie in my class. This is the movie that I used the most number of times in my class. If there are more than 1 movie then return any one of them. more than 1 most used

The professor is planning to have one of his students take over your solution and extend it in the future. For this reason, for each of the operations on the video library he would like you to add the runtime and space complexity of the operation in your documentation.