ISU535 06X1 Information Retrieval

Homework 3

Created: Mon 2 May 2005
Last modified: 

Be sure to check the syllabus for the due date.


Problem 1. (20 points) Huffman coding.

Consider the following text fragment:

To be or not to be

This text fragment contains 18 total characters, including the five internal space characters, and eight unique characters, assuming case-sensitivity so that "T" and "t" are treated as separate characters.

Problem 2. (20 points) Lempel-Ziv: Encoding.

Consider your three initials, lower case. (If you do not have a middle name, use the first two letters from your first and the first letter from your last name.) For my name, Javed A. Aslam, the three lower case initials are "jaa".

Problem 3. (20 points) Lempel-Ziv: Decoding.

Consider the string:

00110010010010010011011010011100010011010010001010001100001111111

This bit string is the result of encoding a piece of text using the Lempel-Ziv algorithm described above (and in class).