Syllabus/Schedule - IS1320 Information Retrieval - Spring 2003

Professor Futrelle

Version of 25 March 2003

This schedule lists important exam and project due dates. I will send email to the class when any important updates have been made to this schedule. Your responsibilities are laid out here and in the assignments page. Be sure to always check both. The items below focus reading; the assignments page focuses more on things you are to hand in or present.

Week 1. Wed. March 26
Course Introduction. Getting started on: Creating, Retrieving and Assessment of Information. The CCIS Unix system. Programming in Java (required). Group projects. Your web site (required). Get the textbook now.
Thurs. March 27
Details of your assignments due next week. More on Unix, Java and your website.
Week 2. Mon. March 31
Basic IR techniques. You will choose your project and team today. Overview of the field. Read all of Chapter 1. Look at some of the Reference entries (pg 455+) for items on pgs 17-18.
Read Chap. 2 to end of Sec. 2.5.3. Look at all Ref. entries for these topics.
Wed. April 2
CLASS WILL BE HELD IN 120 SNELL LIBRARY. Attendance will be taken.
Thurs. April 3
How do we know if an IR system is any good? Retrieval evaluation. Read trough Sec. 3.2.2. Project Lottery today.
Week 3. Mon. April 7
Reference collections. Read remainder of Chap. 3. Study the TREC site and links there: http://trec.nist.gov/ and be familiar with the Glasgow site: http://www.dcs.gla.ac.uk/idom/ir_resources/test_collections/ What will be covered on the Thursday quiz.
Wed. April 9
PROJECT PRESENTATIONS BEGIN. The types of query languages. Read through Sec. 4.3 carefully. Then read the remainder of the chapter.
Thurs. April 10
Quiz #1. Covers all material to date.
Week 4. Mon. April 14
Quiz #1 returned and discussed. Algorithms for pattern matching in Java (w/o indexes).
Wed. April 16
Logging into a web site. HTTP. Writing/using Java-based web access and crawlers.
Thurs. April 17
HTML, XML, XML Schema and more. Read all of Chap. 6, focusing on the portions before Sec. 6.5. Google for xml and also for xml schema to find out more. (Chap. 5 will be mentioned but there is no reading assignment for it.)
Week 5. Mon. April 21
Using JAXB to create, parse and manipulate XML documents. Read some material from the following, your choice: http://www.google.com/search?q=jaxb&ie=UTF-8&oe=UTF-8
Wed. April 23
Text operations. Read up to Sec. 7.3. Also study Porter's stemming algorithm, pgs. 433-436. Practice applying it to twenty very differently structured words.
Thurs. April 24
I will expand on document clustering (Sec. 7.3). Read through Sec. 7.4.4.
Week 6. Mon. April 28
Architecture of the web. The anatomy of Google. Indexing and searching. Read up to Sec. 8.3 and also Sec. 8.6.3. Then skim through Sec. 8.3.1 to gather some ideas about suffix trees and arrays.
Wed. April 30
More on Google
Thurs. May 1
User interfaces and visualization. Chapter 10 on this topic, by Marti Hearst, is long and full of figures. It is not difficult reading. So spend a few days reading the entire chapter a couple of times.
Week 7. Mon. May 5
Suffix trees and arrays -- I will describe these more carefully. Why raw search is not used in large applications.
Wed. May 7
Review for the Midterm.
Thurs. May 8
MIDTERM EXAM
Week 8. Mon. May 12
Return and discuss the Midterm Exam
Wed. May 14
Architecture of the web.
Thurs. May 15
Quiz #2 "pre-final-quiz" (tentatively scheduled)
Week 9. Mon. May 19
Quiz #2 results. Multimedia IR.
Wed. May 21
More on web searching. Read Chap. 13.
Thurs. May 22
Libraries, digital libraries and intellectual property issues. Read as much as you can of Chapters 14 and 15.
Week 10. Mon. May 26
MEMORIAL DAY -- SCHOOL CLOSED
Wed. May 28
Final Student Project Summary Whirlwind
Thurs. May 29
Last class. Review for the Final Exam.
June 2-7 Final Exam Week
IS1320 Final will be at 8am, Monday, June 2nd, Room 237 FR (Forsyth)

Go to IS1320 home page.

Return to Prof. Futrelle's home page