課程資訊
課程名稱
資訊檢索
INFORMATION RETRIEVAL 
開課學期
95-1 
授課對象
學程  知識管理學程  
授課教師
唐牧群 
課號
LIS4012 
課程識別碼
106 47000 
班次
 
學分
全/半年
半年 
必/選修
選修 
上課時間
星期二2,3,4(9:10~12:10) 
上課地點
普402 
備註
知識管理學程資源領域選修課程。
總人數上限:80人 
Ceiba 課程網頁
http://ceiba.ntu.edu.tw/951ir 
課程簡介影片
 
核心能力關聯
核心能力與課程規劃關聯圖
課程大綱
為確保您我的權利,請尊重智慧財產權及不得非法影印
課程概述

Course description
The course is designed to provide an introduction to the use, design and
evaluation of information (IR) systems. It covers major components in the IR
process such as information needs, search strategies, indexing as well as
retrieval evaluation. Special attention will be given to users’ information
environment within which IR is situated.
 

課程目標
Course objectives
1. To develop knowledge and skills to conduct effective online retrieval.
2. To acquire a basic understanding of the inner working of information
retrieval systems and the knowledge to assess their functions and features.
3. To be aware of the relationships between different types of search tasks and
search tactics.
4. To gain hand-on experiences in building a digitalized collection.
 
課程要求
 
預期每週課後學習時數
 
Office Hours
另約時間 
參考書目
References
Readings
Hersh ,William R. (1996) Information retrieval : a health and biomedical
perspective: Springer-Verlag New York, Inc.
Belew, Richard K. (2000). Finding out about: a cognitive perspective on
search engine technology and the WWW. Cambridge: Cambridge University Press.
Evaluation of Web-Based Search Engines Using User-Effort Measures. Available
online: http://libres.curtin.edu.au/libres13n2/tang.htm

PubMed
PubMed tutorials, available at http://www.nlm.nih.gov/bsd/disted/pubmed.html
PubMed help Available at
http://www.ncbi.nlm.nih.gov/books/bookres.fcgi/helppubmed/pubmedhelp.pdf
Greenstone
The software can be downloaded at
http://www.greenstone.org/cgi-bin/library?e=p-en-home-utfZz-8&a=p&p=download
Manuals (the “User’s guide” is most relevant to our purpose)
http://greenstone.sourceforge.net/wiki/index.php/Manual
Ian H. Witten, David Bainbridge (2003). How to Build a Digital Library,
Amsterdam : Morgan Kaufmann Publishers. 
指定閱讀
 
評量方式
(僅供參考)
 
No.
項目
百分比
說明
1. 
Group project: IR evaluation 
30% 
Students will form into a group of 4 to 5 to carry out the project. Each group will conduct an IR evaluation comparing 3 major Web-based search engines on two search topics. a. To obtain the search topics, interview two users (preferably graduate students or faculty members), each on one research topic they are interested in. Collect from each user: a search statement and associated query terms that you both agree best represent her information need. b. For each search topic, submit the queries on the user’s behalf to the three search engines you are testing. Collect the first 20 links from each of the three returned sets. c. Find out the degree of overlap among the three returned sets. d. Mix the non-duplicative (20X3, maximum) links together and strip the graphic cues. This is done so that the user will not be able to tell which search engine each link is from. e. For each link, marks its original and rank position. f. Present the URLs in Microsoft Word files that allow the users to examine the actual webpage by clicking on its hyperlink. Ask them to judge the relevance of the pages based on a 0-4 scale (0 stands for not relevant at all; 4, very relevant). g. Create an EXCEL or SPSS data file to input the relevance scores. h. Compare the performance of the search engines based on 1) first 20 "full" precision and 2) search length 2 (i.e. the number of links the user has to go through to find two relevant documents) i. Turn in an 8 page written report on your findings and present them in the class.  
2. 
In-class quiz 
10% 
The quiz is based on the lecture notes and handouts. The quiz will be held in class on November 21st ; there will be 4 short questions in the quiz.  
3. 
Online tutorial  
10% 
1. Search feature/command demo (accounts for 10% of your final grade) Students will work in pairs to create and present a video demo that explains a search tactics or function available at PubMed database.  
4. 
Group project: Digital library 
40% 
Students will form into a group of 4 to 5 to carry out the project. Each group will build a functional online information retrieval system collaboratively using the Greenstone digital library (GSDL) open source software. The project consists of three components: the implantation of a digital collection on the topic of your own choosing, a 10-15 page written report and an oral presentation of the project at the end of the semester. The digital collection should include: a. A minimum of 50 documents representative of different document formats such as pdf, word, and html. b. An index structure that enables browsing of the collection c. The provision of fielded search The written report should: d. Explain the aim, purpose, intended users and their information needs of the collection. It is better that you come up with an institutional context (real or imaginary) for the use of the collection. e. Define your selection and indexing policies based on the aim and purpose stated above. f. Include a graphic presentation of the browsable index structure and the rationales behind your design (i.e. explain why you choose certain facets/attributes to represent your collection)  
5. 
Class participation 
10% 
Attendance to all class sessions is mandatory. Your grade will be judged based on you attendance and participation in the class discussion. If you don’t get the chance to participate in the class, submit your comments or questions to the online forum.  
 
課程進度
週次
日期
單元主題
第1週
9/19  Introduction to the syllabus; information retrieval in the broad context of human information seeking. 
第2週
9/26  Introduction to PubMed
Demo to Camtasia
 
第3週
10/03  Search tactics and strategies 
第4週
10/10  No class  
第5週
10/17  Relevance; evaluation and performance criteria
Demo to GSDL (Greenstone digital library)

*Turn in the search topics for your evaluation project (including search statement and query terms)
 
第6週
10/24  Indexing: machine vs. human  
第7週
10/31  PubMed demo presentation  
第8週
11/07  Types and structures of vocabularies
*Turn in the topic for your final project (the aim, scope and intended users of your collection)
 
第9週
11/14  Indexing policy: specificity and exhaustivity 
第10週
11/21  IR models (partial vs. exact match): vector space, probability, Page Rank (cognitive authority)
In-class quiz
 
第11週
11/28  Web search
Google Syntax
 
第12週
12/05  IR evaluation presentation 
第13週
12/12  Federated search  
第14週
12/19  Interface design and usability 
第15週
12/26  Extension: Collaborative filtering, citation indexing, collaborative filtering, information visualization  
第16週
2007/1/02  Group project presentation  
第17週
1/09  Group project presentation