課程名稱 |
資訊檢索 Information Retrieval |
開課學期 |
109-1 |
授課對象 |
文學院 圖書資訊學系 |
授課教師 |
唐牧群 |
課號 |
LIS4012 |
課程識別碼 |
106 47000 |
班次 |
|
學分 |
3.0 |
全/半年 |
半年 |
必/選修 |
必帶 |
上課時間 |
星期三6,7,8(13:20~16:20) |
上課地點 |
圖資視聽室 |
備註 |
總人數上限:70人 |
Ceiba 課程網頁 |
http://ceiba.ntu.edu.tw/1091LIS4012_ |
課程簡介影片 |
|
核心能力關聯 |
核心能力與課程規劃關聯圖 |
課程大綱
|
為確保您我的權利,請尊重智慧財產權及不得非法影印
|
課程概述 |
The course is designed to provide an introduction to the use, design and evaluation of information (IR) systems. It covers major components in the IR process such as search strategies, indexing, IR models and IR evaluation. Students will also acquire hand-on experiences with IR evaluation and designing a digital library system. Special attention will be given to the comparision of different indexing methods and IR models and how they might be complement each other. |
課程目標 |
To provide an introduction to the use, design and evaluation of information (IR) systems |
課程要求 |
待補 |
預期每週課後學習時數 |
|
Office Hours |
|
參考書目 |
Bell, S. S.(2006). Librarian's guide to online searching.
Bhavani, S. K. K. Drabenstott, D. Radev (2000). Towards a unified framework of IR tasks and strategies.
Manning, Raghavan, Schutze (2008). Introduction to Informaiton Retrieval. Cambridge.
Chowdhury, G.G. (2004), Introduction to modern information retrieval. London: Facet publishing.
William, H. R. (1996). Information retrieval : a health and biomedical perspective. New York: Springer-Verlag New York, Inc.
Salton & McGill (1983). Introduction to modern information retrieval. McGraw-Hill..
Growssman, and Frieder (2004). Information retrieval: algorithms and Heuristics
Belew, Richard K. (2000). Finding out about: a cognitive perspective on search engine technology and the WWW. Cambridge: Cambridge University Press.
O'Connor, B. (1996). Explorations in indexing and abstracting.
Lancster, 2003. Indexing and abstracting in theory and practice.
Evaluation of Web-Based Search Engines Using User-Effort Measures. Availableonline: http://libres.curtin.edu.au/libres13n2/tang.htm
Ian H. Witten, David Bainbridge (2003). How to Build a Digital Library, Amsterdam: Morgan Kaufmann Publishers.
Janach, D., M. Zanker, A. Felfernig, G. Friedrich (2011). Recommender systems: an introduction. Cambridge.
Soergel (1985). Organizing information: principles of data base and retrieval systems Academic Press Professional, Inc. San Diego, CA.
Camtasia
Download ; Video tutorial
PubMed
PubMed tutorials, available http://www.nlm.nih.gov/bsd/disted/pubmed.html
OVID SP tutorial from Yale University Library
SCOPUS tutorial
PubMed help Available at Online_help
http://www.ncbi.nlm.nih.gov/books/bookres.fcgi/helppubmed/pubmedhelp.pdf
Greenstone: The software can be downloaded at
http://www.greenstone.org/cgi-bin/library?e=p-en-home-utfZz-8&a=p&p=download
Manuals (the “User’s guide” is most relevant to our purpose) http://greenstone.sourceforge.net/wiki/index.php/Manual |
指定閱讀 |
待補 |
評量方式 (僅供參考) |
No. |
項目 |
百分比 |
說明 |
1. |
|
0% |
Attendance to all class sessions is mandatory. Your grade will be judged based on your participation, homeworks, and in-class assignments. For your group projects will be judged by both the instructor |
2. |
Homework and participation |
10% |
There will be one homework on the practice of text tokenization and TFxIDF calculation. |
3. |
Group projects |
0% |
Students will form into groups of 3 to 5 to conduct 3 group projects. For each project, besides the group reports,
*** each group member should prepare an one to two paragraphs personal report explaining your contributions and what you have learned from the assignments. |
4. |
Search feature/command demo |
0% |
create and present a video demo that explains a search tactics or function available with Ovid/medline, Ebsco/Medline, Embase, or Scopus.( (accounts for 10% of your final grade)
See example |
5. |
Simulated literature search evaluation |
30% |
a. To obtain the search topics, interview two users (preferably graduate students or faculty members in sciences), each on one research topic they are interested in.
Collect from each user: a search statement and associated query terms that you both agree best represent her information need. Also try to characterize her information need using attributes such as "topic familiarity" and "uncertainty".
b. For each search topic, submit the queries on the user’s behalf to Google Scholar , Microsoft academic search, Semantic scholar or other major citation databases (e.g. Scopus, WOS). Collect the first 30 links from each of the two returned sets.
c. Find out the degree of overlap among the two returned sets.
d. Mix the non-duplicative (30X2, maximum) links together and strip the graphic cues.
This is done so that the user will not be able to tell which search engine each link is from.
e. For each link, marks its original and rank position.
f. Present the URLs in Microsoft Word files that allow the users to examine the actual webpage by clicking
on its hyperlink. Ask them to judge the relevance (topical as well as situational) of the pages based on a 0-4 scale (0 stands for not relevant at all; 4, very relevant).
g. Create an EXCEL or SPSS data file to input the relevance scores.
h. Compare the performance of the search engines based on 1) Mean Average Precision,
2) CG and DCG
I. Next submit the same query to Scopus and Web of Science and conduct a domain analysis, in which you will identify the publication trends, major authors, institues, journals, countries, and disciplines that have published in this area. |
6. |
Digital library construction |
30% |
Each group will build a functional online digital library collaboratively using WordPress, Joomla , or Greenstone digital library (GSDL) open source content management system.
DL_project_exampl1 DL_project_example2 DL_project_example3
The project consists of three components: the implantation of a digital collection on the topic of your own choosing, a written report (5-6 pages) and an oral presentation of the project.
The digital collection should include:
a. A minimum of 70 documents representative of different document formats such as pdf, word, and html.
b. An index structure that enables browsing of the collection
c. The provision of faceted and fielded search
The written report (4-6 pages) should:
d. Explain the aim, purpose, sources, intended users and their information needs of the collection.
It is better that you come up with an institutional context (real or imaginary) for the use of the collection.
e. Define your selection and indexing policies (human and machine indexing components; metadata structure) based on the aim and purpose stated above.
f. Include a graphic presentation of the browsable index structure and the rationales behind your design
(i.e. explain why you choose certain browsable facets and searchable fields to represent your collection) |
7. |
Final exam |
30% |
The exam is based on the lecture notes and readings, a review will be given before the exam to help you prepare for it. |
|
週次 |
日期 |
單元主題 |
第1週 |
9/16 |
Introduction to syllabus
History of IR; data vs. information retrieval |
第2週 |
9/23 |
Advanced search with PubMed; introduction to search features with PubMed/Ovid/Ebsco/EMBASE
Discussion of your search demo project |
第3週 |
9/30, LAB |
Search strategies tactics ; PICO;
Camtasia demo (laptop)
Discussion of your search demo project |
第4週 |
10/07 |
Indexing exhaustivitiy vs. specificity
Automatic index basic (text analysis, term weighting) |
第5週 |
10/14 |
Search feature/command demo due |
第6週 |
10/21 |
Demo of ctext.org at the lab; TF*IDF tool
Demo Corpro |
第7週 |
10/28 |
IR evaluation;
Discussion of your second (IR evalaution project) |
第8週 |
11/04 |
IR models I: Boolean; term weighting and vector space model; similarity measures;
Discussion of your IR evalaution project
Homework/automatic indexing due |
第9週 |
11/11 |
Relevance feedback and query expansion;
Discussion of your IR evaluation project |
第10週 |
11/18 |
Simulated search evaluation presentation |
第11週 |
11/25 |
Facet analysis and information architecture
Wordpress demo at computer lab
Discussion of your DL project |
第12週 |
12/02 |
IR model II: Probability model
Discussion of your DL project |
第13週 |
12/09 |
IR model: probablitic and language models
Discussion of your DL project |
第14週 |
12/16 |
Lab session with your DL project |
第15週 |
12/23 |
DL assignment presentation |
第16週 |
12/30 |
Web search and link structure |
第17週 |
1/06 |
Final review |
第18週 |
1/13 |
Final exam |
|