Information

Information School INF6060 Information Retrieval Postgraduate coursework (Part II) Expert Assessment of an Information Retrieval System Date due: December 16, 2015 Length: 1500 words Purpose of Coursework The objective of this assessment is to apply the theoretical knowledge that you have learned from this module to a real world use case: the evaluation of an information retrieval system. To do this you must integrate your learning on how an information retrieval system works, what types of function are typically used, and thus how the system supports users in finding the information they need. For the rest of this document, we will refer to an information retrieval system as a search system. All websites, intranets, and information sources use search systems to enable the user and client base to find items among the content, from webpages, to documents to snippets of information. For this coursework, and as a soon-­-to-­-be information manager, you will evaluate how effectively a system appears to be meeting user needs, and then recommend future improvements/developments. Note that the document you provide may seem short but this is in keeping with the real world which expects reports that are focused, succinct and with a clear direction of action. The process that you will need to deploy to get to those 1500 words must not be underestimated. We anticipate that this will be about a week’s work, from designing the test, conducting it, checking and confirming with the research literature and writing the report. This expert assessment evaluation can be summarized in five steps: A) A system to evaluate will be assigned to you. B) Decide the purpose of the evaluation and the criteria used to assess the objectives. C) Design the evaluation, which includes: 1) identifying a set of generic (but typical) tasks that a user of the website would use; 2) identifying the measures to be used to assess the system response and thus respond to the criteria and objectives. D) Conduct the evaluation, collect the data, and analyse the data; E) Write the report. Each of these steps is described in the following sections. A) Search System. We have selected a range of websites, i.e., information sources, that use an information retrieval system. Each member of class will be randomly assigned one of the websites listed in Appendix A. As a graduate of this School, you should be able to apply your skills to any information system, regardless of your background and experience, and the course that you are currently doing. Explore the website so that you understand the source and its intended user group, and closely examine its search system so that you understand how it works and what its key functions are. Refer to lecture notes and to the academic research literature for help in understanding what is important. As you experiment with the search system, you should consider what functions are provided that help people during the search process, specifically with formulating queries, refining them and examining the search results. Each of the assigned websites enables additional types of functionality, such as purchasing, and submitting an application, and may have multiple menus and navigational toolbars. The intent is not to evaluate the website, the interface, or the source; the intent is to focus only on the search system which usually starts with the search box to enable query entry. B) Design the Expert Assessment B.1 Purpose What is the purpose of the evaluation? With the limited time for this project, you will not be able to evaluate the IR system in a holistic way. Instead, this is an expert assessment (where you are the expert) with a tightly focused purpose. You get to choose that tightly focused purpose. What do you want to learn from the evaluation? Given what you now know about information retrieval systems, what should you evaluate: Relevance of results? Efficiency? Effectiveness? Usefulness of the query box? Informativeness of the snippets? There are many many aspects that one could consider. Specify two objectives that you would like to consider. An objective needs to be specific and measurable (either qualitatively or quantitatively). The following are examples of objectives: 1. To evaluate how healthy the University’s lunch “Meal Deal” is; 2. To assess the efficiency of the search engine; 3. To evaluate whether the search engine performs better for specific focused tasks, e.g., known item and facts, than for more generally focused tasks, e.g., finding descriptions and introductory material. These objectives will either directly or indirectly identify the criteria. Given the objective of the test, what criteria will you use? In the three examples above, the criteria are: 1. healthiness 2. efficiency 3. performance In all cases, the criteria emerge from the objectives. The next challenge is to clearly identify what each means using the research literature. B.2 Information Tasks What tasks will you use to test the IR system? These tasks should be those expected of typical users of this website, and considered exemplars. Some may be the most popular and frequently used ones, but some may be irregularly used, but essential. For example, one may access a University site only once during a programme to find out about the date of graduation, but look for the seminar timetable on a weekly basis. Ideally we would have real tasks used by real users. But this would require a formal task analysis that is more like a dissertation topic and outside the scope of this assignment. For this assignment, you need useful surrogates that one could reasonably assume a typical user to do with that website. All tasks are not equal. There are many different types and here are six that are well used throughout the web and inside organisations: • Known item search: the search for a particular information item that is known to exist, e.g. the search for a specific book, or a particular image, etc. • Factual search: the search for a factual piece of information, e.g. population of Sheffield, the number of 2 bedroom flats for rent in Crookes, etc. • Search for instruction: the search for a set of instructions or explanation for how to achieve something. E.g., how to change the oil in a car, make a cake, plan a trip to Rome, etc. • Search for description: the search for a rich description of an object, place or other item. E.g. find a description of the Mona Lisa, a description of the landscape in Hawaii, etc. • Location search: the search for the location of an object or item. E.g. where can I find the original Mona Lisa painting? Where is the closest library with a copy of the book “The art of motorcycle maintenance”? etc. • Finding introductory material: the search for an introduction to a topic, e.g. an introduction to “complex numbers”, or an introduction to Shannon and Weaver’s definition of entropy, etc. Identify a representative instance of each that could be used in testing the system. An example is provided for each task type above. Use that pattern to customise the task for your assigned site. If you believe that the task type is not relevant to your system, then indicate in the report. Insert “not applicable” and provide a reason for why you think so. For example, if you were asked to identify a shopping task on the Youtube website, then we would say that a purchasing task is not applicable because the site does not provide for shopping and purchasing products. B.3 Measures How will you assess each task? What measures will you use? Many measures were introduced in the two class lectures on evaluation. Which are the most appropriate for the evaluation given the criteria that you have identified for assessment? In the examples given in section B.2: • Healthiness could be measured by the quantity of fat and sugar as a percentage of the daily food requirements • Efficiency could be measured by the number of mouseclicks, queries, amount of time. • Performance could be measured by its ability to put the best (e.g., most relevant, more useful) document within the top three ranked items on the results list. You would still need to define what you mean by relevant. Performance might also be measured by number of mouseclicks Identify five measures that you will use. Clearly and parsimoniously define each using the research literature. B.4 Design the Evaluation Once you have specified all of the elements discussed above, then design your test. Here are some things to think about, and this list is by no means complete: • Will you use only one query per task? • Will you control the number of words in each query? • Will you only look at the first page of results? • Will you only look at snippets for the answer? • Will you check how the system handles mis-­-spelling, error detection and correction? • Will you check how the system handles various forms of a word, e.g., plurals? An evaluation such as this should be replicable by others, in case your summation of the system is challenged. Thus it is very important that the precise steps that you take are provided. For example, a very, very simply test could have these steps: 1. One instance of each task was created by looking at the scope of the website, and exploring the menu structure. 2. All tasks contained a similar pattern with a maximum of three key concepts 3. One query of 3 words was created for each 4. Each query was entered into the search box, and the results were displayed. Only snippets were used to assess response. 5. The five measures were then entered into the table. 6. Steps 3-­-5 were repeated for each information task. Provide a similar list being very precise about the steps that a person would take to perform the assessment. C) Conduct the Evaluation and Analyse the Data C.1 Conduct the Evaluation Using the procedure that you created in B.4, conduct the test. Do this in one sitting, rather than haphazardly. Collect all of the measures and insert into a table (see the template) as you go. Consistency is important in doing an evaluation. C.2 Analyse the Results A first step is synthesizing the results. The table that you populated in C.1 will now be full of numbers or text. What do the results mean? How well did the system perform overall? Objectivity in discussing results is important. For example, one could imagine statements such as these included in results section: “On average the tasks took 3 minutes to complete;” “Most tasks took three queries to acquire one useful webpage;” “The best response was consistently in the top three ranked documents.” Then, examine the table in more detail. Did the system perform similarly for all tasks, or did the system handle some tasks better than others? Do you see any common patterns emerging, by measure, or across the set of measures? Do you see any pattern by information task? Or by query? This is where you get to use those critical analysis skills that are so important in graduate school. Finally, how will you respond to each objective? All conclusions about the system must be based on evidence. Evidence will be one of two types: a) results of the test using the tasks; b) results from the research literature that examined a similar problem. C.3 Provide two to three Recommendations All reports are completed for a purpose. What should occur as a result of the assessment? What will you recommend to the website owner that provides the search system? Provide two to three recommendations that you can link directly to your results, or to the research literature. All recommendations must be supported by research. D) Write the Report The report submitted for this coursework will be created using the template provided. You must • use the template provided on the Mole website. Do not change the headings or formatting in the template. Simply answer the questions. • use the required word count within 5% for each section. This is very good practice for being succinct and focused in your writing. • change the name of the file you submit to your student identification number. For example, if you number is 150213350, then convert the filename to 150213350.docx. Report section Marking Scheme 1 Name of information source and its URL Value = Required, but ungraded 2 Website & IR system [50 words] Briefly describe the website and identify why it has a search system. Value= 5% This requires a clear, succinct description of the system and an objective statement of the purpose of a search system for this website. 3 Objectives & Criteria [100 words] Specify precisely and definitively two objectives for the evaluation, and define the criteria used in the evaluation Value= 10% Objectives must be precise, and the criteria clearly defined. 4 Information Tasks [no word restriction, but should not require more than 25 per task] Provide one instance of each task type. Specify the queries used for each task. There is no right or wrong answer for this. Value = Required, but ungraded. The tasks will inform the evaluation. Poorly selected tasks and inconsistent and illogical queries will affect all other aspects of the evaluation. 5 Measures [200 words] Identify the five measures used to assess each task. Provide a clear definition. Each measure must relate to the previously defined criteria. Value= 10% Each measure must be clearly defined with appropriate citation from the research literature. We must know how it was applied. 6 Method or Procedures [200 words] Explain how the information retrieval system was evaluated. Since this is very procedural, use a list format. Value=10% The description must be logical, and must be clearly presented in such a way that someone else could repeat it. 7 Results [200 words] a. Insert the result for each task and measure into the table. Change the headings to be consistent with the measures you selected. For example, change “Criteria 1” to “Time on task.” b. Briefly summarise the results based on the data. Value= 20% Data must be accurate. It the marker enters the queries and uses the definition found in Measures, the same result should be received. This requires an objective succinct description of the results table. 8 Analysis [450 words] Provide the details of the analysis. Include a response to your objectives. Given your analysis, what will you conclude? Value= 30% This section requires a demonstration of your ability to analyse, and at the same time evidence of your knowledge gained from the module. An analysis without evidence to support it will lose Report section Marking Scheme marks. 9 Recommendations [300 words] Identify recommendations for the website owner. Please ensure that your recommendations can be supported by past research and include the appropriate citations. Value= 10% Similar to the Analysis section, this section requires a demonstration of your ability to critically think about the results and your knowledge gained from the module. Recommendations that do not emerge from the analysis, or from the research literature will be downgraded. Like the previous section, you need evidence to support your position. 10 References [not included in word count] All references cited in the report must be included in a bibliography. Use the APA format. Given the type of report, one would expect to see between 6-­-10 references used. Value= 5% The report requires use of the research literature. Non-­-use of the academic research literature will result in no marks for this section as well as a 5% reduction in each of the Results and Analysis sections. Using news articles, blogs, and opinion pieces is also inappropriate as they are not evidence and will be dismissed in the marking. In addition, this section needs a consistent and standard presentation of all references. Providing only an author, title and URL is insufficient. Structure, Language and Writing Style The structure of this report has been clearly defined for you. Concentrate on comprehension, grammatical and language Marks will be deducted from the total mark for: a) inadequate writing b) lack of comprehension c) inappropriate word use Information School Coursework Submission Requirements It is the student’s responsibility to ensure no aspect of their work is plagiarised or the result of other unfair means. The University’s and Information School’s Advice on unfair means can be found in your Student Handbook, available via http://www.sheffield.ac.uk/is/current. Your assignment has a word count limit. A deduction of 3 marks will be applied for coursework that is 5% or more above or below the word count as specified above or that does not state the word count. It is your responsibility to ensure your coursework is correctly submitted before the deadline. It is highly recommended that you submit well before the deadline. Coursework submitted after 2pm on the stated submission date will result in a deduction of 5% of the mark awarded for each working day after the submission date/time up to a maximum of 5 working days, where ‘working day’ includes Monday to Friday (excluding public holidays) and runs from 2pm to 2pm. Coursework submitted after the maximum period will receive zero marks. Work submitted electronically, including through Turnitin, should be reviewed to ensure it appears as you intended. Before the submission deadline, you can submit coursework to Turnitin numerous times. Each submission will overwrite the previous submission. Only your most recent submission will be assessed. However, after the submission deadline, the coursework can only be submitted once. During your first Semester at the School, when submitting a piece of work through Turnitin, you will only be able to view a ‘similarity report’ when submitting your Test Essay. You can then edit and resubmit your Test Essay. For other coursework you will not be able to view a Turnitin ‘similarity report’. Details about the submission of work via Turnitin can be found at http://youtu.be/C_wO9vHHheo If you encounter any problems during the electronic submission of your coursework, you should immediately contact the module coordinator and one of the Information School Exams Secretaries (Julie Priestley, J.Priestley@sheffield.ac.uk, 0114 2222839 or Larah Arvandi, l.arvandi@sheffield.ac.uk, 0114 2222640). This does not negate your responsibilities to submit your coursework on time and correctly. Appendix A. List of Sources for Evaluation Amazon (http://www.amazon.com) BBC (http://www.bbc.co.uk/search) Bloomberg business Europe (http://www.bloomberg.com/europe) British Petroleum (BP) (http://www.bp.com) Deloitte (http://www2.deloitte.com/uk/en.html) Elsevier (http://www.elsevier.com) Epicurious (http://www.epicurious.com) Europeana (http://www.europeana.eu/portal/) Ford cars (http://www.ford.co.uk) Getty Images (http://www.gettyimages.co.uk/editorialimages/archival) Microsoft Research (http://research.microsoft.com/en-­-us/) Olympic Movement (http://www.olympic.org) Reuters (http://uk.reuters.com/) Rightmove (www.rightmove.co.uk) Shell UK (http://www.shell.co.uk) Trove Australian Newspaper archive (https://trove.nla.gov.au/newspaper) UK Science museum (http://www.sciencemuseum.org.uk) Vauxhall cars (http://www.vauxhall.co.uk/) Youtube (https://www.youtube.com) Zoopla (http://www.zoopla.co.uk/)

IS IT YOUR FIRST TIME HERE? WELCOME

USE COUPON "11OFF" AND GET 11% OFF YOUR ORDERS