Information
Information
School
INF6060
Information
Retrieval
Postgraduate
coursework
(Part
II)
Expert
Assessment
of
an
Information
Retrieval
System
Date
due:
December
16,
2015
Length:
1500
words
Purpose
of
Coursework
The
objective
of
this
assessment
is
to
apply
the
theoretical
knowledge
that
you
have
learned
from
this
module
to
a
real
world
use
case:
the
evaluation
of
an
information
retrieval
system.
To
do
this
you
must
integrate
your
learning
on
how
an
information
retrieval
system
works,
what
types
of
function
are
typically
used,
and
thus
how
the
system
supports
users
in
finding
the
information
they
need.
For
the
rest
of
this
document,
we
will
refer
to
an
information
retrieval
system
as
a
search
system.
All
websites,
intranets,
and
information
sources
use
search
systems
to
enable
the
user
and
client
base
to
find
items
among
the
content,
from
webpages,
to
documents
to
snippets
of
information.
For
this
coursework,
and
as
a
soon--to--be
information
manager,
you
will
evaluate
how
effectively
a
system
appears
to
be
meeting
user
needs,
and
then
recommend
future
improvements/developments.
Note
that
the
document
you
provide
may
seem
short
but
this
is
in
keeping
with
the
real
world
which
expects
reports
that
are
focused,
succinct
and
with
a
clear
direction
of
action.
The
process
that
you
will
need
to
deploy
to
get
to
those
1500
words
must
not
be
underestimated.
We
anticipate
that
this
will
be
about
a
week’s
work,
from
designing
the
test,
conducting
it,
checking
and
confirming
with
the
research
literature
and
writing
the
report.
This
expert
assessment
evaluation
can
be
summarized
in
five
steps:
A) A
system
to
evaluate
will
be
assigned
to
you.
B) Decide
the
purpose
of
the
evaluation
and
the
criteria
used
to
assess
the
objectives.
C) Design
the
evaluation,
which
includes:
1)
identifying
a
set
of
generic
(but
typical)
tasks
that
a
user
of
the
website
would
use;
2)
identifying
the
measures
to
be
used
to
assess
the
system
response
and
thus
respond
to
the
criteria
and
objectives.
D) Conduct
the
evaluation,
collect
the
data,
and
analyse
the
data;
E) Write
the
report.
Each
of
these
steps
is
described
in
the
following
sections.
A)
Search
System.
We
have
selected
a
range
of
websites,
i.e.,
information
sources,
that
use
an
information
retrieval
system.
Each
member
of
class
will
be
randomly
assigned
one
of
the
websites
listed
in
Appendix
A.
As
a
graduate
of
this
School,
you
should
be
able
to
apply
your
skills
to
any
information
system,
regardless
of
your
background
and
experience,
and
the
course
that
you
are
currently
doing.
Explore
the
website
so
that
you
understand
the
source
and
its
intended
user
group,
and
closely
examine
its
search
system
so
that
you
understand
how
it
works
and
what
its
key
functions
are.
Refer
to
lecture
notes
and
to
the
academic
research
literature
for
help
in
understanding
what
is
important.
As
you
experiment
with
the
search
system,
you
should
consider
what
functions
are
provided
that
help
people
during
the
search
process,
specifically
with
formulating
queries,
refining
them
and
examining
the
search
results.
Each
of
the
assigned
websites
enables
additional
types
of
functionality,
such
as
purchasing,
and
submitting
an
application,
and
may
have
multiple
menus
and
navigational
toolbars.
The
intent
is
not
to
evaluate
the
website,
the
interface,
or
the
source;
the
intent
is
to
focus
only
on
the
search
system
which
usually
starts
with
the
search
box
to
enable
query
entry.
B)
Design
the
Expert
Assessment
B.1
Purpose
What
is
the
purpose
of
the
evaluation?
With
the
limited
time
for
this
project,
you
will
not
be
able
to
evaluate
the
IR
system
in
a
holistic
way.
Instead,
this
is
an
expert
assessment
(where
you
are
the
expert)
with
a
tightly
focused
purpose.
You
get
to
choose
that
tightly
focused
purpose.
What
do
you
want
to
learn
from
the
evaluation?
Given
what
you
now
know
about
information
retrieval
systems,
what
should
you
evaluate:
Relevance
of
results?
Efficiency?
Effectiveness?
Usefulness
of
the
query
box?
Informativeness
of
the
snippets?
There
are
many
many
aspects
that
one
could
consider.
Specify
two
objectives
that
you
would
like
to
consider.
An
objective
needs
to
be
specific
and
measurable
(either
qualitatively
or
quantitatively).
The
following
are
examples
of
objectives:
1. To
evaluate
how
healthy
the
University’s
lunch
“Meal
Deal”
is;
2. To
assess
the
efficiency
of
the
search
engine;
3. To
evaluate
whether
the
search
engine
performs
better
for
specific
focused
tasks,
e.g.,
known
item
and
facts,
than
for
more
generally
focused
tasks,
e.g.,
finding
descriptions
and
introductory
material.
These
objectives
will
either
directly
or
indirectly
identify
the
criteria.
Given
the
objective
of
the
test,
what
criteria
will
you
use?
In
the
three
examples
above,
the
criteria
are:
1. healthiness
2. efficiency
3. performance
In
all
cases,
the
criteria
emerge
from
the
objectives.
The
next
challenge
is
to
clearly
identify
what
each
means
using
the
research
literature.
B.2
Information
Tasks
What
tasks
will
you
use
to
test
the
IR
system?
These
tasks
should
be
those
expected
of
typical
users
of
this
website,
and
considered
exemplars.
Some
may
be
the
most
popular
and
frequently
used
ones,
but
some
may
be
irregularly
used,
but
essential.
For
example,
one
may
access
a
University
site
only
once
during
a
programme
to
find
out
about
the
date
of
graduation,
but
look
for
the
seminar
timetable
on
a
weekly
basis.
Ideally
we
would
have
real
tasks
used
by
real
users.
But
this
would
require
a
formal
task
analysis
that
is
more
like
a
dissertation
topic
and
outside
the
scope
of
this
assignment.
For
this
assignment,
you
need
useful
surrogates
that
one
could
reasonably
assume
a
typical
user
to
do
with
that
website.
All
tasks
are
not
equal.
There
are
many
different
types
and
here
are
six
that
are
well
used
throughout
the
web
and
inside
organisations:
• Known
item
search:
the
search
for
a
particular
information
item
that
is
known
to
exist,
e.g.
the
search
for
a
specific
book,
or
a
particular
image,
etc.
• Factual
search:
the
search
for
a
factual
piece
of
information,
e.g.
population
of
Sheffield,
the
number
of
2
bedroom
flats
for
rent
in
Crookes,
etc.
• Search
for
instruction:
the
search
for
a
set
of
instructions
or
explanation
for
how
to
achieve
something.
E.g.,
how
to
change
the
oil
in
a
car,
make
a
cake,
plan
a
trip
to
Rome,
etc.
• Search
for
description:
the
search
for
a
rich
description
of
an
object,
place
or
other
item.
E.g.
find
a
description
of
the
Mona
Lisa,
a
description
of
the
landscape
in
Hawaii,
etc.
• Location
search:
the
search
for
the
location
of
an
object
or
item.
E.g.
where
can
I
find
the
original
Mona
Lisa
painting?
Where
is
the
closest
library
with
a
copy
of
the
book
“The
art
of
motorcycle
maintenance”?
etc.
• Finding
introductory
material:
the
search
for
an
introduction
to
a
topic,
e.g.
an
introduction
to
“complex
numbers”,
or
an
introduction
to
Shannon
and
Weaver’s
definition
of
entropy,
etc.
Identify
a
representative
instance
of
each
that
could
be
used
in
testing
the
system.
An
example
is
provided
for
each
task
type
above.
Use
that
pattern
to
customise
the
task
for
your
assigned
site.
If
you
believe
that
the
task
type
is
not
relevant
to
your
system,
then
indicate
in
the
report.
Insert
“not
applicable”
and
provide
a
reason
for
why
you
think
so.
For
example,
if
you
were
asked
to
identify
a
shopping
task
on
the
Youtube
website,
then
we
would
say
that
a
purchasing
task
is
not
applicable
because
the
site
does
not
provide
for
shopping
and
purchasing
products.
B.3
Measures
How
will
you
assess
each
task?
What
measures
will
you
use?
Many
measures
were
introduced
in
the
two
class
lectures
on
evaluation.
Which
are
the
most
appropriate
for
the
evaluation
given
the
criteria
that
you
have
identified
for
assessment?
In
the
examples
given
in
section
B.2:
• Healthiness
could
be
measured
by
the
quantity
of
fat
and
sugar
as
a
percentage
of
the
daily
food
requirements
• Efficiency
could
be
measured
by
the
number
of
mouseclicks,
queries,
amount
of
time.
• Performance
could
be
measured
by
its
ability
to
put
the
best
(e.g.,
most
relevant,
more
useful)
document
within
the
top
three
ranked
items
on
the
results
list.
You
would
still
need
to
define
what
you
mean
by
relevant.
Performance
might
also
be
measured
by
number
of
mouseclicks
Identify
five
measures
that
you
will
use.
Clearly
and
parsimoniously
define
each
using
the
research
literature.
B.4
Design
the
Evaluation
Once
you
have
specified
all
of
the
elements
discussed
above,
then
design
your
test.
Here
are
some
things
to
think
about,
and
this
list
is
by
no
means
complete:
• Will
you
use
only
one
query
per
task?
• Will
you
control
the
number
of
words
in
each
query?
• Will
you
only
look
at
the
first
page
of
results?
• Will
you
only
look
at
snippets
for
the
answer?
• Will
you
check
how
the
system
handles
mis--spelling,
error
detection
and
correction?
• Will
you
check
how
the
system
handles
various
forms
of
a
word,
e.g.,
plurals?
An
evaluation
such
as
this
should
be
replicable
by
others,
in
case
your
summation
of
the
system
is
challenged.
Thus
it
is
very
important
that
the
precise
steps
that
you
take
are
provided.
For
example,
a
very,
very
simply
test
could
have
these
steps:
1. One
instance
of
each
task
was
created
by
looking
at
the
scope
of
the
website,
and
exploring
the
menu
structure.
2. All
tasks
contained
a
similar
pattern
with
a
maximum
of
three
key
concepts
3. One
query
of
3
words
was
created
for
each
4. Each
query
was
entered
into
the
search
box,
and
the
results
were
displayed.
Only
snippets
were
used
to
assess
response.
5. The
five
measures
were
then
entered
into
the
table.
6. Steps
3--5
were
repeated
for
each
information
task.
Provide
a
similar
list
being
very
precise
about
the
steps
that
a
person
would
take
to
perform
the
assessment.
C)
Conduct
the
Evaluation
and
Analyse
the
Data
C.1
Conduct
the
Evaluation
Using
the
procedure
that
you
created
in
B.4,
conduct
the
test.
Do
this
in
one
sitting,
rather
than
haphazardly.
Collect
all
of
the
measures
and
insert
into
a
table
(see
the
template)
as
you
go.
Consistency
is
important
in
doing
an
evaluation.
C.2
Analyse
the
Results
A
first
step
is
synthesizing
the
results.
The
table
that
you
populated
in
C.1
will
now
be
full
of
numbers
or
text.
What
do
the
results
mean?
How
well
did
the
system
perform
overall?
Objectivity
in
discussing
results
is
important.
For
example,
one
could
imagine
statements
such
as
these
included
in
results
section:
“On
average
the
tasks
took
3
minutes
to
complete;”
“Most
tasks
took
three
queries
to
acquire
one
useful
webpage;”
“The
best
response
was
consistently
in
the
top
three
ranked
documents.”
Then,
examine
the
table
in
more
detail.
Did
the
system
perform
similarly
for
all
tasks,
or
did
the
system
handle
some
tasks
better
than
others?
Do
you
see
any
common
patterns
emerging,
by
measure,
or
across
the
set
of
measures?
Do
you
see
any
pattern
by
information
task?
Or
by
query?
This
is
where
you
get
to
use
those
critical
analysis
skills
that
are
so
important
in
graduate
school.
Finally,
how
will
you
respond
to
each
objective?
All
conclusions
about
the
system
must
be
based
on
evidence.
Evidence
will
be
one
of
two
types:
a)
results
of
the
test
using
the
tasks;
b)
results
from
the
research
literature
that
examined
a
similar
problem.
C.3
Provide
two
to
three
Recommendations
All
reports
are
completed
for
a
purpose.
What
should
occur
as
a
result
of
the
assessment?
What
will
you
recommend
to
the
website
owner
that
provides
the
search
system?
Provide
two
to
three
recommendations
that
you
can
link
directly
to
your
results,
or
to
the
research
literature.
All
recommendations
must
be
supported
by
research.
D)
Write
the
Report
The
report
submitted
for
this
coursework
will
be
created
using
the
template
provided.
You
must
• use
the
template
provided
on
the
Mole
website.
Do
not
change
the
headings
or
formatting
in
the
template.
Simply
answer
the
questions.
• use
the
required
word
count
within
5%
for
each
section.
This
is
very
good
practice
for
being
succinct
and
focused
in
your
writing.
• change
the
name
of
the
file
you
submit
to
your
student
identification
number.
For
example,
if
you
number
is
150213350,
then
convert
the
filename
to
150213350.docx.
Report
section
Marking
Scheme
1
Name
of
information
source
and
its
URL
Value
=
Required,
but
ungraded
2
Website
&
IR
system
[50
words]
Briefly
describe
the
website
and
identify
why
it
has
a
search
system.
Value=
5%
This
requires
a
clear,
succinct
description
of
the
system
and
an
objective
statement
of
the
purpose
of
a
search
system
for
this
website.
3
Objectives
&
Criteria
[100
words]
Specify
precisely
and
definitively
two
objectives
for
the
evaluation,
and
define
the
criteria
used
in
the
evaluation
Value=
10%
Objectives
must
be
precise,
and
the
criteria
clearly
defined.
4
Information
Tasks
[no
word
restriction,
but
should
not
require
more
than
25
per
task]
Provide
one
instance
of
each
task
type.
Specify
the
queries
used
for
each
task.
There
is
no
right
or
wrong
answer
for
this.
Value
=
Required,
but
ungraded.
The
tasks
will
inform
the
evaluation.
Poorly
selected
tasks
and
inconsistent
and
illogical
queries
will
affect
all
other
aspects
of
the
evaluation.
5
Measures
[200
words]
Identify
the
five
measures
used
to
assess
each
task.
Provide
a
clear
definition.
Each
measure
must
relate
to
the
previously
defined
criteria.
Value=
10%
Each
measure
must
be
clearly
defined
with
appropriate
citation
from
the
research
literature.
We
must
know
how
it
was
applied.
6
Method
or
Procedures
[200
words]
Explain
how
the
information
retrieval
system
was
evaluated.
Since
this
is
very
procedural,
use
a
list
format.
Value=10%
The
description
must
be
logical,
and
must
be
clearly
presented
in
such
a
way
that
someone
else
could
repeat
it.
7
Results
[200
words]
a.
Insert
the
result
for
each
task
and
measure
into
the
table.
Change
the
headings
to
be
consistent
with
the
measures
you
selected.
For
example,
change
“Criteria
1”
to
“Time
on
task.”
b.
Briefly
summarise
the
results
based
on
the
data.
Value=
20%
Data
must
be
accurate.
It
the
marker
enters
the
queries
and
uses
the
definition
found
in
Measures,
the
same
result
should
be
received.
This
requires
an
objective
succinct
description
of
the
results
table.
8
Analysis
[450
words]
Provide
the
details
of
the
analysis.
Include
a
response
to
your
objectives.
Given
your
analysis,
what
will
you
conclude?
Value=
30%
This
section
requires
a
demonstration
of
your
ability
to
analyse,
and
at
the
same
time
evidence
of
your
knowledge
gained
from
the
module.
An
analysis
without
evidence
to
support
it
will
lose
Report
section
Marking
Scheme
marks.
9
Recommendations
[300
words]
Identify
recommendations
for
the
website
owner.
Please
ensure
that
your
recommendations
can
be
supported
by
past
research
and
include
the
appropriate
citations.
Value=
10%
Similar
to
the
Analysis
section,
this
section
requires
a
demonstration
of
your
ability
to
critically
think
about
the
results
and
your
knowledge
gained
from
the
module.
Recommendations
that
do
not
emerge
from
the
analysis,
or
from
the
research
literature
will
be
downgraded.
Like
the
previous
section,
you
need
evidence
to
support
your
position.
10
References
[not
included
in
word
count]
All
references
cited
in
the
report
must
be
included
in
a
bibliography.
Use
the
APA
format.
Given
the
type
of
report,
one
would
expect
to
see
between
6--10
references
used.
Value=
5%
The
report
requires
use
of
the
research
literature.
Non--use
of
the
academic
research
literature
will
result
in
no
marks
for
this
section
as
well
as
a
5%
reduction
in
each
of
the
Results
and
Analysis
sections.
Using
news
articles,
blogs,
and
opinion
pieces
is
also
inappropriate
as
they
are
not
evidence
and
will
be
dismissed
in
the
marking.
In
addition,
this
section
needs
a
consistent
and
standard
presentation
of
all
references.
Providing
only
an
author,
title
and
URL
is
insufficient.
Structure,
Language
and
Writing
Style
The
structure
of
this
report
has
been
clearly
defined
for
you.
Concentrate
on
comprehension,
grammatical
and
language
Marks
will
be
deducted
from
the
total
mark
for:
a)
inadequate
writing
b)
lack
of
comprehension
c)
inappropriate
word
use
Information
School
Coursework
Submission
Requirements
It
is
the
student’s
responsibility
to
ensure
no
aspect
of
their
work
is
plagiarised
or
the
result
of
other
unfair
means.
The
University’s
and
Information
School’s
Advice
on
unfair
means
can
be
found
in
your
Student
Handbook,
available
via
http://www.sheffield.ac.uk/is/current.
Your
assignment
has
a
word
count
limit.
A
deduction
of
3
marks
will
be
applied
for
coursework
that
is
5%
or
more
above
or
below
the
word
count
as
specified
above
or
that
does
not
state
the
word
count.
It
is
your
responsibility
to
ensure
your
coursework
is
correctly
submitted
before
the
deadline.
It
is
highly
recommended
that
you
submit
well
before
the
deadline.
Coursework
submitted
after
2pm
on
the
stated
submission
date
will
result
in
a
deduction
of
5%
of
the
mark
awarded
for
each
working
day
after
the
submission
date/time
up
to
a
maximum
of
5
working
days,
where
‘working
day’
includes
Monday
to
Friday
(excluding
public
holidays)
and
runs
from
2pm
to
2pm.
Coursework
submitted
after
the
maximum
period
will
receive
zero
marks.
Work
submitted
electronically,
including
through
Turnitin,
should
be
reviewed
to
ensure
it
appears
as
you
intended.
Before
the
submission
deadline,
you
can
submit
coursework
to
Turnitin
numerous
times.
Each
submission
will
overwrite
the
previous
submission.
Only
your
most
recent
submission
will
be
assessed.
However,
after
the
submission
deadline,
the
coursework
can
only
be
submitted
once.
During
your
first
Semester
at
the
School,
when
submitting
a
piece
of
work
through
Turnitin,
you
will
only
be
able
to
view
a
‘similarity
report’
when
submitting
your
Test
Essay.
You
can
then
edit
and
resubmit
your
Test
Essay.
For
other
coursework
you
will
not
be
able
to
view
a
Turnitin
‘similarity
report’.
Details
about
the
submission
of
work
via
Turnitin
can
be
found
at
http://youtu.be/C_wO9vHHheo
If
you
encounter
any
problems
during
the
electronic
submission
of
your
coursework,
you
should
immediately
contact
the
module
coordinator
and
one
of
the
Information
School
Exams
Secretaries
(Julie
Priestley,
J.Priestley@sheffield.ac.uk,
0114
2222839
or
Larah
Arvandi,
l.arvandi@sheffield.ac.uk,
0114
2222640).
This
does
not
negate
your
responsibilities
to
submit
your
coursework
on
time
and
correctly.
Appendix
A.
List
of
Sources
for
Evaluation
Amazon
(http://www.amazon.com)
BBC
(http://www.bbc.co.uk/search)
Bloomberg
business
Europe
(http://www.bloomberg.com/europe)
British
Petroleum
(BP)
(http://www.bp.com)
Deloitte
(http://www2.deloitte.com/uk/en.html)
Elsevier
(http://www.elsevier.com)
Epicurious
(http://www.epicurious.com)
Europeana
(http://www.europeana.eu/portal/)
Ford
cars
(http://www.ford.co.uk)
Getty
Images
(http://www.gettyimages.co.uk/editorialimages/archival)
Microsoft
Research
(http://research.microsoft.com/en--us/)
Olympic
Movement
(http://www.olympic.org)
Reuters
(http://uk.reuters.com/)
Rightmove
(www.rightmove.co.uk)
Shell
UK
(http://www.shell.co.uk)
Trove
Australian
Newspaper
archive
(https://trove.nla.gov.au/newspaper)
UK
Science
museum
(http://www.sciencemuseum.org.uk)
Vauxhall
cars
(http://www.vauxhall.co.uk/)
Youtube
(https://www.youtube.com)
Zoopla
(http://www.zoopla.co.uk/)