Abstract:
This paper presents an empirical study about the temporal patterns characterizing the requests submitted by users to Wikipedia. The study is based on the
analysis of the log lines registered by the Wikimedia Foundation Squid servers after having sent the appropriate content in response to users’ requests. The analysis
has been conducted regarding the ten most visited editions of Wikipedia and has
involved more than 14,000 million log lines corresponding to the traffic of the entire year 2009. The conducted methodology has mainly consisted in the parsing and
filtering of users’ requests according to the study directives. As a result, relevant
information fields have been finally stored in a database for persistence and further
characterization. In thia way, we, first assessed, whether the traffic to Wikipedia
could serve as a reliable estimator of the overall traffic to all the Wikimedia Foundation projects. Our subsequent analysis of the temporal evolutions corresponding
to the different types of requests to Wikipedia revealed interesting differences and
similarities among them that can be related to the users’ attention to the Encyclopedia. In addition, we have performed separated characterizations of each Wikipedia
edition to compare their respective evolutions over time.
Conference: 5th International Workshop on new Challenges in Distributed Information Filtering and Retrieval (DART'11)
Language: English
0 comentarios:
Post a Comment