ABSTRACT
In this paper we address the problem of developing actionable quality models for Wikipedia, models whose features directly suggest strategies for improving the quality of a given article. We first survey the literature in order to understand the notion of article quality in the context of Wikipedia and existing approaches to automatically assess article quality. We then develop classification models with varying combinations of more or less actionable features, and find that a model that only contains clearly actionable features delivers solid performance. Lastly we discuss the implications of these results in terms of how they can help improve the quality of articles across Wikipedia.
- B. T. Adler and L. De Alfaro. A content-driven reputation system for the Wikipedia. In Proc. WWW, 2007. Google ScholarDigital Library
- M. Anderka, B. Stein, and N. Lipka. Predicting quality flaws in user-generated content: the case of Wikipedia. In Proc. SIGIR, pages 981--990, 2012. Google ScholarDigital Library
- O. Arazy and O. Nov. Determinants of Wikipedia quality: the roles of global and local contribution inequality. In Proc. CSCW, pages 233--236, 2010. Google ScholarDigital Library
- J. E. Blumenstock. Size matters: word count as a measure of quality on Wikipedia. In Proc. WWW, 2008. Google ScholarDigital Library
- U. Brandes, P. Kenis, J. Lerner, and D. van Raaij. Network analysis of collaboration structure in Wikipedia. In Proc. WWW, pages 731--740, 2009. Google ScholarDigital Library
- T. Chesney. An empirical examination of Wikipedia's credibility. First Monday, 11(11):1--13, 2006.Google ScholarCross Ref
- D. Cosley, D. Frankowski, L. Terveen, and J. Riedl. SuggestBot: using intelligent task routing to help people find work in Wikipedia. In Proc. IUI, pages 32--41, 2007. Google ScholarDigital Library
- G. De la Calzada and A. Dekhtyar. On measuring the quality of Wikipedia articles. In Proc. WICOW, pages 11--18, 2010. Google ScholarDigital Library
- M. D. Ekstrand and J. T. Riedl. rv you're dumb: identifying discarded work in Wiki article history. In Proc. WikiSym, pages 4:1--4:10, 2009. Google ScholarDigital Library
- F. Flöck, D. Vrandečić, and E. Simperl. Revisiting reverts: accurate revert detection in Wikipedia. In Proc. HT, pages 3--12, 2012. Google ScholarDigital Library
- P. Forner, J. Karlgren, and C. E. Womser-Hacker. Overview of the 1st International Competition on Quality Flaw Prediction in Wikipedia. CLEF 2012 Evaluation Labs and Workshop, 2012.Google Scholar
- R. S. Geiger and A. Halfaker. Using edit sessions to measure participation in Wikipedia. In Proc. CSCW, pages 861--870, 2013. Google ScholarDigital Library
- R. S. Geiger and D. Ribes. The work of sustaining order in Wikipedia: the banning of a vandal. In Proc. CSCW, pages 117--126, 2010. Google ScholarDigital Library
- J. Giles. Internet encyclopaedias go head to head. Nature, 438(7070):900--901, 2005.Google ScholarCross Ref
- A. Halfaker, A. Kittur, R. Kraut, and J. Riedl. A jury of your peers: quality, experience and ownership in Wikipedia. In Proc. WikiSym, pages 15:1--15:10, 2009. Google ScholarDigital Library
- D. Hasan Dalip, M. André Gonçalves, M. Cristo, and P. Calado. Automatic quality assessment of content created collaboratively by web communities: a case study of Wikipedia. In Proc. JCDL, pages 295--304, 2009. Google ScholarDigital Library
- M. Hu, E.-P. Lim, A. Sun, H. W. Lauw, and B.-Q. Vuong. Measuring article quality in Wikipedia: models and evaluation. In Proc. CIKM, pages 243--252, 2007. Google ScholarDigital Library
- B. Keegan and D. Gergle. Egalitarians at the gate: one-sided gatekeeping practices in social media. In Proc. CSCW, pages 131--134, 2010. Google ScholarDigital Library
- A. Kittur, E. Chi, B. Pendleton, B. Suh, and T. Mytkowicz. Power of the few vs. wisdom of the crowd: Wikipedia and the rise of the bourgeoisie. alt.CHI at CHI, 2007.Google Scholar
- A. Kittur and R. E. Kraut. Harnessing the wisdom of crowds in Wikipedia: quality through coordination. In Proc. CSCW, pages 37--46, 2008. Google ScholarDigital Library
- N. Lipka and B. Stein. Identifying featured articles in Wikipedia: writing style matters. In Proc. WWW, pages 1147--1148, 2010. Google ScholarDigital Library
- J. Liu and S. Ram. Who does what: Collaboration patterns in the Wikipedia and their impact on article quality. ACM TMIS, 2(2):11:1--11:23, July 2011. Google ScholarDigital Library
- T. Lucassen and J. M. Schraagen. Trust in Wikipedia: how users trust information from an unknown source. In Proc. WICOW, pages 19--26, 2010. Google ScholarDigital Library
- K. Nemoto, P. Gloor, and R. Laubacher. Social capital increases efficiency of collaboration among Wikipedia editors. In Proc. HT, pages 231--240, 2011. Google ScholarDigital Library
- R. Priedhorsky, J. Chen, S. T. K. Lam, K. Panciera, L. Terveen, and J. Riedl. Creating, destroying, and restoring value in Wikipedia. In Proc. GROUP, pages 259--268, 2007. Google ScholarDigital Library
- J. Solomon and R. Wash. Bootstrapping wikis: developing critical mass in a fledgling community by seeding content. In Proc. CSCW, pages 261--264, 2012. Google ScholarDigital Library
- K. Stein and C. Hess. Does it matter who contributes: A study on featured articles in the German Wikipedia. In Proc. HT, pages 171--174, 2007. Google ScholarDigital Library
- B. Stvilia, A. Al-Faraj, and Y. J. Yi. Issues of cross-contextual information quality evaluation--The case of Arabic, English, and Korean Wikipedias. Library & Information Science Research, 31(4):232--239, 2009.Google ScholarCross Ref
- B. Stvilia, M. B. Twidale, L. Gasser, and L. C. Smith. Information quality discussions in Wikipedia. In Proc. ICKM, pages 101--113, 2005.Google Scholar
- B. Stvilia, M. B. Twidale, L. C. Smith, and L. Gasser. Assessing information quality of a community-based encyclopedia. In Proc. ICIQ, pages 442--454, 2005.Google Scholar
- B. Stvilia, M. B. Twidale, L. C. Smith, and L. Gasser. Information quality work organization in Wikipedia. Journal of ASIST, 59(6):983--1001, 2008. Google ScholarDigital Library
- Y. Suzuki and M. Yoshikawa. Mutual evaluation of editors and texts for assessing quality of Wikipedia articles. In Proc. WikiSym, 2012. Google ScholarDigital Library
- P.-N. Tan, M. Steinbach, and V. Kumar. Introduction to Data Mining. Pearson Education, 2006. Google ScholarDigital Library
- F. B. Viégas, M. Wattenberg, and M. M. McKeon. The hidden order of Wikipedia. In Online communities and social computing, pages 445--454. 2007. Google ScholarDigital Library
- S. Wang and M. Iwaihara. Quality evaluation of Wikipedia articles through edit history and editor groups. In Web Technologies and Applications, pages 188--199. 2011. Google ScholarDigital Library
- M. Wattenberg, F. B. Viégas, and K. Hollenbach. Visualizing activity on Wikipedia with chromograms. In Human-Computer Interaction--INTERACT 2007, pages 272--287. Springer, 2007. Google ScholarDigital Library
- D. M. Wilkinson and B. A. Huberman. Cooperation and quality in Wikipedia. In Proc. WikiSym, 2007. Google ScholarDigital Library
- T. Wöhner and R. Peters. Assessing the quality of Wikipedia articles with lifecycle based metrics. In Proc. WikiSym, pages 16:1--16:10, 2009. Google ScholarDigital Library
- G. Wu, M. Harrigan, and P. Cunningham. Classifying Wikipedia articles using network motif counts and ratios. In Proc. WikiSym, 2012. Google ScholarDigital Library
- Y. Xu and T. Luo. Measuring article quality in Wikipedia: Lexical clue model. In 3rd Symposium on Web Society, pages 141--146. IEEE, 2011.Google Scholar
Index Terms
Tell me more: an actionable quality model for Wikipedia
Recommendations
Size matters: word count as a measure of quality on wikipedia
WWW '08: Proceedings of the 17th international conference on World Wide WebWikipedia, "the free encyclopedia", now contains over two million English articles, and is widely regarded as a high-quality, authoritative encyclopedia. Some Wikipedia articles, however, are of questionable quality, and it is not always apparent to the ...
Towards automatic quality assurance in Wikipedia
WWW '11: Proceedings of the 20th international conference companion on World wide webFeatured articles in Wikipedia stand for high information quality, and it has been found interesting to researchers to analyze whether and how they can be distinguished from "ordinary" articles. Here we point out that article discrimination falls far ...
Candidate Searching and Key Coreference Resolution for Wikification
IMCOM '16: Proceedings of the 10th International Conference on Ubiquitous Information Management and CommunicationWikification is the task to link textual mentions in a document to articles in Wikipedia. It comprises three main steps, namely, mention recognition, candidate generation, and entity linking. For candidate generation, existing methods use hyperlinks in ...
Comments