Controversy Detection in Wikipedia Using Collective Classification

Authors:
Shiri Dori-Hacohen

University of Massachusetts, Amherst, Amherst, MA, USA

University of Massachusetts, Amherst, Amherst, MA, USA
View Profile

,
David Jensen

University of Massachusetts, Amherst, Amherst, MA, USA

University of Massachusetts, Amherst, Amherst, MA, USA
View Profile

,
James Allan

University of Massachusetts, Amherst, Amherst, MA, USA

University of Massachusetts, Amherst, Amherst, MA, USA
View Profile

SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information RetrievalJuly 2016Pages 797–800https://doi.org/10.1145/2911451.2914745

Published:07 July 2016Publication History

SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval

Pages 797–800

ABSTRACT

Concerns over personalization in IR have sparked an interest in detection and analysis of controversial topics. Accurate detection would enable many beneficial applications, such as alerting search users to controversy. Wikipedia's broad coverage and rich metadata offer a valuable resource for this problem. We hypothesize that intensities of controversy among related pages are not independent; thus, we propose a stacked model which exploits the dependencies among related pages. Our approach improves classification of controversial web pages when compared to a model that examines each page in isolation, demonstrating that controversial topics exhibit homophily. Using notions of similarity to construct a subnetwork for collective classification, rather than using the default network present in the relational data, leads to improved classification with wider applications for semi-structured datasets, with the effects most pronounced when a small set of neighbors is used.

References

U. Brandes, P. Kenis, J. Lerner, and D. van Raaij. Network analysis of collaboration structure in Wikipedia. WWW, 2009. Google ScholarDigital Library
M. Bröcheler, L. Mihalkova, and L. Getoor. Probabilistic Similarity Logic. CoRR, 2012.Google Scholar
S. Chakrabarti, B. Dom, and P. Indyk. Enhanced hypertext categorization using hyperlinks. 1998.Google ScholarDigital Library
S. Das, A. Lavoie, and M. Magdon-Ismail. Manipulation Among the Arbiters of Collective Intelligence. CIKM, 2013. Google ScholarDigital Library
S. Dori-Hacohen and J. Allan. Detecting controversy on the web. In CIKM, 2013. Google ScholarDigital Library
S. Dori-Hacohen, E. Yom-Tov, and J. Allan. Navigating Controversy as a Complex Search Task. In Supporting Complex Search Tasks, 2015.Google Scholar
A. Fast and D. Jensen. Why stacked models perform effective collective classification. In ICDM, 2008. Google ScholarDigital Library
R. Jesus, M. Schwartz, and S. Lehmann. Bipartite networks of Wikipedia's articles and authors - a meso-level approach. WikiSym, 2009. Google ScholarDigital Library
A. Kittur, B. Suh, B. A. Pendleton, E. H. Chi, L. Angeles, and P. Alto. He Says, She Says: Conflict and Coordination in Wikipedia. In CHI, 2007. Google ScholarDigital Library
Z. Kou and W. W. Cohen. Stacked Graphical Models for Efficient Inference in Markov Random Fields. SDM, 2007.Google ScholarCross Ref
E. Pariser. The Filter Bubble: What the Internet is hiding from you. Penguin Press HC, 2011. Google ScholarDigital Library
H. Sepehri Rad and D. Barbosa. Identifying controversial articles in Wikipedia. WikiSym, 2012. Google ScholarDigital Library
B.-q. Vuong, E.-p. Lim, A. Sun, M.-T. Le, H. W. Lauw, and K. Chang. On ranking controversies in Wikipedia: models and evaluation. In WSDM, 2008. Google ScholarDigital Library
T. Yasseri, R. Sumi, A. Rung, A. Kornai, and J. Kertész. Dynamics of conflicts in Wikipedia. PloS one, 2012.Google Scholar

Index Terms

Recommendations

Probabilistic Approaches to Controversy Detection
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management

Recently, the problem of automated controversy detection has attracted a lot of interest in the information retrieval community. Existing approaches to this problem have set forth a number of detection algorithms, but there has been little effort to ...
Read More
Improving Automated Controversy Detection on the Web
SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval

Automatically detecting controversy on the Web is a useful capability for a search engine to help users review web content with a more balanced and critical view. The current state-of-the art approach is to find K-Nearest-Neighbors in Wikipedia to the ...
Read More
Detecting controversy on the web
CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management

A useful feature to facilitate critical literacy would alert users when they are reading a controversial web page. This requires solving a binary classification problem: does a given web page discuss a controversial topic? We explore the feasibility of ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval
July 2016
1296 pages
ISBN:9781450340694
DOI:10.1145/2911451
General Chairs:
Raffaele Perego
ISTI-CNR, Italy
,
Fabrizio Sebastiani
Qatar Computing Research Institute, HBKU, Qatar
,
Program Chairs:
Javed Aslam
Northeastern University, US
,
Ian Ruthven
University of Strathclyde, UK
,
Justin Zobel
University of Melbourne, Australia
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 7 July 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
collective classification
controversy detection
semi-structured data
similarity
stacked classification
Qualifiers
- short-paper
Conference

Acceptance Rates
SIGIR '16 Paper Acceptance Rate62of341submissions,18%Overall Acceptance Rate792of3,983submissions,20%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 10
  Total Citations
  View Citations
- 709
  Total Downloads
- Downloads (Last 12 months)51
- Downloads (Last 6 weeks)12
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Controversy Detection in Wikipedia Using Collective Classification

SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Probabilistic Approaches to Controversy Detection

Improving Automated Controversy Detection on the Web

Detecting controversy on the web