ABSTRACT
Concerns over personalization in IR have sparked an interest in detection and analysis of controversial topics. Accurate detection would enable many beneficial applications, such as alerting search users to controversy. Wikipedia's broad coverage and rich metadata offer a valuable resource for this problem. We hypothesize that intensities of controversy among related pages are not independent; thus, we propose a stacked model which exploits the dependencies among related pages. Our approach improves classification of controversial web pages when compared to a model that examines each page in isolation, demonstrating that controversial topics exhibit homophily. Using notions of similarity to construct a subnetwork for collective classification, rather than using the default network present in the relational data, leads to improved classification with wider applications for semi-structured datasets, with the effects most pronounced when a small set of neighbors is used.
- U. Brandes, P. Kenis, J. Lerner, and D. van Raaij. Network analysis of collaboration structure in Wikipedia. WWW, 2009. Google ScholarDigital Library
- M. Bröcheler, L. Mihalkova, and L. Getoor. Probabilistic Similarity Logic. CoRR, 2012.Google Scholar
- S. Chakrabarti, B. Dom, and P. Indyk. Enhanced hypertext categorization using hyperlinks. 1998.Google ScholarDigital Library
- S. Das, A. Lavoie, and M. Magdon-Ismail. Manipulation Among the Arbiters of Collective Intelligence. CIKM, 2013. Google ScholarDigital Library
- S. Dori-Hacohen and J. Allan. Detecting controversy on the web. In CIKM, 2013. Google ScholarDigital Library
- S. Dori-Hacohen, E. Yom-Tov, and J. Allan. Navigating Controversy as a Complex Search Task. In Supporting Complex Search Tasks, 2015.Google Scholar
- A. Fast and D. Jensen. Why stacked models perform effective collective classification. In ICDM, 2008. Google ScholarDigital Library
- R. Jesus, M. Schwartz, and S. Lehmann. Bipartite networks of Wikipedia's articles and authors - a meso-level approach. WikiSym, 2009. Google ScholarDigital Library
- A. Kittur, B. Suh, B. A. Pendleton, E. H. Chi, L. Angeles, and P. Alto. He Says, She Says: Conflict and Coordination in Wikipedia. In CHI, 2007. Google ScholarDigital Library
- Z. Kou and W. W. Cohen. Stacked Graphical Models for Efficient Inference in Markov Random Fields. SDM, 2007.Google ScholarCross Ref
- E. Pariser. The Filter Bubble: What the Internet is hiding from you. Penguin Press HC, 2011. Google ScholarDigital Library
- H. Sepehri Rad and D. Barbosa. Identifying controversial articles in Wikipedia. WikiSym, 2012. Google ScholarDigital Library
- B.-q. Vuong, E.-p. Lim, A. Sun, M.-T. Le, H. W. Lauw, and K. Chang. On ranking controversies in Wikipedia: models and evaluation. In WSDM, 2008. Google ScholarDigital Library
- T. Yasseri, R. Sumi, A. Rung, A. Kornai, and J. Kertész. Dynamics of conflicts in Wikipedia. PloS one, 2012.Google Scholar
Index Terms
- Controversy Detection in Wikipedia Using Collective Classification
Recommendations
Probabilistic Approaches to Controversy Detection
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge ManagementRecently, the problem of automated controversy detection has attracted a lot of interest in the information retrieval community. Existing approaches to this problem have set forth a number of detection algorithms, but there has been little effort to ...
Improving Automated Controversy Detection on the Web
SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information RetrievalAutomatically detecting controversy on the Web is a useful capability for a search engine to help users review web content with a more balanced and critical view. The current state-of-the art approach is to find K-Nearest-Neighbors in Wikipedia to the ...
Detecting controversy on the web
CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge ManagementA useful feature to facilitate critical literacy would alert users when they are reading a controversial web page. This requires solving a binary classification problem: does a given web page discuss a controversial topic? We explore the feasibility of ...
Comments