The double power law in human collaboration behavior: The case of Wikipedia

https://doi.org/10.1016/j.physa.2016.05.010Get rights and content

Highlights

  • We study inter-event time distribution of editing behavior on Wikipedia.

  • We observe a double power law distribution for the inter-editing behavior at the population level.

  • We observe a single power law distribution at the individual level.

  • We reveal that the synchronized editing behavior among users plays a key role.

Abstract

We study human behavior in terms of the inter-event time distribution of revision behavior on Wikipedia, an online collaborative encyclopedia. We observe a double power law distribution for the inter-editing behavior at the population level and a single power law distribution at the individual level. Although interactions between users are indirect or moderate on Wikipedia, we determine that the synchronized editing behavior among users plays a key role in determining the slope of the tail of the double power law distribution.

Introduction

The current availability of enormous amounts of digital information on human activities facilitates the analysis of temporal activity patterns of various human dynamics. Bursty behavior, which leads to a heavy-tailed inter-event or response time distribution, has been observed in a wide range of human dynamics; these include communication patterns in mail  [1], [2], [3], [4], [5], [6], short message correspondence  [7], mobile phone records  [8], [9], [10], web surfing  [5], [11], [12], [13], library loans  [5], printing requests  [14], networked games  [15], online chatting  [16], and file downloads  [17].

To the best of our knowledge, there are three approaches to studying bursty behavior. The first approach investigates empirical patterns in terms of inter-event (or response) time distributions  [1], [3], [4], [5], [7]. Although there has been a longstanding controversy regarding the power law behavior of the inter-event time distribution, various universality classes in the power law distribution have been reported from various temporal human behaviors  [5]. Moreover, empirical evidence has revealed that the bimodal distribution is neither completely Poisson nor exhibits power law behavior  [7]. The second approach determines the origins of bursty behavior. A queueing model for a to-do list with a highest-priority-first protocol  [3], [5] is the best-known model that explains heavy tails in inter-event time distributions. Periodic cycles of human activity also provide a good alternative explanation to the heavy tails  [6], [18]. The last approach suggests a new definition or rescaling method for the time interval in terms of hidden scaling patterns and universality  [19], [20], [21]. Underlying similarities between surface letters and e-mail are successfully revealed by rescaling time intervals  [19]. Furthermore, a new timing method to eliminate activity heterogeneity in temporal human behavior was recently introduced  [20], [21].

Wikipedia is an online collaborative encyclopedia that has grown exponentially in terms of its number of users and information since 2001  [22]. It is considered to be a success story of low-cost collaborative knowledge systems. Anyone with access to this online encyclopedia can edit almost all of its articles. Wikipedia also provides a revision history for its articles that includes when an article was revised and who revised it. Therefore, the revision history is one of the best tools for determining temporal activity patterns of collaborative human behavior on the Internet.

In this work, we investigate the distribution of inter-editing time intervals between two successive edits of Wikipedia articles at both the individual and population levels. Interestingly, at the individual level, a single power law distribution with an exponent of α1 is observed. However, at the population level, a double power law distribution is observed. The latter distribution exhibits two exponents, α1 for shorter time intervals and α2 for longer time intervals. Previous studies on the distribution of inter-event times between two consecutive events for e-mail, mobile phones, and short message correspondence  [6], [7], [8] provide empirical results for the temporal activity pattern of a single person when there is explicit correspondence with other individuals. In contrast to direct communication, editors of Wikipedia articles interact indirectly with other individuals; this interaction is implicitly mediated by the articles that are continuously revised by Wikipedia users. Our study reveals new evidence and insight on the temporal patterns of collective human behavior when there are indirect or moderate interactions between users.

Section snippets

Methods

Wikipedia’s editors determine the featured articles that are considered the best articles on Wikipedia; as of March 26, 2014, there were 4200 featured articles out of the total 4,479,603 articles on the English Wikipedia web page. In this study, we only considered featured articles. Each article displays heterogeneous statistical properties in terms of the number of users and editing frequency according to its popularity  [23]; however, the time of the first editing event differs greatly from

Results and discussion

Patterns of inter-editing time intervals at the individual level. To observe patterns in the inter-editing time interval, τ, at the individual user level, we extracted a single user’s editing history for a particular article and measured the inter-editing time interval of that user. We collected the inter-editing time intervals of all single users who participated in collective editing behavior for the same article. Using these time intervals, we obtained a distribution of inter-editing time

Synchronized editing activity

In this section, we consider the synchronous editing activity of an article on “Global warming” for the four most active users. An article on “Global warming” was selected because we believe it to be the most representative example of synchronous editing activity between users.

To investigate the interactions between users, we estimated the editing rate of each user. The editing rate is defined as the number of edits executed by a user in a certain period. The editing rate was measured daily for

Conclusion

In this study, we examined the inter-editing behavior of Wikipedia, an online collaborative encyclopedia. Our findings revealed a double power law distribution for the inter-editing behavior at the population level and a single power law distribution at the individual level. This result suggested that long individual editing intervals are likely split into shorter editing intervals by other users’ editing behaviors. This mechanism of reducing the probability density of long inter-editing times

Acknowledgments

This research was supported by National Institute for Mathematical Sciences (NIMS) funded by the Ministry of Science, ICT and Future Planning (B21501). This work is supported by the Mid-career Researcher Program through a National Research Foundation of Korea (NRF) grant funded by the Ministry of Science, ICT, and Future Planning (2013R1A2A2A04017095) and the Basic Science Research Program through the NRF funded by the Ministry of Science, ICT, and Future Planning (2010-0021987).

References (26)

  • A. Johansen

    Physica A

    (2004)
  • U. Harder et al.

    Physica A

    (2006)
  • A. Johansen et al.

    Physica A

    (2000)
  • J.-P. Eckmann et al.

    Proc. Natl. Acad. Sci. USA

    (2004)
  • A.-L. Barabási

    Nature

    (2005)
  • J.G. Oliveira et al.

    Nature

    (2005)
  • A. Vázquez et al.

    Phys. Rev. E

    (2006)
  • R.D. Malmgren et al.

    Proc. Natl. Acad. Sci.

    (2008)
  • Y. Wu et al.

    Proc. Natl. Acad. Sci.

    (2010)
  • J. Candia et al.

    J. Phys. A

    (2008)
  • M. Karsai et al.

    Phys. Rev. E

    (2011)
  • H.-H. Jo et al.

    PLoS One

    (2011)
  • Z. Dezsö et al.

    Phys. Rev. E

    (2006)
  • Cited by (0)

    View full text