| [1] |
M. E. J. Newman and M. Girvan.
Finding and evaluating community structure in networks.
Phys. Rev. E, 69(2), 2004.
[ bib |
DOI ]
We propose and study a set of algorithms for discovering community structure in networks—natural divisions of network nodes into densely connected subgroups. Our algorithms all share two definitive features: first, they involve iterative removal of edges from the network to split it into communities, the edges removed being identified using any one of a number of possible ‘‘betweenness’’ measures, and second, these measures are, crucially, recalculated after each removal. We also propose a measure for the strength of the community structure found by our algorithms, which gives us an objective metric for choosing the number of communities into which a network should be divided. We demonstrate that our algorithms are highly effective at discovering community structure in both computer-generated and real-world network data, and show how they can be used to shed light on the sometimes dauntingly complex structure of networked systems.
|
| [2] |
Jon M. Kleinberg.
Authoritative sources in a hyperlinked environment.
In Journal of the ACM, volume 46, 1999.
[ bib |
DOI ]
The network structure of a hyperlinked environment can be a rich source of information about the content of the environment, provided we have effective means for understanding it. We develop a set of algorithmic tools for extracting information from the link structures of such environments, and report on experiments that demonstrate their effectiveness in a variety of contexts on the World Wide Web. The central issue we address within our framework is the distillation of broad search topics, through the discovery of “authoritative” information sources on such topics. We propose and test an algorithmic formulation of the notion of authority, based on the relationship between a set of relevant authoritative pages and the set of “hub pages” that join them together in the link structure. Our formulation has connections to the eigenvectors of certain matrices associated with the link graph; these connections in turn motivate additional heuristics for link-based analysis.
|
| [3] |
Vincent D. Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne
Lefebvre.
Fast unfolding of community hierarchies in large networks.
Preprint, 2008.
[ bib ]
Social, technological and information systems can often be described in terms of complex networks that have a topology of interconnected nodes that combines organization and randomness. The typical size of large networks such as social network services, mobile phone networks or the web now counts in millions when not billions of nodes and these scales demand new methods to retrieve comprehensive information from their structure. A promising approach consists in decomposing the networks into sub-units or communities, which are sets of highly connected nodes. The identification of these communities is of crucial importance as they may help to uncover a-priori unknown functional modules such as topics in information networks or cyber-communities in social networks. Moreover, the resulting meta-network, whose nodes are the communities, may then be used to visualize the original network structure. Here we propose a simple community detection method that reveals the hierarchical community structure of networks and that outperforms all other known community detection methods. We use our method to identify language communities and analyze community interactions in a Belgian mobile phone network of 2.6 million customers and we apply it to a web network of 118 million nodes and more than one billion links.
|
| [4] |
David M. Pennock, Gary W. Flake, Steve Lawrence, Eric J. Glover, and C. Lee
Giles.
Winners don't take all: Characterizing the competition for links on
the web.
Proceedings of the National Academy of Sciences of the United
States, 99(8), 2002.
[ bib |
DOI ]
As a whole, the World Wide Web displays a striking “rich get richer” behavior, with a relatively small number of sites receiving a disproportionately large share of hyperlink references and traffic. However, hidden in this skewed global distribution, we discover a qualitatively different and considerably less biased link distribution among subcategories of pages—for example, among all university homepages or all newspaper homepages. Although the connectivity distribution over the entire web is close to a pure power law, we find that the distribution within specific categories is typically unimodal on a log scale, with the location of the mode, and thus the extent of the rich get richer phenomenon, varying across different categories. Similar distributions occur in many other naturally occurring networks, including research paper citations, movie actor collaborations, and United States power grid connections. A simple generative model, incorporating a mixture of preferential and uniform attachment, quantifies the degree to which the rich nodes grow richer, and how new (and poorly connected) nodes can compete. The model accurately accounts for the true connectivity distributions of category-specific web pages, the web as a whole, and other social networks.
|
| [5] |
M. E. J. Newman.
Fast algorithm for detecting community structure in networks.
Physical Review E, 69, 2004.
[ bib ]
It has been found that many networks display community structure - groups of vertices within which connections are dense but between which they are sparser - and highly sensitive computer algorithms have in recent years been developed for detecting such structure. These algorithms however are computationally demanding, which limits their application to small networks. Here we describe a new algorithm which gives excellent results when tested on both computer-generated and real-world networks and is much faster, typically thousands of times faster than previous algorithms. We give several example applications, including one to a collaboration network of more than 50000 physicists.
|
| [6] | David Gibson, Jon Kleinberg, and Prabhakar Raghavan. Inferring web communities from link topology. In HYPERTEXT '98: Proceedings of the ninth ACM conference on Hypertext and hypermedia : links, objects, time and space-structure in hypermedia systems, pages 225-234, New York, NY, USA, 1998. ACM. [ bib | DOI ] |
| [7] |
Santo Fortunato and Marc Barthelemy.
Resolution limit in community detection.
In Proceedings of the National Academy of Sciences of the United
States, 2006.
[ bib |
DOI ]
Detecting community structure is fundamental for uncovering the links between structure and function in complex networks and for practical applications in many disciplines such as biology and sociology. A popular method now widely used relies on the optimization of a quantity called modularity, which is a quality index for a partition of a network into communities. We find that modularity optimization may fail to identify modules smaller than a scale which depends on the total size of the network and on the degree of interconnectedness of the modules, even in cases where modules are unambiguously defined. This finding is confirmed through several examples, both in artificial and in real social, biological, and technological networks, where we show that modularity optimization indeed does not resolve a large number of modules. A check of the modules obtained through modularity optimization is thus necessary, and we provide here key elements for the assessment of the reliability of this community detection method.
|
| [8] | Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, and Andrew Tomkins. Trawling the web for emerging cyber-communities. In WWW '99: Proceedings of the eighth international conference on World Wide Web, pages 1481-1493, New York, NY, USA, 1999. Elsevier North-Holland, Inc. [ bib | DOI ] |
| [9] |
M. E. J. Newman.
Modularity and community structure in networks.
In PNAS, 2006.
[ bib |
DOI ]
Many networks of interest in the sciences, including social networks, computer networks, and metabolic and regulatory networks, are found to divide naturally into communities or modules. The problem of detecting and characterizing this community structure is one of the outstanding issues in the study of networked systems. One highly effective approach is the optimization of the quality function known as “modularity” over the possible divisions of a network. Here I show that the modularity can be expressed in terms of the eigenvectors of a characteristic matrix for the network, which I call the modularity matrix, and that this expression leads to a spectral algorithm for community detection that returns results of demonstrably higher quality than competing methods in shorter running times. I illustrate the method with applications to several published network data sets.
|
| [10] | Jon Kleinberg and Steve Lawrence. The structure of the web. Science, 294:1849-1850, 2001. [ bib ] |
| [11] | Soume Chakrabarti, Byron E. Dom, David Gibson, Jon Kleinberg, Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, and Andrew Tomkins. Mining the link structure of the world wide web. 1999. [ bib ] |
| [12] | Gregoire Allaire and Sidi Mahmoud Kaber. Numerical Linear Algebra. Springer, 2008. [ bib ] |
| [13] |
Pascal Pons and Matthieu Latapy.
Computing communities in large networks using random walks.
Lecture notes in computer science, 2005.
[ bib |
DOI ]
Dense subgraphs of sparse graphs (communities), which appear in most real-world complex networks, play an important role in many contexts. Computing them however is generally expensive. We propose here a measure of similarities between vertices based on random walks which has several important advantages: it captures well the community structure in a network, it can be computed efficiently, it works at various scales, and it can be used in an agglomerative algorithm to compute efficiently the community structure of a network. We propose such an algorithm which runs in time O(mn2) and space O(n2) in the worst case, and in time O(n2log n) and space O(n2) in most real-world cases (n and m are respectively the number of vertices and edges in the input graph).
|
| [14] | Feng Luo, James Z. Wang, and Eric Promislow. Exploring local community structures in large networks. In WI '06: Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence, pages 233-239, Washington, DC, USA, 2006. IEEE Computer Society. [ bib | DOI ] |
| [15] | Ken Wakita and Toshiyuki Tsurumi. Finding community structure in mega-scale social networks: [extended abstract]. In WWW '07: Proceedings of the 16th international conference on World Wide Web, pages 1275-1276, New York, NY, USA, 2007. ACM. [ bib | DOI ] |
| [16] |
Gergely Palla, Illes J. Farkas, Peter Pollner, Imre Derenyi, and Tamas Vicsek.
Directed network modules.
New journal of physics, 9(6):186, 2007.
[ bib |
DOI ]
A search technique locating network modules, i.e. internally densely connected groups of nodes in directed networks is introduced by extending the clique percolation method originally proposed for undirected networks. After giving a suitable definition for directed modules we investigate their percolation transition in the Erdos Rényi graph both analytically and numerically. We also analyse four real-world directed networks, including Google's own web-pages, an email network, a word association graph and the transcriptional regulatory network of the yeast Saccharomyces cerevisiae. The obtained directed modules are validated by additional information available for the nodes. We find that directed modules of real-world graphs inherently overlap and the investigated networks can be classified into two major groups in terms of the overlaps between the modules. Accordingly, in the word-association network and Google's web-pages, overlaps are likely to contain in-hubs, whereas the modules in the email and transcriptional regulatory network tend to overlap via out-hubs.
|
| [17] |
Aaron Clauset, M. E. J. Newman, and Christopher Moore.
Finding community structure in very large networks.
Physical Review E., 70(6), 2004.
[ bib ]
The discovery and analysis of community structure in networks is a topic of considerable recent interest within the physics community, but most methods proposed so far are unsuitable for very large networks because of their computational cost. Here we present a hierarchical agglomeration algorithm for detecting community structure which is faster than many competing algorithms: its running time on a network with n vertices and m edges is O (md log n) where d is the depth of the dendrogram describing the community structure. Many real-world networks are sparse and hierarchical, with m approximately n and d approximately log n, in which case our algorithm runs in essentially linear time, O (n log(2) n). As an example of the application of this algorithm we use it to analyze a network of items for sale on the web site of a large on-line retailer, items in the network being linked if they are frequently purchased by the same buyer. The network has more than 400 000 vertices and 2 x 10(6) edges. We show that our algorithm can extract meaningful communities from this network, revealing large-scale patterns present in the purchasing habits of customers.
|
This file was generated by bibtex2html 1.91.