Publication View

Why Mapping the Internet is Hard (2004)

Abstract
Despite great effort spent measuring topological features of large networks like the Internet, it was recently argued that sampling based on taking paths through the network (e.g., traceroutes) introduces a fundamental bias in the observed degree distribution. We examine this bias analytically and experimentally. For classic random graphs with mean degree c, we show analytically that traceroute sampling gives an observed degree distribution P(k) ~ 1/k for k < c, even though the underlying degree distribution is Poisson. For graphs whose degree distributions have power-law tails P(k) ~ k^-alpha, the accuracy of traceroute sampling is highly sensitive to the population of low-degree vertices. In particular, when the graph has a large excess (i.e., many more edges than vertices), traceroute sampling can significantly misestimate alpha.. Comment: supercedes cond-mat/0312674

Publication details
Download http://arxiv.org/abs/cond-mat/0407339
Repository arXiv (United States)
Keywords Condensed Matter - Disordered Systems and Neural Networks
Type text