Faculty Recruiting Make a Gift

Mining Graphs at Scale: Ego-Networks, Clusters and Privacy

19 Sep
Tuesday, 09/19/2017 4:00pm to 5:00pm
Computer Science Building, Rooms 150/151
Data Science Tea

Graph analysis is a fundamental primitive in several applications of data mining and machine learning. We present recent advances in this area based on the analysis of local structures in graphs known as ego-nets (i.e. the subgraph induced by the neighborhood of each node). In particular, we present ego-splitting, a novel framework for detecting clusters in complex networks which leverages ego-network analysis to de-couple overlapping clusters. Ego-splitting is a highly scalable and flexible framework, with provable theoretical guarantees that allows to reduce the complex overlapping clustering problem to a simpler and more amenable non-overlapping (partitioning) problem. This allows us to scale overlapping community detection to graphs with tens of billions of edges outperforming previous solutions based on ego-nets analysis.

We will also show how similar ego-net based techniques allow improvements in the important task of graph ranking with applications to friend suggestions. Finally, time permitting, we will briefly discuss some interesting issues in social networks analysis raised by user privacy concerns and some potential algorithmic approaches on how to address them.

Based on joint work with Silvio Lattanzi, Renato Paes Leme, KDD'17 http://epasto.org/papers/kdd2017.pdf and other papers
http://epasto.org/papers/vldb2016-egonet.pdf VLDB'16,
http://epasto.org/papers/kdd2015.pdf KDD'15 (best paper award)

Alessandro Epasto is a research scientist at Google NYC in the graph mining team (part of the Google NYC Algorithms and Optimization team) lead by Vahab Mirrokni. Before joining Google in 2016 he was a postdoc at Brown University advised by prof. Eli Upfal. He received a Ph.D. in computer science from Sapienza University of Rome where he was advised by prof. Alessandro Panconesi and supported by the Google European Doctoral Fellowship. His research interests include algorithmic problems in large-scale data mining.