The analysis of social and information networks has gained wide attentions in computer science, physics, social science, biology, and other research communities, with exciting discoveries and successful applications. In most of the existing network research, networks are usually assumed to be homogenous, where nodes are objects from the same entity type (e.g., person) and links are relationships from the same relation type (e.g., friendship). However, in reality, objects of different types interact with each other via relationships of different types, forming heterogeneous, semi-structured information networks. Such kind of heterogeneous networks are ubiquitous, representing real-world systems ranging from social to scientific, engineering, or medical systems, and to e-commerce systems.
We investigate the problem of mining heterogeneous information networks, by leveraging the semantic meanings of the types of objects and links in the network, and propose principles and technologies that can exploit these rich semantics and solve large-scale, real-world problems. In particular, we disclose how different types of relationships carry different semantics and strengths in determining the similarity or influence between linked objects. These studies have laid the foundation for in-depth analysis of heterogeneous information networks, such as similarity search, ranking, clustering, classification, prediction and outlier detection. Our experiments on large-scale networks like the DBLP bibliographic network, the Flickr image network, and the Yelp review network, have demonstrated the effectiveness of our models and the efficiency of our algorithms, as well as the potential of our methodologies being successfully applied to a broader range of applications.
Yizhou Sun is a fifth year Ph.D. candidate at the Department of Computer Science, University of Illinois at Urbana-Champaign. Her principal research interest is in mining information and social networks, and more generally in data mining, database systems, statistics, machine learning, information retrieval, and network science, with a focus on modeling novel problems and proposing scalable algorithms for large-scale, real-world applications. Yizhou has over 30 publications in book chapters, journals, and major conferences. Tutorials based on her thesis work on mining heterogeneous information networks have been given in several premier conferences, such as SIGMOD’10, SIGKDD’10 and ICDE’12.