Link Prediction Revisited : From Evaluation Pitfalls to Language Model Synergies

Artificial intelligence (AI) and machine learning (ML) have significantly impacted many aspects of daily life, with numerous methods involving structural graph data. Graphs, which model relationships between different entities, are widely used to represent real-world data, including social networks, transportation systems, chemical molecules, power and communication networks, and user-item interactions in recommendation systems. A fundamental task in graph data analysis is link prediction, which predicts connections between entities and is essential for understanding their relationships. For example, link prediction allows us to determine whether two individuals are friends in a social network or if a user will purchase an item in a recommendation system. Consequently, link prediction is crucial for advancing graph-based applications in real-world ML applications. However, several challenges impede progress in this area. Specifically, 1) existing evaluation settings are not unified or rigorous, leading to inconsistent and sometimes suboptimal results, and 2) graph nodes are frequently associated with textual attributes containing rich semantic information, and this data has become increasingly abundant. Language models excel at processing textual data to capture semantic insights. However, effectively integrating textual information with graph data to enhance real-world applications remains under-explored. In light of these challenges, this dissertation seeks to advance link prediction from two main perspectives: 1)identifying evaluation pitfalls across various graph types to inspire more advanced methods for link prediction, and 2) leveraging language models in synergy with link prediction techniques to enhance a range of real-world applications. From the first perspective, we investigate evaluation pitfalls in both uni-relational and multi-relational graphs. Based on these findings, we propose methods to mitigate the identified issues or develop more effective and efficient models. From the second perspective, we explore the use of language models to improve link prediction in recommendation systems, and conversely, apply link prediction techniques to enhance language models in query understanding tasks. By working on these two perspectives, this dissertation not only contributes to the development of more robust link prediction methods but also facilitates their application in practical, real-world scenarios.

Read