autonlab.org
WARNING: you are not looking at the live version but at an older version.

Research Thrust

Social Network Analysis/Link Analysis/Group Detection

Social Network Analysis/Link Analysis/Group Detection seeks to discover interesting relationships and patterns among people or other entities, for example:

  • Who communicates with whom?  And who appears to avoid communicating with whom?
  • Are there cliques of people who mostly communicate among themselves and rarely with others, or is communication more evenly distributed?
  • Are there "stars" who are linked with a very large number or people, and/or isolated people who are only linked with one or two others?
  • Might there be aliases?  That is, if we see two people with essentially the same link patterns, but who are never linked with each other, might they in fact be the same person?
  • How do patterns of association among entities evolve over time?
  • Can we identify groups of entities, based on link data and/or demographic properties?  If we know that a communication took place, but we don't know the identity of one of the participants, can we infer who that entity was?

Auton Lab researchers have developed--and continue to develop--many algorithms and associated software packages for investigating these kinds of questions.  As usual at the Auton Lab, these technologies place great emphasis on efficient analysis of large datasets.

Software

AFDL - Activity From Demographics and Links   
Bayes Net Learner - As the name sounds
SBNS - Screen-based Bayes Net Structure search                       
GDA/k-groups - Group Detection Algorithm
MNOP - Many Names, One Person alias detection         
XGDA - A fast group detection algorithm 

Datasets

Alias detection Dataset - input forMany Names One Person software software
Link Datasets - for Link Detection, GDA, k-groups, cGraph, and Sparse Bayes Net search

More to come...

Papers
NameSummaryActions
A Comparison of Statistical and Machine Learning Algorithms on the Task of Link Completion Link data, consisting of a collection of subsets of entities, can be an important source of information for a variety of fields including the social sciences, biology, criminology, and business intelligence. However, these links may be incomplete, containing one or more unknown members. We consi...show
Alias Detection in Link Data Sets The problem of detecting aliases - multiple text string identifiers corresponding to the same entity - is increasingly important in the domains of biology, intelligence, marketing, and geoinformatics. This report investigates the extent to which probabilistic methods can help. Aliases arise from...show
Alias Detection in Link Data Sets The problem of detecting aliases - multiple text string identifiers corresponding to the same entity - is increasingly important in the domains of biology, intelligence, marketing, and geoinformatics. Aliases arise from entities who are trying to hide their identities, from a person with multipl...show
cGraph: A Fast Graph-Based Method for Link Analysis and Queries Many techniques in the social sciences and graph theory deal with the problem of examining and analyzing patterns found in the underlying structure and associations of a group of entities. However, much of this work assumes that this underlying structure is known or can easily be inferred from d...show
Dynamic Social Network Analysis using Latent Space Models This paper explores two aspects of social network modeling. First, we generalize a successful static model of relationships into a dynamic model that accounts for friendships drifting over time. Second, we show how to make it tractable to learn such models from data, even as the number of entiti...show
Empirical Bayes Screening for Link Analysis The domain of link analysis has recently re-ignited interest among researchers due to its applicability to new areas such as intelligence analysis (for example, identifying cliques of suspicious people), large scale social network analysis and genomics. The area of link analysis is not new and c...show
Finding Underlying Connections: A Fast Graph-Based Method for Link Analysis and Collaboration Queries Many techniques in the social sciences and graph theory deal with the problem of examining and analyzing patterns found in the underlying structure and associations of a group of entities. However, much of this work assumes that this underlying structure is known or can easily be inferred from d...show
Learning Automated Product Recommendations Without Observable Features: An Initial Investigation It is appealing to imagine software packages that provide personally tailored product recommendations to a consumer. One way to predict the rating of a particular product by a particular consumer is through inference from a database of previous ratings by many consumers of many products. Such a ...show
Making Logistic Regression A Core Data Mining Tool: A Practical Investigation of Accuracy, Speed, and Simplicity Binary classification is a core data mining task. For large datasets or real-time applications, desirable classifiers are accurate, fast, and automatic (i.e. no parameter tuning). Naive Bayes and decision trees are fast and parameter-free, but their accuracy is often below state-of-the-art. Line...show
Stochastic Link and Group Detection Link detection and analysis has long been important in the social sciences and in the government intelligence community. A significant effort is focused on the structural and functional analysis of "known" networks. Similarly, the detection of individual links is important but is usually done wi...show
Tractable Group Detection on Large Link Data Sets Discovering underlying structure from co-occurrence data is an important task in a variety of fields, including: insurance, intelligence, criminal investigation, epidemiology, human resources, and marketing. Previously Kubica et. al. presented the group detection algorithm (GDA) - an algorithm f...show

Back

Copyright 2008, Carnegie Mellon University, Auton Lab. All Rights Reserved.