Query-Driven Discovery of Semantically Similar Substructures in Heterogeneous Networks (KDD12)
Information Network Analysis
- Y. Sun, Y. Yu, and J. Han, “Ranking-Based Clustering of Heterogeneous Information Networks with Star Network Schema", KDD’09
- Y. Sun, J. Han, P. Zhao, Z. Yin, H. Cheng, and T. Wu, “RankClus: Integrating Clustering with Ranking for Heterogeneous Information Network Analysis”, EDBT'09
- Y. Sun, T. Wu, H. Cheng, J. Han, X. Yin, and P. Zhao, “BibNetMiner: Mining Bibliographic Information Networks”, SIGMOD’08 (demo)
- X. Yin, J. Han, and P. S. Yu, “CrossClus: User-Guided Multi-Relational Clustering”, Data Mining and Knowledge Discovery, 16(1), 2007.
- X. Yin, J. Han, and P. S. Yu, “LinkClus: Efficient Clustering via Heterogeneous Semantic Links”, VLDB'06
OLAP and Mining of Multidimensional Text Databases
- C. X. Lin, B. Ding, J. Han, F. Zhu, and B. Zhao. “Text Cube: Computing IR Measures for Multidimensional Text Database Analysis”, ICDM’08
- D. Zhang, C. Zhai, and J. Han, "Topic Cube: Topic Modeling for OLAP on Multidimensional Text Databases", SDM’09 (Best of SDM’09)
- X. Yan, H. Cheng, J. Han, and P. S. Yu, “Mining Significant Graph Patterns by Scalable Leap Search”, SIGMOD'08.
- C. Chen, X. Yan, F. Zhu, J. Han, and P. S. Yu, “Graph OLAP: Towards Online Analytical Processing on Graphs”, ICDM’08
- C. Chen, C. X.Lin, X. Yan, and J. Han, “On Effective Presentation of Graph Patterns: A Structural Representative Approach”, CIKM’08
- C. Chen, X. Yan, P. S. Yu, J. Han, D. Zhang, and X. Gu, “Towards Graph Containment Search and Indexing”, VLDB'07
- X. Yan, F. Zhu, P. S. Yu, and J. Han, “Feature-based Substructure Similarity Search”, ACM Transactions on Database Systems (TODS), .31: 1418 -1453, 2006
Mining Moving Objects, Trajectories, RFID, and Traffic Data
- X. Li, Z. Li, J. Han, and J.-G. Lee, “Temporal Outlier Detection in Vehicle Traffic Data”, ICDE’09
- J.-G. Lee, J. Han, X. Li, and H.Gonzalez, “TraClass: Trajectory Classification Using Hierarchical Region-Based and Trajectory-Based Clustering”, VLDB’08
- J.-G. Lee, J. Han, and X. Li, "Trajectory Outlier Detection: A Partition-and-Detect Framework", ICDE’08
- J.-G. Lee, J. Han, and K.-Y. Whang, “Trajectory Clustering: A Partition-and-Group Framework”, SIGMOD'07
- H. Gonzalez, J. Han, X. Li, M. Myslinska, and J. P. Sondag, “Adaptive Fastest Path Computation on a Road Network: A Traffic Mining Approach”, VLDB'07
- X. Li, J. Han, S. Kim, and H. Gonzalez, “ROAM: Rule- and Motif-Based Anomaly Detection in Massive Moving Object Data Sets”, SDM'07. (Best of SDM’07)
- H. Gonzalez, J. Han, X. Li, and D. Klabjan, “Warehousing and Analysis of Massive RFID Data Sets”, in Proc. 2006 Int. Conf. on Data Engineering (ICDE'06), Atlanta, Georgia, April 2006. (Best Student Paper Award)
Image and Video Mining
We investigate efficient image and video pattern mining, clustering, classification, and indexing methods. including developing an image frequent spatial pattern mining algorithm SpIBag (Spatial Item Bag Mining), an image clustering algorithm SpaRClus (Spatial Relationship Pattern-Based Hierarchical Clustering) which persists over shifting, scaling and rotation transformations, and a multi-layer ring-based index structure for both r-Range search and k-NN search.
- X. Jin, S. Kim, J. Han, L. Cao, and Z. Yin, “GAD: General Activity Detection for Fast Clustering on Large Data", SDM'09.
- R. Malik, S.Kim, X. Jin, C. Ramachandran, J. Han, I. Gupta, and K. Nahrstedt, "MLR-Index: An Index Structure for Fast and Scalable Similarity Search in High Dimensions", SSDBM'09.
- S. Kim, X. Jin, and J. Han, “SpaRClus: Spatial Relationship Pattern-Based Hierarchical Clustering”, SDM'08
Stream Data Mining
- L. Mendes, B. Ding, and J. Han, "Stream Sequential Pattern Mining with Precise Error Bounds", .ICDM'08.
- J. Gao, W. Fan, and J. Han, “On Appropriate Assumptions to Mine Data Streams: Analysis and Practice”, ICDM'07
- J. Gao, W. Fan, J. Han, and P. S.Yu, “A General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions”, SDM'07
Data Mining Applications
Sequential pattern mining: Motivated by long sequences in text data, biological data, software engineering, and sensor networks, we study mining repetitive gapped subsequences to capture the occurrences of sequential patterns repeating within each sequence of a large database and use them as features for classification or prediction.
Biological and medical data mining: We investigate medical classification problems include gene prediction based on micro-array data and cancer prediction based on medical images and develop discriminative pattern based methods to improve the accuracy of medical data classification, as well as provide useful discriminative patterns to help the medical experts with their decisions.
Software engineering and sensor network mining: We investigate statistical analysis and sequence/graph mining methods for software bug detection, failure indexing, troubleshooting and root-cause analysis in sensor networks and data streams.
Cyberphysical systems: A cyberphysical system consists of a large number of interacting physical and information components. For example, a patient-care system may link a patient monitoring system with a network of patients and associated medical information and an emergency handling system. We investigate data mining cyberphysical networks, including real-time analysis of massive amount of streaming data, reliable and trusted data analysis, and effective spatiotemporal data analysis in cyberphysical networks.
- D. Lo, H.Cheng, J. Han, S. Khoo, and C. Sun, "Classification of Software Behaviors for Failure Detection: A Discriminative Pattern Mining Approach", KDD'09
- B. Ding, D. Lo, J. Han, and S.-C. Khoo, "Efficient Mining of Closed Repetitive Gapped Subsequences from a Sequence Database", ICDE'09
- M. M. H. Khan, T. Abdelzaher, J. Han, and H. Ahmadi, "Finding Symbolic Bug Patterns in Sensor Networks", .DCOSS'09.
- M. M. H. Khan, H. Le, H. Ahmadi, T. Abdelzaher, and J. Han, "DustMiner: Troubleshooting Interactive Complexity Bugs in Sensor Networks", Sensys'08
- F. Zhu, X. Yan, J. Han, P. S. Yu, and H. Cheng, “Mining Colossal Frequent Patterns by Core Pattern Fusion”, ICDE'07. (Best Student Paper Award)