data mining
TRANSCRIPT
Data & Information System (DAIS) Research LabDepartment of Computer Science | University of Illinois at Urbana-Champaign
Data MiningReseaRch gRoup Spring 09
Preparing for the challenges in information access, retrieval, and management that lie ahead requires a coordinated and multi-faceted approach.
The Data Mining Research Group at the University of Illinois Department of Computer Science is proud of the successful research partnerships and research initiatives that are a part of our hallmark of excellence.
TABLE OF CONTENTS
About the Data Mining Research Group
Jiawei Han
Students, Alumni, and Visiting Scholars
Awards and Publications
Projects
Funding
01
02
03
08
11
23
The Data Mining Research Group in the Department of Computer Science, University of Illinois at Urbana-Champaign, conducts leading edge research in the areas of data mining, data warehousing, database systems, and Web-based information systems.
Work conducted by the group is pioneering new directions in the field, and is pushing the boundaries of data mining techniques. Their work aims to integrate and advance the knowledge produced in multiple disciplines, including database systems, statistics, machine learning, algorithms, information theory, spatial and multimedia databases, and Web technology, among others.
With more than 20 members, the group is characterized by their breadth and depth of excellence, and their integrated approach to complex problem chains. The group is associated with the Data and Information System Laboratory.
Its research projects include: information network analysis•OLAP and mining of multidimensional text databases•graph mining•privacy and trust validation by data mining•mining moving objects, trajectories, RFID, and traffic data•image and video mining•multidimensional promotion and ranking analysis•transfer learning, dimensionality reduction, and •pattern-based classificationstream data mining•data mining in biomedical, software engineering •and cyberphysical system applications.
01 02
Professor Han is a world-recognized leader in the data mining field. His ground-breaking work includes pioneering techniques on frequent, sequential, and graph pattern mining; heterogeneous information network analysis; spatiotemporal data mining, stream data mining; and text cube, ranking cube, and data cube computa-tion. His contributions and discoveries have been characterized by an integrative approach, advancing knowledge produced in multiple disciplines.
Professor Han is one of the most cited authors in Data Mining, has written more than 400 papers for conferences and journals, organized a number of international conferences, and is the Editor-in-Chief of ACM Transactions on Knowledge Discovery from Data.
Working with government funding agencies and industry partners, Professor Han has extensive experience in managing large-scale, complex projects that take a multi-disciplinary approach.
awaRDs:SIGKDD Innovations Award (2004)•ACM Fellow (2004)•IEEE CS Technical Achievement Award (2005)•IEEE Fellow (2009)•
jiawei han
ReseaRch collaboRatoRs anD FunDeRs:
01 02
students
Dustin boRtneR
Network mining•
Deng cai
Machine learning, especially manifold •learning and dimensionality reductionInformation retrieval•
chen chen
Graph mining and related data •management problems
bolin Ding
Pattern mining algorithms•Theoretical aspects of data mining and •database problems
Jing gao
Ensemble learning, transfer learning•Data stream mining•Anomaly detection•
chanDRasekaR RaMachanDRan
Video/Image mining•Dimensionality reduction on sparse datasets•Indexing and search•
sangkyuM kiM
Image/video mining•High dimensional indexing•
Zhenhui li
Mining moving objects•Spatialtemporal data mining•
cinDy XiDe lin
Graph mining•Web mining•Multidimensional analysis•
Xin Jin
Image/video mining and retrieval•
03 04
Dustin boRtneR
Network mining•
Deng cai
Machine learning, especially manifold •learning and dimensionality reductionInformation retrieval•
chen chen
Graph mining and related data •management problems
bolin Ding
Pattern mining algorithms•Theoretical aspects of data mining and •database problems
Jing gao
Ensemble learning, transfer learning•Data stream mining•Anomaly detection•
chanDRasekaR RaMachanDRan
Video/Image mining•Dimensionality reduction on sparse datasets•Indexing and search•
sangkyuM kiM
Image/video mining•High dimensional indexing•
Zhenhui li
Mining moving objects•Spatialtemporal data mining•
cinDy XiDe lin
Graph mining•Web mining•Multidimensional analysis•
Xin Jin
Image/video mining and retrieval•
03 04
sebastian seith
Moving object and traffic mining•
yiZhou sun
Link analysis and information network analysis•Graph mining and Web mining•Machine learning•
luan tang
Spatial data mining•Privacy-Preserving data mining•Data mining with bio-medical application•
tianyi wu
Ranking query processing•Association analysis•Information network analysis•
ZhiJun yin
Web mining•Information retrieval•Machine learning•
Xiao yu
Anomaly detection•Web mining•
FeiDa Zhu
Structural pattern mining•Approximation and complexity analysis •for data mining problems
yintao yu
Information network and social network analysis•Web mining•
bo Zhao
Multidimensional text database systems•Web mining, entity search and extraction•Information network analysis•
peiXiang Zhao
Structural data mining•Algorithms on massive data sets•
05 06
sebastian seith
Moving object and traffic mining•
yiZhou sun
Link analysis and information network analysis•Graph mining and Web mining•Machine learning•
luan tang
Spatial data mining•Privacy-Preserving data mining•Data mining with bio-medical application•
tianyi wu
Ranking query processing•Association analysis•Information network analysis•
ZhiJun yin
Web mining•Information retrieval•Machine learning•
Xiao yu
Anomaly detection•Web mining•
FeiDa Zhu
Structural pattern mining•Approximation and complexity analysis •for data mining problems
yintao yu
Information network and social network analysis•Web mining•
bo Zhao
Multidimensional text database systems•Web mining, entity search and extraction•Information network analysis•
peiXiang Zhao
Structural data mining•Algorithms on massive data sets•
05 06
visiting scholars
alumni
07 08
hong cheng Ph.D. 2008, City University of Hong Kong
hectoR gonZaleZ Ph.D. 2008, Google Research
Xiaolei li Ph.D. 2008, Microsoft
chao liu Ph.D. 2007, Microsoft Research
Dong Xin Ph.D. 2007, Microsoft Research
XiaoXin yin Ph.D. 2007, Microsoft Research
XiFeng yan Ph.D. 2006, University of California at Santa-Barbara
hwanJo yu Ph.D. 2004, POSTECH University, Korea
ph.D.
Min-soo kiMGraph/network data mining•Bioinformatics•Indexing & query processing•Information retrieval & search engines•
lu liuWeb video analysis and mining•Topic modeling•Social-network analysis•
R. alves (Portugal)
R. angRyk (Montana State U.)
F. beRZal (Spain)
Jianlin Feng (China)
Jae-gil lee (IBM Research) cuiping li (China)
Recent visiting scholaRs
Recent MasteRs anD unDeRgRaDuate aluMni
luiZ MenDes
Jacob lee
MaRgaRet Myslinska
RicaRDo ReDDe
John paul sonDag
distinguished honors: jiawei han
distinguished honors: students
IEEE Fellow (2009)
IEEE Computer Society Technical Achievement Award (2005)
ACM SIGKDD Innovations Award (2004)
ACM Fellow (2004)
IBM Faculty Awards (2002, 2003, 2004)
The Outstanding Contribution Award (2002, IEEE Computer Society, International Conference on Data Mining)
UIUC Teachers Ranked as Excellent (2002-2007)
Microsoft Research Graduate Women’s Scholarship (2009): Cindy Xide Lin
ACM SIGKDD Dissertation Award (2008): Xiaoxin Yin
ACM SIGMOD Ph.D. Dissertation Runner-Up Award (2007): Xifeng Yan
IBM Scholarship (2007): Hong Cheng
Midwest Database Symp. Best Presentation Award (2007): Feida Zhu
Henry Ford II Award (2006): Deng Cai
07 08
R. alves (Portugal)
R. angRyk (Montana State U.)
F. beRZal (Spain)
Jianlin Feng (China)
Jae-gil lee (IBM Research) cuiping li (China)
Recent visiting scholaRs
D. Zhang, C. Zhai, and J. Han, “Topic Cube: Topic Modeling for OLAP on Multidimensional Text Databases”, in Proc. 2006 SIAM Int. Conf. on Data Mining (SDM’09)(One of “Best of SDM’09”) F. Zhu, X. Yan, J. Han, and P. S. Yu, “gPrune: A Constraint Pushing Framework for Graph Pattern Mining”, in Proc. 2007 Pacific-Asia Conf. on Knowledge Discovery and Data Mining (PAKDD’07)(Best Student Paper Award)
X. Li, J. Han, S. Kim, and H. Gonzalez, “ROAM: Rule- and Motif-Based Anomaly Detection in Massive Moving Object Data Sets”, in Proc. 2007 SIAM Int. Conf. on Data Mining (SDM’07)(One of “Best of SDM’07”)
F. Zhu, X. Yan, J. Han, P. S. Yu, and H. Cheng, “Mining Colossal Frequent Patterns by Core Pattern Fusion”, in Proc. 2007 Int. Conf. on Data Engineering (ICDE’07)(Best Student Paper Award)
Q. Mei, D. Xin, H. Cheng, J. Han, and C. Zhai, “Generating Semantic Annotations for Frequent Patterns with Context Analysis”, in Proc. 2006 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD’06)(Best Student Paper Runner-Up Award)
Hongyan Liu, Jiawei Han, Dong Xin, and Zheng Shao, “Mining Interesting Patterns from Very High Dimensional Data: A Top-Down Row Enumeration Approach”, in Proc. 2006 SIAM Int. Conf. on Data Mining (SDM’06).(One of “Best of SDM’06”)
H. Gonzalez, J. Han, X. Li, and D. Klabjan, “Warehousing and Analysis of Massive RFID Data Sets”, in Proc. 2006 Int. Conf. on Data Engineering (ICDE’06)(Best Student Paper Award)
X. Yan, H. Cheng, J. Han, and D. Xin, “Summarizing Itemset Patterns: A Profile-Based Approach”, in Proc. 2005 Int. Conf. on Knowledge Discovery and Data Mining (KDD’05)(Best Student Paper Runner-Up Award)
conference awards
09 10
D. Cai, X. He, and J. Han, “A Geometric Perspective on DimensionalityReduction”, SDM’09, Sparks, NV, April 2009
J. Pei, Y. Tao, and J. Han, “Preference Queries from OLAP and DataMining Perspective”, ICDE’09, Shanghai, China, March 2009
J. Han, X. Yan, and P. S. Yu, “Scalable OLAP and Mining of InformationNetworks”, EDBT’09, St. Petersburg, Russia, March 2009
H. Cheng, J. Han, X. Yan, and P. S. Yu, “Integration of Classification andPattern Mining: A Discriminative and Frequent Pattern-basedApproach”, ICDM’08, Pisa, Italy, December 2008
J. Han, J.-G. Lee, H. Gonzalez, and X. Li, “Mining Massive RFID, Trajectory,and Traffic Data Sets”, ACM SIGKDD’08, Las Vegas, NE, August 2008
J. Han, X. Yin, and P. S. Yu, “Exploring the Power of Links in DataMining”, ICDE’08, Cancun, Mexico, April 2008 (Also, ECML/PKDD’07,Warsaw, Poland, Sept. 2007)
C. Liu, T. Xie, and J. Han, “Mining for Software Reliability”, ICDM’07,Omaha, NE, Oct. 2007
J. Han, X. Yan, and P. S. Yu, “Mining and Searching Graphs andStructures”, KDD’06, Philadelphia, PA, August 2006 (Also, ICDE’06,Atlanta, GA, April 2006, and ICDM’05, Huston, TX, Nov. 2005)
conference tutorials
09 10
Information Network Analysis
OLAP and Mining of Multidimensional Text Databases
Graph Mining
Privacy and Trust Validation by Data Mining
Mining Moving Objects, Trajectories, RFID, and Traffic Data
Image and Video Mining
Multidimensional Promotion and Ranking Analysis
Transfer Learning, Dimensionality Reduction, and Pattern-Based Classification
Stream Data Mining
Data Mining Applications
project list
12
13
14
15
16
17
18
19
20
21
information network analysis
11 12
description:
Information network analysis investigates effective discovery of patterns and knowledge from large-scale networks that consist of interconnected physical, technological, conceptual, and human/societal components. The major themes in our study include: (1) ranking-based clustering on different types of objects in heterogeneous information networks; (2) hierarchi-cal network structure analysis for OLAP, multidimensional text database analysis, and ranking promotion; (3) query-based information network extraction and analysis; and (4) link-based veracity analysis for bibliographic networks and news information networks.
researchers: Yizhou Sun, Yintao Yu, Chen Chen, Cindy Xide Lin, Tianyi Wu, Bo Zhao, Dustin Botner, and Jiawei Han
selected publications:
Y. Sun, Y. Yu, and J. Han, “Ranking-Based Clustering of Heterogeneous Information Networks with Star Network Schema”, KDD’09
Y. Sun, J. Han, P. Zhao, Z. Yin, H. Cheng, and T. Wu, “RankClus: Integrating Clustering with Rank-ing for Heterogeneous Information Network Analysis”, EDBT’09
Y. Sun, T. Wu, H. Cheng, J. Han, X. Yin, and P. Zhao, “BibNetMiner: Mining Bibliographic Informa-tion Networks”, (demo paper), SIGMOD’08
X. Yin, J. Han, and P. S. Yu, “LinkClus: Efficient Clustering via Heterogeneous Semantic Links”, VLDB’06
information network analysis
11 12
description:
A multidimensional text database, such as customer reviews, flight reports, job descriptions and service feedbacks, is a database that consists of both multidimensional categorical attributes and narrative text attributes. We investigate how to construct text or topic data cubes, perform effective information retrieval, OLAP, and text mining on such data cubes, and how textual and structured multidimensional information could work together to enhance information retrieval and knowledge discovery.
researchers: Cindy Xide Lin, Bo Zhao, Bolin Ding, Duo Zhang, ChengXiang Zhai, and Jiawei Han
selected publications:
C. X. Lin, B. Ding, J. Han, F. Zhu, and B. Zhao. “Text Cube: Computing IR Measures for Multidimen-sional Text Database Analysis”, ICDM’08
D. Zhang, C. Zhai, and J. Han, “Topic Cube: Topic Modeling for OLAP on Multidimensional Text Databases”, SDM’09 (Best of SDM’09)
olap and mining of multidimensional text databases graph mining
13 14
description:
Graph mining is to mine patterns, classification models, clusters, and other kinds of knowledge from massive graph data sets and develop indexing, similarity search and OLAP tools for graph data. Applications include bioinformatics, computer system diagnoistics, social network analy-sis, and Web search and mining.
researchers: Chen Chen, Feida Zhu, Cindy Xide Lin, Peixiang Zhao, Xifeng Yan (Univ. of California at Santa-Barbara), Jiawei Han and Philip S. Yu (Univ. of Illinois at Chicago)
selected publications:
X. Yan, H. Cheng, J. Han, and P. S. Yu, “Mining Significant Graph Patterns by Scalable Leap Search”, SIGMOD’08
C. Chen, X. Yan, F. Zhu, J. Han, and P. S. Yu, “Graph OLAP: Towards Online Analytical Processing on Graphs”, ICDM’08
C. Chen, C. X.Lin, X. Yan, and J. Han, “On Effective Presentation of Graph Patterns: A Structural Representative Approach”, CIKM’08
C. Chen, X. Yan, P. S. Yu, J. Han, D. Zhang, and X. Gu, “Towards Graph Containment Search and Indexing”, VLDB’07
X. Yan, F. Zhu, P. S. Yu, and J. Han, “Feature-based Substructure Similarity Search”, ACM Transac-tions on Database Systems (TODS), .31: 1418 -1453, 2006
olap and mining of multidimensional text databases graph mining
13 14
description:
Can we trust pieces of information provided by other parties and other information providers including newspapers, Web, TV?
We investigate this issue and develop techniques to provide trustable analysis of the truthfulness of information from multiple information providers and automatically identify the trustworthy information. Alternatively, can we develop data mining mechanisms that find interesting infor-mation and still preserve the required privacy specified by information providers?
We study privacy-preserving data mining and developed a constraint-based clustering approach for privacy-preservation data publishing. We are also working on privacy- preserving data cube that may provide multidimensional aggregate information as well as preserve privacy of sensi-tive data.
researchers: Bolin Ding, Zhijun Yin, and Jiawei Han
selected publications:
A. K. H. Tung, J. Han, L. V. S. Lakshmanan, and R. T. Ng, “Privacy-Preserving Data Publishing : A Constraint-Based Clustering Approach”, in S. Basu, et al. (eds.), Constrained Clustering: Advances in Algorithms, Theory, and Applications, Taylor and Francis, 2008
X. Yin, J. Han, and P. S. Yu, “Truth Discovery with Multiple Conflicting Information Providers on the Web”, TKDE’08
X. Yin, J. Han, and P. S. Yu, “Object Distinction: Distinguishing Objects with Identical Names by Link Analysis”, ICDE’07
mining moving objects, trajectories, rfid, and traffic dataprivacy and trust validation by data mining
15 16
description:
The world is increasingly become more mobile. We design and develop effective and scalable methods for mining massive moving-object data, trajectory data, RFID data, and traffic data to uncover clusters, classification models, frequent and sequential patterns, and outliers in large sets of moving objects, with applications in homeland security, law enforcement, traffic control, animal/bird migration analysis, and environmental studies.
researchers: Zhenhui Li, LuAn Tang, Sebastian Seith, and Jiawei Han
selected publications:
X. Li, Z. Li, J. Han, and J.-G. Lee, “Temporal Outlier Detection in Vehicle Traffic Data”, ICDE’09
J.-G. Lee, J. Han, X. Li, and H.Gonzalez, “TraClass: Trajectory Classification Using Hierarchical Region-Based and Trajectory-Based Clustering”, VLDB’08
J.-G. Lee, J. Han, and X. Li, “Trajectory Outlier Detection: A Partition-and-Detect Framework”, ICDE’08
H. Gonzalez, J. Han, X. Li, M. Myslinska, and J. P. Sondag, “Adaptive Fastest Path Computation on a Road Network: A Traf-fic Mining Approach”, VLDB’07
J.-G. Lee, J. Han, and K.-Y. Whang, “Trajectory Clustering : A Partition-and-Group Framework”, SIGMOD’07
mining moving objects, trajectories, rfid, and traffic dataprivacy and trust validation by data mining
15 16
description:
We investigate efficient image and video pattern mining, clustering, classification, and indexing methods. including developing an image frequent spatial pattern mining algorithm SpIBag (Spatial Item Bag Mining), an image clustering algorithm SpaRClus (Spatial Relationship Pattern-Based Hierarchical Clustering) which persists over shifting, scaling and rotation transformations, and a multi-layer ring-based index structure for both r-Range search and k-NN search.
researchers: Sangkyum Kim, Xin Jin, Chandrasekar Ramachandran, Liangliang Cao, and Klara Nahrstedt
selected publications:
X. Jin, S. Kim, J. Han, L. Cao, and Z. Yin, “GAD: General Activity Detection for Fast Clustering on Large Data”, SDM’09
R. Malik, S.Kim, X. Jin, C. Ramachandran, J. Han, I. Gupta, and K. Nahrstedt, “MLR-Index: An Index Structure for Fast and Scalable Similarity Search in High Dimensions”, SSDBM’09
S. Kim, X. Jin, and J. Han, “SpaRClus: Spatial Relationship Pattern-Based Hierarchical Clustering”, SDM’08
image and video mining multidimensional promotion and ranking analysis
17 18
image and video mining
description:
As decision support and business intelligence applications become increasingly large-scale, it is critical to support effective and efficient search and knowledge discovery through online multidimensional analysis. Promotion and ranking are indispensable functions of such an analysis engine: ranking aims at enabling analysts to explore top-k interesting aggregate or non-aggregate answers at multiple resolutions; and promotion helps decision makers promote any given object of interest through discovering the best subspaces or data regions where the object becomes prominent, without manually navigating the data set. We have developed Ranking-Cube and PromotionCube methods that are efficient and scalable at processing flexible queries in multidimensional space.
researchers: Tianyi Wu, Dong Xin (Microsoft Research), and Jiawei Han
selected publications:
T. Wu, D. Xin, and J. Han, “ARCube: Supporting Ranking Aggregate Queries in Partially Materialized Data Cubes”, SIGMOD’08
D. Xin and J. Han, “P-Cube: Answering Preference Queries in Multi-Dimensional Space”, ICDE’08
T. Wu, X. Li, D. Xin, J. Han, J. Lee, and R. Redder, “DataScope: Viewing Database Contents in Google Maps’ Way”, VLDB’08 (demo)
multidimensional promotion and ranking analysis
17 18
description:
Classification is a core problem widely studied in machine learning, statistical learning and data mining. Real-world applications, such text, image and web categorization, gene prediction, system and network intrusion detection, can be cast into a classification problem. Although many learning algorithms, such as Support Vector Machines, logistic regression, and decision tree induction, have been developed, there are still numerous challenges in effective classifica-tion. We investigate methods for improving classification accuracy by exploring knowledge embedded in data and develop novel methods to construct discriminative and compact feature set for complex structured data, explore manifold structure for learning, and combine multiple sources or learning models for better predictions.
researchers: Jing Gao, Deng Cai, Hong Cheng (Chinese Univ. of Hong Kong), and Jiawei Han
selected publications:
J. Gao, W. Fan, J. Jiang, and J. Han, “Knowledge Transfer via Multiple Model Local Structure Map-ping”, KDD’08
D. Cai, X. He, and J. Han, “Training Linear Discriminant Analysis in Linear Time”, ICDE’08
H. Cheng, X. Yan, J. Han, and P. S. Yu, “Direct Discriminative Pattern Mining for Effective Classifica-tion”, ICDE’08
D. Cai, X. He, and J. Han, “SRDA: An Efficient Algorithm for Large Scale Discriminant Analysis”, TKDE’08.
H. Cheng, X. Yan, J. Han, and C.-W. Hsu, “Discriminative Frequent Pattern Analysis for Effective Classification”, ICDE”07
transfer learning, dimesionality reduction, and pattern-based classification
stream data mining
19 20
transfer learning, dimesionality reduction, and pattern-based classification
description:
In many real-time applications, such as network traffic monitoring, credit card fraud detection, and web click stream, data arriving continuously and in large amount, forming data streams. We investigate stream data mining principles and algorithms, develop effective and scalable methods for mining the dynamics of data streams in multi-dimensional space, including dis-covering changes, trends and evolution characteristics in data streams, constructing clusters and classification models, and exploring frequent patterns and similarities among data streams.
researchers: Jing Gao, Wei Fan (IBM Research), and Jiawei Han
selected publications:
L. Mendes, B. Ding, and J. Han, “Stream Sequential Pattern Mining with Precise Error Bounds”, ICDM’08
J. Gao, W. Fan, and J. Han, “On Appropriate Assumptions to Mine Data Streams: Analysis and Practice”, ICDM’07
J. Gao, W. Fan, J. Han, and P. S. Yu, “A General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions”, SDM’07
stream data mining
19 20
Motivated by long sequences in text data, biological data, software engineering, and sensor networks, we study mining repetitive gapped subsequences to capture the occurrences of sequential patterns repeating within each sequence of a large database and use them as features for classification or prediction.
We investigate medical classification problems include gene prediction based on micro-array data and cancer prediction based on medical images and develop discriminative pattern based methods to improve the accuracy of medical data classification, as well as provide useful dis-criminative patterns to help the medical experts with their decisions.
We investigate statistical analysis and sequence/graph mining methods for software bug detec-tion, failure indexing, troubleshooting and root-cause analysis in sensor networks and data streams.
A cyberphysical system consists of a large number of interacting physical and information components. For example, a patient-care system may link a patient monitoring system with a network of patients and associated medical information and an emergency handling system. We investigate data mining cyberphysical networks, including real-time analysis of massive amount of streaming data, reliable and trusted data analysis, and effective spatiotemporal data analysis in cyberphysical networks.
sequential pattern mining (bolin ding and jiawei han):
biological and medical data mining (jing gao, xiao yu, min-soo kim, Zhijun yin, jiawei han):
software engineering and sensor network mining (xin jin, jiawei han and tarek abdelzaher):
cyberphysical systems (luan tang and jiawei han):
data mining applications
21 22
selected publications:
D. Lo, H.Cheng, J. Han, S. Khoo, and C. Sun, “Classification of Software Behaviors for Failure Detec-tion: A Discriminative Pattern Mining Approach”, KDD’09
B. Ding, D. Lo, J. Han, and S.-C. Khoo, “Efficient Mining of Closed Repetitive Gapped Subsequences from a Sequence Database”, ICDE’09
M. M. H. Khan, T. Abdelzaher, J. Han, and H. Ahmadi, “Finding Symbolic Bug Patterns in Sensor Networks”, DCOSS’09
M. M. H. Khan, H. Le, H. Ahmadi, T. Abdelzaher, and J. Han, “DustMiner: Troubleshooting Interac-tive Complexity Bugs in Sensor Networks”, Sensys’08
F. Zhu, X. Yan, J. Han, P. S. Yu, and H. Cheng, “Mining Colossal Frequent Patterns by Core Pattern Fusion”, ICDE’07 (Best Student Paper Award)
data mining applications
21 22
NASA (with ChengXiang Zhai, et al.): “Event Cube: An Organized Approach for Mining and Understanding Anomalous Aviation Events” (2008-2010)
Air Force (MURI, with Tim Finin as PI, et al.): “A Framework for Managing the Assured Informa-tion Sharing Lifecycle” (2008-2012)
NSF: “SGER: CS-BibCube: OLAPing and Mining of Computer Science Literature” (2008-2010)
NSF (with Roland Kays et al.): “BDI: Movebank: Integrated Database for Networked Organism Tracking” (2007-2010)
NSF: “SGER: DataScope: Viewing Database Contents in Multi-Resolution at Your Finger Tips” (2006-2007)
NSF (with Jasmine Zhou): “Collaborative Research: Endowing Biological Databases With Analytical Power: Indexing , Querying , and Mining of Complex Biological Structures” (2005-2009)
NSF (with Ouri Wolfson): “SEI(IIS): MotionEye: Querying and Mining Large Datasets of Moving Objects” (2005-2008)
NSF (With Xiaosong Ma) “Collaborative Research: Reusable, Observation-based Performance Predic-tion across Platforms” (2004-2005)
DHS (with Dan Roth as PI, et al.): Multimodal Information Access and Synthesis Center (2007-2010)
ONR/NCASSR (with Michael Welge), “Detection and Apprehension of Rare Events in Data Streams” (2008-2009)
Boeing: “On-Line Mining of Strange Moving Objects for Security Protection” (2007-2010)
U.S. Air Force (with IAI Inc) “Distributed High-Dimensional Mining Tool for Bioscience Data Analy-sis” (2006-2009)
NSF (with Josep Torrelas as PI, et al.): “ITR: Automatic On-the-fly Detection, Characterization, Recovery, and Correction of Software Bugs in Production Runs” (2003-2008)
NSF: “Mining Dynamics of Data Streams in Multi-Dimensional Space” (2003-2006)
research funding
23 24
ONR (with Michael Welge): “Mining Changes and Alarming Events in Streaming Data” (2003-2006)
NSF: “Mining Sequential and Structured Patterns: Scalability, Flexibility, Extensibility and Applicability” (2002-2006)
Research gifts and grants from industry: Microsoft Research, Intel, IBM (Faculty Award, Innovation Award), Google, Yahoo!, NCSA (Faculty Fellowship), HP-Labs.
23 24
DataMiningResearchGroup•DepartmentofComputerScience,UIUC
http://dm1.cs.uiuc.edu