Edited by Mohammed J. Zaki, Ke Wang, Chid Apte, and Haesun Park
Proceedings in Applied Mathematics 130
Symposium held in Atlanta, GA, April 24-26, 2008.
Contents Message from the Conference Co-Chairs; Preface; SDM 2008 Conference Organization; Program Committee; External Reviewers; Semi-Supervised Clustering via Matrix Facotization; Creating a Cluster Hierarchy under Constraints of a Partially Known Hierarchy; Constrained Co-clustering of Gene Expression Data; DATA PEELER: Constraint-Based Closed Pattern Mining in n-ary Relations; SpaRClus: Spatial Relationship Pattern-Based Hierarchial Clustering; Mining Tree Patterns with Almost Smallest Supertrees; Maximal Quasi-Bicliques with Balanced Noise Tolerance: Concepts and Co-clustering Applications; CISpan: Comprehensive Incremental Mining Algorithms of Closed Sequential Patterns for Multi-Versional Software Mining; Mining Association Rules of Simple Conjunctive Queries; Discovering Relational Item Sets Efficently; A Stagewise Lease Square Loss Function for Classification; Semi-Supervised Learning Based on Semiparametric Regularization; Roughly Balanced Bagging for Imbalanced Data; An Efficient Local Algorithm for Distributed Multivariate Regression in Peer-to-Peer Networks; Aerosol Optical Depth Prediction from Satellite Observations by Multiple Instance Regression; Feature Selection with the logRatio Kernel; A RELIEF Based Feature Extraction Algorithm; Deterministic Latent Variable Models and Their Pitfalls; Massive-Scale Kernel Discriminant Analysis: Mining for Quasars; Dynamic Non-Parametric Mixutre Models and Recurrent Chinese Restaurant Process: With Applications to Evolutionary Clustering; Latent Variable Mining with Its Applications to Anomalous Behavior Detection; Similarity Measures for Categorical Data: A Comparative Evaluation; Gaussian Process Learning for Cyber-Attack Early Warning; Practical Private Computation and Zero-Knowledge Tools for Privacy-Perserving Distributed Data Mining; A Spamicity Approach to Web Spam Detection; Semantic Smoothing for Bayesian Text Classification with Small Training Data; Clustering from Constraint Graphs; Efficiently Mining Closed Subsequences with Gap Constraints; Semi-Supervised Classification with Universum; Finding Subgroups Having Several Descriptions: Algorithms for Redescription Mining; The PageTrust Algorithm: How to Rank Web Pages When Negative Links Are Allowed?; A Pattern Mining Approach toward Discovering Generalized Sequences Signatures; The Asymmetric Approximate Antyime Join: A New Primative with Applications to Data Mining; Preemptive Measures against Malicious Party in Privacy-Preserving Data Mining; A Range Query Approach for High Dimensional Euclidean Space Based on EDM Estimation; A Bayesian Technique for Estimating the Credibility of Question Answerers; Semi-supervised Multi-label Learning by Solving a Sylvester Equation; Exploiting Structured Reference Data for Unsupervised Text Segmentation with Conditional Random Fields; Graph Mining with Variational Dirichlet Process Mixture Models; Direct Density Ratio Estimation for Large-scale Covariate Shift Adaption; ROC-tree: A Novel Decision Tree Induction Algorithm Based on Receiver Operating Characteristics to Classify Gene Expression Data; Semi-supervised Learning of a Markovian Metric; Mining Abnormal Patterns from Heterogeneous Time-Series with Irrelevant Features for Fault Event Detection; Outlier Detection with Uncertain Data; Randomization of Real-Valued Matrices for Assessing the Significance of Data Mining Results; Theoretical Analysis of Subsequences Time-Series Clustering from a Frequency-Analysis Viewpoint; Active Learning with Model Selection in Linear Regression; A Feautre Selection Algorithm Capable of Handling Extremely Large Data Dimensionality; Generic Methods for Multi-criteria Evaluation; A New Method for Rule Finding via Bootstrapped Confidence Intervals; Mining and Ranking Generators of Sequential Patterns; Type Independent Correction of Sample Selection Bias via Structural Discovery and Re-balancing; Exploration and Reduction of the Feature Space by Hierarchical Clustering; On the Dangers of Cross-Validation. An Experimental Evaluation; Mining Complex, Maximal and Complete Sub-graphs and Sets of Correlated Variables with Applications to Feature Subset Selection; Spatio-Temporal Partitioning for Improving Aerosol Prediction Accuracy; On Indexing High Dementional Data with Uncertainty; Efficient Distribution Mining Classification; Mining Sequence Classifiers for Early Prediction; Exact and Approximate Reverse Nearest Neighbor Search for Multimedia Data; Finding a Haystack in Haystacks—Simultaneous Identification of Concepts in Large Bio-Medical Corpora; Learning Markov Network Structure Using Few Independence Tests; Statistical Density Prediction in Traffic Networks; Proximity Tracking on Time-Evolving Bipartite Graphs; Integration of Multiple Networks for Robust Label Propagation; Spatial Scan Statistics for Graph Clustering; Randomizing Social Networks: A Spectrum Preserving Approach; Efficient Maximum Margin Clustering via Cutting Plane Algorithm; Robust Clustering in Arbitrarily Orient Subspaces; The Relevant-set Correlation Model for Data Clustering; Cluster Ensemble Selection; Weighted Consensus Clustering; A General Framework for Estimating Similarity of Datasets and Decision Trees: Exploring Semantic Similarity; A General Model for Multiple View Unsupervised Learning; Unsupervised Segmentation of Conversational Transcripts; Large-Scale Many-Class Learning; Simultaneous Unsupervised Learning od Disparate Clusterings; Author Index.
2008 / 869 pages + index / CD / ISBN: 978-089871-654-2 List Price $174.00 / Member Price $121.80 / Order Code PR130 |