TOPIC · 10 entries · 1 thinker

Machine Learning Research

Thinkers posting on this topic

No compiled wiki article for this topic yet. Raw entries below are the source material — a wiki article can be generated on demand from /admin/triggers.

All entries on this topic (10)

paper · 40d ago

Geometric Kernels for Discrete Matchings

This paper introduces a principled framework for developing geometric kernels that are specifically designed for the non-Euclidean space of matchings. It addresses the computational challenges of applying kernel methods to discrete structures by focusing on heat and Matérn kernel families and develo…

kernel-methods machine-learning graph-matching non-euclidean-geometry computational-efficiency phylogenetic-trees

paper · 40d ago

Post-Training Interpretability for SVMs using Orthogonal Polynomial Kernels

This paper introduces Orthogonal Representation Contribution Analysis (ORCA), a post-training interpretability framework for Support Vector Machines (SVMs) utilizing truncated orthogonal polynomial kernels. ORCA quantifies how the squared Reproducing Kernel Hilbert Space (RKHS) norm of the classifie…

svm-interpretability orthogonal-polynomial-kernels rkhs orca machine-learning-diagnostics statistical-learning

paper · 47d ago

Unique Recovery of Transport Maps and Vector Fields from Finite Data

This paper establishes conditions for the unique identification of diffeomorphisms and vector fields using finite measure-valued data. It introduces a new metric to compare diffeomorphisms based on discrepancies in pushforward densities. The analysis leverages Whitney and Takens embedding theorems t…

machine-learning dynamical-systems transport-maps vector-fields generative-models pde-inverse-problems

paper · 47d ago

Geometric Framework for Prototype Clustering Accuracy

This paper introduces a geometric framework to analyze the relationship between objective accuracy and structural recovery in prototype-based clustering. It defines a clustering condition number that quantifies the difficulty of separating clusters, showing that a small suboptimality gap implies low…

clustering-algorithms machine-learning geometric-framework statistical-learning data-analysis clustering-condition-number

paper · Wes Roth · 47d ago

Re-evaluating Data Leakage Severities in Machine Learning

This study systematically quantifies the impact of four classes of data leakage in machine learning across diverse datasets. It reveals that selection leakage, often overlooked, is the most significant, while estimation leakage (e.g., scaler fitting on full data) commonly emphasized in textbooks, ha…

machine-learning data-leakage model-evaluation tabular-data temporal-data statistical-analysis

paper · 48d ago

Information-Theoretic Limits and QP Relaxation for Attributed Network Alignment

This research introduces the featured correlated Gaussian Wigner model to optimize attributed network alignment by integrating node features with graph topology. The authors establish the information-theoretic limits for exact and partial recovery and present QPAlign, a quadratic programming relaxat…

network-alignment graph-theory machine-learning statistical-modeling quadratic-programming feature-engineering

paper · 48d ago

FLOWGEM: A Principled Solution for Non-Monotonic MAR Missingness in Data

FLOWGEM is a novel, iterative method addressing non-monotonic Missing at Random (MAR) data by minimizing Kullback-Leibler divergence through approximate Wasserstein Gradient Flows. This approach utilizes a discretized particle evolution and a local linear estimator for density ratio, enabling the ge…

generative-modeling missing-data wasserstein-gradient-flows kullback-leibler-divergence data-imputation machine-learning-applications

paper · 48d ago

Optimized Partially Deterministic Sampling Improves Compressed Sensing

This paper introduces a novel partially deterministic sampling scheme for compressed sensing, combining random and deterministic selection of sampling vectors from rows of a unitary matrix. This method offers improved sample complexity and novel denoising guarantees. Numerical experiments demonstrat…

compressed-sensing denoising sampling-theory information-theory machine-learning signal-processing

paper · 48d ago

Individual-Heterogeneous Sub-Gaussian Mixture Models Outperform Homogeneous Models in Clustering

The paper introduces individual-heterogeneous sub-Gaussian mixture models (IHSGMM) to address limitations of traditional Gaussian mixture models (GMM) which assume cluster homogeneity. IHSGMMs assign a unique heterogeneity parameter to each observation, allowing for better capture of real-world data…

gaussian-mixture-models sub-gaussian-models heterogeneity-parameter spectral-method clustering-algorithms high-dimensional-data

paper · 48d ago

Weighted Bayesian Conformal Prediction Generalizes Uncertainty Quantification Under Distribution Shift

Weighted Bayesian Conformal Prediction (WBCP) extends traditional Bayesian Conformal Prediction (BQ-CP) to handle distribution shifts by incorporating importance weights. This method replaces the uniform Dirichlet prior with a weighted Dirichlet, using Kish's effective sample size. WBCP improves con…

bayesian-inference conformal-prediction machine-learning uncertainty-quantification spatial-data-analysis statistical-methods