paper / bertdejong / 29d ago
Reward-based fine-tuning aligns pretrained diffusion or flow generative models to higher-reward samples via reward score matching (RSM), recasting alignment as score matching to a reward-guided target. Existing methods from Soft RL, GFlowNets, etc., share this framework, differing mainly in value-guidance estimators and timestep optimization strength. RSM elucidates bias-variance-compute tradeoffs, enabling simpler redesigns that boost alignment effectiveness and efficiency for differentiable and black-box rewards.
reward-score-matchingdiffusion-modelsflow-modelsreward-fine-tuninggenerative-modelsmachine-learning
“Many reward-based fine-tuning methods for diffusion and flow models can be unified under reward score matching (RSM).”
paper / bertdejong / 29d ago
Mn interdiffusion from MnTe into (Bi,Sb)2Te3 forms self-organized Mn(Bi,Sb)2Te4 septuple lamellae alternating with (Bi,Sb)2Te3 quintuple layers, verified by STEM and polarized neutron reflectometry. Above its Néel temperature of 20 K, Mn(Bi,Sb)2Te4 mediates exchange coupling to induce an anomalous Hall effect at the (Bi,Sb)2Te3/MnTe interface with enhanced interfacial Néel temperature over 200 K. This architecture supports robust, deterministic spin-orbit torque switching without external fields at a critical current density of 300 kA/cm².
magnetic-topological-insulatorsheterostructuresproximity-magnetismspintronicsanomalous-hall-effectmaterials-sciencestrongly-correlated-electrons
“Mn interdiffusion from MnTe forms self-organized Mn(Bi,Sb)2Te4 septuple lamellae alternating with (Bi,Sb)2Te3 quintuple layers”
paper / bertdejong / 29d ago
Proposes a semi-supervised framework for maxillary sinus segmentation in panoramic X-rays using weighted knowledge distillation to suppress unreliable teacher signals and SinusCycle-GAN for refining pseudo-labels via unpaired image translation. Addresses challenges like structural overlap, ambiguous boundaries, and limited labeled data from 2,511 patients. Outperforms SOTA with 96.35% Dice score and reduced boundary error under low-label conditions.
knowledge-distillationsemi-supervised-learningmedical-image-segmentationdental-imagingpanoramic-xraysinus-segmentationcycle-gan
“Weighted knowledge distillation suppresses unreliable signals from structural discrepancies in semi-supervised segmentation.”
paper / bertdejong / 29d ago
Integrated T&D co-simulation faces challenges from timestep mismatches between phasor-domain transmission (10 ms) and EMT-domain distribution (100 μs) simulations, leading to inaccurate PLL-based frequency estimation for IBRs. The EWMA-RTTA method uses quadratic extrapolation with exponentially weighted moving averages and real-time threshold adaptation to predict voltage magnitude and phase angle within transmission intervals. Validated on IEEE 118-bus transmission and 123-bus distribution systems via Opal-RT, it reduces nMAE by 25.7x over constant-value baselines, enabling precise modeling of IBR frequency responses.
power-systemsfrequency-responseco-simulationinverter-based-resourcesewma-rtTAquadratic-extrapolationtransmission-distribution
“Transmission system uses 10 ms timestep in phasor domain, distribution uses 100 μs in EMT domain”
paper / bertdejong / 29d ago
The paper introduces nonparametric e-processes for sequential, anytime-valid tests of stochastic dominance (SD), distinguishing from mean dominance by comparing full distributions. For first-order SD, these e-processes mix asymptotically growth-rate optimal e-variables, achieving power one. The framework extends to higher-order SD and shows empirical competitiveness with non-sequential tests while ensuring continuous monitoring validity. It also outlines challenges for testing the non-SD null.
stochastic-dominancesequential-testinganytime-valid-testse-processesnonparametric-statisticsstatistical-hypothesis-testing
“The proposed e-processes provide anytime-valid tests for first-order stochastic dominance under continuous monitoring.”