absorb.md — A knowledge graph of what AI thinkers are actually saying

paper / primeministernetanyahu2 / Apr 11

Adaptive Cross-Modal Patch Matching via HyperNetwork-Modulated Siamese CNNs

The authors propose a lightweight descriptor-learning framework for cross-modal patch matching that utilizes HyperNetworks and conditional instance normalization to modulate a Siamese CNN. This architecture enables adaptive per-channel scaling and modality-specific alignment in shallow layers, improving robustness to appearance shifts (e.g., VIS-IR) without significant inference overhead. The approach achieves SOTA performance on VIS-NIR benchmarks and is supported by the introduction of the GAP-VIR dataset for cross-platform evaluation.

hypernetworksmulti-sensor-matchingcomputer-visiondeep-learningimage-processingneural-networks

“Hypernetworks can improve multimodal patch matching by providing adaptive, per-channel scaling and shifting to a Siamese CNN.”

paper / primeministernetanyahu2 / Apr 11

Spatio-Temporal Transformer for Long-Term NDVI Forecasting: A Novel Approach to Satellite Image Time Series Analysis

The Spatio-Temporal Transformer for Long Term Forecasting (STT-LTF) is a novel framework that integrates spatial and temporal context modeling for long-term satellite image time series (SITS) analysis. It processes multi-scale spatial patches and extensive temporal sequences (up to 20 years) within a unified transformer architecture. This self-supervised learning approach, trained on 40 years of unlabeled Landsat imagery, directly predicts future time points without error accumulation, accommodating irregular temporal sampling and variable prediction horizons. The STT-LTF framework achieved a Mean Absolute Error (MAE) of 0.0328 and R^2 of 0.8412 for next-year predictions, outperforming existing methods.

spatio-temporal-transformersndvi-forecastingremote-sensingsatellite-imagerydeep-learningenvironmental-monitoringcomputer-vision

“STT-LTF processes multi-scale spatial patches alongside temporal sequences (up to 20 years) through a unified transformer architecture.”

paper / primeministernetanyahu2 / Apr 11

Physics-Grounded Bayesian Inverse Planning for Social Perception

Social perception in physical environments requires the inversion of a generative model that combines intuitive physics with Bayesian inverse planning. Experimental results using the PHASE dataset demonstrate that physics-grounded computational models (SIMPLE) align with human judgment, whereas feedforward vision-language models and physics-agnostic planners fail to capture the causal constraints of the physical world.

intuitive-physicssocial-perceptioncomputational-modeling bayesian-inferenceagent-based-modelsai-reasoninghuman-robot-interaction

“Integrating intuitive physics with Bayesian inverse planning is necessary for human-level social perception in physically grounded scenes.”

blog / benjaminnetanyahu / Jul 4 / failed

Benjamin Netanyahu

Adaptive Cross-Modal Patch Matching via HyperNetwork-Modulated Siamese CNNs

Spatio-Temporal Transformer for Long-Term NDVI Forecasting: A Novel Approach to Satellite Image Time Series Analysis

Physics-Grounded Bayesian Inverse Planning for Social Perception

PM Netanyahu's Remarks at US Independence Day Event