absorb.md

Jim Fan

Chronological feed of everything captured from Jim Fan.

Michelle Receives Congratulations from Jim Fan in Hourly X Feed Poll

Jim Fan's hourly poll on his X feed features a user note congratulating Michelle. This indicates a celebratory mention amid routine feed monitoring. No further context on Michelle's achievement is provided.

drjimfan starred huggingface/pytorch-image-models: The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more. Stars: 36682

drjimfan starred hashicorp/terraform: Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.. Stars: 48174

Chain-of-Sight Enables VLM-Native 3D Detection via Sequential Token Prediction

LocateAnything3D reframes 3D object detection as a next-token prediction task in vision-language models using a Chain-of-Sight (CoS) sequence. It generates 2D detections as visual chain-of-thought, followed by 3D box predictions in a near-to-far object order and per-object factorization (center, dimensions, rotation) for stability. This approach achieves SOTA on Omni3D with 38.90 AP_3D, outperforming prior methods by +13.98 even against baselines with ground-truth 2D inputs, while preserving open-vocabulary and zero-shot generalization.

Older entries →