Bishoy M. Galoaa

PhD Student • Machine Learning Researcher • Engineer

"The real question is not whether machines think but whether men do. The mystery which surrounds a thinking machine already surrounds a thinking man."
– B.F. Skinner

About Me

↓

I'm an upcoming PhD student in the Electrical and Computer Engineering department at Northeastern University's College of Engineering. I work under the supervision of Prof. Sarah Ostadabbas in the Augmented Cognition Lab (ACLab). My research centers on motion-centric video understanding and reasoning — building systems that don't merely extrapolate visual patterns, but deduce causal structure from observed motion. I'm particularly interested in how vision-language models can move beyond statistical pattern completion toward genuine physical and spatial reasoning. This connects to broader work in multi-object tracking, human-machine interaction, and medical AI.

Motion-Centric Reasoning & Vision-Language Models

↓

A core challenge I'm pursuing: most current VLMs act as stochastic extrapolators. Trained on massive visual corpora, they bias toward the statistically probable — a pendulum seen swinging to Point B is predicted to return to Point A, because that completes a symmetric arc. But this ignores the physics: initial conditions, energy dissipation, non-conservative forces. The model pattern-matches; it doesn't reason.

The Extrapolator vs. The Visual Observer

Extrapolator (current VLMs): sees the arc → assumes a periodic function → predicts completion. Prioritizes visual symmetry over physical constraints.

Visual Observer (the goal): observes initial state, recognizes constraints (fixed pivot, gravity, friction), accounts for hidden dissipative variables → deduces that the return height must be h_final < h_initial.

Walter Lewin famously put his nose — and his life — on the line to demonstrate this principle:

Prof. Walter Lewin releases a 15 kg pendulum from his chin. He trusts physics — a true push would smash his face. Energy is conserved, not extrapolated.

My work on motion-centric systems aims to bridge this gap: from pattern completion to causal deduction.

Motion-Centric Video Understanding: Query-free motion discovery and description systems that autonomously identify and describe events in videos, and text-to-motion generation (Lang2Motion) enabling natural language control of motion synthesis.
Multi-Object Tracking: Transformer-enhanced and graph-based algorithms for tracking in complex environments — including occlusions, crowded scenes, and multi-camera setups.
Spatial Reasoning in VLMs: Learning structured spatial and counting reasoning from pedagogically-organized video content.
Human-Machine Interaction: Motion analysis and uncertainty-aware anomaly detection for exoskeleton control and rehabilitation.
Medical AI: Personalized prognostic models for oncology via interpretable machine learning.

Publications

↓

2026

MoReGen: Multi-Agent Motion-Reasoning Engine for Code-based Text-to-Video Synthesis
X. Bai, H. Liang, B. Galoaa, et al. — CVPR 2026 New
UniTrack: Differentiable Graph Representation Learning for Multi-Object Tracking
B. Galoaa, X. Bai, U. Nandi, S. Amraee, S. Ostadabbas — ICLR 2026 Nectar Track · Spotlight Oral · 3DV 2026
K-Track: Kalman-Enhanced Tracking for Accelerating Deep Point Trackers on Edge Devices
B. Galoaa, P. Closas, S. Ostadabbas — WACVW 2026
LAPA: Look Around and Pay Attention — Multi-camera Point Tracking Reimagined with Transformers
B. Galoaa, X. Bai, S. Moezzi, et al. — 3DV 2026 Oral Best Paper Nominee

2025

SPARTAN: Spatiotemporal Pose-Aware Retrieval for Text-Guided Autonomous Navigation
X. Bai, S. A. Sreeramagiri, B. Galoaa, et al. — BMVC 2025
More Than Meets the Eye: Enhancing Multi-Object Tracking with Softmax Splatting and Optical Flow
B. Galoaa, S. Amraee, S. Ostadabbas — ICML 2025
Dragontrack: Transformer-Enhanced Graphical Multi-Person Tracking
B. Galoaa, S. Amraee, S. Ostadabbas — WACV 2025
Classification of Infant Sleep–Wake States from Natural Overnight In-Crib Videos
S. Moezzi, M. Wan, B. Galoaa, et al. — WACVW 2025
Advancing Prognostics in Oncology: ML Models for Predicting Survival in Undifferentiated Pleomorphic Sarcoma
A. G. Girgis, B. M. Galoaa, et al. — Annals of Surgical Oncology, 2025
Predicting Long-Term Survival in Myxofibrosarcoma
S. Rampam, A. G. Girgis, B. M. Galoaa, et al. — Surgical Oncology, 2025
Extraskeletal Osteosarcoma: MicroRNA Patterns
S. A. Lozano-Calderon, B. M. Galoaa, et al. — CTOS 2025
Real-Time Uncertainty Detection for Safe, Adaptive Exoskeleton Control
B. Galoaa et al. — ICRA Workshops 2025

2024

Multiple Toddler Tracking in Indoor Videos
S. Amraee, B. Galoaa, et al. — WACVW 2024
A Personalized Predictive Model for Salivary Gland Cancer
A. Girgis, B. Galoaa, A. Devaiah — COSM 2024
A Novel AI Model for Optimizing Treatment of Salivary Gland Malignancies
A. Girgis, B. Galoaa, A. Devaiah — AAO-HNSF 2024
Bias or Best Fit? SEER vs. NCDB in ML for Osteosarcoma Survival
A. G. Girgis, B. M. Galoaa, et al. — Clinical Orthopaedics and Related Research, 2024
Machine Learning–Assisted Decision Making in Orthopaedic Oncology
P. A. Rizk, M. R. Gonzalez, B. M. Galoaa, et al. — accepted for publication

Preprints & Under Review

Structured Over Scale: Learning Spatial Reasoning from Educational Video
B. Galoaa, X. Bai, and S. Ostadabbas — under review, 2026 Preprint
Track and Caption Any Motion: Query-Free Motion Discovery and Description in Videos
B. Galoaa and S. Ostadabbas — under review, 2026 Preprint
Lang2Motion: Bridging Language and Motion Through Joint Embedding Spaces
B. Galoaa, X. Bai, and S. Ostadabbas — under review, 2026 Preprint
Uncertainty-Aware Ankle Exoskeleton Control
F. M. Tourk, B. Galoaa, S. Shajan, A. J. Young, M. Everett, and M. K. Shepherd — arXiv preprint arXiv:2508.21221, 2025 arXiv
Cognitive Learning through Hierarchical Prototypes and Dynamic Focus
B. Galoaa, S. Ostadabbas — under review, 2025 Preprint
ML Algorithms for Survival Prediction in Synovial Sarcoma
J. O. Werenski, S. Rampam, B. Galoaa, et al. — under review Preprint

30-Day Novel Ideas Challenge

↓

This section highlights a creative experiment where I aimed to build one novel idea per day over a month.

Progress: 5 / 30 ideas

Inattention NotaBene – A novel regularization method that strategically "forgets" less important features through a stacked dropout mechanism, offering an alternative to traditional attention mechanisms.
ROCKET – An innovative path planning system that identifies collision paths first to find optimal trajectories in complex environments using inverse collision sampling.
Secretary Template Matching – An online template matching algorithm inspired by the Secretary Problem, dynamically adjusting thresholds based on observed data patterns for improved real-time decision-making.
2F1B – A novel optimization technique introducing controlled oscillation in neural network training by alternating two forward steps with one backward step, enhancing the optimization trajectory.
Knock-Knock – An optimization algorithm inspired by bat echolocation, emitting "echo signals" to navigate complex loss landscapes effectively.

Originally shared as a public challenge on LinkedIn.

Awards

↓

Best Paper Award Nominee — International Conference on 3D Vision (3DV 2026) Nominee
For "Look Around and Pay Attention: Multi-camera Point Tracking Reimagined with Transformers"
Best Poster Presentation Award — COSM-AHNSF (Spring 2025)
COE Outstanding Graduate Student Award — Northeastern University (2025)
COE Outstanding Graduate Student Award — Northeastern University (2024)
Best of Scientific Orals — AAO-HNSF Annual Meeting & OTO EXPO (2024)

↓

Download CV