Bishoy Galoaa

Bishoy M. Galoaa

PhD Student • Machine Learning Researcher • Engineer

"The real question is not whether machines think but whether men do. The mystery which surrounds a thinking machine already surrounds a thinking man."
– B.F. Skinner

About Me

I'm an upcoming PhD student in the Electrical and Computer Engineering department at Northeastern University's College of Engineering. I work under the supervision of Prof. Sarah Ostadabbas in the Augmented Cognition Lab (ACLab). My research centers on motion-centric video understanding and reasoning — building systems that don't merely extrapolate visual patterns, but deduce causal structure from observed motion. I'm particularly interested in how vision-language models can move beyond statistical pattern completion toward genuine physical and spatial reasoning. This connects to broader work in multi-object tracking, human-machine interaction, and medical AI.

Motion-Centric Reasoning & Vision-Language Models

A core challenge I'm pursuing: most current VLMs act as stochastic extrapolators. Trained on massive visual corpora, they bias toward the statistically probable — a pendulum seen swinging to Point B is predicted to return to Point A, because that completes a symmetric arc. But this ignores the physics: initial conditions, energy dissipation, non-conservative forces. The model pattern-matches; it doesn't reason.

The Extrapolator vs. The Visual Observer

Extrapolator (current VLMs): sees the arc → assumes a periodic function → predicts completion. Prioritizes visual symmetry over physical constraints.

Visual Observer (the goal): observes initial state, recognizes constraints (fixed pivot, gravity, friction), accounts for hidden dissipative variables → deduces that the return height must be hfinal < hinitial.

Walter Lewin famously put his nose — and his life — on the line to demonstrate this principle:

Prof. Walter Lewin releases a 15 kg pendulum from his chin. He trusts physics — a true push would smash his face. Energy is conserved, not extrapolated.

My work on motion-centric systems aims to bridge this gap: from pattern completion to causal deduction.

  • Motion-Centric Video Understanding: Query-free motion discovery and description systems that autonomously identify and describe events in videos, and text-to-motion generation (Lang2Motion) enabling natural language control of motion synthesis.
  • Multi-Object Tracking: Transformer-enhanced and graph-based algorithms for tracking in complex environments — including occlusions, crowded scenes, and multi-camera setups.
  • Spatial Reasoning in VLMs: Learning structured spatial and counting reasoning from pedagogically-organized video content.
  • Human-Machine Interaction: Motion analysis and uncertainty-aware anomaly detection for exoskeleton control and rehabilitation.
  • Medical AI: Personalized prognostic models for oncology via interpretable machine learning.

Publications

2026

  • MoReGen: Multi-Agent Motion-Reasoning Engine for Code-based Text-to-Video Synthesis
    X. Bai, H. Liang, B. Galoaa, et al. — CVPR 2026 New
  • UniTrack: Differentiable Graph Representation Learning for Multi-Object Tracking
    B. Galoaa, X. Bai, U. Nandi, S. Amraee, S. Ostadabbas — ICLR 2026 Nectar Track · Spotlight Oral · 3DV 2026
  • K-Track: Kalman-Enhanced Tracking for Accelerating Deep Point Trackers on Edge Devices
    B. Galoaa, P. Closas, S. Ostadabbas — WACVW 2026
  • LAPA: Look Around and Pay Attention — Multi-camera Point Tracking Reimagined with Transformers
    B. Galoaa, X. Bai, S. Moezzi, et al. — 3DV 2026 Oral Best Paper Nominee

2025

  • SPARTAN: Spatiotemporal Pose-Aware Retrieval for Text-Guided Autonomous Navigation
    X. Bai, S. A. Sreeramagiri, B. Galoaa, et al. — BMVC 2025
  • More Than Meets the Eye: Enhancing Multi-Object Tracking with Softmax Splatting and Optical Flow
    B. Galoaa, S. Amraee, S. Ostadabbas — ICML 2025
  • Dragontrack: Transformer-Enhanced Graphical Multi-Person Tracking
    B. Galoaa, S. Amraee, S. Ostadabbas — WACV 2025
  • Classification of Infant Sleep–Wake States from Natural Overnight In-Crib Videos
    S. Moezzi, M. Wan, B. Galoaa, et al. — WACVW 2025
  • Advancing Prognostics in Oncology: ML Models for Predicting Survival in Undifferentiated Pleomorphic Sarcoma
    A. G. Girgis, B. M. Galoaa, et al. — Annals of Surgical Oncology, 2025
  • Predicting Long-Term Survival in Myxofibrosarcoma
    S. Rampam, A. G. Girgis, B. M. Galoaa, et al. — Surgical Oncology, 2025
  • Extraskeletal Osteosarcoma: MicroRNA Patterns
    S. A. Lozano-Calderon, B. M. Galoaa, et al. — CTOS 2025
  • Real-Time Uncertainty Detection for Safe, Adaptive Exoskeleton Control
    B. Galoaa et al. — ICRA Workshops 2025

2024

  • Multiple Toddler Tracking in Indoor Videos
    S. Amraee, B. Galoaa, et al. — WACVW 2024
  • A Personalized Predictive Model for Salivary Gland Cancer
    A. Girgis, B. Galoaa, A. Devaiah — COSM 2024
  • A Novel AI Model for Optimizing Treatment of Salivary Gland Malignancies
    A. Girgis, B. Galoaa, A. Devaiah — AAO-HNSF 2024
  • Bias or Best Fit? SEER vs. NCDB in ML for Osteosarcoma Survival
    A. G. Girgis, B. M. Galoaa, et al. — Clinical Orthopaedics and Related Research, 2024
  • Machine Learning–Assisted Decision Making in Orthopaedic Oncology
    P. A. Rizk, M. R. Gonzalez, B. M. Galoaa, et al. — accepted for publication

Preprints & Under Review

  • Structured Over Scale: Learning Spatial Reasoning from Educational Video
    B. Galoaa, X. Bai, and S. Ostadabbas — under review, 2026 Preprint
  • Track and Caption Any Motion: Query-Free Motion Discovery and Description in Videos
    B. Galoaa and S. Ostadabbas — under review, 2026 Preprint
  • Lang2Motion: Bridging Language and Motion Through Joint Embedding Spaces
    B. Galoaa, X. Bai, and S. Ostadabbas — under review, 2026 Preprint
  • Uncertainty-Aware Ankle Exoskeleton Control
    F. M. Tourk, B. Galoaa, S. Shajan, A. J. Young, M. Everett, and M. K. Shepherd — arXiv preprint arXiv:2508.21221, 2025 arXiv
  • Cognitive Learning through Hierarchical Prototypes and Dynamic Focus
    B. Galoaa, S. Ostadabbas — under review, 2025 Preprint
  • ML Algorithms for Survival Prediction in Synovial Sarcoma
    J. O. Werenski, S. Rampam, B. Galoaa, et al. — under review Preprint

30-Day Novel Ideas Challenge

This section highlights a creative experiment where I aimed to build one novel idea per day over a month.

  • Inattention NotaBene – A novel regularization method that strategically "forgets" less important features through a stacked dropout mechanism, offering an alternative to traditional attention mechanisms.
  • ROCKET – An innovative path planning system that identifies collision paths first to find optimal trajectories in complex environments using inverse collision sampling.
  • Secretary Template Matching – An online template matching algorithm inspired by the Secretary Problem, dynamically adjusting thresholds based on observed data patterns for improved real-time decision-making.
  • 2F1B – A novel optimization technique introducing controlled oscillation in neural network training by alternating two forward steps with one backward step, enhancing the optimization trajectory.
  • Knock-Knock – An optimization algorithm inspired by bat echolocation, emitting "echo signals" to navigate complex loss landscapes effectively.

Originally shared as a public challenge on LinkedIn.

Awards

  • Best Paper Award Nominee — International Conference on 3D Vision (3DV 2026) Nominee
    For "Look Around and Pay Attention: Multi-camera Point Tracking Reimagined with Transformers"
  • Best Poster Presentation Award — COSM-AHNSF (Spring 2025)
  • COE Outstanding Graduate Student Award — Northeastern University (2025)
  • COE Outstanding Graduate Student Award — Northeastern University (2024)
  • Best of Scientific Orals — AAO-HNSF Annual Meeting & OTO EXPO (2024)