ML Researcher & Data Scientist

Arjun Raj

I work on Machine Learning

I'm graduating from the Australian National University in 2026 with a Bachelor of Advanced Computing (R&D) (Honours), specialising in machine learning. My research sits in computer vision and 3D reconstruction: my honours thesis adapts 3D foundation models to recover the hidden structure of indoor scenes, and I have published work on object tracking and video understanding at venues including ICASSP and NeurIPS. Alongside research, I work as a data scientist building large-scale geospatial and biodiversity data pipelines. I like working at the boundary between research and engineering, taking an idea from a paper through to something that runs reliably at scale.

Research Interests

Machine LearningComputer Vision3D ReconstructionBiodiversity & Ecological ML

Education

Bachelor of Advanced Computing (R&D) (Honours)

Jul 2022 to Jul 2026

Australian National University

Machine Learning Specialisation

  • Completed an honours thesis on room envelope 3D reconstruction using 3D foundation models (VGGT).
  • Recipient of the Chancellor's International Scholarship.
  • Former President, Formula Motorsports Club.
  • Former Research Portfolio Officer, Undergraduate Research Society.

Research

ANU Honours Thesis 2026

LayoutLens: Room Envelope Reconstruction with VGGT

Arjun Raj, Dylan Campbell

Honours thesis on predicting the amodal room envelope, the walls, floor, and ceiling hidden behind furniture and clutter, as dense per-pixel geometry. Adapts the VGGT 3D foundation model with a learned layout-depth head, roughly halving reconstruction error versus the base model and outperforming classical plane, Manhattan, and cuboid fitting.

Thesis/Paper coming soonCode
To Appear in PlantCLEF 2026 Working Notes

Fine-Tuning of BioCLIP 2.5 with Taxonomic Heads for Multi-Species Plant Identification

Arjun Raj, Manindra de Mel, Razeen Wasif, William Brake

ANU's 7th-place submission to the PlantCLEF 2026 multi-species plant identification challenge.

ICASSP 2025

TrackNetV4: Enhancing Fast Sports Object Tracking with Motion Attention Maps

Arjun Raj, Lei Wang, Tom Gedeon

A motion-attention approach for fast, accurate sports object tracking.

NeurIPS 2024 (D&B Track)

Advancing Video Anomaly Detection: A Concise Review and a New Dataset

Liyun Zhu, Lei Wang, Arjun Raj, Tom Gedeon, Chen Chen

A concise review of video anomaly detection accompanied by a new benchmark dataset.

ICONAT 2022

Comparing Computer Resource Usage Through Interpolating Global Ecosystem Dynamics Investigation (GEDI) LiDAR Waveform Data

Arjun Raj, Andrew Charles Baker

Comparing compute resource usage across methods for interpolating GEDI LiDAR waveform data.

Experience

Data Scientist

Nov 2024 to Present

NatureHelm

Building and operating production geospatial ETL pipelines in Python on AWS (Batch, Lambda, Step Functions) with a PostGIS data store and Docker, processing large-scale satellite and biodiversity datasets. I develop spatial-analysis modules with the geospatial Python stack (GeoPandas, Shapely, rasterio/GDAL, scikit-image) for land-cover change and landscape-connectivity analyses, integrate public data sources such as GBIF, GRIIS, and Google Earth Engine Dynamic World, and harden the pipeline with automated data-quality checks (Soda Core) and tests. I also contribute to the backend API and web portal (TypeScript, NestJS, React), including interactive map visualisations.

Data Scientist Intern

Jul 2024 to Oct 2024

NatureHelm

Integrated biodiversity scoring, vegetation analytics, and interactive ecological visualisations into the NatureHelm platform. Built scalable ELT pipelines, integrated AWS services (S3, Lambda, EC2), and designed RESTful APIs, alongside large-scale performance optimisations, documentation, and cross-functional testing.

Projects

Context-First Prompting for LLM Unit-Test Generation

A five-stage, context-first pipeline that generates JUnit tests with GPT-4o-mini using structured program-analysis context (Jimple IR, predicate strings, branch summaries) and a feedback-driven compile-and-repair loop. Evaluated on Defects4J, reaching around 94% branch and line coverage and exposing all targeted faults. University research project at ANU.

LLMsPrompt EngineeringProgram AnalysisJavaDefects4J
SlidesCode coming soonPaper coming soon

Local AI Health Assistant

RAG-powered health assistant with SolidPod storage and multi-platform querying, exposed through a Flutter application and CLI.

RAGFlutterPythonOllamaLangChain

AI-Powered Surgery Scheduler

Predicts surgical case length to optimise operating-list planning and throughput by skill level. Top-5 finalist at AI Hack Melbourne 2023.

PythonTensorFlowOpenAI API

Maze of Enchantment

A Java-based desktop maze game featuring strategic, puzzle-driven challenges.

JavaMavenDocker

Quirkus Educational App

Android application designed for students, staff, and educational institutions to communicate, interact, and learn more efficiently.

AndroidMobile

Certifications

  • Microsoft Certified: Power Platform Fundamentals · MicrosoftApr 2025
  • TensorFlow Developer · GoogleJun 2021
  • Oracle Certified Associate, Java SE 8 Programmer · OracleJan 2021
  • Certified Associate in Python Programming (PCAP) · Python InstituteDec 2020