ENPH 353 - HTTP 418 Autonomous Robot

Simulated world map

Overview

This project was our final competition entry for ENPH 353. The goal was to build a fully autonomous robot in simulation that could drive around a city map, read clue boards, and avoid pedestrians and vehicles using only its onboard camera. We named our team "HTTP 418" and built the system as a set of ROS nodes that could be tested independently but run as one coordinated pipeline.

I worked with Joshua Himmens on the project. Josh focused on the driving stack, and I focused on the vision and OCR pipeline for clue detection.

This is a small post about the project; the majority of the work is documented in the final report below.


Project Overview

The core loop looked like this:

  1. Drive using an imitation-learning model running in an ONNX inference node.
  2. Detect crosswalk activity and pause for pedestrians and vehicles.
  3. Spot the blue clue boards, crop them, and run OCR to read letters.
  4. Aggregate clue detections over time and publish the best guess to the scoring node.
  5. Recover from crashes by detecting when the robot is stuck and resetting its pose in simulation.

My Contributions


Challenges

  1. OCR in the wild: characters were easy to detect in training, but full signs were much harder in context.
  2. Performance tradeoffs: we needed fast inference for driving and slower, higher-accuracy models for clue reading.
  3. Reproducibility: models, ROS, and GPU dependencies made "it works on my machine" a real risk.

Technical Highlights

ROS System Architecture

We split the system into nodes for driving, pedestrian tracking, clue detection, clue collection, and crash recovery. This made the system easier to debug and helped keep slow perception nodes from blocking real-time control.

Vision and OCR

Training and Tooling


Repository


Other Media