Autonomous Robotic Foosball

C++ / Nvidia Jetson / TensorRT / Arduino

Hardware Implementation

3D Rendering — Fig 1. Initial 3D CAD design for the actuation assembly, targeting the dimensions of a standard Tornado table.

Lab Setup — Fig 2. The constructed system in the Duke CPSL. The frame supports the NEMA-17 stepper motors and belt drives required for lateral and rotational control.

The hardware architecture splits high-level perception and low-level motor actuation. An Nvidia Jetson AGX Orin handles vision processing via a Logitech BRIO camera stream cropped to the playfield. We convert frames to the HSV color space and apply color range thresholding to isolate the blue ball and yellow end-of-rod markers. A centroid calculation on the largest contour provides the ball's position. We calculate velocity by averaging the delta over a sliding window of recent frames and project the linear trajectory to find the intersection coordinate with each rod's x-plane, transmitting move and rotate commands to the Arduino to actuate the stepper motors for lateral and rotational (kick) movement.

Vision Pipeline — Fig 3. The dual-path perception pipeline. The ball tracking runs via traditional CV (HSV Thresholding) for low latency, while rod rotation is inferred via a quantized neural network.

State Estimation & Control

For rotational control, we trained four EfficientNet-B1 regression models using a custom dataset collected by automating the motors to sweep through all angular and lateral states. We optimized these models using Post-Training Static Quantization to INT8 and exported them to TensorRT engines. The inference runs in parallel CUDA streams to maintain 30fps throughput. An Arduino Mega receives target step positions via serial. To solve stepper drift, the vision system periodically compares the y-position of the yellow tape markers against the expected motor step count. If the error exceeds a pixel threshold, the Jetson sends a RESETPOS command to the Arduino to overwrite its internal step counter, effectively closing the loop without hardware encoders.

Camera Stream Debug — Fig 4. Live debug feed of the perception engine. The green circle isolates the ball via color filtering. Red dots track the yellow tape markers, which serve as the anchor to infer the blue player positions. The overlays display the regression model's output for rotational state (0-200 steps). Here, the strikers register ~110 steps (indicative of being inverted) while other rods hold near 0 or 200 (upright).