← Back to Index

32-bit RISC CPU & FPGA Pong

Verilog / Vivado / FPGA / Computer Architecture

Implementation Details

The core is a five-stage pipelined processor comprising Fetch, Decode, Execute, Memory, and Writeback stages. We implemented full hazard detection logic to maintain throughput. A forwarding unit compares the destination registers in the Memory and Writeback stages against the source registers in the Decode stage. If a match is found, the data is bypassed directly to the ALU inputs to resolve Read-After-Write hazards. For Load-Use hazards where bypassing is impossible, a stall unit inserts NOP bubbles into the pipeline.

RTL Diagram
Fig 1. The RTL datapath design showing the five stages and forwarding logic. (Credits to Dr. Bletsch, as well as Dr. Sorin, Dr. Roth, Dr. Lebeck )

The system interfaces with a custom VGA controller driven by a 25MHz pixel clock divided from the 100MHz system clock. We utilized memory-mapped I/O to expose the video memory and PS/2 keyboard state to the processor's address space. The Pong game logic is written in MIPS assembly. It polls the keyboard address for paddle input and updates the ball coordinates by writing to specific VGA memory addresses. A hardware divider facilitates the ball angle calculations. The design synthesizes onto a Xilinx FPGA, utilizing block RAM for instruction and data memory.

Hardware Prototype
Fig 2. Early hardware prototype running on the Xilinx FPGA. The 7-segment displays are memory-mapped to show the live score. The software also handles state management, allowing users to select between single-player (AI) and multiplayer modes via an initial menu screen which is not pictured here.