Beginning Logic Design – Part 11

Hello and welcome to Part 11 of my Beginning Logic Design series. In this episode, I will continue implementing the CPU I planned and stared in the previous post.

The first CPU operations I’d like to have working are going to be my LOAD and STORE type instructions, as these provide the basic reading and writing operations to interact with my system bus. This design will not be very efficient nor the most clever implementation, but it will work!



The LOAD Instruction

I want each of my registers to have the same LOAD capabilities, in contrast to the 6502 instruction set which has 8 types of LOAD instructions for A and 5 for X and Y.

I have 5 types of load instruction in mind:

  • Load register with next byte in program code (Immediate Load)
  • Load register using next two bytes of program code as a memory address. (Memory Load)
  • Load register using next byte as the upper 4 bits of a memory address, and the A register as the lower 4 bits (A indexed load)
  • Load register using next byte as the upper 4 bits of a memory address, and the B register as the lower 4 bits (B indexed load)
  • Load register using next byte as the upper 4 bits of a memory address, and the C register as the lower 4 bits (C indexed load)

With my 3 registers and these 5 different types of load instructions, this consumes 15 of the 16 possible LOAD instructions.

0 - Immediate Load A
1 - Immediate Load B
2 - Immediate Load C
3 - Memory Load A
4 - Memory Load B
5 - Nemory Load C
6 - A Index Load A
7 - A Index Load B
8 - A Index Load C
9 - B Index Load A
a - B Index Load B
b - B Index Load C
c - C Index Load A
d - C Index Load B
e - C Index Load C
f - undefined

Immediate Load

With a rough plan, I’m ready to start implementing! My first goal is to just get the Immediate Load A instruction to work. I’ll write a small program that should load A with 00 then load it with 42.

c0 00
c0 42

Now I’ll start building out what will end up being a huge tree of case statements implementing my various operations. This isn’t the most elegant way to organize the code, but it’s simple and it will work for a start.

PERFORM: begin
  case (op_type)
    LOAD: begin
      case (instruction[3:0])
        0: begin
          if (!read) begin
            read <= 1;
            address_bus <= program_counter + 1;
          end else begin
            read <= 0;
            a <= data_bus;
            program_counter += 2;
            state <= FETCH;
          end
        end
      endcase
    end
  endcase
end

Similar to the fetch state this will be a two cycle operation. First the CPU starts a memory read request for the next byte in program code, on the next cycle the result is stored into the program counter. The program_counter then gets incremented to the next address after the opcode and its parameter.

Testing this bit of code in simulation verifies it works as intended!

From here we can use our copy-pasta skills to do the same for the immediate load operations for the B and C registers.

LOAD: begin
  case (instruction[3:0])
    0: begin
      if (!read) begin
        read <= 1;
        address_bus <= program_counter + 1;
      end else begin
        read <= 0;
        a <= data_bus;
        program_counter += 2;
        state <= FETCH;
      end
    end
    1: begin
      if (!read) begin
        read <= 1;
        address_bus <= program_counter + 1;
      end else begin
        read <= 0;
        b <= data_bus;
        program_counter += 2;
        state <= FETCH;
      end
    end
    2: begin
      if (!read) begin
        read <= 1;
        address_bus <= program_counter + 1;
      end else begin
        read <= 0;
        c <= data_bus;
        program_counter += 2;
        state <= FETCH;
      end
    end
  endcase
end

I’ll write a new program to test this out:

c0 aa
c1 bb
c2 cc

Stepping through these opcodes, I should end up with the A register set to aa, B to bb and C to cc.

Woohoo! These load commands work and were not too difficult to implement. At this point I’m feeling pretty excited about my first CPU design.

Memory Load

For my next trick, I will implement my memory load operations. These will fetch a memory address after the current instruction and set the register to the number at that location. This operation is going to take more than two CPU cycles. With this in mind, I’m going to add a new internal variable logic [1:0] cycle; to track each CPU cycle. In my FETCH, state, I will set cycle to 0 as I transition to the PERFORM state so that all instructions can use this same variable. After the main case statement within PERFORM, I’ll add cycle++; to increment cycle every clock cycle.

Next, before I implement my memory load operations, I’ll modify the immediate load implementation to follow this model for consistency.

LOAD: begin
  case (instruction[3:0])
    0: begin
      case (cycle)
        0: begin
          read <= 1;
          address_bus <= program_counter + 1;
        end
        1: begin
          read <= 0;
          a <= data_bus;
          program_counter += 2;
          state <= FETCH;
        end
      endcase
    end
  ...

Now for the memory load! It will start off identical to the immediate load by reading the next byte in code.

// Memory load A
3: begin
  case (cycle)
    0: begin
      read <= 1;
      address_bus <= program_counter + 1;
    end
  endcase
end

On the next cycle I’ll have the most significant address byte returned via the data bus and I’ll need another register to store it. I’ll add logic [7:0] x; near my other CPU internal registers and request the next byte.

1: begin
  x <= data_bus;
  address_bus <= program_counter + 2;
end

On the 3rd cycle, I’ll have the lower address byte and can concatenate it with the x register to read that memory address.

2: begin
  address_bus <= {x,data_bus};
end

Finally, on the last cycle of the operation, I will have the value from the specified memory location on the data_bus. I can store that value, increment the program_counter by the total length of the instruction, clear the read signal and transition back into FETCH.

3: begin
  program_counter += 3;
  read <= 0;
  a <= data_bus;
  state <= FETCH;
end

Now to test it! I’ll extend my previous program to include this new operation. It’ll load the first byte of the program into A.

Alright! It does successfully pull the memory address and uses it to load the value at that address into the register. With some more copy paste I can replicate this for the B and C registers.

Offset Memory Load

With the basic memory load operation figured out, the offset memory load is a small modification. I only need to read the most significant address byte then I can concatenate that with the appropriate register to read the desired offset address.

// A offset load A
6: begin
  case (cycle)
    0: begin
      read <= 1;
      address_bus <= program_counter + 1;
    end
    1: begin
      address_bus <= {data_bus, a};
    end
    2: begin
      program_counter += 2;
      read <= 0;
      a <= data_bus;
      state <= FETCH;
    end
  endcase
end

As before I can duplicate this for the various permutations of the load command. I validated this in the simulator as well and it looks to work just as intended.

Store Operations

The STORE operations are nearly identical to the load operations, though there are not Immediate Store instructions. Because the operations are so similar I will actually even use the same numbers for the lower 4 operation bits to indicate the types of operation.

0 - undefined
1 - undefined
2 - undefined
3 - Memory Store A
4 - Memory Store B
5 - Nemory Store C
6 - A Index Store A
7 - A Index Store B
8 - A Index Store C
9 - B Index Store A
a - B Index Store B
b - B Index Store C
c - C Index Store A
d - C Index Store B
e - C Index Store C
f - undefined

I’ll first implement the Memory Store A operation. It starts off pretty similar to the load, as it needs to begin by reading the memory address from the program code.

0: begin
  read <= 1;
  address_bus <= program_counter + 1;
end
1: begin
  x <= data_bus;
  address_bus <= program_counter + 2;
end

On the next cycle, I’ll have the full address and can stop reading to start writing A to the data_bus. On the last cycle I’ll increment the program counter and return to FETCH to grab the next bit of code.

2: begin
  address_bus <= {x,data_bus};
  read <= 0;
  write <= 1;
  write_data <= a;
end
3: begin
  program_counter += 3;
  write <= 0;
  state <= FETCH;
end

Easy enough! I’ll extend my last program to end with an operation to write A to the first byte of my RAM.

c0 02
c6 80
d3 00 00

Via simulation I can confirm it’s stashing the A register into the first byte of RAM.

As before we can use this as the basis for the memory store calls for the B and C registers.

Offset Store Operations

As the offset load was a small variation on memory load, the same will be true for offset store. With some small modifications to the regular memory store call, the offset store is easily implemented.

// A offset store A
6: begin
  case (cycle)
    0: begin
      read <= 1;
      address_bus <= program_counter + 1;
    end
    1: begin
      address_bus <= {data_bus, a};
      read <= 0;
      write <= 1;
      write_data <= a;
    end
    2: begin
      program_counter += 2;
      write <= 0;
      state <= FETCH;
    end
  endcase
end

 

With the STORE and LOAD operations implemented I will call it a wrap for this post. As always I would love any feedback or questions you may have. Keep tinkering!

Save

Save

Beginning Logic Design – Part 10

Hello and welcome to part 10 of my Beginning Logic Design series!

There’s been a lot of ground covered so far. Through these posts I’ve implemented an ALU, a system bus, a RAM module, a ROM module. I explored state machines in the last post to build a design that can implement a more complex processing flow by breaking a process into individual operations.

All of this has laid out significant foundation to implement the final design of the series. A basic CPU and computer system!

Over the next few posts I will cover my planning, design and testing for this CPU. The planning will be very light, and this will certainly be an inefficient design in many ways… but I’ll keep changing it until it works!



System Design

I’m starting this design by planning my system architecture, how my largest building blocks will communicate.

For this design I’ve decided to give my ROM the full upper half of my address space, 32KB of ROM. I decided to split the lower half between 16KB of RAM and 16KB of I/O space. I felt this was a good balance of the address space and it’ll be easy to implement.

My system architecture

CPU Design and Instruction Set

There are many decisions to make in designing a CPU and computer system. For this design I’m going to build an 8-bit CPU with a 16-bit system bus to stick with my previous posts.

It will have 4 general purpose registers to use with the various instructions, they will be referred to as A, B and C.

Internally there will also be a PC (program counter) register to keep track of the memory address of the current instruction. An 8-bit stack register, S, will be held for stack related operations. There will be an instruction register to hold the current CPU instruction. It will also have the same status flags as my ALU had, zero, sign, overflow and carry.

The next major piece I have in mind is the instruction set architecture (ISA) itself. The ISA describes my machine code language that the processor will support. I’ve stirred over my ISA design and feel pretty certain that I will end up implementing some operations I don’t need while not thinking of others that would be extremely useful. I accept this and will push forward with a less elegant design to learn the lessons that could be applied to a future one.

My instruction format is split between 4 bits for the instruction family, and 4 bits to be interpreted different by the instruction.

Here are the instruction families I’m initially running with.

0:  ADD
1:  SUBTRACT
2:  INCREMENT
3:  DECREMENT
4:  BIT_AND
5:  BIT_OR
6:  BIT_XOR
7:  BIT_NOT
8:  SHIFT_LEFT
9:  SHIFT_RIGHT
10: ROTATE_LEFT
11: ROTATE_RIGHT
12: LOAD
13: STORE
14: BRANCH
15: EXTRA

The first 12 operations are intentionally identical to the ALU operations, the latter 4 are for loading and storing register values, program code branching and other miscellaneous processor operations.

The CPU will also have a state machine to manage its flow of operation.

CPU state flow

I think that’s about enough planning for now, I want to start laying down the foundations so I can begin the actual designing.

Ground Work

I’m going to use my go-to Makefile to organize my building and testing.

Then I will setup my top module to include my CPU and system bus components.

`timescale 1ns / 1ps

module top ();
  logic clock;
  logic reset;

  // System Bus
  logic slave_clock;
  logic read;
  logic write;
  logic [15:0] address_bus;
  wire logic [7:0] data_bus;

  assign slave_clock = ~clock;

  cpu processor (
    reset,
    clock,
    read,
    write,
    address_bus,
    data_bus
  );

  ram memory (
    slave_clock,
    read,
    write,
    address_bus,
    data_bus
  );

  io devices (
    slave_clock,
    read,
    write,
    address_bus,
    data_bus
  );

  rom storage (
    slave_clock,
    read,
    write,
    address_bus,
    data_bus
  );

  initial begin
    clock = 0;
    reset = 1;
    #2 reset = 0;
  end

  always begin
    #1 clock = ~clock;
  end

endmodule

Then I’ll start my cpu module with a few of the planned features. I’ll include my cpu_state enumeration, the various registers/flags and I’ll add support for system bus write operations.

`timescale 1ns / 1ps

typedef enum logic [1:0] {
  RESET,
  FETCH,
  PERFORM,
  HALT
} cpu_state;

module cpu (
  input logic reset,
  input logic clock,
  output logic read,
  output logic write,
  output logic [15:0] address_bus,
  inout logic [7:0] data_bus
);
  // CPU internals
  cpu_state state;
  logic [15:0] program_counter;
  logic [7:0] stack;
  logic [7:0] instruction;

  // General purpose registers
  logic [7:0] a;
  logic [7:0] b;
  logic [7:0] c;

  // System Bus support
  logic [7:0] write_data;
  assign data_bus = write ? write_data : 'bZ;

endmodule

I’ll next pull in my RAM module from post 8. I’ll modify it slightly to look at the first two address bits and to contain 16KB of memory instead of 32KB.

`timescale 1ns / 1ps

module ram (
  input logic clock,
  input logic read,
  input logic write,
  input logic [15:0] address_bus,
  inout logic [7:0] data_bus
);

  assign data_bus = (read && address_bus[15:14] == 0) ? memory[address_bus[13:0]] : 'bZ;

  logic [7:0] memory [0:(1<<14)-1];

  always_ff @ (posedge clock) begin
    if (address_bus[15:14] == 0 && write) begin
      memory[address_bus[13:0]] <= data_bus;
    end
  end


endmodule

I’ll also modify the writer module from post 8 to listen to the appropriate addresses and will change the output format to show both the address and the data.

`timescale 1ns / 1ps

module io (
  input logic clock,
  input logic read,
  input logic write,
  input logic [15:0] address_bus,
  inout logic [7:0] data_bus
);

  always_ff @ (posedge clock) begin
    if (address_bus[15:14] == 1 && write) begin
      $display("%h: %h (%c)", address_bus, data_bus, data_bus);
    end
  end


endmodule

The ROM module can be directly pulled in from the previous post.

`timescale 1ns / 1ps

module rom (
  input logic clock,
  input logic read,
  input logic write,
  input logic [15:0] address_bus,
  inout logic [7:0] data_bus
);

  assign data_bus = (read && address_bus[15] == 1) ? memory[address_bus[14:0]] : 'bZ;

  logic [7:0] memory [0:(1<<15)-1];

  initial begin
    $readmemh("rom.hex", memory);
  end


endmodule

With these base module in place, I attempt a build to catch the various syntax errors I’ve made along the way to ensure the code I’m sharing actually works 🙂

Processor Functionality

With the foundation set, we can start building! The first part I want to implement is the case statement to handle the various states and the reset logic.

always_ff @ (posedge clock or posedge reset) begin
  if (reset) begin
    state <= RESET;
  end else begin
    case (state)
      RESET: begin
        state <= FETCH;
        program_counter <= 'h8000;
        stack <= 0;
        read <= 0;
        write <= 0;
        address_bus <= 0;
        write_data <= 0;
      end
      FETCH: begin
        state <= PERFORM;
      end
      PERFORM: begin
        state <= HALT;
      end
      HALT: begin
        $finish();
      end
    endcase
  end
end

For now the processing loop just progresses one step at a time until it halts, via simulation I can verify all the important things are cleared, and I can see what’s left unset as well.

Next up, I want to start fetching instructions from ROM similar to how I fetched data from ROM in the previous post. I have the program_counter register as my instruction pointer. The fetch will have 2 cycles, a read request, and the read itself. Since an operation will eventually transition back to FETCH and this state never writes, I’ll ensure write  is set low here as well.

FETCH: begin
  write <= 0;
  if (!read) begin
    read <= 1;
    address_bus <= program_counter;
  end else begin
    read <= 0;
    instruction <= data_bus;
    state <= PERFORM;
  end
end

So this should fetch our first instruction, I’ll test it! Here’s my first ROM (in hex format):

de ad be ef

It’s a total garbage program 🙂 but it’s just here to make sure our instruction register becomes set to de, the first byte. In simulation it does work just fine, as FETCH transitions to PERFORM my instruction register becomes de!

Now, to more easily identify what this de instruction is, I’m going to add another enumeration for my operation family.

typedef enum logic [3:0] {
  CPU_ADD,
  CPU_SUBTRACT,
  CPU_INCREMENT,
  CPU_DECREMENT,
  CPU_AND,
  CPU_OR,
  CPU_XOR,
  CPU_NOR,
  CPU_SHIFT_LEFT,
  CPU_SHIFT_RIGHT,
  CPU_ROTATE_LEFT,
  CPU_ROTATE_RIGHT,
  LOAD,
  STORE,
  BRANCH,
  EXTRA
} instruction_type;

Then, in my CPU internals, I’ll add an instance of this type. My intention is that this will refer to the 4 most significant bits of my instruction register. I’ll add instruction_type op_type; to my CPU internal variables.

Then, I’ll use $cast(); to map the upper 4 bits of the 8-bit logic type to my instruction_type variable. If I did not cast this, Vivado would be unhappy with me. I’ll put this bit of code in my FETCH state.

FETCH: begin
  write <= 0;
  if (!read) begin
    read <= 1;
    address_bus <= program_counter;
  end else begin
    read <= 0;
    instruction <= data_bus;
    $cast(op_type, data_bus[7:4]);
    state <= PERFORM;
  end
end

Now I’ll check this in simulation.

Huzzah! I have fetched the instruction and can identify it’s family. Here’s my cpu module at this point:

`timescale 1ns / 1ps

typedef enum logic [1:0] {
  RESET,
  FETCH,
  PERFORM,
  HALT
} cpu_state;

typedef enum logic [3:0] {
  CPU_ADD,
  CPU_SUBTRACT,
  CPU_INCREMENT,
  CPU_DECREMENT,
  CPU_AND,
  CPU_OR,
  CPU_XOR,
  CPU_NOR,
  CPU_SHIFT_LEFT,
  CPU_SHIFT_RIGHT,
  CPU_ROTATE_LEFT,
  CPU_ROTATE_RIGHT,
  LOAD,
  STORE,
  BRANCH,
  EXTRA
} instruction_type;

module cpu (
  input logic reset,
  input logic clock,
  output logic read,
  output logic write,
  output logic [15:0] address_bus,
  inout logic [7:0] data_bus
);
  // CPU internals
  cpu_state state;
  logic [15:0] program_counter;
  logic [7:0] stack;
  logic [7:0] instruction;
  instruction_type op_type;

  // General purpose registers
  logic [7:0] a;
  logic [7:0] b;
  logic [7:0] c;

  // System Bus support
  logic [7:0] write_data;
  assign data_bus = write ? write_data : 'bZ;

  always_ff @ (posedge clock or posedge reset) begin
    if (reset) begin
      state <= RESET;
    end else begin
      case (state)
        RESET: begin
          state <= FETCH;
          program_counter <= 'h8000;
          stack <= 0;
          read <= 0;
          write <= 0;
          address_bus <= 0;
          write_data <= 0;
        end
        FETCH: begin
          write <= 0;
          if (!read) begin
            read <= 1;
            address_bus <= program_counter;
          end else begin
            read <= 0;
            instruction <= data_bus;
            $cast(op_type, data_bus[7:4]);
            state <= PERFORM;
          end
        end
        PERFORM: begin
          state <= HALT;
        end
        HALT: begin
          $finish();
        end
      endcase
    end
  end

endmodule

With a significant start to the organization and flow to this CPU, I will call that a wrap for this post. In the next post I will begin implementing some of the planned instructions. As always, I welcome your feedback and questions in the comments. Keep tinkering!

Beginning Logic Design – Part 9

Hello and welcome to Part 9 of my Beginning Logic Design series!

In this post, I’ll build a design that utilizes the system bus from the previous post and the ALU I started in Post 5. This design will read data from a ROM mapped to the system bus address space, add the numbers together, and store the results in RAM.

Design Plan

To build this design I will need to decide how I want my system bus address space mapped. I’ll need to implement a new ROM module that can reside in that address space. For this build I will give the RAM all addresses that start with 0, and the ROM will handle addresses that start with 1 to evenly split my address space between them.

Memory Map Diagram

For my primary system controller, I’ll build a state machine that will follow the process flow I am looking to design.

State machine diagram for this controller

This machine will begin with a START state to set up it’s internal variables. Next, in the REQUEST_A state the next a value will be requested from ROM via the system bus. In READ_A the value from ROM will be read on the system bus; if a is zero the system will stop, otherwise it will request b and transition to b. In STORE, the sum of a and b will be written to RAM. The process loops until a is zero.

This setup will allow use of a arbitrarily long null-terminated list of values to add, limited by the size of the ROM.



Reusing Components

It can be significantly more time efficient to reuse designs that are already available so I will start this build by pulling in my top and ram modules from the previous post and the final ALU design from post 7.

I’ll modify my top module to add a reset signal, and will mock out my new controller and rom modules. I’ll also add to my initial block some simulation steps that toggle the reset signal on for 4 nanoseconds.

`timescale 1ns / 1ps
module top ();
  logic reset;
  logic clock;

  // System Bus
  logic slave_clock;
  logic read;
  logic write;
  logic [15:0] address_bus;
  wire logic [7:0] data_bus;

  assign slave_clock = ~clock;
  master controller (
    reset,
    clock,
    read,
    write,
    address_bus,
    data_bus
  );

  ram memory (
    slave_clock,
    read,
    write,
    address_bus,
    data_bus
  );

  rom storage (
    slave_clock,
    read,
    write,
    address_bus,
    data_bus
  );

  initial begin
    clock = 0;
    reset = 1;
    #4 reset = 0;
  end

  always begin
    #1 clock = ~clock;
  end

endmodule

The RAM module remains the same as before.

`timescale 1ns / 1ps

module ram (
  input logic clock,
  input logic read,
  input logic write,
  input logic [15:0] address_bus,
  inout logic [7:0] data_bus
);

  assign data_bus = (read && address_bus[15] == 0) ? memory[address_bus[14:0]] : 'bZ;

  logic [7:0] memory [0:(1<<15)-1];

  always_ff @ (posedge clock) begin
    if (address_bus[15] == 0 && write) begin
      memory[address_bus[14:0]] <= data_bus;
    end
  end


endmodule

Building a ROM

Before building the state machine, I’ll build my ROM to store the numbers I’d like to add together.

The SystemVerilog language provides two handy functions for implementing ROMs, readmemb and readmemh, which can set a memory to the values from a local file. The difference between these functions is the expected format of the files to be read. readmemb expects binary notation in ASCII, while readmemh expects the file to be in hexadecimal notation. I prefer to work in hexadecimal so I’ll be using readmemh.

The ROM design is very similar to RAM, but we can drop the write logic and change the read handling to only handle addresses that start with 1.

The main thing we’ll need to add is an initial block that sets our rom modules memory to the values from our specified file, in this case I’ll have it read from a file named rom.hex.

`timescale 1ns / 1ps

module rom (
  input logic clock,
  input logic read,
  input logic write,
  input logic [15:0] address_bus,
  inout logic [7:0] data_bus
);

  assign data_bus = (read && address_bus[15] == 1) ? memory[address_bus[14:0]] : 'bZ;

  logic [7:0] memory [0:(1<<15)-1];

  initial begin
    $readmemh("rom.hex", memory);
  end


endmodule

Now I’ll write a small rom.hex file to perform a few operations. The file is a representation of hexadecimal using normal text characters, so every 2 digits represents one byte in hexadecimal. Whitespace and line breaks are ignored.

02 02
09 07
20 02
00

This sequence should instruct our controller to perform the operations 2 + 2, 9 + 7 and 32 + 2.

The Controller

The most interesting part of this design is the new controller.

I’ll start off by cleaning out most of the controller from the previous post and adding my ALU module along with the signals to communicate with it. In this case I don’t care about all of the outputs, so I will use a different syntax to map the signals connecting to the ALU by name instead of by the order they are defined in the ALU module. I’ll also set my operation set to ADD and my carry_in to 0.

`timescale 1ns / 1ps

import ALU::*;

module master (
  input logic reset,
  input logic clock,
  output logic read,
  output logic write,
  output logic [15:0] address_bus,
  inout logic [7:0] data_bus
);

  opcode operation;
  logic [7:0] a;
  logic [7:0] b;
  logic carry_in;
  logic [7:0] y;

  assign operation = ADD,
    carry_in = 0;

  alu ALU (
    .clock(clock),
    .operation(operation),
    .a(a),
    .b(b),
    .carry_in(carry_in),
    .y(y)
  );

endmodule

Next I will add an enumeration to give a unique number to each of my states. Since I have 6 states, I can use a 3 bit number to support all of my desired states.

typedef enum logic [2:0] {
  START,
  REQUEST_A,
  READ_A,
  READ_B,
  STORE,
  STOP
} master_state;

Inside of the module definition, I’ll add a new variable to hold the current state.

master_state state;

Next, I’ll start my sequential logic with always_ff. I’ll have the logic follow the rising edge of clock or reset. Including the rising edge for reset lets this module asynchronously reset immediately when reset goes high, instead of waiting for the next clock signal. Inside the block I’ll have a reset handler that sets the state to START and I’ll mock out the remaining states.

always_ff @ (posedge clock or posedge reset) begin
  if (reset) begin
    state <= START;
  end else begin
    case(state)
      START: begin
        
      end
      REQUEST_A: begin
        
      end
      READ_A: begin
        
      end
      READ_B: begin
        
      end
      STORE: begin
        
      end
      STOP: begin
        
      end
    endcase
  end
end

Implementing the State Machine

For proper operation here, I’ll need a few internal variables. I already have registers to hold my a and b values, but I’ll need two more to use as pointers to where I’m currently reading from in ROM, and where I’m storing the result in RAM.

logic [15:0] read_pointer;
logic [15:0] write_pointer;

To support writing, I also need a means to put the output of the ALU onto the data bus. For this I’ll use an assign to put the y signal on data_bus when the write signal is high since the only writes I’ll have are from y.

assign data_bus = write ? y : 'bz;

Now I’ll add the logic for my START state. It will set the system bus signals to a known state and initialize my read and write pointers. It will also change the state variable to transition to the next step.

START: begin
  read <= 0;
  write <= 0;
  read_pointer <= 'h8000;
  write_pointer <= 0;
  state <= REQUEST_A;
end

The REQUEST_A state will implement request to start reading from the address that read_pointer is set to.

REQUEST_A: begin
  address_bus <= read_pointer;
  read <= 1;
  state <= READ_A;
end

In the READ_A state, I’ll look at what was returned from ROM on the data bus. If zero, the state will change to STOP. If the value returned on data_bus is non-zero, that value will be stored in a, I’ll increment the read_pointer and update the address_bus to point to the next value and transition to the READ_B state.

READ_A: begin
  if (data_bus == 0) begin
    state <= STOP;
  end else begin
    a <= data_bus;
    address_bus = ++read_pointer;
    state <= READ_B;
  end
end

For READ_B I’ll store the returned value from data_bus into b, pre-increment read_pointer in anticipation of the next loop and transition to STORE. I’ll also stop my read operation since I am done reading for this iteration.

READ_B: begin
  b <= data_bus;
  read_pointer++;
  read <= 0;
  state <= STORE;
end

Finally, in STORE my ALU will have updated it’s output on y to reflect the inputs on a and b. I’ll set the address bus to my write_pointer and set write so that my y value will be available on data_bus. I’ll transition back to REQUEST_A to continue the loop.

STORE: begin
  address_bus <= write_pointer++;
  state <= REQUEST_A;
  write <= 1;
end

With this as-is, I have a design flaw as nothing within the loop is clearing my write signal. To fix this I’ll add a statement to REQUEST_A to make sure write is 0. Lastly, I’ll add a $finish(); to my STOP state so that simulation stops there.

Here’s how my master module is defined after all of this implementation.

`timescale 1ns / 1ps

import ALU::*;

typedef enum logic [2:0] {
  START,
  REQUEST_A,
  READ_A,
  READ_B,
  STORE,
  STOP
} master_state;

module master (
  input logic reset,
  input logic clock,
  output logic read,
  output logic write,
  output logic [15:0] address_bus,
  inout logic [7:0] data_bus
);

  opcode operation;
  logic [7:0] a;
  logic [7:0] b;
  logic carry_in;
  logic [7:0] y;

  master_state state;
  logic [15:0] read_pointer;
  logic [15:0] write_pointer;

  assign operation = ADD,
    carry_in = 0,
    data_bus = write ? y : 'bz;

  alu ALU (
    .clock(clock),
    .operation(operation),
    .a(a),
    .b(b),
    .carry_in(carry_in),
    .y(y)
  );

  always_ff @ (posedge clock or posedge reset) begin
    if (reset) begin
      state <= START;
    end else begin
      case(state)
        START: begin
          read <= 0;
          write <= 0;
          read_pointer <= 'h8000;
          write_pointer <= 0;
          state <= REQUEST_A;
        end
        REQUEST_A: begin
          write <= 0;
          address_bus <= read_pointer;
          read <= 1;
          state <= READ_A;
        end
        READ_A: begin
          if (data_bus == 0) begin
            state <= STOP;
          end else begin
            a <= data_bus;
            address_bus = ++read_pointer;
            state <= READ_B;
          end
        end
        READ_B: begin
          b <= data_bus;
          read_pointer++;
          read <= 0;
          state <= STORE;
        end
        STORE: begin
          address_bus <= write_pointer++;
          state <= REQUEST_A;
          write <= 1;
        end
        STOP: begin
          $finish();
        end
      endcase
    end
  end

endmodule

It’s certainly become a bit more sophisticated with all these steps each implementing a piece of the process that grabs my numbers to add from ROM and storing the results in RAM.

Finally, I’ll simulate this design to look at how my states and various signals are changing over time.

Waveforms from simulation

In the building of this design I did create several flawed implementations, and this wave viewer helped me inspect the changes that were going on for each cycle, ultimately I’ll look at the RAM to make sure I’m only seeing the set values that are expected.

Inspecting the memory at the end of simulation

The proper results of the 3 addition operations from my ROM are in RAM right where expected, validating this design! I hope this design helps give some understanding of how ROMs and state machines can be used in SystemVerilog. If you have any questions or feedback I welcome your response in the comments. Keep tinkering!

Save