首页 > 技术知识 > 正文

ARM CPU的完备SDK、软件生态对于加速MCU设计而言可谓是有如神助,作为ICer,我在设计完成MCU RTL后,即可利用ARM提供的SDK快速完成MCU的系统验证,避免要自己一一开发软件驱动的重复性繁琐工作。

ARM是否被英伟达收购犹未可知,海思的麒麟芯片的CPU、mali GPU仍是公版arm软核,受制于人。

我曾经预言过海思手机芯片的三大卡脖子:

1、arm授权的CPU、Mali GPU

2、安卓系统(鸿蒙 逆境而生)

3、台积电代工(当时预测还被喷)

先进工艺代工问题非常痛苦,那么ARM的CPU、GPU问题依然棘手,相信海思未来能开发自己的自主可控指令集、CPU、GPU,像鸿蒙一样独立自主。

为了自己可控,RISC-V的崛起之路仍需要加强生态建设,本文的主角,还是选取学校科研教学常用的MIPS指令集。

以中科龙芯采用的MIPS架构为例,本CPU设计架构图如下:

【MCU】16bit CPU设计实战(一)-芯片8bit和32bit的区别

The Instruction Format and Instruction Set Architecture for the 16-bit single-cycle MIPS are as follows:

Instruction set for the MIPS processor

【MCU】16bit CPU设计实战(一)-芯片8bit和32bit的区别1

Instruction Set Architecture for the MIPS processor

【MCU】16bit CPU设计实战(一)-芯片8bit和32bit的区别2

指令描

我们选取更为容易实现的单周期指令来实现CPU设计:

Add : R[rd] = R[rs] + R[rt]

Subtract : R[rd] = R[rs] – R[rt]

And: R[rd] = R[rs] & R[rt]

Or : R[rd] = R[rs] | R[rt]

SLT: R[rd] = 1 if R[rs] <  R[rt] else 0

Jr: PC=R[rs]

Lw: R[rt] = M[R[rs]+SignExtImm]

Sw : M[R[rs]+SignExtImm] = R[rt]

Beq : if(R[rs]==R[rt]) PC=PC+1+BranchAddr

Addi: R[rt] = R[rs] + SignExtImm

J :  PC=JumpAddr

Jal : R[7]=PC+2;PC=JumpAddr

SLTI: R[rt] = 1 if R[rs] < imm else 0

SignExtImm = { 9{immediate[6]}, imm}

JumpAddr =    { (PC+1)[15:13], address}

BranchAddr = { 7{immediate[6]}, immediate, 1’b0 }

CPU数据通路、控制通路

【MCU】16bit CPU设计实战(一)-芯片8bit和32bit的区别3

【MCU】16bit CPU设计实战(一)-芯片8bit和32bit的区别4

// Submodule: Data memory in Verilog 

module data_memory ( input clk, // address input, shared by read and write port input [15:0] mem_access_addr, // write port input [15:0] mem_write_data, input mem_write_en, input mem_read, // read port output [15:0] mem_read_data ); integer i; reg [15:0] ram [255:0]; wire [7 : 0] ram_addr = mem_access_addr[8 : 1]; initial begin for(i=0;i<256;i=i+1) ram[i] <= 16d0; end always @(posedge clk) begin if (mem_write_en) ram[ram_addr] <= mem_write_data; end assign mem_read_data = (mem_read==1b1) ? ram[ram_addr]: 16d0; endmoduleVerilog code for ALU Control unit:// Submodule: ALU Control Unit in Verilog module ALUControl( ALU_Control, ALUOp, Function); output reg[2:0] ALU_Control; input [1:0] ALUOp; input [3:0] Function; wire [5:0] ALUControlIn; assign ALUControlIn = {ALUOp,Function}; always @(ALUControlIn) casex (ALUControlIn) 6b11xxxx: ALU_Control=3b000; 6b10xxxx: ALU_Control=3b100; 6b01xxxx: ALU_Control=3b001; 6b000000: ALU_Control=3b000; 6b000001: ALU_Control=3b001; 6b000010: ALU_Control=3b010; 6b000011: ALU_Control=3b011; 6b000100: ALU_Control=3b100; default: ALU_Control=3b000; endcase endmodule // Verilog code for JR control unitmodule JR_Control( input[1:0] alu_op, input [3:0] funct, output JRControl );assign JRControl = ({alu_op,funct}==6b001000) ? 1b1 : 1b0;endmoduleVerilog code for control unit:// Submodule: Control Unit in Verilog module control( input[2:0] opcode, input reset, output reg[1:0] reg_dst,mem_to_reg,alu_op, output reg jump,branch,mem_read,mem_write,alu_src,reg_write,sign_or_zero ); always @(*) begin if(reset == 1b1) begin reg_dst = 2b00; mem_to_reg = 2b00; alu_op = 2b00; jump = 1b0; branch = 1b0; mem_read = 1b0; mem_write = 1b0; alu_src = 1b0; reg_write = 1b0; sign_or_zero = 1b1; end else begin case(opcode) 3b000: begin // add reg_dst = 2b01; mem_to_reg = 2b00; alu_op = 2b00; jump = 1b0; branch = 1b0; mem_read = 1b0; mem_write = 1b0; alu_src = 1b0; reg_write = 1b1; sign_or_zero = 1b1; end 3b001: begin // sli reg_dst = 2b00; mem_to_reg = 2b00; alu_op = 2b10; jump = 1b0; branch = 1b0; mem_read = 1b0; mem_write = 1b0; alu_src = 1b1; reg_write = 1b1; sign_or_zero = 1b0; end 3b010: begin // j reg_dst = 2b00; mem_to_reg = 2b00; alu_op = 2b00; jump = 1b1; branch = 1b0; mem_read = 1b0; mem_write = 1b0; alu_src = 1b0; reg_write = 1b0; sign_or_zero = 1b1; end 3b011: begin // jal reg_dst = 2b10; mem_to_reg = 2b10; alu_op = 2b00; jump = 1b1; branch = 1b0; mem_read = 1b0; mem_write = 1b0; alu_src = 1b0; reg_write = 1b1; sign_or_zero = 1b1; end 3b100: begin // lw reg_dst = 2b00; mem_to_reg = 2b01; alu_op = 2b11; jump = 1b0; branch = 1b0; mem_read = 1b1; mem_write = 1b0; alu_src = 1b1; reg_write = 1b1; sign_or_zero = 1b1; end 3b101: begin // sw reg_dst = 2b00; mem_to_reg = 2b00; alu_op = 2b11; jump = 1b0; branch = 1b0; mem_read = 1b0; mem_write = 1b1; alu_src = 1b1; reg_write = 1b0; sign_or_zero = 1b1; end 3b110: begin // beq reg_dst = 2b00; mem_to_reg = 2b00; alu_op = 2b01; jump = 1b0; branch = 1b1; mem_read = 1b0; mem_write = 1b0; alu_src = 1b0; reg_write = 1b0; sign_or_zero = 1b1; end 3b111: begin // addi reg_dst = 2b00; mem_to_reg = 2b00; alu_op = 2b11; jump = 1b0; branch = 1b0; mem_read = 1b0; mem_write = 1b0; alu_src = 1b1; reg_write = 1b1; sign_or_zero = 1b1; end default: begin reg_dst = 2b01; mem_to_reg = 2b00; alu_op = 2b00; jump = 1b0; branch = 1b0; mem_read = 1b0; mem_write = 1b0; alu_src = 1b0; reg_write = 1b1; sign_or_zero = 1b1; end endcase end end endmodule

Verilog code for the single-cycle MIPS processor:

module mips_16( input clk,reset, output[15:0] pc_out, alu_result //,reg3,reg4 ); reg[15:0] pc_current; wire signed[15:0] pc_next,pc2; wire [15:0] instr; wire[1:0] reg_dst,mem_to_reg,alu_op; wire jump,branch,mem_read,mem_write,alu_src,reg_write ; wire [2:0] reg_write_dest; wire [15:0] reg_write_data; wire [2:0] reg_read_addr_1; wire [15:0] reg_read_data_1; wire [2:0] reg_read_addr_2; wire [15:0] reg_read_data_2; wire [15:0] sign_ext_im,read_data2,zero_ext_im,imm_ext; wire JRControl; wire [2:0] ALU_Control; wire [15:0] ALU_out; wire zero_flag; wire signed[15:0] im_shift_1, PC_j, PC_beq, PC_4beq,PC_4beqj,PC_jr; wire beq_control; wire [14:0] jump_shift_1; wire [15:0]mem_read_data; wire [15:0] no_sign_ext; wire sign_or_zero; // PC always @(posedge clk or posedge reset) begin if(reset) pc_current <= 16d0; else pc_current <= pc_next; end // PC + 2 assign pc2 = pc_current + 16d2; // instruction memory instr_mem instrucion_memory(.pc(pc_current),.instruction(instr)); // jump shift left 1 assign jump_shift_1 = {instr[13:0],1b0}; // control unit control control_unit(.reset(reset),.opcode(instr[15:13]),.reg_dst(reg_dst) ,.mem_to_reg(mem_to_reg),.alu_op(alu_op),.jump(jump),.branch(branch),.mem_read(mem_read), .mem_write(mem_write),.alu_src(alu_src),.reg_write(reg_write),.sign_or_zero(sign_or_zero)); // multiplexer regdest assign reg_write_dest = (reg_dst==2b10) ? 3b111: ((reg_dst==2b01) ? instr[6:4] :instr[9:7]); // register file assign reg_read_addr_1 = instr[12:10]; assign reg_read_addr_2 = instr[9:7]; register_file reg_file(.clk(clk),.rst(reset),.reg_write_en(reg_write), .reg_write_dest(reg_write_dest), .reg_write_data(reg_write_data), .reg_read_addr_1(reg_read_addr_1), .reg_read_data_1(reg_read_data_1), .reg_read_addr_2(reg_read_addr_2), .reg_read_data_2(reg_read_data_2)); //.reg3(reg3), //.reg4(reg4)); // sign extend assign sign_ext_im = {{9{instr[6]}},instr[6:0]}; assign zero_ext_im = {{9{1b0}},instr[6:0]}; assign imm_ext = (sign_or_zero==1b1) ? sign_ext_im : zero_ext_im; // JR control JR_Control JRControl_unit(.alu_op(alu_op),.funct(instr[3:0]),.JRControl(JRControl)); // ALU control unit ALUControl ALU_Control_unit(.ALUOp(alu_op),.Function(instr[3:0]),.ALU_Control(ALU_Control)); // multiplexer alu_src assign read_data2 = (alu_src==1b1) ? imm_ext : reg_read_data_2; // ALU alu alu_unit(.a(reg_read_data_1),.b(read_data2),.alu_control(ALU_Control),.result(ALU_out),.zero(zero_flag)); // immediate shift 1 assign im_shift_1 = {imm_ext[14:0],1b0}; // assign no_sign_ext = ~(im_shift_1) + 1b1; // PC beq add assign PC_beq = (im_shift_1[15] == 1b1) ? (pc2 – no_sign_ext): (pc2 +im_shift_1); // beq control assign beq_control = branch & zero_flag; // PC_beq assign PC_4beq = (beq_control==1b1) ? PC_beq : pc2; // PC_j assign PC_j = {pc2[15],jump_shift_1}; // PC_4beqj assign PC_4beqj = (jump == 1b1) ? PC_j : PC_4beq; // PC_jr assign PC_jr = reg_read_data_1; // PC_next assign pc_next = (JRControl==1b1) ? PC_jr : PC_4beqj; // data memory data_memory datamem(.clk(clk),.mem_access_addr(ALU_out), .mem_write_data(reg_read_data_2),.mem_write_en(mem_write),.mem_read(mem_read), .mem_read_data(mem_read_data)); // write back assign reg_write_data = (mem_to_reg == 2b10) ? pc2:((mem_to_reg == 2b01)? mem_read_data: ALU_out); // output assign pc_out = pc_current; assign alu_result = ALU_out; endmodule

Verilog testbench code for the single-cycle MIPS processor:

`timescale 1ns / 1ps// Verilog project: Verilog code for 16-bit MIPS Processor// Testbench Verilog code for 16 bit single cycle MIPS CPU module tb_mips16; // Inputs reg clk; reg reset; // Outputs wire [15:0] pc_out; wire [15:0] alu_result;//,reg3,reg4; // Instantiate the Unit Under Test (UUT) mips_16 uut ( .clk(clk), .reset(reset), .pc_out(pc_out), .alu_result(alu_result) //.reg3(reg3), // .reg4(reg4) ); initial begin clk = 0; forever #10 clk = ~clk; end initial begin // Initialize Inputs //$monitor (“register 3=%d, register 4=%d”, reg3,reg4); reset = 1; // Wait 100 ns for global reset to finish #100; reset = 0; // Add stimulus here end endmodule

谢阅读,别走!点赞、关注、转发后再走吧

【MCU】16bit CPU设计实战(一)-芯片8bit和32bit的区别5

参考链接:

https://www.fpga4student.com/2017/01/verilog-code-for-single-cycle-MIPS-processor.html

转载:全栈芯片工程师

猜你喜欢