Simulating CAPI Designs with PSLSE and Vivado

Following up on the Hello AFU tutorial, this post covers the process to bring simulate that design in Vivavo’s xsim.

Setting up PSLSE

Assuming readers may not have setup PSLSE before, I will start by cloning that down again and building it with support for use with Vivado.

First, I’ll just clone down the repo and enter it’s directory.

kwilke@kbawx:~/projects$ git clone
Cloning into 'pslse'...
remote: Counting objects: 2789, done.
remote: Compressing objects: 100% (11/11), done.
remote: Total 2789 (delta 0), reused 0 (delta 0), pack-reused 2778
Receiving objects: 100% (2789/2789), 954.75 KiB | 0 bytes/s, done.
Resolving deltas: 100% (1952/1952), done.
Checking connectivity... done.

Next, I’ll set the VPI_USER_H_DIR to point to the xsim include directory from my local Vivado installation and build the afu_driver, pslse and libcxl from the PSLSE repo.

kwilke@kbawx:~/projects/pslse$ export VPI_USER_H_DIR=/opt/Xilinx/Vivado/2015.4/data/xsim/include
kwilke@kbawx:~/projects/pslse$ cd afu_driver/src
kwilke@kbawx:~/projects/pslse/afu_driver/src$ make
 [CC]    afu_driver.o
kwilke@kbawx:~/projects/pslse/afu_driver/src$ cd ../../pslse/
kwilke@kbawx:~/projects/pslse/pslse$ make
 [CC]    shim_host.o
 [CC]    pslse
kwilke@kbawx:~/projects/pslse/pslse$ cd ../libcxl/
kwilke@kbawx:~/projects/pslse/libcxl$ make
 [CC]    libcxl.o
 [AR]    libcxl.a 

Next I’ll add some symlinks to te psl_interface headers into my pslse/afu_driver/src directory to ease the library compilation via Vivado’s xsc.

kwilke@kbawx:~/projects/pslse/libcxl$ cd ../afu_driver/src/
kwilke@kbawx:~/projects/pslse/afu_driver/src$ ln -s ../../common/psl_interface.h .
kwilke@kbawx:~/projects/pslse/afu_driver/src$ ln -s ../../common/psl_interface_t.h .

Now PSLSE should be ready for us.

Compiling the DPI library and AFU

The next step is to enter the Hello AFU directory and build the AFU driver for Vivado.

kwilke@kbawx:~/projects/hello-afu$ xsc ~/projects/pslse/afu_driver/src/afu_driver.c
Multi-threading is on. Using 2 slave threads.
Running compilation flow
Done compilation
Done linking: "/home/kwilke/projects/hello-afu/xsim.dir/xsc/"

I’ll create a symlink to the into my Hello AFU directory.

kwilke@kbawx:~/projects/hello-afu$ ln -s ~/projects/pslse/afu_driver/src/ .

Next I’ll use xvlog to build the AFU code.

kwilke@kbawx:~/projects/hello-afu$ xvlog --sv *.sv *.v
INFO: [VRFC 10-2263] Analyzing SystemVerilog file "/home/kwilke/projects/hello-afu/" into library work
INFO: [VRFC 10-311] analyzing module afu
INFO: [VRFC 10-2263] Analyzing SystemVerilog file "/home/kwilke/projects/hello-afu/" into library work
INFO: [VRFC 10-443] port direction not specified for function/task, assuming input [/home/kwilke/projects/hello-afu/]
INFO: [VRFC 10-2263] Analyzing SystemVerilog file "/home/kwilke/projects/hello-afu/" into library work
INFO: [VRFC 10-311] analyzing module mmio
INFO: [VRFC 10-2263] Analyzing SystemVerilog file "/home/kwilke/projects/hello-afu/" into library work
INFO: [VRFC 10-311] analyzing module parity_afu
INFO: [VRFC 10-2263] Analyzing SystemVerilog file "/home/kwilke/projects/hello-afu/" into library work
INFO: [VRFC 10-443] port direction not specified for function/task, assuming input [/home/kwilke/projects/hello-afu/]
INFO: [VRFC 10-311] analyzing module parity_workelement
INFO: [VRFC 10-2263] Analyzing SystemVerilog file "/home/kwilke/projects/hello-afu/" into library work
INFO: [VRFC 10-311] analyzing module shift_register
INFO: [VRFC 10-2263] Analyzing SystemVerilog file "/home/kwilke/projects/hello-afu/top.v" into library work
INFO: [VRFC 10-311] analyzing module top

Finally, I’ll use xelab to elaborate the design.

kwilke@kbawx:~/projects/hello-afu$ xelab -timescale 1ns/1ps -svlog ~/projects/pslse/afu_driver/verilog/top.v -sv_root . -sv_lib libdpi -debug all
Vivado Simulator 2015.4
Copyright 1986-1999, 2001-2015 Xilinx, Inc. All Rights Reserved.
Running: /opt/Xilinx/Vivado/2015.4/bin/unwrapped/lnx64.o/xelab -timescale 1ns/1ps -svlog /home/kwilke/projects/pslse/afu_driver/verilog/top.v -sv_root . -sv_lib libdpi -debug all 
Multi-threading is on. Using 2 slave threads.
INFO: [VRFC 10-2263] Analyzing SystemVerilog file "/home/kwilke/projects/pslse/afu_driver/verilog/top.v" into library work
INFO: [VRFC 10-311] analyzing module top
Starting static elaboration
Completed static elaboration
Starting simulation data flow analysis
Completed simulation data flow analysis
Time Resolution for simulation is 1ps
Compiling module work.shift_register
Compiling module work.shift_register(width=64)
Compiling module work.mmio
Compiling module work.shift_register(width=512)
Compiling module work.parity_workelement
Compiling module work.parity_afu
Compiling module work.afu
Compiling module
Compiling package work.CAPI
Compiling package work.$unit_1
Built simulation snapshot

We should now be in a good state to begin simulation.

Running the simulation

To kick off the simulation, I’ll point xsim to the project.

kwilke@kbawx:~/projects/hello-afu$ xsim -g

****** xsim v2015.4 (64-bit)
  **** SW Build 1412921 on Wed Nov 18 09:44:32 MST 2015
  **** IP Build 1412160 on Tue Nov 17 13:47:24 MST 2015
    ** Copyright 1986-2015 Xilinx, Inc. All Rights Reserved.


At this point the Vivado tools will come up and enter simulation for our project.

Vivado simulation window

Hitting the Run All button from the top (looks like a Play button with a square wave) it’ll start off the simulation and wait for a connection from PSLSE.

Simulation Console Output

I can now kick off PSLSE and it’ll connect to xsim and wait for a connection from my application.

Starting PSLSE with Vivado

At this time the master branch of PSLSE isn’t building, so I’ve pointed my application to the library of a previous build and it’s worked just fine.

Hello simulation in Vivado

Hooray! I now have my system setup to simulate CAPI designs in Vivado! I hope this post is useful for others working to get their CAPI designs simulated in Vivado, if you have any questions or comments please pass them my way!

Peeking into the Alpha-Data KU3

At this point in my digital design adventure, I wanted to get my feet wet learning how to debug on live hardware. My goal was simple, I want to be able to interactively write to and read from a single register running on an implemented design. This post documents the process that can be followed to replicate my end results with Vivado and an Alpha-Data KU3 card.

Starting fresh

I like to start learning new things with a fairly bare-bones approach. So my first step here is to create a new project in Vivado. I named my project ku3-peeking, chose RTL Project for my project type and picked the FPGA part that’s on the KU3, xcku060-ffva1156-2-e.

New Project

Adding some source

Next I’ll add a couple SystemVerilog files. One for my top level source and one for the module that I’ll be peeking and poking at through the debugger.

My test module is pretty simple, just a positive edge triggered register.

`timescale 1ns / 1ps

module testmodule (
  input clk,
  input data_in,
  output logic data_out

  always_ff @ (posedge clk) begin
    data_out <= data_in;


My top module initially will just instantiate my testmodule and a few wires to eventually hook everything up.

`timescale 1ns / 1ps

module top ();

  wire clk;
  wire data_in;
  wire data_out;

  testmodule my_test (


By default Vivado will look for the ‘top’ module when building the project, so this is good for now.

Adding a debug core

With the basic design in place, you can pull up the IP Catalog and select from a few different debug cores. For my goal, the VIO core is nice as it’ll let me read and drive signals.

VIO in catalog

Double clicking on the VIO core will bring up a prompt of options that can be set for the debug core.

VIO Customization Wizard

All of these defaults are fine for my case of a 1 bit register, but you can see you can easily add additional input and output probes. After hitting OK here you’ll get a prompt for starting an out-of-context synthesis run, which will synthesize the VIO core while you continue to work in Vivado.

Within the IP Sources tab, you can drill down into the hierarchy and find templates to instantiate this new core in your design files.

VIO Instance Template

So I’ll modify my top file, to add a new block that wires the VIO core to the clock and my testmodule instance.

  vio_0 my_test_vio (

Clocking in

For me, pulling in the clock for a legit hardware design was the most difficult part to figure out. I spelunked documentation and did much Googling, but eventually I reached out to some more experienced folks that helped lead me in the right direction.

The first step in this process is to check the users manual for the device at hand. In this case I’m looking to use the Fabric Clocks described in section 3.2.2 of Alpha Data’s ADM-PCIE-KU3 User Manual. The relevant bits here say there are 2 available fabric clocks, a 200MHz and a 250MHz clock. It lets me know that these pins use the LVDS I/O standard, what pins are available for each clock, and it also notes that I must set a constraint to set the DIFF_TERM_ADV to TERM_100 as there is a requirement that these clocks are terminated within the FPGA. I don’t fully comprehend what that means quite yet but the doc tells me to do it so I oblige.

To use the LVDS structures built into the FPGA, I need to instantiate a module that can take the LVDS input signals and provide a simple clock output. The module for this is called a Differential Input Buffer; it can be found in Vivado’s Language Templates window.

Differential Input Buffer Template

I’ll copy and clean up the template into my top module file, I’ll also add clk_p and clk_n as inputs to the top module so I can route them into the IBUFDS instance.

module top (
  input clk_p,
  input clk_n

  wire clk;
  wire data_in;
  wire data_out;

  ) IBUFDS_inst (

Since I’m using this clock to drive multiple blocks, it’s a best practice to use a General Clock Buffer (BUFG) so that Vivado will choose an appropriate buffer for this FPGA and minimize clock skewing. This is found in the Templates as well, under Verilog->Device Primitive Instantiation->Kintex UltraScale->CLOCK->BUFFER->General Clock Buffer (BUFG).

I’ll add this to my top module as well, routing the output of my IBUFDS buffer to the BUFG. The output of BUFG will be the clock signal used by my module and by the VIO core.

module top (
  input clk_p,
  input clk_n

  wire clk_int;
  wire clk;
  wire data_in;
  wire data_out;

  ) IBUFDS_inst (

  BUFG BUFG_inst (

At this point I can run Synthesis and inspect the design. Here’s the schematic view that shows the big picture.


Selecting the clock pins

From the schematic view I can click on the clk_p pin and it’ll select that pin from the I/O Ports view below the schematic in the IDE. From there, I can set the Site, I/O Std and DIFF_TERM_ADV as described in the KU3 User Manual. In this case I’m choosing to use the 250MHz Fabric Clock.

IO Port Planning

After making these changes I used CTRL+S to save my settings, and Vivado prompted me to save a .xdc constraints file which I named myconstraints.xdc.

The file generated has these contents, part of the contents here are from the selections I made in the IDE and some are added by Vivado for the VIO core I instantiated.

set_property PACKAGE_PIN AA24 [get_ports clk_p]
set_property IOSTANDARD LVDS [get_ports clk_p]
set_property DIFF_TERM_ADV TERM_100 [get_ports clk_p]
set_property C_CLK_INPUT_FREQ_HZ 300000000 [get_debug_cores dbg_hub]
set_property C_ENABLE_CLK_DIVIDER false [get_debug_cores dbg_hub]
set_property C_USER_SCAN_CHAIN 1 [get_debug_cores dbg_hub]
connect_debug_port dbg_hub/clk [get_nets clk]

With that in place I can re-run synthesis, run implementation and generate a bitstream for my KU3.

Deploying and testing

With the bitstream generated, I can open the Hardware Manager. After connecting to my device I can use the Program Device option, which will auto-populate with my bitfile and debug probes file, hit Program and wait for the magic to happen.

Program Device

Once programmed, if all works well Vivado will automatically open a VIO dashboard window. In that window you can hit the green + to add the input and output probes.

Adding Probes

With that open I can start poking at the data_in signal and watch the updates reflect in the data_out signal.

Peeking and poking signals


With all this setup I have an easy means to do some interactive control with designs that are running in a live FPGA. I hope other Vivado noobies can follow this guide to help in their digital design adventures. If you follow this guide and run into any issues reach out to me and I’ll try to help you out.

I’d like to thank JT Kellington, Kevin Irick and Mark Paluszkiewicz for offering their help and experience. I ran into many issues trying to hack my way through this and their assistance was extremely helpful in getting this up and running. Thank you!

Hello AFU on Alpha-Data KU3

Picking up on the Hello AFU project, I’ve recently gone through the motions of building the Hello AFU project for an actual CAPI device and tested it out. This post documents the process I followed to build and deploy this on real hardware.


To complete this process you’ll need a few things:
* A POWER8 based machine, for me I’m using a Barreleye server
* An Alpha-Data KU3 card
* The latest HDK archive from Alpha Data’s support site, at this time that file is named
* A licensed version of Xilinx’s Vivado

Preparing files for the build

First off, we need to extract the HDK


In the HDK by default, there will be some AFU source files in adku060_capi_1_1_release/Sources/afu/ we’ll jump in there and delete them, then copy over the SystemVerilog files from the hello-afu repository

cd adku060_capi_1_1_release/Sources/afu/
rm *
cp ~/projects/hello-afu/*.sv .

Next, open the project file adku060_capi_1_1_release/Sources/prj/psl_fpga.prj in a text editor to change a few lines. Remove all of the lines that start with verilog work, then add lines to reference the source files we copied into the afu directory. Some bash-fu for that:

cd ../prj
sed -i '/^verilog work/d' psl_fpga.prj
for i in `ls ../afu/*.sv | cut -d'/' -f3`; do echo "verilog work \"afu/$i\"" >> psl_fpga.prj; done

That should have us setup to build our AFU in leiu of the one that comes with the HDK!

Build and flash the binfile

With our files in the right spot and our project file modified, we just need to run a few of the tcl scripts in the HDK through vivado.

vivado -mode batch -source psl_fpga.tcl -notrace
vivado -mode batch -source write_bitstream.tcl -notrace

The first run here does the heavy lifting of synthesis, place and route, etc. The second command generates the actual binfile and bitfile that we can use to flash the device. The first command takes a significant amount of time on my i7-equipped laptop, about 40 minutes, the second command completed in about 9 seconds. Maybe someday we’ll have a CAPI-based accelerator for synthesis and place & route! Now that the building is complete I have my bitfile at capi-adku060/psl_fpga_flash.bin

To flash this to your device to a card that already has the PSL working you can use the capi-flash-script utility. If your card is factory-fresh or in a bad state, you can use a JTAG programmer and Vivado’s Hardware Manager to flash directly from your laptop, or remotely via xvcserver.

Using the AFU

After I flashed my AFU, I ensured libcxl was setup on my server. Since I’m running Ubunt 16.04 I simply installed it via apt.

apt-get install -y libcxl-dev

Next I rebooted the machine so that everything is nice and fresh, as part of the PCIe reset the bitfile from the KU3’s flash chip will be flashed onto the FPGA. I can verify the card is in a good state because I have my cxl device at /dev/cxl/afu0.0d.

I run my test_afu binary from the hello-afu project and boom! The same result as I get from simulation, woo-hoo!