Using CAPI within a Docker container

In this post I will cover the process I used to get a program that uses CAPI to run within a docker container.

Base system config

The system I’m working on has been used inthe past to deploy my Hello AFU example project, which I will get to run in this Docker container. The host system already has the cxl support in the kernel, as well as my Hello AFU flashed onto my AlphaData KU3 device.

I first tested to ensure my AFU and CAPI application were still runing properly on the host system and installed Docker using Ubuntu’s docker package in apt.



Base docker image for PPC64LE

It’s been a while since I’ve used docker, so I’m a bit rusty. I decided to use the ppc64le/ubuntu from docker hub as my base image.

I started with a fairly minimal image, just to test that everything is working as expected. This simply pulls the dockerhub image, and runs a few apt commands.

FROM ppc64le/ubuntu

RUN apt update -y
RUN apt upgrade -y
RUN apt install -y git build-essential

CMD ["/bin/echo", "hellllooooooo!"]

Next I tag a new image, baseimage, with docker build, output below truncated for brevity.

root@Barreleye-15:~/docker-capi-test# docker build -t baseimage .
Sending build context to Docker daemon 2.048 kB
Step 1 : FROM ppc64le/ubuntu
latest: Pulling from ppc64le/ubuntu
0847857e6401: Pull complete 
f8c18c152457: Pull complete 
8643975d001d: Pull complete 
d5802da4b3a0: Pull complete 
fe172ed92137: Pull complete 
Digest: sha256:5349f00594c719455f2c8e6f011b32758dcd326d8e225c737a55c15cf3d6948c
Status: Downloaded newer image for ppc64le/ubuntu:latest
 ---> 1967d889e07f
Step 2 : RUN apt update -y
 ---> Running in 8bb2d361d36c

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

Get:1 http://ports.ubuntu.com/ubuntu-ports xenial InRelease [247 kB]

[...]

Fetched 24.1 MB in 4s (4993 kB/s)
Reading package lists...
Building dependency tree...
Reading state information...
25 packages can be upgraded. Run 'apt list --upgradable' to see them.
 ---> e6e3775a3cd9
Removing intermediate container 8bb2d361d36c
Step 3 : RUN apt upgrade -y
 ---> Running in b617cbfa00f8

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

Reading package lists...
Building dependency tree...
Reading state information...
Calculating upgrade...
The following packages will be upgraded:
  apt base-files bsdutils gcc-5-base libapt-pkg5.0 libblkid1 libc-bin libc6

[...]

Processing triggers for libc-bin (2.23-0ubuntu5) ...
 ---> 6cc12a86896f
Removing intermediate container b617cbfa00f8
Step 4 : RUN apt install -y git build-essential
 ---> Running in c42e2e2f5e52

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

Reading package lists...
Building dependency tree...
Reading state information...
The following additional packages will be installed:
  binutils bzip2 ca-certificates cpp cpp-5 dpkg-dev fakeroot g++ g++-5 gcc

[...]

173 added, 0 removed; done.
Running hooks in /etc/ca-certificates/update.d...
done.
 ---> a3c8b863af74
Removing intermediate container c42e2e2f5e52
Step 5 : CMD /bin/bash echo hellllooooooo!
 ---> Running in e320ed315285
 ---> 7f633abdf66d
Removing intermediate container e320ed315285
Successfully built 7f633abdf66d

And I test to see if my image is working:

root@Barreleye-15:~/docker-capi-test# docker run -t baseimage
hellllooooooo!

Huzzah!

Building my CAPI application into the image

Next I will extend my Dockerfile to pull down my hello-afu code and build it. The first I’ll add an additional apt call to pull the necessary library and headers in from the libcxl-dev package. Then I use git and make commands to build my application the same as I would anywhere else. I also set my command line to automaticaly execute the test_afu application when running the container.

FROM ppc64le/ubuntu

RUN apt update -y
RUN apt upgrade -y
RUN apt install -y git build-essential 
RUN apt install -y libcxl-dev

RUN git clone https://github.com/KennethWilke/hello-afu
RUN cd hello-afu && make

CMD ["/hello-afu/test_afu"]

I’ll build this image and tag it as hello-afu

root@Barreleye-15:~/docker-capi-test# docker build -t hello-afu .
Sending build context to Docker daemon 2.048 kB
Step 1 : FROM ppc64le/ubuntu
 ---> 1967d889e07f
Step 2 : RUN apt update -y
 ---> Using cache
 ---> daf3a9437751
Step 3 : RUN apt upgrade -y
 ---> Using cache
 ---> 3882f8f83b78
Step 4 : RUN apt install -y git build-essential
 ---> Using cache
 ---> 2cf498f74f15
Step 5 : RUN apt install -y libcxl-dev
 ---> Running in dbeb87ec38fd

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

Reading package lists...
Building dependency tree...
Reading state information...
The following additional packages will be installed:
  libcxl1
The following NEW packages will be installed:
  libcxl-dev libcxl1
0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded.
Need to get 58.5 kB of archives.
After this operation, 178 kB of additional disk space will be used.
Get:1 http://ports.ubuntu.com/ubuntu-ports xenial/universe ppc64el libcxl1 ppc64el 1.3-0ubuntu2 [12.6 kB]
Get:2 http://ports.ubuntu.com/ubuntu-ports xenial/universe ppc64el libcxl-dev ppc64el 1.3-0ubuntu2 [45.9 kB]
debconf: delaying package configuration, since apt-utils is not installed
Fetched 58.5 kB in 0s (124 kB/s)
Selecting previously unselected package libcxl1.
(Reading database ... 16348 files and directories currently installed.)
Preparing to unpack .../libcxl1_1.3-0ubuntu2_ppc64el.deb ...
Unpacking libcxl1 (1.3-0ubuntu2) ...
Selecting previously unselected package libcxl-dev.
Preparing to unpack .../libcxl-dev_1.3-0ubuntu2_ppc64el.deb ...
Unpacking libcxl-dev (1.3-0ubuntu2) ...
Processing triggers for libc-bin (2.23-0ubuntu5) ...
Setting up libcxl1 (1.3-0ubuntu2) ...
Setting up libcxl-dev (1.3-0ubuntu2) ...
Processing triggers for libc-bin (2.23-0ubuntu5) ...
 ---> db6ef39a11d3
Removing intermediate container dbeb87ec38fd
Step 6 : RUN git clone https://github.com/KennethWilke/hello-afu
 ---> Running in ca75aa96477f
Cloning into 'hello-afu'...
 ---> 4471e2887fad
Removing intermediate container ca75aa96477f
Step 7 : RUN cd hello-afu && make
 ---> Running in e9cd952ef56c
gcc -Wall -o test_afu test_afu.c -I ~/workprojects/pslse/libcxl -L ~/workprojects/pslse/libcxl -lcxl -lpthread 
test_afu.c: In function 'main':
test_afu.c:56:9: warning: format '%llu' expects argument of type 'long long unsigned int', but argument 2 has type '__u64 {aka long unsigned int}' [-Wformat=]
  printf("  example->size: %llu\n", example->size);
         ^
 ---> 75346e99c3a7
Removing intermediate container e9cd952ef56c
Step 8 : CMD /hello-afu/test_afu
 ---> Running in 53c0cb116657
 ---> dd2ea88f3a11
Removing intermediate container 53c0cb116657
Successfully built dd2ea88f3a11

And I’ll test out this new image:

root@Barreleye-15:~/docker-capi-test# docker run -t hello-afu
Failed to open AFU: No such file or directory

So far my docker file seems setup properly, but the test_afu application is not finding the capi device, which is expected as it’s not yet within the containers view of the world.

Sharing the CXL device

In my first attempt, I tried to directly mount my /dev/cxl from host to container, this resulted in a different error regarding the permissions

root@Barreleye-15:~/docker-capi-test# docker run -v /dev/cxl/:/dev/cxl -t hello-afu
Failed to open AFU: Operation not permitted

After some documentation perusing, I found the --privileged flag that allows me to use this type of device sharing.

root@Barreleye-15:~/docker-capi-test# docker run -v /dev/cxl/:/dev/cxl --privileged -t hello-afu
[example structure
  example: 0x1002c4d0200
  example->size: 128
  example->stripe1: 0x1002c4d0300
  example->stripe2: 0x1002c4d0400
  example->parity: 0x1002c4d0580
  &(example->done): 0x1002c4d0220
Attached to AFU
Waiting for completion by AFU
PARITY:
That is some proper parity! This is exactly what I'm expecting to see. I'd also like to see this running on some real gear soon
Releasing AFU

Success! I now have my application from my docker container that can interface with the host CAPI device. I hope this post proves helpful, please leave any feedback you may have in the comments!

Simulating CAPI Designs with PSLSE and Vivado

Following up on the Hello AFU tutorial, this post covers the process to bring simulate that design in Vivavo’s xsim.

Setting up PSLSE

Assuming readers may not have setup PSLSE before, I will start by cloning that down again and building it with support for use with Vivado.

First, I’ll just clone down the repo and enter it’s directory.

kwilke@kbawx:~/projects$ git clone https://github.com/ibm-capi/pslse
Cloning into 'pslse'...
remote: Counting objects: 2789, done.
remote: Compressing objects: 100% (11/11), done.
remote: Total 2789 (delta 0), reused 0 (delta 0), pack-reused 2778
Receiving objects: 100% (2789/2789), 954.75 KiB | 0 bytes/s, done.
Resolving deltas: 100% (1952/1952), done.
Checking connectivity... done.

Next, I’ll set the VPI_USER_H_DIR to point to the xsim include directory from my local Vivado installation and build the afu_driver, pslse and libcxl from the PSLSE repo.

kwilke@kbawx:~/projects/pslse$ export VPI_USER_H_DIR=/opt/Xilinx/Vivado/2015.4/data/xsim/include
kwilke@kbawx:~/projects/pslse$ cd afu_driver/src
kwilke@kbawx:~/projects/pslse/afu_driver/src$ make
 [CC]    afu_driver.o
...
 [CC]    libdpi.so
kwilke@kbawx:~/projects/pslse/afu_driver/src$ cd ../../pslse/
kwilke@kbawx:~/projects/pslse/pslse$ make
 [CC]    shim_host.o
 ...
 [CC]    pslse
kwilke@kbawx:~/projects/pslse/pslse$ cd ../libcxl/
kwilke@kbawx:~/projects/pslse/libcxl$ make
 [CC]    libcxl.o
...
 [AR]    libcxl.a 

Next I’ll add some symlinks to te psl_interface headers into my pslse/afu_driver/src directory to ease the library compilation via Vivado’s xsc.

kwilke@kbawx:~/projects/pslse/libcxl$ cd ../afu_driver/src/
kwilke@kbawx:~/projects/pslse/afu_driver/src$ ln -s ../../common/psl_interface.h .
kwilke@kbawx:~/projects/pslse/afu_driver/src$ ln -s ../../common/psl_interface_t.h .

Now PSLSE should be ready for us.



Compiling the DPI library and AFU

The next step is to enter the Hello AFU directory and build the AFU driver for Vivado.

kwilke@kbawx:~/projects/hello-afu$ xsc ~/projects/pslse/afu_driver/src/afu_driver.c
Multi-threading is on. Using 2 slave threads.
Running compilation flow
Done compilation
Done linking: "/home/kwilke/projects/hello-afu/xsim.dir/xsc/dpi.so"

I’ll create a symlink to the libdpi.so into my Hello AFU directory.

kwilke@kbawx:~/projects/hello-afu$ ln -s ~/projects/pslse/afu_driver/src/libdpi.so .

Next I’ll use xvlog to build the AFU code.

kwilke@kbawx:~/projects/hello-afu$ xvlog --sv *.sv *.v
INFO: [VRFC 10-2263] Analyzing SystemVerilog file "/home/kwilke/projects/hello-afu/afu.sv" into library work
INFO: [VRFC 10-311] analyzing module afu
INFO: [VRFC 10-2263] Analyzing SystemVerilog file "/home/kwilke/projects/hello-afu/capi.sv" into library work
INFO: [VRFC 10-443] port direction not specified for function/task, assuming input [/home/kwilke/projects/hello-afu/capi.sv:136]
INFO: [VRFC 10-2263] Analyzing SystemVerilog file "/home/kwilke/projects/hello-afu/mmio.sv" into library work
INFO: [VRFC 10-311] analyzing module mmio
INFO: [VRFC 10-2263] Analyzing SystemVerilog file "/home/kwilke/projects/hello-afu/parity_afu.sv" into library work
INFO: [VRFC 10-311] analyzing module parity_afu
INFO: [VRFC 10-2263] Analyzing SystemVerilog file "/home/kwilke/projects/hello-afu/parity_workelement.sv" into library work
INFO: [VRFC 10-443] port direction not specified for function/task, assuming input [/home/kwilke/projects/hello-afu/parity_workelement.sv:27]
INFO: [VRFC 10-311] analyzing module parity_workelement
INFO: [VRFC 10-2263] Analyzing SystemVerilog file "/home/kwilke/projects/hello-afu/shift_register.sv" into library work
INFO: [VRFC 10-311] analyzing module shift_register
INFO: [VRFC 10-2263] Analyzing SystemVerilog file "/home/kwilke/projects/hello-afu/top.v" into library work
INFO: [VRFC 10-311] analyzing module top

Finally, I’ll use xelab to elaborate the design.

kwilke@kbawx:~/projects/hello-afu$ xelab -timescale 1ns/1ps -svlog ~/projects/pslse/afu_driver/verilog/top.v -sv_root . -sv_lib libdpi -debug all
Vivado Simulator 2015.4
Copyright 1986-1999, 2001-2015 Xilinx, Inc. All Rights Reserved.
Running: /opt/Xilinx/Vivado/2015.4/bin/unwrapped/lnx64.o/xelab -timescale 1ns/1ps -svlog /home/kwilke/projects/pslse/afu_driver/verilog/top.v -sv_root . -sv_lib libdpi -debug all 
Multi-threading is on. Using 2 slave threads.
INFO: [VRFC 10-2263] Analyzing SystemVerilog file "/home/kwilke/projects/pslse/afu_driver/verilog/top.v" into library work
INFO: [VRFC 10-311] analyzing module top
Starting static elaboration
Completed static elaboration
Starting simulation data flow analysis
Completed simulation data flow analysis
Time Resolution for simulation is 1ps
Compiling module work.shift_register
Compiling module work.shift_register(width=64)
Compiling module work.mmio
Compiling module work.shift_register(width=512)
Compiling module work.parity_workelement
Compiling module work.parity_afu
Compiling module work.afu
Compiling module work.top
Compiling package work.CAPI
Compiling package work.$unit_1
Built simulation snapshot work.top

We should now be in a good state to begin simulation.

Running the simulation

To kick off the simulation, I’ll point xsim to the project.

kwilke@kbawx:~/projects/hello-afu$ xsim -g work.top

****** xsim v2015.4 (64-bit)
  **** SW Build 1412921 on Wed Nov 18 09:44:32 MST 2015
  **** IP Build 1412160 on Tue Nov 17 13:47:24 MST 2015
    ** Copyright 1986-2015 Xilinx, Inc. All Rights Reserved.

start_gui

At this point the Vivado tools will come up and enter simulation for our project.

Vivado simulation window

Hitting the Run All button from the top (looks like a Play button with a square wave) it’ll start off the simulation and wait for a connection from PSLSE.

Simulation Console Output

I can now kick off PSLSE and it’ll connect to xsim and wait for a connection from my application.

Starting PSLSE with Vivado

At this time the master branch of PSLSE isn’t building libcxl.so, so I’ve pointed my application to the library of a previous build and it’s worked just fine.

Hello simulation in Vivado

Hooray! I now have my system setup to simulate CAPI designs in Vivado! I hope this post is useful for others working to get their CAPI designs simulated in Vivado, if you have any questions or comments please pass them my way!

Peeking into the Alpha-Data KU3

At this point in my digital design adventure, I wanted to get my feet wet learning how to debug on live hardware. My goal was simple, I want to be able to interactively write to and read from a single register running on an implemented design. This post documents the process that can be followed to replicate my end results with Vivado and an Alpha-Data KU3 card.

Starting fresh

I like to start learning new things with a fairly bare-bones approach. So my first step here is to create a new project in Vivado. I named my project ku3-peeking, chose RTL Project for my project type and picked the FPGA part that’s on the KU3, xcku060-ffva1156-2-e.

New Project



Adding some source

Next I’ll add a couple SystemVerilog files. One for my top level source and one for the module that I’ll be peeking and poking at through the debugger.

My test module is pretty simple, just a positive edge triggered register.

`timescale 1ns / 1ps

module testmodule (
  input clk,
  input data_in,
  output logic data_out
);

  always_ff @ (posedge clk) begin
    data_out <= data_in;
  end

endmodule

My top module initially will just instantiate my testmodule and a few wires to eventually hook everything up.

`timescale 1ns / 1ps

module top ();

  wire clk;
  wire data_in;
  wire data_out;

  testmodule my_test (
    .clk(clk),
    .data_in(data_in),
    .data_out(data_out)
  );

endmodule

By default Vivado will look for the ‘top’ module when building the project, so this is good for now.

Adding a debug core

With the basic design in place, you can pull up the IP Catalog and select from a few different debug cores. For my goal, the VIO core is nice as it’ll let me read and drive signals.

VIO in catalog

Double clicking on the VIO core will bring up a prompt of options that can be set for the debug core.

VIO Customization Wizard

All of these defaults are fine for my case of a 1 bit register, but you can see you can easily add additional input and output probes. After hitting OK here you’ll get a prompt for starting an out-of-context synthesis run, which will synthesize the VIO core while you continue to work in Vivado.

Within the IP Sources tab, you can drill down into the hierarchy and find templates to instantiate this new core in your design files.

VIO Instance Template

So I’ll modify my top file, to add a new block that wires the VIO core to the clock and my testmodule instance.

  vio_0 my_test_vio (
    .clk(clk),
    .probe_in0(data_out),
    .probe_out0(data_in)
  );

Clocking in

For me, pulling in the clock for a legit hardware design was the most difficult part to figure out. I spelunked documentation and did much Googling, but eventually I reached out to some more experienced folks that helped lead me in the right direction.

The first step in this process is to check the users manual for the device at hand. In this case I’m looking to use the Fabric Clocks described in section 3.2.2 of Alpha Data’s ADM-PCIE-KU3 User Manual. The relevant bits here say there are 2 available fabric clocks, a 200MHz and a 250MHz clock. It lets me know that these pins use the LVDS I/O standard, what pins are available for each clock, and it also notes that I must set a constraint to set the DIFF_TERM_ADV to TERM_100 as there is a requirement that these clocks are terminated within the FPGA. I don’t fully comprehend what that means quite yet but the doc tells me to do it so I oblige.

To use the LVDS structures built into the FPGA, I need to instantiate a module that can take the LVDS input signals and provide a simple clock output. The module for this is called a Differential Input Buffer; it can be found in Vivado’s Language Templates window.

Differential Input Buffer Template

I’ll copy and clean up the template into my top module file, I’ll also add clk_p and clk_n as inputs to the top module so I can route them into the IBUFDS instance.

module top (
  input clk_p,
  input clk_n
);

  wire clk;
  wire data_in;
  wire data_out;

  IBUFDS #(
    .DQS_BIAS("FALSE")
  ) IBUFDS_inst (
    .O(clk),
    .I(clk_p),
    .IB(clk_n)
   );

Since I’m using this clock to drive multiple blocks, it’s a best practice to use a General Clock Buffer (BUFG) so that Vivado will choose an appropriate buffer for this FPGA and minimize clock skewing. This is found in the Templates as well, under Verilog->Device Primitive Instantiation->Kintex UltraScale->CLOCK->BUFFER->General Clock Buffer (BUFG).

I’ll add this to my top module as well, routing the output of my IBUFDS buffer to the BUFG. The output of BUFG will be the clock signal used by my module and by the VIO core.

module top (
  input clk_p,
  input clk_n
);

  wire clk_int;
  wire clk;
  wire data_in;
  wire data_out;

  IBUFDS #(
    .DQS_BIAS("FALSE")
  ) IBUFDS_inst (
    .O(clk_int),
    .I(clk_p),
    .IB(clk_n)
   );

  BUFG BUFG_inst (
    .O(clk),
    .I(clk_int)
   );

At this point I can run Synthesis and inspect the design. Here’s the schematic view that shows the big picture.

schematic

Selecting the clock pins

From the schematic view I can click on the clk_p pin and it’ll select that pin from the I/O Ports view below the schematic in the IDE. From there, I can set the Site, I/O Std and DIFF_TERM_ADV as described in the KU3 User Manual. In this case I’m choosing to use the 250MHz Fabric Clock.

IO Port Planning

After making these changes I used CTRL+S to save my settings, and Vivado prompted me to save a .xdc constraints file which I named myconstraints.xdc.

The file generated has these contents, part of the contents here are from the selections I made in the IDE and some are added by Vivado for the VIO core I instantiated.

set_property PACKAGE_PIN AA24 [get_ports clk_p]
set_property IOSTANDARD LVDS [get_ports clk_p]
set_property DIFF_TERM_ADV TERM_100 [get_ports clk_p]
set_property C_CLK_INPUT_FREQ_HZ 300000000 [get_debug_cores dbg_hub]
set_property C_ENABLE_CLK_DIVIDER false [get_debug_cores dbg_hub]
set_property C_USER_SCAN_CHAIN 1 [get_debug_cores dbg_hub]
connect_debug_port dbg_hub/clk [get_nets clk]

With that in place I can re-run synthesis, run implementation and generate a bitstream for my KU3.

Deploying and testing

With the bitstream generated, I can open the Hardware Manager. After connecting to my device I can use the Program Device option, which will auto-populate with my bitfile and debug probes file, hit Program and wait for the magic to happen.

Program Device

Once programmed, if all works well Vivado will automatically open a VIO dashboard window. In that window you can hit the green + to add the input and output probes.

Adding Probes

With that open I can start poking at the data_in signal and watch the updates reflect in the data_out signal.

Peeking and poking signals

Conclusion

With all this setup I have an easy means to do some interactive control with designs that are running in a live FPGA. I hope other Vivado noobies can follow this guide to help in their digital design adventures. If you follow this guide and run into any issues reach out to me and I’ll try to help you out.

I’d like to thank JT Kellington, Kevin Irick and Mark Paluszkiewicz for offering their help and experience. I ran into many issues trying to hack my way through this and their assistance was extremely helpful in getting this up and running. Thank you!