ARTY: The $99 Artix-7 FPGA eval kit

September 30, 2015, 6:45 pm

≫ Next: Using AXI Ethernet Subsystem and GMII-to-RGMII in a Multi-port Ethernet design

≪ Previous: KickStarter Campaign Launched: OnCourse Goggles

I just got the news about the new ARTY $99 FPGA evaluation kit being released and I thought it was worth a mention. At the $99 price point and with the Arduino shield connector, they’ll attract a lot of hobbyists who can now hook up one of the many existing Arduino shields to a Series-7 FPGA. Another interesting thing is that it ships with a webserver reference design so you’ve got a head-start on your IoT applications. Definitely a lot of possibilities with this board and I might just get one for myself.

Checkout the hardware:

If you’ve got one, let us all know what you think of it and what you’re designing.

↧

Using AXI Ethernet Subsystem and GMII-to-RGMII in a Multi-port Ethernet design

December 8, 2015, 12:06 pm

≫ Next: FPGA Network tap: Designing the Ethernet pass-through

≪ Previous: ARTY: The $99 Artix-7 FPGA eval kit

Tutorial Overview

In this two-part tutorial, we’re going to create a multi-port Ethernet design in Vivado 2015.4 using both the GMII-to-RGMII and AXI Ethernet Subsystem IP cores. We’ll then test the design on hardware by running an echo server on lwIP. Our target hardware will be the ZedBoard armed with an Ethernet FMC, which adds 4 additional Gigabit Ethernet ports to our platform. Ports 0 to 2 of the Ethernet FMC will connect to separate AXI Ethernet Subsystem IPs which will be configured in DMA mode. Port 3 of the Ethernet FMC will connect to GEM1 of the Zynq PS through the GMII-to-RGMII IP, while the on-board Ethernet port of the ZedBoard will connect to GEM0.

Requirements

To go through this tutorial, you’ll need the following:

Vivado 2015.4 (see note below)
ZedBoard
Ethernet FMC (standard or robust model will work)
Platform Cable USB II (or equivalent JTAG programmer)

Note: The tutorial text and screenshots are suitable for Vivado 2015.4 however the sources in the Git repository will be regularly updated to the latest version of Vivado.

Change Vivado’s default language

Before creating our project, we need to make sure that Vivado is configured to use VHDL as it’s default language. We wont be writing any HDL code, however the constraints that we use will be dependent on the project language being set to VHDL, so it’s important that we set this:

Open Vivado.
From the menu, select Tools->Options.
In the “General” tab select target language : VHDL.

Create a new Vivado project

Follow these steps to create a new project in Vivado:

From the welcome screen, click “Create New Project”.
Specify a folder for the project. I’ve created a folder named “zedboard_qgige”. Click “Next”.
For the Project Type window, choose “RTL Project” and tick “Do not specify sources at this time”. Click “Next”.
For the Default Part window, select the “Boards” tab and then select the “ZedBoard Zynq Evaluation and Development Kit” and click “Next”.
Click “Finish” to complete the new project wizard.

Setup the Zynq PS

We start off the design by adding the Zynq PS (aka. Processor System) and make the connections specified by the ZedBoard board definition file which is included with Vivado 2015.4.

From the Vivado Flow Navigator, click “Create Block Design”.
Specify a name for the block design. Let’s go with the default “design_1” and leave it local to the project. Click “OK”.
In the Block Design Diagram, you will see a message that says “This design is empty. Press the (Add IP) button to add IP.”. Click on the “Add IP” icon either in the message, or in the vertical toolbar.
The IP catalog will appear. Go to the end of the list and double click on “ZYNQ7 Processing System” – it should be the second last on the list.
In the Block Design Diagram, you will see a message that says “Designer Assistance available. Run Block Automation”. Click on the “Run Block Automation” link.
Block Automation uses the board definition file for the ZedBoard to make connections and pin assignments to external hardware such as the DDR and the on-board Ethernet port. Just make sure that “Apply Board Preset” is ticked and click OK.
Now our block diagram has changed and we can see that the DDR and FIXED_IO are connected externally. We can now configure the Zynq PS for our specific needs. Double click on the Zynq PS block to open the Re-customize IP window.
From the Page Navigator, select “Clock Configuration” and open the “PL Fabric Clocks” tree. Notice that “FCLK_CLK0” is enabled by default and set to 100MHz, this will serve as the clock for our AXI interfaces. Now enable “FCLK_CLK1” and “FCLK_CLK2” and set them to 125MHz and 200MHz respectively. The FCLK_CLK1 (125MHz) will be needed by the AXI Ethernet Subsystem blocks and it will be used to clock the RGMII interfaces. FCLK_CLK2 (200MHz) will be required by both the GMII-to-RGMII and AXI Ethernet Subsystem IPs and it is needed to clock the IDELAY_CTRLs.
Now from the Page Navigator, select “PS-PL Configuration”. By default the Master AXI GP0 interface should be enabled as you can see in the image. You must also enable the High Performance Slave AXI HP0 interface as shown. The HP Slave AXI Interface provides a high-bandwidth connection to the DDR3 memory controller – this will be needed by the DMA engines which we will create after we add the AXI Ethernet Subsystem blocks to our design.
The last thing to do is to enable interrupts. From the Page Navigator, select “Interrupts” and tick to enable “Fabric Interrupts” then “IRQ_F2P[15:0]”. Interrupts will be generated by all the Ethernet IPs and by the DMA engine IPs.
Now click “OK” to close the Re-customize IP window.
You will notice that the PS block has gotten a bit bigger and it has more ports. Connect FCLK_CLK0 (100MHz) to the GP Master AXI clock input (M_AXI_GP0_ACLK) by dragging a trace from one pin to the other.This action will draw a wire between the pins and make the connection.
Also connect the FCLK_CLK0 to the HP Slave AXI Interface clock input (S_AXI_HP0_ACLK).
Now open the IP Catalog and add 3 x AXI 1G/2.5G Ethernet Subsystem IPs to the design (you will have to add one at a time). Once you have done this, you should have three AXI Ethernet Subsystem blocks in your design: “axi_ethernet_0”, “axi_ethernet_1” and “axi_ethernet_2”.
To wire the AXI Ethernet Subsystem blocks in DMA mode, we’ll use the block automation feature, however before running this, we want to configure the “shared logic” option of the cores first. The AXI Ethernet Subsystem IP is designed with the option to include “shared logic” in the core. The shared logic includes an IDELAY_CTRL to control the IODELAYs on the RGMII interface, as well as an MMCM to generate a 90 degree skewed clock for generation of the RGMII TX clock. When we use multiple AXI Ethernet Subsystem blocks in the one design, we can save on resources by having only one of those cores include the “shared logic”. The core containing the “shared logic” will naturally share the IDELAY_CTRL with the other cores, and it will have outputs for the clocks generated by the MMCM so that it can share them too. Let’s make “axi_ethernet_0” be the one that contains the shared logic, so double click on it to bring up the Re-customize IP window.
Go to the “Shared Logic” tab (don’t worry about any of the other options for now). Select the “Include Shared Logic in Core” option and click OK.
Now open the Re-customize IP window for “axi_ethernet_1”, go to the “Shared Logic” tab and select the “Include Shared Logic in IP Example Design” option and click OK. Do the same for “axi_ethernet_2”.
Now we can wire up the Ethernet blocks by using the block automation feature. Notice there is a message in your block diagram saying “Designer Assistance available. Run Block Automation”. Click on the “Run Block Automation” link.
In the “Run Block Automation” window, you will have automation options for each of the Ethernet blocks. Tick to enable all of them, then select them one by one and make sure that they are each configured for an “RGMII” physical interface and a “DMA” connection to the AXI Streaming Interfaces. By default they will all be configured for GMII so it is important to set the physical interface correctly here and for each one of them. Then click OK.
After the block automation has run its course, you will notice that it has added a Clocking Wizard block called “axi_ethernet_0_refclk”. This block generates a 125MHz and 200MHz clock to feed the Ethernet blocks, however we will be using the Zynq PS to generate those clocks, so we don’t need this block. Click once on the “axi_ethernet_0_refclk” block and press Delete to remove it from the block diagram.
We can now use the Connection Automation feature to wire up our AXI interfaces. Click “Run Connection Automation” from the block diagram.
Like before, tick to enable ALL of the connections. We then we need to select each of the interfaces one-by-one and choose the right settings. Luckily the defaults are good for us in Vivado 2015.4, but check the screenshots below if you are using a different version. Then click OK.
When the automation feature has run its course, you will notice that again you have the option to “Run Connection Automation”. Maybe this will not be the case in future versions of Vivado, but it is the case for 2015.4. So again click “Run Connection Automation”, tick to enable all the interfaces for automation and make sure the settings are correct (defaults are good for 2015.4). They should all be configured to connect to the HP Slave AXI Interface (S_AXI_HP0) and use the “Auto” clock connection.
Now it’s time to add the GMII-to-RGMII block for Port 3 of the Ethernet FMC. Open the IP Catalog and double click on “Gmii to Rgmii”.
Double click on the GMII-to-RGMII block to open the Re-customize IP window.
In the “Core Functionality” tab, tick “Instantiate IDELAYCTRL in design”, set the PHY Address to 8 and select the option “Skew added by PHY”. Notice that the GMII-to-RGMII core has an MDIO input and an MDIO output. Why does the MDIO bus have to pass through the core? That’s because the GMII-to-RGMII core has logic that sits on the MDIO bus to receive commands from the MAC for configuration. The PHY address we specify here allows us to give the core a unique address on the MDIO bus, and it is very important that the address be different to that of the external PHY. On the Ethernet FMC, all the PHYs are configured with address 0, so we can give the GMII-to-RGMII core an address of 8 without creating a bus conflict. As for the “Skew added by the PHY” option, this concerns the RGMII transmit clock. Some PHYs, including the 88E1510 on the Ethernet FMC, have a feature to add a delay to the incoming RGMII TX clock so that it aligns well for sampling the incoming RGMII transmit data. The GMII-to-RGMII core allows us to specify where the skew is added: in the PHY or in the FPGA fabric (MMCM). The skew should be added by one or the other, never both, or the clock will be poorly aligned for sampling the data and the RGMII interface will fail.
Now open the “Shared Logic” tab and select “Include Shared Logic in Core”.
To connect the GMII-to-RGMII core to the PS, we need to enable GEM1 in the PS. Double click on the Zynq PS block and select “MIO Configuration” in the Page Navigator. Tick to enable “ENET 1” and select “EMIO” (Extended Multiplexed Input/Output). Selecting EMIO allows us to route GEM1 through to the FPGA fabric, so that we can then connect it to our GMII-to-RGMII core and then out to the Ethernet FMC.
Now you should see two extra ports on the Zynq PS block: “GMII_ETHERNET_1” and “MDIO_ETHERNET_1”. Make a connection between “MDIO_GEM” of the GMII-to-RGMII block and “MDIO_ETHERNET_1” of the Zynq PS.
Now make a connection between “GMII” of the GMII-to-RGMII block and “GMII_ETHERNET_1” of the Zynq PS.
Now we need to make the “MDIO_PHY” and “RGMII” ports external so that they can connect to the Ethernet FMC. Right click on each of these interfaces and select “Make External”.
Now find the external ports on the right hand side of the block diagram. We’ll need to rename them so that the names fit with the constraints that we will later add to the project. Click first on the “MDIO_PHY” port and rename it “mdio_io_port_3”. Then click on the “RGMII” port and rename it “rgmii_port_3”. The port name can be changed in the “External Interface Properties” window that normally sits just below the “Design” window and to the left of the block diagram (see image below).
Once you have changed the names, your ports should now look like this.
The GMII-to-RGMII block doesn’t provide us with a reset signal for the PHY, so we have to add some logic to provide that signal. Open the IP Catalog and add a “Utility Reduced Logic” IP.
Double click on the “util_reduced_logic_0” block and set the “C Size” to 1 and the “C Operation” to “and”. Then click OK.
Now connect the input of the “util_reduced_logic_0” block to the active-low peripheral reset output of the Processor System Reset block.
Now right click on the output of the “util_reduced_logic_0” block and select “Make External”.
The external port will have been named “Res” by default. Change this name to “reset_port_3” so that it matches the constraints we will later add to the project.
The MDIO, RGMII and reset ports of the 3 x AXI Ethernet Subsystem blocks will have already been externalized during the automation process, however they will have been given odd names so we need to change those names to match the constraints that we will later add to the project. Go through the ports one-by-one and rename them as follows:
1. axi_ethernet_0 should have its external ports named “mdio_io_port_0”, “rgmii_port_0” and “reset_port_0”.
2. axi_ethernet_1 should have its external ports named “mdio_io_port_1”, “rgmii_port_1” and “reset_port_1”.
3. axi_ethernet_2 should have its external ports named “mdio_io_port_2”, “rgmii_port_2” and “reset_port_2”.
Now open the IP Catalog and add a Concat (concatenate) IP to the design. The concatenate IP takes a series of single inputs and concatenates them into a vector output. We will need this IP to be able to connect all the interrupts to the IRQ_F2P[0:0] vector input of the PS.
Double click on the Concat block and set the number of ports to 12. Then click OK.
Now we must connect all the interrupts to the Concat IP. One-by-one, go through and make all the following connections. Note that the order of pin assignment is not important because it will all be transferred to the SDK in the hardware description and be correctly mapped by the BSP.
1. Connect axi_ethernet_0_dma/mm2s_introut to xlconcat_0/In0
2. Connect axi_ethernet_0_dma/s2mm_introut to xlconcat_0/In1
3. Connect axi_ethernet_1_dma/mm2s_introut to xlconcat_0/In2
4. Connect axi_ethernet_1_dma/s2mm_introut to xlconcat_0/In3
5. Connect axi_ethernet_2_dma/mm2s_introut to xlconcat_0/In4
6. Connect axi_ethernet_2_dma/s2mm_introut to xlconcat_0/In5
7. Connect axi_ethernet_0/mac_irq to xlconcat_0/In6
8. Connect axi_ethernet_0/interrupt to xlconcat_0/In7
9. Connect axi_ethernet_1/mac_irq to xlconcat_0/In8
10. Connect axi_ethernet_1/interrupt to xlconcat_0/In9
11. Connect axi_ethernet_2/mac_irq to xlconcat_0/In10
12. Connect axi_ethernet_2/interrupt to xlconcat_0/In11
Now connect the Concat output to the IRQ_F2P input of the Zynq PS.
Now let’s connect FCLK_CLK2, the 200MHz clock, to the Ethernet blocks. First connect FCLK_CLK2 to the “ref_clk” pin of “axi_ethernet_0”.
Then connect FCLK_CLK2 to the “clkin” pin of the GMII-to-RGMII block. Now for those who are curious, you probably noticed that the GMII-to-RGMII block only has one clock input (clkin). This input must be connected to 200MHz which will be used to clock the IDELAY_CTRL, but also it will be used to generate three other clocks: 125MHz, 25MHz and 2.5MHz which are used for link speeds 1Gbps, 100Mbps, 10Mbps respectively. The actual link speed is determined by the PHY during the autonegotiation process and it is up to the processor to read the link speed from the PHY and then pass this value onto the GMII-to-RGMII core so that it uses the appropriate clock. By default, it is set to use the 2.5MHz clock for a link speed of 10Mbps.
Now let’s connect the “tx_reset” and “rx_reset” ports to the peripheral reset signal. Remember to connect BOTH of them (one at a time).
Now we need to connect the 125MHz clock to the “gtx_clk” port of the AXI Ethernet Subsystem block “axi_ethernet_0” (the one containing the shared logic). We did enable FCLK_CLK1 for this purpose, and you can make that connection if you wish, but for this tutorial we will explore another possibility. The Ethernet FMC has an on-board 125MHz oscillator which can also be used to supply “gtx_clk”. In order to use it, we just need to add a differential buffer to our design. Open the IP Catalog and add a “Utility Buffer” to the design.
Connect the output of the buffer to the “gtx_clk” input of “axi_ethernet_0”.
Now click on the plus (+) symbol on the input of the buffer to show the differential inputs.
Now right-click on each of the individual inputs of the buffer and select “Make External”. You should end up with two external input ports named “IBUF_DS_P[0:0]” and “IBUF_DS_N[0:0]”.
Rename those external input ports to “ref_clk_p” and “ref_clk_n” respectively.
Now there is only one more thing to add. The Ethernet FMC has two inputs that are used to enable the on-board 125MHz oscillator and to select it’s frequency (it can alternatively be set to 250MHz). We need to add some constants to our design to enable the oscillator and to set it’s output frequency to 125MHz. Open the IP Catalog and add two Constant IPs.
By default, they will be set to constant outputs of 1, which is exactly what we need. So all we must do is make their outputs external and rename them to “ref_clk_oe” and “ref_clk_fsel”. The result should be as shown in the image below.
Save the block diagram by clicking “File->Save Block Design”.

Create the HDL wrapper

Our Vivado block diagram is complete and we now need to create a HDL wrapper for the design.

Open the “Sources” tab from the Block Design window.
Right-click on “design_1” and select “Create HDL wrapper” from the drop-down menu.
From the “Create HDL wrapper” window, select “Let Vivado manage wrapper and auto-update”. Click “OK”.

Add the constraints file

The last thing we need to add to our project will be the constraints. The constraints file contains:

Pin assignments for all the external ports of our block design, which in our case are the pins that are routed to the FMC connector and through to our Ethernet FMC
A definition for the 125MHz reference clock that comes in from the Ethernet FMC
IODELAY grouping constraints to assign each port to one of two groups corresponding to the I/O bank that it occupies. We don’t want the tools trying to group all the ports to the same IDELAY_CTRL, but rather there should be one instantiated for each I/O bank – in our case, there is one instantiated in the “axi_ethernet_0” and another in “gmii_to_rgmii_0”.

Follow these steps to add the constraints file to your project:

Download the constraints file from this link: Constraints for ZedBoard and Ethernet FMC using GMII-to-RGMII and AXI Ethernet
Save the constraints file somewhere on your hard disk.
From the Project Manager, click “Add Sources”.
Then click “Add or create constraints”.
Then click “Add files” and browse to the constraints file that you downloaded earlier.
Tick “Copy constraints files into project” and click Finish.
You should now see the constraints file in the Sources window.

Sources Git repository

Sources for re-generating this project automatically can be found on Github at the links below. There is a version of the project for the ZedBoard and the MicroZed. There is also a version that uses only the AXI Ethernet Subsystem IP.

Instructions for re-generating those projects can be found in this post: Version control for Vivado projects. We will also discuss that in the following tutorial, as well as testing the projects on actual hardware.

Testing the project on hardware

In the second part of this tutorial (yet to come) we will generate the bitstream for this project, export it to the SDK and then test an echo server application on the hardware. The echo server application runs on lwIP (light-weight IP), the open source TCP/IP stack for embedded systems.

If you have any questions about this tutorial, or if you run into problems, please leave me a comment below.

↧

FPGA Network tap: Designing the Ethernet pass-through

December 29, 2015, 8:20 am

≫ Next: PicoZed Unboxing

≪ Previous: Using AXI Ethernet Subsystem and GMII-to-RGMII in a Multi-port Ethernet design

When designing a network tap on an FPGA, the logical place to start is the pass-through between two Ethernet ports. In this article, I’ll discuss a convenient way to connect two Ethernet ports at the PHY-MAC interface, which will form the basis of a network tap. The pass-through will be designed in Vivado for the ZedBoard combined with an Ethernet FMC. In future articles, I’ll discuss other aspects of an FPGA network tap design, including monitor ports, packet filtering, and opportunities for hardware acceleration in the FPGA.

Pass-through at the MAC interface (GMII, RGMII or SGMII)

The criteria for an ideal pass-through are:

it must be completely transparent to all devices communicating over the link,
it must preserve the fidelity of the link, and ideally,
it should add very little latency to the link.

From those criteria we could suppose that if we could simply tap the wires of the Ethernet cable, we’d have our ideal tap. Unfortunately, due to the complexity of Gigabit Ethernet signals, we can’t do that, instead we have to break the link and connect each end to it’s own Ethernet PHY. The pass-through is implemented on the other end of the PHYs, or the MAC interface which is typically one of the following standards: GMII, RGMII or SGMII. In the case of the Ethernet FMC, which uses 4x Marvell 88E1510 Ethernet PHYs, we’re dealing with the RGMII interface.

RGMII signals are double-data-rate (DDR) and so in order to bring the data into our FPGA fabric and send it back out, we need to use the IDDR and ODDR primitives. Fortunately, there is an IP that implements the RGMII interface for us and provides us with a single-data-rate interface which we can use for the pass-through and for “tapping”. The GMII-to-RGMII IP core, included with Vivado, converts an RGMII interface, to a GMII interface. To implement our pass-through, all we have to do is instantiate two GMII-to-RGMII converters, route them to two separate Ethernet PHYs and loop together the two GMII interfaces.

fpga_network_tap_4

The block diagram above illustrates the general idea. Port 0 and port 1 of the Ethernet FMC are each connected to a GMII-to-RGMII converter, and the GMII interfaces are passed through to the opposite port.

Use FIFOs to connect the GMII interfaces

When connecting one GMII interface to another, you will notice that the transmit interface has a separate clock to the receive interface. The GMII TX data, TX enable and TX error signals are all synchronous to the TX clock, whereas the GMII RX data, RX valid and RX error signals are all synchronous to the RX clock. So you can’t directly connect the GMII transmit interface to the GMII receive interface – you have to use proper clock domain crossing. The easy way to do that is by using a FIFO with independent read and write clocks – you’ll need two of them, one for each direction of data flow.

fpga_network_tap_2

Wire the FIFOs as elastic buffers

The natural way to connect FIFOs to the transmit and receive interfaces is to use the “rx_dv” (RX valid) output of the GMII interfaces to drive the “write enable” inputs of the FIFOs, and to use the “valid” output of the FIFOs to drive the “tx_en” (TX enable) inputs of the GMII interface. However, in our application, there is a problem with this method. If even momentarily the FIFO is being read slightly faster than it is being written to, you will have occasions where the FIFO is empty for one clock cycle and forced to de-assert the “valid” signal. This is a problem because the GMII interface “enable” and “valid” signals are only supposed to be de-asserted at the end of a packet, so this gap effectively terminates the Ethernet packet that you are feeding to the PHY. The better solution is to feed the “enable” and “valid” signals through the FIFOs, and to design the FIFOs as elastic buffers. Remember that once you decide that a FIFO will be written to and read from constantly, using two independent clocks, it must be designed as an elastic buffer or you risk losing data due to the FIFO reaching the full or empty state. In the elastic buffer solution, we still use our “tx_en” and “rx_dv” signals, but we use them to determine what data the elastic buffer can discard at the write interface (when it’s too full), as well as when the elastic buffer can momentarily halt the read interface (when it’s too empty). An elastic buffer is not perfect and it relies on a certain amount of redundancy being present in the data, but in typical Ethernet applications, there is enough time between packets that the job of designing a reliable elastic buffer is quite simple.

So when you want to wire up a FIFO as a simple elastic buffer, there are two things to setup:

1. Programmable full and empty outputs

These signals will tell us when the FIFOs are too full or too empty and they allow us to keep the FIFO occupancy within a certain range. Typically that “range” is centered at the mid-point of the FIFO, for example, if our FIFO contains 1000 words, then we could set our desired occupancy to be between 400 and 600. In this case, the programmable full output would be set to 600, and the programmable empty output would be set to 400.

2. Write enable and read enable logic

The write and read enable inputs must be connected to logic functions that will throttle the FIFO, filling it up when it gets too empty and emptying it when it gets too full. The functions are:

write enable <= NOT prog_full OR rx_valid
read enable <= NOT prog_empty OR tx_valid

fpga_network_tap_3

Configuring the GMII-to-RGMII converter

For the GMII-to-RGMII converter to operate properly, we have to let it know the actual link speed that was setup by the PHY during auto-negotiation. But how do we communicate this information to the core?

You may have noticed that the GMII-to-RGMII core contains two MDIO ports, one of which is normally connected to the MAC, and the other which is normally externalized and connected to the PHY. The GMII-to-RGMII core “sits” on the MDIO bus, as though it were another PHY, and it can be configured over that MDIO bus. So we communicate the link speed information to the core over the MDIO bus and the typical sequence is as follows:

Trigger the auto-negotiation sequence in the PHY (optional)
We read the actual link speed from the PHY after auto-negotiation has completed
We write the actual link speed to the GMII-to-RGMII core

The last step involves writing to a specific register within the GMII-to-RGMII core with a value that corresponds to the link speed. To do this we need the address of the register to write to (0x10) and the “PHY” address of the GMII-to-RGMII core (I quote the word PHY because the core is not a PHY). The “PHY” address of the GMII-to-RGMII core is specified in Vivado, and is 8 by default. In order to communicate with two GMII-to-RGMII cores in our design, we have connected one of the MDIO “inputs” to GEM1 of the Zynq PS. We then connected the MDIO “output” to the MDIO “input” of the second GMII-to-RGMII converter (see block diagram above). This way, we can configure both GMII-to-RGMII converters using only the MDIO port of GEM1. In Vivado, we configure the GMII-to-RGMII cores to have different “PHY addresses”, specifically 7 and 8, so that we don’t create a bus conflict.

fpga_network_tap_5

Depending on the established link speed, we need to write the following values to register 0x10 of both of the GMII-to-RGMII converters:

For a link speed of 1Gbps, we need to write 0x140.
For a link speed of 100Mbps, we need to write 0x2100.
For a link speed of 10Mbps, we need to write 0x100.

For reliable operation, the link on Port 0 should be the same speed as that on Port 1, ie. don’t try to use this pass-through to connect networks of different speeds.

Sources Git repository

The sources for re-generating this project automatically can be found on Github at the link below.

ZedBoard Network Tap Github Source Code

If you want to better understand how the sources are organized, you can read this post: Version control for Vivado projects.

Next on the FPGA network tap

In the next post on the FPGA network tap, we’ll hook up the other two ports of the Ethernet FMC as monitor ports which will enable “listening” by a third device. Port 2 will send a copy all packets going in one direction, while port 3 will send a copy of all packets going in the other direction, so the result will be a full gigabit network tap. We’ll also hook the ports up to soft TEMAC IPs and look at filtering the packets within the FPGA fabric.

↧

PicoZed Unboxing

January 4, 2016, 7:00 am

≫ Next: Running a lwIP Echo Server on a Multi-port Ethernet design

≪ Previous: FPGA Network tap: Designing the Ethernet pass-through

I recently got myself a PicoZed 7Z030 SoM (system-on-module) so that I could start developing more resource intensive applications for the Ethernet FMC, such as network tapping and network latency measurement. Why would I use a SoM for this? Checkout my comparison of Zynq SoMs to learn more about the benefits of SoMs in product development.

It’s worth mentioning this arrived the day after I ordered it. Here are some photos I took while unboxing:

picozed_unboxing_1

picozed_unboxing_2

picozed_unboxing_3

Packing slip and PicoZed in an anti-static bag.

picozed_unboxing_4

The MicroZed has its own custom product packaging so I was surprised by this plain white box – but I would have thrown it away anyway!

picozed_unboxing_5

picozed_unboxing_6

picozed_unboxing_12

Here’s the slick looking red PCB with the recognizable Zynq 7Z030 in the middle. I particularly wanted the 7Z030 because it has a lot of LUTs and it’s also the only PicoZed with MGTs (multi-gigabit transceivers).

picozed_unboxing_11

The other side of the PCB has 3 expansion connectors providing access to 148 I/Os and 4 GTX (gigabit transceivers).

You can expect to see more PicoZed projects from me over the next few months!

↧

Running a lwIP Echo Server on a Multi-port Ethernet design

January 5, 2016, 7:08 am

≫ Next: QuickPlay reinvents FPGA design

≪ Previous: PicoZed Unboxing

Tutorial Overview

This tutorial is the follow-up to Using AXI Ethernet Subsystem and GMII-to-RGMII in a Multi-port Ethernet design. In this part of the tutorial we will generate the bitstream, export the hardware description to the SDK and then test the echo server application on our hardware. The echo server application runs on lwIP (light-weight IP), the open source TCP/IP stack for embedded systems. Our hardware platform is the Avnet ZedBoard combined with the Ethernet FMC.

Regenerate the Vivado project

Firstly, for those of you who did not follow the first part of this tutorial, we will use the scripts in the Git repository for this project to regenerate the Vivado project. If you followed the first part of the tutorial correctly, you should not need to complete this step. Please note that the Git repository is regularly updated for the latest version of Vivado, so you must download the last “commit” for the version of Vivado that you are using.

Download the sources from Github here: https://github.com/fpgadeveloper/zedboard-qgige
Depending on your operating system:
- If you are using a Windows machine, open Windows Explorer, browse to the “Vivado” folder within the sources you just downloaded. Double-click on the “build.bat” file to run the batch file.
- If you are using a Linux machine, run Vivado and then select Window->Tcl Console from the welcome screen. In the Tcl console, use the “cd” command to navigate to the “Vivado” folder within the sources you just downloaded. Then type “source build.tcl” to run the build script.
Once the script has finished running, the Vivado project should be regenerated and located in the “Vivado” folder. Run Vivado and open the newly generated project.

If you did not follow the first part of this tutorial, you may want to open the block diagram and get familiar with the design before continuing.

Generate the bitstream

When you are ready to generate the bitstream, click “Generate Bitstream” in the Flow Navigator.

Once the bitstream is generated, the following window will appear. Select “View Reports” and click “OK”.

zedboard_echo_server_5

Export the hardware to SDK

When the bitstream has been generated, we can export it and the hardware description to the Software Development Kit (SDK). In the SDK we will be able to generate the echo server example design and run it on our hardware.

In Vivado, from the File menu, select “Export->Export hardware”.
In the window that appears, tick “Include bitstream”, select Export to “Local to Project”, and click “OK”.
From the File menu, select “Launch SDK”.
In the window that appears, you need to specify the location of the hardware description and the location of the SDK workspace. We specified earlier to generate the hardware description local to the project (including bistream), so the Exported location must be “Local to Project”. By preference, we choose to create the SDK workspace local to the project, but you can change this if you wish. Click “OK”.

At this point, the SDK loads and a hardware platform specification will be created for your design.

Create the Echo Server application

At this point, your SDK workspace should contain only the hardware description and no applications:

zedboard_echo_server_14

We add the echo server application by selecting New->Application Project from the File menu.

zedboard_echo_server_15

In the New Project wizard, we want to name the application appropriately, so type “echo_server” as the project name then click “Next”.

zedboard_echo_server_16

The next page allows you to create the new application based on a template. Select the “lwIP Echo Server” template and click “Finish”.

zedboard_echo_server_17

The SDK will generate a new application called “echo_server” and a Board Support Package (BSP) called “echo_server_bsp”, both of which you will find in the Project Explorer as shown below.

zedboard_echo_server_18

By default, the SDK is configured to build the application automatically.

Modify the application

The echo server template application will be setup to run on the first AXI Ethernet Subsystem block in our design. This corresponds to PORT0 of the Ethernet FMC. We want to add some code to the application to allow us to select a different port if we choose.

Open the “main.c” file from the echo_server source folder.
After the last “#include” statement, add the following code:

#include "xlwipconfig.h"

/* Set the following DEFINE to the port number (0,1,2 or 3)
* of the Ethernet FMC that you want to hook up
* to the lwIP echo server. Only one port can be connected
* to it in this version of the code.
*/
#define ETH_FMC_PORT 0

/*
* NOTE: When using ports 0..2 the BSP setting "use_axieth_on_zynq"
* must be set to 1. When using port 3, it must be set to 0.
* To change BSP settings: right click on the BSP and click
* "Board Support Package Settings" from the context menu.
*/
#ifdef XLWIP_CONFIG_INCLUDE_AXIETH_ON_ZYNQ
#if ETH_FMC_PORT == 0
#define EMAC_BASEADDR XPAR_AXIETHERNET_0_BASEADDR  // Eth FMC Port 0
#endif
#if ETH_FMC_PORT == 1
#define EMAC_BASEADDR XPAR_AXIETHERNET_1_BASEADDR  // Eth FMC Port 1
#endif
#if ETH_FMC_PORT == 2
#define EMAC_BASEADDR XPAR_AXIETHERNET_2_BASEADDR  // Eth FMC Port 2
#endif
#else /* XLWIP_CONFIG_INCLUDE_AXIETH_ON_ZYNQ is not defined */
#if ETH_FMC_PORT == 3
#define EMAC_BASEADDR XPAR_XEMACPS_1_BASEADDR  // Eth FMC Port 3
#endif
#endif

3. Then go down to where the define PLATFORM_EMAC_BASEADDR is used, and replace it with EMAC_BASEADDR.

When you save the “main.c” file, the SDK should automatically start rebuilding the application.

Modify the Libraries

The BSP for this project will also have to be modified slightly, at least for Vivado 2015.4 and older versions. There are a few reasons for these modifications, but we would be going off-track to discuss those reasons in detail at this point. The modifications that apply to you will be found in the “README.md” file of the sources that you downloaded earlier. If you are using the latest version of Vivado, you can simply refer to the instructions on the front page of the Git repository.

I strongly recommend that you perform these modifications to the sources in the Vivado installation files – not the sources in the BSP of your SDK workspace. The reason is that the BSP sources will be written-over with the original sources every time that you re-build the BSP – so you’re better off modifying them at the true source.

Note: These modifications are specific to using the echo server application on the Ethernet FMC. If you are not using the Ethernet FMC, you may not need to make these modifications and you’re better off leaving the library sources as they are.

Setup the hardware

To setup our hardware, we need to configure the ZedBoard for configuration by JTAG, we need to set the VADJ voltage to the appropriate value and we need to correctly attach the Ethernet FMC. Follow these instructions to ensure that your setup is correct:

On the ZedBoard, set the JP7, JP8, JP9, JP10 and JP11 jumpers all to the SIG-GND position. This sets it for configuration by JTAG.
Set the VADJ select jumper (J18) to either 1.8V or 2.5V, depending on the version of Ethernet FMC that you are using. We are using the 2.5V version.
Connect the Ethernet FMC to the FMC connector of the ZedBoard. Apply pressure only to the area above and below the connector – you should feel the two connectors “snap” together.
Now we need to use two screws to fix the Ethernet FMC to the ZedBoard – you should find two M2.5 x 4mm screws included with the ZedBoard. Turn the ZedBoard upside down and use a Phillips head screwdriver to fix the Ethernet FMC to the ZedBoard. Please do not neglect this step, it is very important and will protect your hardware from being damaged in the event that the Ethernet FMC hinges and becomes loose. The FMC connector is not designed to be the only mechanical fixation between the carrier and mezzanine card, the screws are necessary for mechanical and electrical integrity.
Turn the ZedBoard around so that it is sitting the right way up.
Connect the USB-UART (J14) to a USB port of your PC.
Connect a Platform Cable USB II programmer (or similar device) to the JTAG connector. Connect the programmer to a USB port of your PC. Alternatively, if you don’t have a programmer, you can connect a USB cable to the J17 connector of the ZedBoard.
Connect PORT0 of the Ethernet FMC to a gigabit Ethernet port of your PC.
Now plug the ZedBoard power adapter into a wall socket and then into the ZedBoard.
Switch ON the power to the board. You should see the “POWER” LED on the ZedBoard turn on.

Test the Echo Server on hardware

To be able to read the output of the echo server application, we need to use a terminal program such as Putty. Use the following settings:

Comport – check your device manager to find out what comport the ZedBoard popped up as. In my case, it was COM16 as shown below.
Baud rate: 115200bps
Data: 8 bits
Parity: None
Stop bits: 1

With the terminal program open, we can now load our ZedBoard with the bitstream and then run the echo server application.

In the SDK, from the menu, select Xilinx Tools->Program FPGA.
In the Program FPGA window, we select the hardware platform to program. We have only one hardware platform, so click “Program”.
The bitstream will be loaded onto the Zynq and we are ready to load the software application. Select the “echo_server” folder in the Project Explorer, then from the menu, select Run->Run.
In the Run As window, select “Launch on Hardware (GDB)” and click “OK”.
The application will be loaded on the Zynq PS and it will be executed. The terminal window should display this output from the echo server:

The output indicates that:

The PHY auto-negotiation sequence has completed
The auto-negotiated link-speed is 1Gbps
The DHCP timeout was reached, indicating that the application was not able to get an IP address from a DHCP server
The auto-assigned IP address is 192.168.1.10

Now that the application is running successfully, we can test the echo server by sending packets from our PC to the ZedBoard and looking at what gets sent back.

Ping Test

All Ethernet devices are required to respond to ping requests, so this is a very simple and easy test to perform using your computer. Just open a command window and type “ping 192.168.1.10”.

zedboard_echo_server_26

Packet Echoing

To test that the echo server is actually doing its job and echoing received packets, you will have to install software that allows you to send and receive arbitrary packets. The software that I use is called Packet Sender and can be downloaded here. Once the software is installed, follow the instructions below to send and receive packets:

Run Packet Sender.
Create a new packet to send using these parameters and then click “Save”:
- Name: Test packet
- ASCII: follow the white rabbit
- IP Address: 192.168.1.10
- Port: 7
- Resend: 0
The packet details will be saved in the Packets tab, and we can now click on the “Send” button to send that packet whenever we want. Click “Send” and see what happens.
If everything went well, the Traffic Log tab should display two packets: one sent by our computer and one received by our computer. They should both occur almost instantaneously, so if you only see one, you’ve probably got a problem with your setup.

If you want to experiment, you can play around with the software by sending more packets, or different kinds of packets.

Changing ports

So far we’ve been using PORT0 of the Ethernet FMC to test the echo server, but suppose we wanted to use one of the other ports 1,2 or 3. You can configure the port on which to run lwIP by setting the ETH_FMC_PORT define that we added earlier to the main.c file of the SDK application. Valid values for ETH_FMC_PORT are 0,1,2 or 3.

One other thing to be aware of is the BSP setting called “use_axieth_on_zynq”. This parameter specifies whether the BSP will be used with AXI Ethernet Subsystem or with something else: Zynq GEM, Ethernet lite, etc. Remember that in our Vivado design we connected ports 0, 1 and 2 to an AXI Ethernet Subsystem block, and we connected port 3 to the GEM1 of the Zynq PS. Therefore, when selecting the port on which you wish to run lwIP, remember to correctly set the “use_axieth_on_zynq” parameter:

When using ports 0..2 the BSP setting “use_axieth_on_zynq” must be set to 1.
When using port 3, the BSP setting “use_axieth_on_zynq” must be set to 0.

The application will not compile if the correct BSP settings have not been set. To change BSP settings: right click on the BSP and click Board Support Package Settings from the context menu.

What now?

The echo server application is actually a very good starting place for developing Ethernet applications on the ZedBoard or other Xilinx FPGAs. Here are some potential ways you could “tweek” the echo server application to be useful for other things:

Allow your FPGA designs to be controlled by TCP commands sent from a PC over an Ethernet cable – or over the Internet.
Send data over TCP from your PC to your FPGA and leverage the FPGA for hardware acceleration.
Connect your FPGA to the Internet and design a high-performance IoT device.

Source code Git repository

Below are the links to the source code Git repositories. There is a version of the project for the ZedBoard and the MicroZed. There is also a version that uses only the AXI Ethernet Subsystem IP.

If you enjoyed this tutorial or if you run into problems using it, please leave me a comment below.

↧

QuickPlay reinvents FPGA design

January 26, 2016, 11:55 am

≫ Next: Unboxing Samsung V-NAND SSD 950 Pro M.2 NVM Express

≪ Previous: Running a lwIP Echo Server on a Multi-port Ethernet design

Since their invention, FPGAs have been burdened by a problem that has held them back from more widespread adoption: they’re too hard to program. Xilinx knows this, which is why they spent hundreds of millions of dollars developing the Vivado Design Suite and more importantly Vivado HLS (high-level synthesis) which enables high-performance hardware designs to be programmed in C/C++. Well a new company called QuickPlay has created their own solution to this problem. They claim to have created a development platform (including hardware and software) that enables developers to create FPGA based designs with almost no FPGA knowledge or experience. They’ve created a development environment with a high-level of abstraction, allowing FPGA designs to be developed in C/C++, while also supporting Xilinx and Altera FPGAs, and multiple board and IP vendors.

quickplay_flow

If you solve the problem of programming FPGAs and allow them to be exploited by the masses of C/C++ coders in the world, I guarantee that FPGAs will swamp the data centers and replace a sizable portion of the x86 processor based servers in the world. If FPGAs were easier to program, most web-based services would be running on a server with an embedded FPGA accelerator, if not running entirely on an FPGA based server. In that kind of a world, we’d have faster web-services and more power efficient data centers.

↧

Unboxing Samsung V-NAND SSD 950 Pro M.2 NVM Express

February 13, 2016, 1:50 pm

≫ Next: FPGA accelerators to get a standard software interface

≪ Previous: QuickPlay reinvents FPGA design

Very excited to be showing off my new Samsung SSD 950 in the M.2 form factor. This tiny solid-state drive has a PCI Express Gen3 x 4-lane interface for a more direct connection to the CPU which enables a much higher throughput than a SATA interface. According to Samsung:

It outperforms SATA SSDs by over 4.5 times in sequential read and by over 2.5 times in sequential write, delivering the speeds of 2,500 MB/s and 1,500 MB/s respectively.

So what am I doing with this beast?

fpga_drive_samsung_ssd_m2_pcie_nvme_2

I’m gonna hook it up to an FPGA of course!

How???!!!

fpga_drive_samsung_ssd_m2_pcie_nvme_4

That’s what my next product is about, and I can’t wait to tell you all about it!

I put that AAA battery in the photo just to show you how small this thing is! It’s insane!

↧

FPGA accelerators to get a standard software interface

February 16, 2016, 7:18 pm

≫ Next: Xilinx reveals Virtex Ultrascale Board for PCI Express applications

≪ Previous: Unboxing Samsung V-NAND SSD 950 Pro M.2 NVM Express

Rick Merritt wrote an interesting article on EETimes titled Red Hat Drives FPGAs, ARM Servers. It seems that Red Hat and the major FPGA vendors are going to get together in March to work out a standard software interface for FPGA accelerator boards. The success of high-level synthesis tools in recent years has re-ignited interest in FPGA-based hardware accelerators, as development times on FPGA hardware has seen massive reductions thanks to OpenCL and Vivado HLS, among others. Typically, these kinds of accelerators are PCI Express boards but the OS usually talks to them through a custom interface, which depends on the application and the algorithms being implemented on the FPGA. This obligates the software designers to know the hardware in detail, in order to code the drivers and applications to exploit the accelerators. So Red Hat, the open-source software company, is basically pushing for some abstraction to make the accelerators easier to code for. The idea is simple: design the accelerators with a standard interface and hide the hardware implementation details behind it.

A standard interface for FPGA hardware accelerators would make them more competitive vs GPUs, which typically come with their own drivers. It also makes sense from a design perspective that the details of the implementation of the accelerator should be hidden from the software designer through a standard interface. I think it’s a step in the right direction and I’m looking forward to seeing what they come up with.

↧

Xilinx reveals Virtex Ultrascale Board for PCI Express applications

February 23, 2016, 7:16 am

≫ Next: ZynqBoard: The World’s Smallest Zynq SoM

≪ Previous: FPGA accelerators to get a standard software interface

Xilinx just released a video presenting the next-generation of All Programmable devices and dev environments. It’s a quick look at where technology is going and particularly where FPGAs are going to make their mark.

Of particular interest to me were the images of a Virtex Ultrascale PCI Express board at 2:45 in the video. This board appears to have both the PCIe gold-finger edge connector and a PCIe saddle-mount socket connector, so it could be used as either the PCIe end-point or the root complex – or maybe both at the same time. Most of Xilinx’s dev boards have the PCIe edge connector but as far as I know, the only FPGA dev board with a PCIe socket is the Mini-ITX from Avnet.

virtex-ultrascale-pcie-eval-board-1

virtex-ultrascale-pcie-eval-board-3

virtex-ultrascale-pcie-eval-board-4

Although I don’t really know why you’d want to use both the edge connector and socket at the same time, I can come up with a few crazy ideas:

Stack-up multiple FPGA accelerators (although physically you wouldn’t be able to fit too many of these into a server)
Connect a PCIe NVMe SSD to the socket and use the FPGA to offload data processing operations such as hash function for database applications (although the better way to do it would be to have the FPGA interface directly with NAND flash)
Use the FPGA as a bridge between a root-complex and end-point, and to perform analysis of PCIe transactions with the aim of identifying inefficiencies, bottle-necks or opportunities for hardware acceleration in a particular system.

If you’ve got any other ideas for potential applications for this board, feel free to leave a comment below.

↧

ZynqBoard: The World’s Smallest Zynq SoM

March 11, 2016, 6:38 am

≫ Next: A first peek at FPGA Drive

≪ Previous: Xilinx reveals Virtex Ultrascale Board for PCI Express applications

Almost a year ago I did a comparison of Zynq SoMs, or System-on-Modules, these handy little Zynq-based devices that speed up your product development by taking the risk out of your PCB design and often handing you a ton of working example code. Well there have been many more Zynq SoMs come onto the market since then, so another comparison is due, but today I just wanted to review one of them: ZynqBoard, the smallest Zynq SoM on the market today according to zynqboard.com. This new device measures only 42mm x 22mm! To get it so small it’s developers have stripped it down to only the essentials: Zynq, DDR3, flash memory, clock oscillator and expansion connectors.

Normally I’d write about all the features, but in this case, I think that the value of this SoM doesn’t come from it’s “features” but instead from what it doesn’t do:

It has no ON-OFF switch
It has no RESET switch
It has no LEDs
It has no programming connector
It has no on-board power supplies
It has no Ethernet PHY
It has no USB PHY
It can’t be used as a stand-alone device

That’s right, this board can’t be used as a stand-alone device, in fact, in order to even turn this thing on you have to design your own carrier board for it. Although that may sound like a drawback to some, the total freedom to customize this board to your application is this SoM’s powerful value proposition. It’s so small they didn’t even have space for the company logo! But with such a small footprint, it’s hard to justify not using it – here’s what I mean…

So you need a Zynq in your product, and you decide to put it directly on your custom board. This ups your layer count to at least 8 (maybe more, the ZynqBoard has 16), it increases your assembly cost (BGAs, two sided board), it inflates your bring-up time and it obliges you to do critical DDR3 routing. In general terms, it significantly increases the complexity of your board design and in turn increases your risk of doing re-spins and spending too much time bringing up your board. Why would you go that route when for practically the same amount of real-estate, you can put a tiny Zynq SoM on your board and be done with it. Engineers like the challenge of doing things from scratch, but the guy who’s paying the bills wants to have a product yesterday, and the best way of achieving that is by leveraging these SoMs.

Overall, I like this new Zynq SoM and I admire the boldness it took to develop a product that does less than those of the competitors.

Update 2016-03-12: I got in touch with Servaes Joordens from zynqboard.com and he tells me that the price for low quantities of the ZynqBoard is 200 euros.

↧

A first peek at FPGA Drive

March 15, 2016, 11:42 am

≫ Next: FPGA Drive Board Bring-up

≪ Previous: ZynqBoard: The World’s Smallest Zynq SoM

With the first prototypes on the way, it’s time to take a closer look at what exactly FPGA Drive is and how it can help you to develop new disruptive technologies with FPGAs and SSDs. Here’s what you need to know in 3 points:

FPGA Drive enables you to connect a high-speed Solid State Drive (SSD) to an FPGA
FPGA Drive delivers high-capacity, extreme-throughput non-volatile storage to FPGA development boards
FPGA Drive connects a 4-lane PCI Express bus between your FPGA and SSD

The 3D rendered image shows the following key features:

8-lane PCIe socket for connection to the PCIe edge connector of one of the Xilinx development boards
M2 (NGFF) socket for connection to the SSD
12V power connector and switching power supply for generation of 3.3V

For information on price and availability, please contact me.

↧

FPGA Drive Board Bring-up

March 30, 2016, 5:24 pm

≫ Next: Microblaze PCI Express Root Complex design in Vivado

≪ Previous: A first peek at FPGA Drive

Bring-up of the first FPGA Drive with the Kintex-7 KC705 Evaluation board went nice and smoothly today. In the photo below you’ll see the KC705 and FPGA Drive adapter which is loaded with a Samsung V-NAND 950 Pro. The solid-state drive is an M.2 form factor, NVM Express, 4-lane PCI Express drive with 256GB of storage.

A little intro to NVM Express. NVM Express or NVMe is an interfacing specification for accessing SSDs over a PCI Express bus. By connecting the SSD over PCIe, it has a direct connection to the CPU which results in lower latency when compared to SATA drives, as well as increased throughput and potential for scaling (just add more lanes). PCIe SSDs can use the older AHCI interfacing standard, but due to the way that standard was designed, it can’t fully exploit the potential of modern SSDs. The NVMe specification was designed from the ground up to solve this problem.

A bit of info on the board. The FPGA Drive adapter is a 6-layer PCB. Although I’m bringing it up on the KC705 board, it has been designed for compatibility with all the Xilinx Series-7 eval boards that have PCIe edge-connectors. Although I’m using the Samsung V-NAND 950 Pro, it has been designed to carry most standard M-key M.2 devices. It contains a switching power supply which converts the incoming 12V supply to a 3.3V supply to power the SSD. It has a clock generator which supplies a 100MHz clock to both the FPGA and the SSD. The board has some logic to generate the PCIe reset signal for both FPGA and SSD.

fpga-drive-bring-up-2

Why not connect to the FMC connector? It would have been a bit simpler to design the FPGA Drive as a standard FMC module, because the FMC connector supplies a range of power supplies and a lot of useful I/Os. However, to make use of the integrated PCIe blocks in the FPGA, we need to use specific gigabit transceivers – and those transceivers are routed to the PCIe edge-connector on the Xilinx eval boards. Besides, by using the PCIe edge-connector, we leave the FMC connectors free for actual I/O – imagine being able to record gigabytes of samples from an ADC module, or Ethernet packets from the Ethernet FMC.

Power cable. Power to the KC705 and the FPGA Drive is supplied by the same 12V adapter that comes with the KC705 (and other Xilinx eval boards for that matter). The single output connector of the adapter is converted to two parallel outputs by the black adapter cable shown below (to be supplied with the FPGA Drive).

fpga-drive-bring-up-19

Testing the board. After verifying the 3.3V switching power supply and current draw of the adapter board without SSD, I checked them again with the SSD and then again when connected to the KC705 – all clear. Then I tested the system with a bitstream in the Kintex-7 FPGA. The design running on the Kintex-7 contains an AXI Memory Mapped to PCI Express Bridge IP configured as a Root Port or root complex. To test the design, I ran a stand-alone application on the MicroBlaze which configures and tests the PCIe Bridge IP, then enumerates all PCIe devices found. The example application can be found in the Xilinx SDK folder here:

C:\Xilinx\SDK\2015.4\data\embeddedsw\XilinxProcessorIPLib\drivers\axipcie_v3_0\examples\xaxipcie_rc_enumerate_example.c

Below is a screen shot of the output of the test application. As you can see, the FPGA achieved PCIe “link up” and the application successfully enumerated the SSD end-point. In the next few days I’ll release the sources and write a tutorial showing how to build this design in Vivado and test it on the KC705.

fpga-drive-bring-up-1

My next task is to get the SSD functioning under a Linux operating system with the in-box NVMe driver. Unfortunately, PetaLinux 2015.4 doesn’t seem to have the NVMe driver built-in, so I’m going to have to build the kernel myself from the Xilinx sources.

If you want more info about FPGA Drive, just get in touch.

fpga-drive-bring-up-7

↧

Microblaze PCI Express Root Complex design in Vivado

April 13, 2016, 7:00 am

≫ Next: Zynq PCI Express Root Complex design in Vivado

≪ Previous: FPGA Drive Board Bring-up

This is the first part of a three part tutorial series in which we will go through the steps to create a PCI Express Root Complex design in Vivado, with the goal of being able to connect a PCIe end-point to our FPGA. We will test the design on hardware by connecting a PCIe NVMe solid-state drive to our FPGA using the FPGA Drive adapter.

Part 1: Microblaze PCI Express Root Complex design in Vivado (this tutorial)

Part 2: Zynq PCI Express Root Complex design in Vivado

Part 3: Connecting an SSD to an FPGA running PetaLinux

In the first part of this tutorial series we will build a Microblaze based design targeting the KC705 Evaluation Board. In the second part, we will build a Zynq based design targeting the PicoZed 7Z030 and PicoZed FMC Carrier Card V2. In part 3, we will test the design on the target hardware using a stand-alone application that will validate the state of the PCIe link and perform enumeration of the PCIe end-points. We will then run PetaLinux on the FPGA and prepare our SSD for use under the operating system.

Requirements

To complete this tutorial you will need the following:

Vivado 2015.4
KC705 Evaluation Board
FPGA Drive adapter
An NVMe PCIe solid-state drive such as this one

Note: The tutorial text and screenshots are suitable for Vivado 2015.4 however the sources in the Git repository will be regularly updated to the latest version of Vivado.

The Components

The image below gives us a high level view of the design showing each component and how it connects to the Microblaze – only the AXI-Lite interfaces are shown.

microblaze_pcie_root_complex_vivado_93

Let’s talk about the role of each peripheral in the design:

AXI Interrupt Controller – connects to the interrupts generated by the peripherals and routes them through to the Microblaze. It’s generally a good idea to connect all interrupts to the Microblaze when you plan to run PetaLinux.
AXI Central DMA – performs data transfers from one memory mapped space to another. We have the CDMA in this design to be able to make fast data transfers between the PCIe end-point and the DDR3 memory.
AXI Memory Mapped to PCI Express – performs address mapping between the AXI address space and the PCIe address space. It contains the integrated PCI Express block and all the logic required to translate PCIe TLPs into AXI memory mapped reads and writes. The AXI-PCIe block has a slave interface (S_AXI) to allow an AXI master (such as the Microblaze) to access the PCIe address space, and it also has a master interface (M_AXI) which allows a PCIe end-point to access the AXI address space.
AXI UART16550 – UART for console output, which is needed by our stand-alone software application and by PetaLinux.
AXI EthernetLite – provides a 10/100Mbps network connection for PetaLinux.
AXI Quad SPI – provides access to a QSPI Flash device which can be used for storing software, the Linux kernel or FPGA configuration files.
AXI Timer – provides an accurate timer needed by PetaLinux.

The Address Spaces

The image below shows the AXI memory mapped interface connections which is useful for understanding the memory spaces and the devices that have access to them.

microblaze_pcie_root_complex_vivado_94

The important thing is to understand is who the bus masters are and what address spaces they can access – the connections could have been made in a number of different ways to achieve the same goal.

The 2 address spaces are:

the DDR3 memory accessed through the MIG, and
the PCIe address space accessed through the S_AXI interface of the AXI-PCIe bridge

The 3 AXI masters and the address spaces they can access are:

the Microblaze can access both the DDR3 memory and the PCIe address space
the PCIe end-point with bus mastering capability can access the DDR3 memory only (via M_AXI port of the AXI-PCIe bridge)
the CDMA can access both the DDR3 memory and the PCIe address space

About PCIe end-point bus mastering

Most PCIe end-points have bus mastering capability. Basically this means that the PCIe end-point can send memory read/write TLPs to the root complex and read/write to a part of the system memory that was allocated for the end-point. Maybe the most common application of end-point bus mastering is the implementation of Message Signaled Interrupts (or MSI). When a PCIe end-point generates an MSI, it simply writes to part of the system memory that was allocated by the root complex.

Create a new Vivado project

We start by creating a new project in Vivado and selecting the KC705 Evaluation board as our target.

From the welcome screen, click “Create New Project”.
Specify a folder for the project. I’ve created a folder named “kc705_aximm_pcie”. Click “Next”.
For the Project Type window, choose “RTL Project” and tick “Do not specify sources at this time”. Click “Next”.
For the Default Part window, select the “Boards” tab and then select the “Kintex-7 KC705 Evaluation Platform” and click “Next”.
Click “Finish” to complete the new project wizard.

Create the block design

Now we need to create and build our block design. We will start by adding the Microblaze and the AXI Memory Mapped PCI Express Bridge.

From the Vivado Flow Navigator, click “Create Block Design”.
Specify a name for the block design. Let’s go with the default “design_1” and leave it local to the project. Click “OK”.
In the Block Design Diagram, you will see a message that says “This design is empty. Press the (Add IP) button to add IP.”. Click on the “Add IP” icon either in the message, or in the vertical toolbar.
The IP catalog will appear. Find and double click on “Microblaze”.
You will see the Microblaze in the block diagram. Double click on it to open the configuration wizard.
The Microblaze has several predefined configurations that can be selected on the first page of the Microblaze Configuration Wizard. We eventually want to run PetaLinux on the Microblaze, so we need to select “Linux with MMU” to get the best configuration for that. Then click “OK” to accept that configuration.
The AXI-PCIe block is going to provide the clock source for most of our design, including the Microblaze. By adding it to our block design at this point, we will then be able to use the Block Automation feature to setup a lot of the required hardware, saving us a lot of time. Find the “AXI Memory Mapped to PCI Express Bridge IP” in the IP Catalog and double click on it to add it to the block diagram.
Now click on “Run Block Automation” which will help us to setup the Microblaze local memory, the Microblaze MDM, the Processor System Reset and the AXI Interrupt Controller.
In the Run Block Automation window, apply the settings shown in the image below. Set the Local Memory to 128KB. Set the Cache Configuration to 16KB. Tick the Interrupt Controller checkbox. Set the Clock Connection to “/axi_pcie_0/axi_aclk_out”. Then click OK.
The block diagram should now look like the image below. Notice that everything so far is driven by the “axi_aclk_out” clock which is driven by the AXI-PCIe block. The reset signals are generated by the Processor System Reset block, which will synchronize the external PCIe reset signal (PERST_N) to the “axi_aclk_out” clock.
Right click on the “ext_reset_in” pin of the Processor System Reset block, and select “Make External”.
Click on the port that was just created (called “ext_reset_in”) and change it’s name to “perst_n” using the “External Port Properties” window.

Add the MIG

Now let’s add the DDR3 memory to the design. Find the “Memory Interface Generator (MIG 7 series)” in the IP Catalog and double click it to add it to the block diagram.
Click “Run Block Automation” to setup the external connections to the MIG.
In the Run Block Automation window, click “OK”.
The connection automation feature can save us a lot of time setting up the MIG, but if we run it now, Vivado will connect it to the Microblaze through the AXI Interconnect that is already in the design (microblaze_0_axi_periph). There’s nothing particularly wrong with that, but in this design we want to have a separate AXI Interconnect for the MIG so that we can more easily control which blocks have access to the DDR3 and which have access to the peripherals. It’s a point to consider in this design because we will have a PCIe end-point with bus mastering capabilities, and we need to limit what the end-point will have access to. Find “AXI Interconnect” in the IP Catalog and double click on it to add one to the design.
Click on the AXI Interconnect block and rename it to “mem_intercon” using the “Sub-block properties” window.
Double click on the “mem_intercon” block and configure it for 4 slave interfaces, and 1 master interface.
Connect the master interface (M00_AXI) of “mem_intercon” to the slave interface (S_AXI) of the MIG.
Now we can run the connection automation feature. Click “Run Connection Automation”. Select ONLY the “microblaze_0/M_AXI_DC”, “microblaze_0/M_AXI_IC” and “mig_7series_0/sys_rst” connections. Click “OK”.
Connect the master interface (M_AXI) of “axi_pcie_0” to the slave interface (S02_AXI) of the “mem_intercon”. This provides a data path from the PCIe end-point to the DDR3 memory. Note that the PCIe end-point will not be able to access anything else in our design.
Connect the “aresetn” input of the MIG to the “peripheral_aresetn” output of the “rst_mig_7series_0_100M” Processor System Reset block. Note that this Processor System Reset was generated when we used the connection automation feature in the steps above.
As shown in the image below, connect the “S02_ACLK” and “S03_ACLK” clock inputs of the “mem_intercon” to the “axi_aclk_out” output of the AXI-PCIe block. Also connect the “S02_ARESETN” and “S03_ARESETN” inputs to the “peripheral_aresetn” of the “rst_axi_pcie_0_62M” Processor System Reset.

Configure the AXI Memory Mapped to PCI Express Bridge

Double click on the AXI-PCIe block so that we can configure it. On the “PCIE:Basics” tab of the configuration, select “KC705 REVC” as the Xilinx Development Board, and select “Root Port of PCI Express Root Complex” as the port type.
On the “PCIE:Link Config” tab, select a “Lane Width” of 4x and a “Link speed” of 5 GT/s (Gen2). Note that the KC705 has 8 lanes routed to the PCIe edge-connector, however the PCIe SSD that we want to connect with has only 4 lanes.
In the “PCIE:ID” tab, enter a “Class Code” of 0x060400. This is important for the last part of this tutorial series, in which we will be running PetaLinux. The class code will ensure that the correct driver is associated with the AXI to PCIe bridge IP.
In the “PCIE:BARS” tab, tick “Hide RP BAR”, tick “BAR 64-bit Enabled” and set BAR 0 with type “Memory” and a size of 4 Gigabytes. In this configuration, the PCIe end-point is given access to the entire 32-bit address space – remember though that it’s only physically connected to the DDR3 memory.
In the “PCIE:Misc” tab, use the defaults as shown in the image below.
In the “AXI:BARS” tab, use the defaults as shown in the image below. We will later be able to configure the size of the AXI BAR 0 in the Address Editor.
In the “AXI:System” tab, use the defaults as shown in the image below.
In the “Shared Logic” tab, use the defaults as shown in the image below. Click “OK”.
Right click on the “pcie_7x_mgt” port of the AXI-PCIe block and select “Make External”. This will connect the gigabit transceivers to the 4 PCIe lanes on the PCIe edge-connector of the KC705.
Connect the “mmcm_lock” output of the AXI-PCIe block to the “dcm_locked” input of “rst_axi_pcie_0_62M” Processor System Reset block.
Connect the “axi_aresetn” input of the AXI-PCIe block to the “perst_n” port.
Add a “Constant” from the IP Catalog and configure it to output 0 (low). We’ll use this to tie low the “INTX_MSI_Request” input of the AXI-PCIe block. Connect the constant’s output to the “INTX_MSI_Request” input of the AXI-PCIe block.
Add a “Utility Buffer” to the block design. This buffer is going to be connected to a 100MHz clock that will be provided to the KC705 board by the FPGA Drive adapter, via the PCIe edge-connector. A 100MHz reference clock is required by all PCIe devices.
Double click on the utility buffer and on the “Page 0” tab of the configuration window, select “IBUFDSGTE” as the C Buf Type. Click “OK”.
Connect the “IBUF_OUT” output of the utility buffer to the “REFCLK” input of the AXI-PCIe block.
Right click on the “CLK_IN_D” input of the utility buffer and select “Make External”.
Change the name of the created external port to “ref_clk” using the External Interface Properties window.
We need to connect the PCIe interrupt to the Microblaze. Connect the “interrupt_out” output of the AXI-PCIe block to the “In0” input of the interrupt concat “microblaze_0_xlconcat”.

Add the CDMA

Now we’ll add a Central DMA to this design which will allow us to setup data transfers between the PCIe end-point and the DDR3 memory. We won’t actually test the CDMA in this tutorial series, but it’s an important part of any PCIe design that needs to transfer large amounts of data very quickly over the PCIe link. We will add an AXI Interconnect to allow the CDMA to access both the PCIe end-point and the MIG.

Add a “AXI Central Direct Memory Access” from the IP Catalog to the block design.
Double click on the CDMA block to open the configuration window. Disable Scatter Gather and set “Write/Read Data Width” to 128 as shown in the image below.
Connect the “cdma_introut” output of the CDMA to the “In1” input of the interrupt concat “microblaze_0_xlconcat”.
Add an “AXI Interconnect” from the IP Catalog to the block design. Rename it to “cdma_intercon” using the “Sub-block Properties” window.
Connect the “M_AXI” interface of the CDMA to the “S00_AXI” interface of the “cdma_intercon”.
Connect the “M00_AXI” interface of the “cdma_intercon” to the “S03_AXI” interface of the “mem_intercon”. This provides the data path between the CDMA and the DDR3 memory.
Now connect all the clocks and resets of the “cdma_intercon” as shown in the image below. Connect all the clock inputs to the “axi_aclk_out” output of the AXI-PCIe block. Connect the “ARESETN” input to the “interconnect_aresetn” output of the “rst_axi_pcie_0_62M” Processor System Reset. Connect all other reset inputs to the “peripheral_aresetn” output of the “rst_axi_pcie_0_62M” Processor System Reset.
Double click on the “microblaze_0_axi_periph” interconnect and configure it for 7 master ports. Leave the number of slave ports as 1.
Connect the “M01_AXI” interface of the “microblaze_0_axi_periph” interconnect to the “S_AXI_LITE” interface of the CDMA.
Connect the “m_axi_aclk” input of the CDMA to the “axi_aclk_out” output of the AXI-PCIe block.
Connect the “s_axi_lite_aclk” input of the CDMA to the “axi_aclk_out” output of the AXI-PCIe block.
Connect the “s_axi_lite_aresetn” input of the CDMA to the “peripheral_aresetn” output of the “rst_axi_pcie_0_62M” Processor System Reset block.
Connect the “M01_ACLK” input of the “microblaze_0_axi_periph” to the “axi_aclk_out” output of the AXI-PCIe block.
Connect the “M01_ARESETN” input of the “microblaze_0_axi_periph” to the “peripheral_aresetn” output of the “rst_axi_pcie_0_62M” Processor System Reset block.

Connect the AXI PCIe slave interfaces

The AXI PCIe block has one slave interface for configuration (S_AXI_CTL) and another for accessing the PCIe end-point (S_AXI). The slave interface for configuration must be driven synchronous to the “axi_ctl_aclk_out” clock, so before connecting the slave interfaces, we first need to create a Processor System Reset to generate a reset signal that is synchronous to this clock.

Add a “Processor System Reset” from the IP Catalog.
Connect the “axi_ctl_aclk_out” clock output of the AXI-PCIe block to the “slowest_sync_clk” input of the Processor System Reset just added.
Connect the “ext_reset_in” input of the Processor System Reset to the “perst_n” port.
Connect the “dcm_locked” input of the Processor System Reset to the “mmcm_lock” output of the AXI-PCIe block.
Now the Processor System Reset is setup and we can connect the AXI-PCIe block slave control interface. We want the control interface to be connected to the Microblaze, just like any other peripheral. Connect the “M02_AXI” interface of the “microblaze_0_axi_periph” interconnect to the “S_AXI_CTL” interface of the AXI-PCIe block.
Connect the “M02_ACLK” input of the “microblaze_0_axi_periph” interconnect to the “axi_ctl_aclk_out” output of the AXI-PCIe block.
Connect the “peripheral_aresetn” output of the “proc_sys_reset_0” Processor System Reset to the “M02_ARESETN” input of the “microblaze_0_axi_periph” interconnect.

The other slave interface of the AXI-PCIe block, S_AXI, provides access to the PCIe end-point address space. We want this port to be accessible to both the Microblaze and the CDMA, so we will add another AXI Interconnect to the design.

Add an “AXI Interconnect” from the IP Catalog to the block design. Rename it “pcie_intercon” and configure it to have 2 slave interfaces and 1 master interface.
Connect the “M00_AXI” interface of the “pcie_intercon” to the “S_AXI” interface of the AXI-PCIe block.
Now connect all the clocks and resets of the “pcie_intercon” as shown in the image below. Connect all the clock inputs to the “axi_aclk_out” output of the PCIe block. Connect the “ARESETN” input to the “interconnect_aresetn” output of the “rst_axi_pcie_0_62M” Processor System Reset. Connect all other reset inputs to the “peripheral_aresetn” output of the “rst_axi_pcie_0_62M” Processor System Reset.
Connect the “M01_AXI” interface of the “cdma_intercon” to the “S00_AXI” interface of the “pcie_intercon”.
Connect the “M03_AXI” interface of the “microblaze_0_axi_periph” interconnect to the “S01_AXI” interface of the “pcie_intercon”.
Connect the “M03_ACLK” input of the “microblaze_0_axi_periph” interconnect to the “axi_aclk_out” output of the AXI-PCIe block.
Connect the “M03_ARESETN” of the “microblaze_0_axi_periph” interconnect to the “peripheral_aresetn” of the “rst_axi_pcie_0_62M” Processor System Reset block.

Add the other peripherals

To make our design “Linux ready”, we need to add four more blocks to our design:

UART – for console output
AXI Ethernet Lite – for network connection
AXI Quad SPI – for retrieval of FPGA configuration files, software and Linux kernel from a QSPI Flash
AXI Timer – Microblaze doesn’t have an integrated timer

We will add all 4 blocks to the design and then let the block automation feature handle the connection of these peripherals to the Microblaze.

Add an “AXI UART16550” from the IP Catalog to the block design.
Add an “AXI EthernetLite” from the IP Catalog to the block design.
Add an “AXI Quad SPI” from the IP Catalog to the block design.
Add an “AXI Timer” from the IP Catalog to the block design.
Click “Run Connection Automation” and select all of the connections for the 4 added peripherals.
They will all have been automatically connected to the “microblaze_0_axi_periph” interconnect as shown in the image below.
Connect the “ext_spi_clk” input of the AXI QSPI to the same clock as it’s “s_axi_aclk” input.
Double click on the “microblaze_0_xlconcat” interrupt concat and change the number of input ports to 6 – we need 4 more to connect the interrupts of our new peripherals.
One-by-one, connect the interrupt outputs of the peripherals to the inputs of the interrupt concat as shown in the image below. The interrupt output for the UART, AXI EthernetLite and AXI QSPI is called “ip2intc_irpt”. The interrupt output for the AXI Timer is called “interrupt”.

Add some debug signals

It’s always nice to have an LED light up to tell us that things are working correctly.

Right click on the “mmcm_lock” output of the AXI-PCIe block and select “Make External”.
Right click on the “init_calib_complete” output of the MIG and select “Make External”.

We will later add a constraint for each one of these ports to assign it to a specific LED on the KC705 board.

Assign addresses

Open the “Address Editor” tab and click the “Auto Assign Address” button.
All addresses should be assigned as in the image below.
By default, the AXI-PCIe control interface (S_AXI_CTL) is allocated 256M, but this will cause a problem for PetaLinux later on, so change it to 64M and then save the block design.

Create the HDL wrapper

Now the block diagram is complete, so we can save it and create a HDL wrapper for it.

Open the “Sources” tab from the Block Design window.
Right-click on “design_1” and select “Create HDL wrapper” from the drop-down menu.
From the “Create HDL wrapper” window, select “Let Vivado manage wrapper and auto-update”. Click “OK”.

Add the constraints

We must now add our constraints to the design for assignment of the PCIe integrated block, the gigabit transceivers, the reference clocks, the LEDs and a few other signals.

Download the constraints file from this link: Constraints for Microblaze PCIe Root Complex design
Save the constraints file somewhere on your hard disk.
From the Project Manager, click “Add Sources”.
Then click “Add or create constraints”.
Then click “Add files” and browse to the constraints file that you downloaded earlier. Select the constraints file, then click “OK”. Now tick “Copy constraints files into project” and click “Finish”.
You should now see the constraints file in the Sources window.

Finished at last!

In the next tutorial: Zynq

In the next part of this tutorial series, we will build another PCIe Root Complex design in Vivado, but this time for the Zynq. The target hardware will be the PicoZed 7Z030 and the PicoZed FMC Carrier Card V2.

Testing the project on hardware

In the third and final part of this tutorial series, we will run a stand-alone application on the hardware which will check the state of the PCIe link and enumerate the connected PCIe end-points. Then we will run PetaLinux on our hardware and make an NVMe PCIe SSD accessible under the operating system.

Sources Git repository

The sources for re-generating this project automatically can be found on Github here: FPGA Drive PCIe Root Complex design

Other useful resources

Here are some other useful resources for creating PCI Express designs:

If you have any questions about this tutorial, or if you run into problems, please leave me a comment below.

↧

Zynq PCI Express Root Complex design in Vivado

April 14, 2016, 5:01 am

≫ Next: Connecting an SSD to an FPGA running PetaLinux

≪ Previous: Microblaze PCI Express Root Complex design in Vivado

This is the second part of a three part tutorial series in which we will create a PCI Express Root Complex design in Vivado with the goal of connecting a PCIe NVMe solid-state drive to our FPGA.

Part 1: Microblaze PCI Express Root Complex design in Vivado

Part 2: Zynq PCI Express Root Complex design in Vivado (this tutorial)

Part 3: Connecting an SSD to an FPGA running PetaLinux

In this second part of the tutorial series, we will build a Zynq based design targeting the PicoZed 7Z030 and PicoZed FMC Carrier Card V2. In part 3, we will then test the design on the target hardware by running a stand-alone application which will validate the state of the PCIe link and perform enumeration of the PCIe end-points. We will then run PetaLinux on the FPGA and prepare our SSD for use under the operating system.

Requirements

To complete this tutorial you will need the following:

Vivado 2015.4
PicoZed 7Z030
PicoZed FMC Carrier Card V2
FPGA Drive adapter
An NVMe PCIe solid-state drive such as this one
A JTAG programmer such as Digilent HS3 JTAG

Note: The tutorial text and screenshots are suitable for Vivado 2015.4 however the sources in the Git repository will be regularly updated to the latest version of Vivado.

Design Overview

The diagram below shows the block design we are about to build with only the AXI interfaces showing. It shows three main elements: the Zynq PS, the AXI to PCIe bridge and the AXI CDMA. If you went through the previous tutorial where we created the same design for a Microblaze system, you may be wondering why the Zynq design seems so much simpler. The reason is that a lot of the elements required in this design are hidden in the Zynq PS block, including the DDR3 memory controller, UART, Ethernet, Interrupt controller, Timer and QSPI.

zynq_pcie_root_port_design_vivado_48

So again let’s look at who the bus masters are and what address spaces they can access:

the Zynq PS can access both the DDR3 memory and the PCIe address space
the PCIe end-point with bus mastering capability can access the DDR3 memory only (via M_AXI port of the AXI-PCIe bridge)
the CDMA can access both the DDR3 memory and the PCIe address space

Install PicoZed board definition files

The first thing we have to do is provide the PicoZed board definition files to our Vivado installation so that the PicoZed will show up in the list of targets when we create a new project. The board definition files contain information about the hardware on the target board and also on how the Zynq PS should be configured in order to properly connect to that hardware.

Download the PicoZed board definition files for Vivado 2015.4 from the PicoZed documentation page.
From inside the ZIP file, copy the folder picozed_7030_fmc2 into the folder C:\Xilinx\Vivado\2015.4\data\boards\board_files of your Vivado installation.

Create a new Vivado project

Let’s kick off the design by creating a new project in Vivado and selecting the PicoZed 7Z030 as our target.

From the welcome screen, click “Create New Project”. Specify a folder for the project. I’ve created a folder named “kc705_aximm_pcie”. Click “Next”.
For the Project Type window, choose “RTL Project” and tick “Do not specify sources at this time”. Click “Next”.
For the Default Part window, select the “Boards” tab and then select the “PicoZed 7030 SOM + FMC Carrier V2” and click “Next”.
Click “Finish” to complete the new project wizard.

Create the block design

In the following steps, we’ll create the block design then add the Zynq PS and the AXI Memory Mapped PCI Express Bridge.

From the Vivado Flow Navigator, click “Create Block Design”.
Specify a name for the block design. Let’s go with the default “design_1” and leave it local to the project. Click “OK”.
Once the empty block design opens, click on the “Add IP” icon. The IP catalog will appear. Find and double click on “ZYNQ7 Processing System”.
The Zynq PS block will be added to the block design. Click Run Block Automation to configure the Zynq PS for our target hardware.
Use the default block automation settings.
Now double click on the Zynq PS to configure it.
In the PS-PL Configuration tab, enable HP Slave AXI interface (S AXI HP0 interface). HP stands for high-performance and this port allows an AXI master to access the DDR3 memory.
In the Clock Configuration tab, disable the PL Fabric Clock FCLK_CLK0, because we won’t be needing it. Instead, most of our design will be driven by the clock supplied by the AXI-PCIe bridge, which is derived from the 100MHz PCIe reference clock.
In the Interrupts tab, enable Fabric Interrupts, IRQ_F2P. This allows us to connect interrupts from our PL (programmable logic) to the Zynq PS.
The Zynq block should look like in the image below.

Add the AXI MM to PCIe bridge

From the IP Catalog, add the “AXI Memory Mapped to PCI Express” block to the design.
When the AXI-PCIe block is in the block design, double click on it to configure it.
On the “PCIE:Basics” tab of the configuration, select “Root Port of PCI Express Root Complex” as the port type.
On the “PCIE:Link Config” tab, select a “Lane Width” of 1x and a “Link speed” of 5 GT/s (Gen2). We plan to connect to a 4-lane NVMe PCIe SSD in the next part of this tutorial, but the target hardware only has a single-lane PCIe edge connector.
In the “PCIE:ID” tab, enter a “Class Code” of 0x060400. This is important for the next tutorial, in which we will be running PetaLinux. The class code will ensure that the correct driver is associated with the AXI to PCIe bridge IP.
In the “PCIE:BARS” tab, set BAR 0 with type “Memory” and a size of 1 Gigabytes.
In the “PCIE:Misc” tab, use the defaults as shown in the image below.
In the “AXI:BARS” tab, use the defaults as shown in the image below. We will later be able to configure the size of the AXI BAR 0 in the Address Editor.
In the “AXI:System” tab, use the defaults as shown in the image below.
In the “Shared Logic” tab, use the defaults as shown in the image below. Click “OK”.
Right click on the “pcie_7x_mgt” port of the AXI-PCIe block and select “Make External”. This will connect the gigabit transceiver to the 1-lane PCIe edge-connector of the PicoZed 7030 SOM + FMC Carrier V2.
Right click on the “axi_aresetn” port of the AXI-PCIe block and select “Make External”. This will be connected to the PERST_N signal that is generated by the FPGA Drive adapter.
Rename the created port to “perst_n” using the External Port Properties window.
Add a “Constant” from the IP Catalog and configure it to output 0 (low). We’ll use this to tie low the “INTX_MSI_Request” input of the AXI-PCIe block. Connect the constant’s output to the “INTX_MSI_Request” input of the AXI-PCIe block.
Add a “Utility Buffer” to the block design. This buffer is going to be connected to a 100MHz clock that will be provided to the PicoZed 7030 SOM + FMC Carrier V2 by the FPGA Drive adapter, via the PCIe edge-connector. A 100MHz reference clock is required by all PCIe devices. Double click on the utility buffer and on the “Page 0” tab of the configuration window, select “IBUFDSGTE” as the C Buf Type. Click “OK”.
Connect the “IBUF_OUT” output of the utility buffer to the “REFCLK” input of the AXI-PCIe block.
Right click on the “CLK_IN_D” input of the utility buffer and select “Make External”, then change the name of the created external port to “ref_clk” using the External Interface Properties window.

Add the Processor System Resets

Our design will be using the two clocks supplied by the AXI-PCIe bridge: “axi_aclk_out” and “axi_ctl_aclk_out”. We will need to add a Processor System Reset to generate resets for each of those clocks.

From the IP Catalog, add a “Processor System Reset” to the design – this one should automatically be called “proc_sys_reset_0”.
Connect the “axi_ctl_aclk_out” output of the AXI-PCIe block to the “slowest_sync_clk” input of the “proc_sys_reset_0” Processor System Reset.
Connect the “mmcm_lock” output of the AXI-PCIe block to the “dcm_locked” input of the “proc_sys_reset_0” Processor System Reset.
Connect the “ext_reset_in” input of the “proc_sys_reset_0” Processor System Reset to the “perst_n” port.
From the IP Catalog, add another “Processor System Reset” to the design – this one should automatically be called “proc_sys_reset_1”.
Connect the “axi_aclk_out” output of the AXI-PCIe block to the “slowest_sync_clk” input of the “proc_sys_reset_1” Processor System Reset.
Connect the “mmcm_lock” output of the AXI-PCIe block to the “dcm_locked” input of the “proc_sys_reset_1” Processor System Reset.
Connect the “ext_reset_in” input of the “proc_sys_reset_1” Processor System Reset to the “perst_n” port.

Add the CDMA

We’re going to add a Central DMA to this design to allow us to make DMA transfers between the PCIe end-point and the DDR3 memory. We won’t actually test it, that will be the subject of another tutorial, but most PCIe designs can benefit from having a Central DMA because it allows for higher throughput over the PCIe link using burst transfers.

Add an “AXI Central Direct Memory Access” from the IP Catalog to the block design.
Double click on the CDMA block to open the configuration window. Disable Scatter Gather and set “Write/Read Data Width” to 128 as shown in the image below.

Add the Interrupt Concat

To connect interrupts to the IRQ_F2P port of the Zynq PS, we need to use a Concat.

From the IP Catalog, add a “Concat” to the block design.
By default, it should have two inputs – that’s perfect for us, as we only have 2 interrupts to connect. Connect the output of the Concat to the “IRQ_F2P” port of the Zynq PS.
Connect the “interrupt_out” output of the AXI-PCIe block to the “In0” input of the Concat.
Connect the “cdma_introut” output of the CDMA block to the “In1” input of the Concat.

Add the AXI Interconnects

Now the last thing to do is add the AXI Interconnects and wire up all the AXI interfaces.

axi_interconnect_0:

From the IP Catalog, add an “AXI Interconnect” to the block design – this one should be automatically named “axi_interconnect_0”. We’ll use this to create two ports for accessing the DDR3 memory.
Re-configure it to have 2 slave ports and 1 master port.
Connect the “M00_AXI” port of the “axi_interconnect_0” to the “S_AXI_HP0” port of the Zynq PS.

axi_interconnect_1:

From the IP Catalog, add another “AXI Interconnect” to the block design – this one should be automatically named “axi_interconnect_1”. We’ll use this to create two ports for accessing the AXI-PCIe control interface, the PCIe end-point and the CDMA control interface.
Re-configure it to have 2 slave ports and 3 master ports.
Connect the “M00_AXI” port of the “axi_interconnect_1” to the “S_AXI” port of the AXI-PCIe block.
Connect the “M01_AXI” port of the “axi_interconnect_1” to the “S_AXI_CTL” port of the AXI-PCIe block.
Connect the “M02_AXI” port of the “axi_interconnect_1” to the “S_AXI_LITE” port of the CDMA block.

axi_interconnect_2:

From the IP Catalog, add another “AXI Interconnect” to the block design – this one should be automatically named “axi_interconnect_2”. We’ll use this to allow the CDMA to access both the DDR3 memory and the PCIe end-point.
By default, it should already have 1 slave port and 2 master ports, which is exactly what we need.
Connect the “M00_AXI” port of the “axi_interconnect_2” to the “S01_AXI” port of the “axi_interconnect_0” (the first interconnect we created).
Connect the “M01_AXI” port of the “axi_interconnect_2” to the “S01_AXI” port of the “axi_interconnect_1” (the second interconnect we created).

Now for the rest of the connections:

Connect the “M_AXI” port of the CDMA block to the “S00_AXI” port of the “axi_interconnect_2”.
Connect the “M_AXI_GP0” port of the Zynq PS to the “S00_AXI” port of the “axi_interconnect_1”.
Connect the “M_AXI” port of the AXI-PCIe block to the “S00_AXI” port of the “axi_interconnect_0”.

Connect all the clocks

Let’s start by hooking up the main clock axi_aclk_out:

Connect “axi_aclk_out” clock to the “M_AXI_GP0_ACLK” and “S_AXI_HP0_ACLK” inputs of the Zynq PS.
Connect “axi_aclk_out” clock to the “m_axi_aclk” and “s_axi_lite_aclk” inputs of the CDMA.
Connect “axi_aclk_out” clock to the “ACLK”, “S00_ACLK”, “M00_ACLK” and “S01_ACLK” inputs of the “axi_interconnect_0” (ie. all of the clock inputs).
Connect “axi_aclk_out” clock to the “ACLK”, “S00_ACLK”, “M00_ACLK”, “S01_ACLK” and “M02_ACLK” inputs of the “axi_interconnect_1” (notice that we do not connect “M01_ACLK” yet!).
Connect “axi_aclk_out” clock to the “ACLK”, “S00_ACLK”, “M00_ACLK” and “M01_ACLK” inputs of the “axi_interconnect_2” (ie. all of the clock inputs).

Now the control clock axi_ctl_aclk_out:

Connect “axi_ctl_aclk_out” clock to the “M01_ACLK” input of the “axi_interconnect_1”.

Connect all the resets

Connect the “interconnect_aresetn” output of the “proc_sys_reset_1” Processor System Reset to the “ARESETN” input of ALL 3 AXI Interconnects.
Connect the “peripheral_aresetn” output of the “proc_sys_reset_1” Processor System Reset to the following inputs:
1. CDMA input “s_axi_lite_aresetn”
2. “axi_interconnect_0” inputs “S00_ARESETN”, “M00_ARESETN” and “S01_ARESETN”
3. “axi_interconnect_1” inputs “S00_ARESETN”, “M00_ARESETN”, “S01_ARESETN” and “M02_ARESETN” (notice that we do not connect “M01_ARESETN” yet!)
4. “axi_interconnect_2” inputs “S00_ARESETN”, “M00_ARESETN” and “M01_ARESETN”
Connect the “peripheral_aresetn” output of the “proc_sys_reset_0” Processor System Reset to the “M01_ARESETN” of “axi_interconnect_1”.

Assign addresses

Open the “Address Editor” tab and click the “Auto Assign Address” button.
There will be an error generated because Vivado will try to assign 1G to the PCIe BAR0 and 256M to the PCIe control interface (CTL0). Change the size of PCIe BAR0 to 256M and use the “Auto Assign Address” button again. It should succeed this time and you will have the addresses shown below.
Finally, we’ll need to set the size of the PCIe control interface to 64M, to avoid a memory allocation problem in PetaLinux later.

Create the HDL wrapper

Now the block diagram is complete, so we can save it and create a HDL wrapper for it.

Open the “Sources” tab from the Block Design window.
Right-click on “design_1” and select “Create HDL wrapper” from the drop-down menu.
From the “Create HDL wrapper” window, select “Let Vivado manage wrapper and auto-update”. Click “OK”.

Add the constraints

We must now add our constraints to the design for assignment of the PCIe integrated block, the gigabit transceivers, the reference clocks and a few other signals.

Download the constraints file from this link: Constraints for Zynq PCIe Root Complex design
Save the constraints file somewhere on your hard disk.
From the Project Manager, click “Add Sources”.
Then click “Add or create constraints”.
Then click “Add files” and browse to the constraints file that you downloaded earlier. Select the constraints file, then click “OK”. Now tick “Copy constraints files into project” and click “Finish”.
You should now see the constraints file in the Sources window.

You’re all done!

Testing the project on hardware

In the next and final part of this tutorial series, we will test our design on hardware by connecting an NVMe PCIe SSD to our FPGA. We’ll start by running a simple stand-alone application that will check the PCIe bus status and enumerate the end-points. Then we’ll generate a PetaLinux build that is customized to our hardware and we’ll bring up the SSD from the command line.

Sources Git repository

The sources for re-generating this project automatically can be found on Github here: FPGA Drive PCIe Root Complex design

Other useful resources

Here are some other useful resources for creating PCI Express designs:

If you have any questions about this tutorial, or if you run into problems, please leave me a comment below.

↧

Connecting an SSD to an FPGA running PetaLinux

April 15, 2016, 10:43 am

≫ Next: Multi-port Ethernet in PetaLinux

≪ Previous: Zynq PCI Express Root Complex design in Vivado

This is the final part of a three part tutorial series on creating a PCI Express Root Complex design in Vivado and connecting a PCIe NVMe solid-state drive to an FPGA.

Part 1: Microblaze PCI Express Root Complex design in Vivado

Part 2: Zynq PCI Express Root Complex design in Vivado

Part 3: Connecting an SSD to an FPGA running PetaLinux (this tutorial)

In this final part of the tutorial series, we’ll start by testing our hardware with a stand-alone application that will verify the status of the PCIe link and perform enumeration of the PCIe end-points. We’ll then run PetaLinux on the FPGA and prepare our SSD for use under the operating system. PetaLinux will be built for our custom hardware using the PetaLinux SDK and the Vivado generated hardware description. Using Linux commands, we will then create a partition, a file system and a file on the solid-state drive.

This part of the tutorial applies to both the Microblaze and Zynq designs developed in the previous tutorials. Where the instructions differ between the designs, they are split into two branches.

Requirements

To complete this tutorial you will need the following:

Vivado 2015.4
PetaLinux SDK 2015.4
Putty (or similar terminal program)
For the Microblaze design:
- KC705 Evaluation Board
For the Zynq design:
- PicoZed 7Z030
- PicoZed FMC Carrier Card V2
- A JTAG programmer such as Digilent HS3 JTAG
FPGA Drive adapter and supplied power splitter
An NVMe PCIe solid-state drive such as this one

Note: The tutorial text and screenshots are suitable for Vivado 2015.4 however the sources in the Git repository will be regularly updated to the latest version of Vivado.

Tool Setup for Windows users

PetaLinux SDK 2015.4 only runs in the Linux operating system, so Windows users (like me) have to have two machines to follow this tutorial. You can either have two physical machines, which is how I work, or you can have one Windows machine and one Linux virtual machine. In this tutorial, I will assume that you have two physical machines, one running Windows and the other running Linux. My personal setup uses Windows 7 and Ubuntu 14.04 LTS on two separate machines.

If you are building your Linux setup for the first time, here are the supported OSes according to the PetaLinux SDK Installation guide:

RHEL 5 (32-bit or 64-bit)
RHEL 6 (32-bit or 64-bit)
SUSE Enterprise 11 (32-bit or 64-bit)

Note: I had problems installing PetaLinux SDK 2015.4 on 32-bit Ubuntu, as did others, so I use 64-bit Ubuntu and I haven’t had any problems with my setup.

Setup the hardware: KC705

The KC705 Evaluation Board must be setup as shown in the image below. It is strongly recommended that you make the connections in the precise order described below.

connecting_ssd_to_fpga_running_petalinux_125

Connect the M.2 PCIe SSD to the FPGA Drive adapter, and tighten the fixing screw
Connect the FPGA Drive to the KC705 PCI Express edge-connector. Do NOT put pressure on the M.2 SSD while doing this.
Connect the input of the power splitter (comes with FPGA Drive) to the power adapter that was supplied with the KC705
Connect one branch of the power splitter to the KC705 power connector (J49).
Connect the other branch of the power splitter to the FPGA Drive power connector
Connect a USB cable between your PC and the UART port of the KC705
Connect a USB cable between your PC and the JTAG port of the KC705
Set DIP switches (SW13) to 11101 (this is for configuration by JTAG, see UG810 page 73)

Setup the hardware: PicoZed and PicoZed FMC Carrier Card V2

The PicoZed 7Z030 and PicoZed FMC Carrier Card V2 must be setup as shown in the image below. It is strongly recommended that you make the connections in the precise order described below.

connecting_ssd_to_fpga_running_petalinux_121

Insert the PicoZed into the SoM socket of the PicoZed FMC Carrier Card V2
Connect the M.2 PCIe SSD to the FPGA Drive adapter, and tighten the fixing screw
Connect the FPGA Drive to the PicoZed FMC Carrier Card V2 PCI Express edge-connector. Do NOT put pressure on the M.2 SSD while doing this.
Connect the input of the power splitter (comes with FPGA Drive) to the power adapter that was supplied with the PicoZed FMC Carrier Card V2
Connect one branch of the power splitter to the PicoZed FMC Carrier Card V2 power connector (J2).
Connect the other branch of the power splitter to the FPGA Drive power connector
Connect a USB cable between your PC and the UART port of the PicoZed FMC Carrier Card V2
Connect a JTAG programmer between your PC and the JTAG port (J7) of the PicoZed FMC Carrier Card V2
Set DIP switches (SW1) to 00 (this is for configuration by JTAG, see PicoZed 7015/7030 User guide table 13)

Regenerate the Vivado project

If you did not follow either of the previous tutorials, and you do not have a completed Vivado project, then follow these instructions to regenerate the Vivado project from scripts. Please note that the Git repository is regularly updated for the latest version of Vivado, so you must download the last “commit” for the version of Vivado that you are using.

Download the sources from Github here: https://github.com/fpgadeveloper/fpga-drive-aximm-pcie
Depending on your operating system:
- If you are using a Windows machine, open Windows Explorer, browse to the “Vivado” folder within the sources you just downloaded. Double-click on the build-<your target platform>.bat file to run the batch file.
- If you are using a Linux machine, run Vivado and then select Window->Tcl Console from the welcome screen. In the Tcl console, use the “cd” command to navigate to the “Vivado” folder within the sources you just downloaded. Then type source build-<your target platform>.tcl to run the build script.
Once the script has finished running, the Vivado project should be regenerated and located in the “Vivado” folder. Run Vivado and open the newly generated project.

Note: You must replace <your target platform> with kc705 or pz-7z030, depending on the target hardware you are using.

Generate a bitstream

The first thing we’ll need to do is to generate a bitstream from the Vivado project we created in the earlier tutorials.

Open the project in Vivado.
From the Flow Navigator, click “Generate Bitstream”.
Depending on your machine, it will take several minutes to perform synthesis and implementation. In the end, you should see the following message. Just select “View Reports” and click OK.
Now we need to use the “Export to SDK” feature to create a hardware description file (.hdf) for the project. From the menu, select File->Export->Export Hardware.
In the Export Hardware window, tick “Include bitstream” and choose “Local to Project” as the export location.

Launch Xilinx SDK

At this point it’s best to launch the Xilinx SDK from Vivado, because it will automatically setup our SDK workspace with a hardware platform based on our project’s hardware description file.

From the menu, select File->Launch SDK.
Specify both the exported location and workspace as “Local to Project”.
The SDK should automatically create the hardware platform (design_1_wrapper_hw_platform_0) for you, and you should see it in the Project Explorer as seen in the image below – first image is for the KC705, and the second image is for the PicoZed.
Now we want to create a template software application, so that we can simply insert code and run it. From the menu, select File->New->Application Project.
In the New Project window, type “pcie_test” as the Project name and click Next. The right Processor for your hardware should already be selected. The image below shows ps7_cortexa9_0 for the PicoZed, but for the KC705, it will be microblaze_0.
Select the “Hello World” template and click Finish.
You should now see the software application “pcie_test” and the BSP “pcie_test_bsp” added to your workspace in the Project Explorer.
Now we need to get the code to test our PCIe link. We will use an example from Xilinx which you can find in the Xilinx SDK installation folders at this location: C:\Xilinx\SDK\2015.4\data\embeddedsw\XilinxProcessorIPLib\drivers\axipcie_v3_0\examples\xaxipcie_rc_enumerate_example.c
In Windows Explorer, browse to the above “examples” folder, right click on the source file and select “Copy”.
Now return to Xilinx SDK and open the “pcie_test” tree to reveal the “src” folder. Now right click on the “src” folder and select “Paste”. This will copy the source file into our application.
Now select the “helloworld.c” file in the “src” folder and press Del to delete the file.
SDK will automatically rebuild the software application.

Now we are ready to run the stand-alone application on the hardware.

Run the stand-alone application

Power up the hardware:

First switch ON the SSD – Then switch ON your FPGA platform
Open Putty or a similar terminal program to receive the console output from the UART.
Check your device manager to find the USB-UART, and it’s comport. The example below shows COM16. If you don’t find one, then ensure that you have a USB cable between the PC and the UART port of your FPGA board.
In Putty, open a new session using the comport that you just located and the following settings:
- Baud rate: 9600bps
- Data: 8 bits
- Parity: None
- Stop bits: 1
Now returning to the SDK, from the menu, select Xilinx Tools->Program FPGA.
In the Program FPGA window, we select the hardware platform to program. We have only one hardware platform, so click “Program”. The image below, taken for the KC705 design, shows that the Microblaze will be loaded with the bootloop program. If you are using the PicoZed, you will not be loading the processor with anything and so this line will be blank.
The bitstream will be loaded onto the FPGA and we are ready to load the software application. Select the “pcie_test” folder in the Project Explorer, then from the menu, select Run->Run.
In the Run As window, select “Launch on Hardware (GDB)” and click “OK”.
The application will be loaded on the processor and it will be executed. The terminal window should display this output:

The console output shows that the PCIe “Link is up” and it has enumerated the PCIe bridge and an end-point with Vendor ID 0x144D.

Build PetaLinux

Now that we have validated our hardware, let’s get started using the PetaLinux SDK on our Linux machine.

On your Linux machine, start a command terminal.
Type source /<your-petalinux-install-dir>/settings.sh into the terminal and press Enter. Obviously you must insert the location of your PetaLinux installation.
For consistency, let’s work from a directory called projects/fpga-drive-aximm-pcie in your home directory. Create that directory and then “cd” to it.
Use a USB stick or another method to copy the entire Vivado project directory (should be kc705_aximm_pcie for the KC705, pz_7z030_aximm_pcie for the PicoZed) from your Windows machine onto your Linux machine. Place it into the directory we just created.
Create a PetaLinux project using this command:
- For KC705: petalinux-create --type project --template microblaze --name petalinux_prj
- For PicoZed: petalinux-create --type project --template zynq --name petalinux_prj
Change to the “petalinux_prj” directory in the command terminal.

Stay in the PetaLinux project folder from here on. It is important that all the following commands are run from the PetaLinux project folder that we just created.
Import the Vivado generated hardware description into our PetaLinux project with the command:
- For KC705: petalinux-config --get-hw-description ../kc705_aximm_pcie/kc705_aximm_pcie.sdk/
- For PicoZed: petalinux-config --get-hw-description ../pz_7z030_aximm_pcie/pz_7z030_aximm_pcie.sdk/
The Linux System Configuration will open, but we don’t have any changes to make here, so simply exit and save the configuration.
Configure the Linux kernel with the command: petalinux-config -c kernel
Now we use the kernel configuration menu to enable PCI support and enable the driver for NVM Express devices:
- For KC705:
  - Enable: Bus options->PCI support
  - Enable: Bus options->PCI support->Message Signaled Interrupts (MSI and MSI-X)
  - Enable: Bus options->PCI support->Enable PCI resource re-allocation detection
  - Enable: Bus options->PCI support->PCI host controller drivers->Xilinx AXI PCIe host bridge support
  - Enable: Device Drivers->Block devices->NVM Express block device
- For PicoZed:
  - Check: Bus options->PCI support should already be enabled by default
  - Check: Bus options->PCI support->Message Signaled Interrupts (MSI and MSI-X) should already be enabled by default
  - Check: Bus options->PCI support->Enable PCI resource re-allocation detection should already be enabled by default
  - Check: Bus options->PCI support->PCI host controller drivers->Xilinx AXI PCIe host bridge support should already be enabled by default
  - Enable: Device Drivers->Block devices->NVM Express block device
To configure the Linux root file system, run the command: petalinux-config -c rootfs
Configure the root file system to include some utilities we will need to setup the NVMe PCIe SSD:
- Enable PCI utils (for lspci): Filesystem Packages->console/utils->pciutils->pciutils
- Enable required packages for lsblk, fdisk, mkfs, blkid:
  - Filesystem Packages->base->util-linux->util-linux
  - Filesystem Packages->base->util-linux->util-linux-blkid
  - Filesystem Packages->base->util-linux->util-linux-fdisk
  - Filesystem Packages->base->util-linux->util-linux-mkfs
  - Filesystem Packages->base->util-linux->util-linux-mount
  - Filesystem Packages->base->e2fsprogs->e2fsprogs
  - Filesystem Packages->base->e2fsprogs->e2fsprogs-mke2fs
Build PetaLinux using the command: petalinux-build

PetaLinux will take a few minutes to build depending on your machine.

Boot PetaLinux over JTAG

There are many ways to boot PetaLinux on the hardware, but to avoid going through the details of setting up a flash or SD card boot, we will use the JTAG method for this tutorial.

Power up your hardware:

First switch ON the SSD – Then switch ON your FPGA platform
Open a new session in Putty again, but this time, use a baud rate of 115200bps:
- Baud rate: 115200bps
- Data: 8 bits
- Parity: None
- Stop bits: 1
Boot PetaLinux using these commands:
- For KC705, we load the bitstream then the kernel:
  - petalinux-boot --jtag --fpga --bitstream ../kc705_aximm_pcie/kc705_aximm_pcie.runs/impl_1/design_1_wrapper.bit
  - petalinux-boot --jtag --kernel
- For PicoZed, we package everything, then load it all:
  - petalinux-package --boot --fsbl ./images/linux/zynq_fsbl.elf --fpga ../pz_7z030_aximm_pcie/pz_7z030_aximm_pcie.runs/impl_1/design_1_wrapper.bit --uboot --force
  - petalinux-package --prebuilt --fpga ../pz_7z030_aximm_pcie/pz_7z030_aximm_pcie.runs/impl_1/design_1_wrapper.bit
  - petalinux-boot --jtag --prebuilt 3 --fpga --bitstream ../pz_7z030_aximm_pcie/pz_7z030_aximm_pcie.runs/impl_1/design_1_wrapper.bit
It will take several minutes before the kernel has been transferred via JTAG. Wait for the command line to return, then it can take another 10-20 seconds before you see any output on the Putty terminal.
PetaLinux will boot and you should see the boot log on the Putty terminal window.

If you want to see the complete boot logs, here they are:

How to setup the NVMe SSD in PetaLinux

Log into PetaLinux using the username root and the password root.
Check that the SSD has been enumerated using: lspci.
Check that the SSD has been recognized as a block device using: lsblk.
Create a partition on the SSD using: fdisk /dev/nvme0n1.
- Type “n” to create a new partition
- Then type “p”, then “1” to create a new primary partition
- Use the defaults for the sector numbers
- Then type “w” to write the data to the disk
Run lsblk again to get the name of the partition created. As you see in the image below, it is nvme0n1p1.
Create a file system on the new partition using: mkfs -t ext2 /dev/nvme0n1p1. This will take a few minutes.
Make a directory to mount the file system to using: mkdir /media/nvme.
Mount the SSD to that directory: mount /dev/nvme0n1p1 /media/nvme.
Change to the /media/nvme directory.
Create a file called test.txt using vi test.txt.
In VI, press the capital letter “I” (as in India) to start adding text to the file.
Now type The Matrix has you... into the file, press Esc and then type “:x” (colon, then the letter x) to save the file and quit.
Now use ls to see that the file is there.

Reboot

Let’s shut it all down and re-boot so that we can check that our file is still there after powering down.

Use poweroff to shutdown Linux.
Power down the hardware.
Run through the steps to Boot PetaLinux over JTAG, until you have logged in again as root.
Create a directory to mount the SSD to again: mkdir /media/nvme.
Mount the SSD to that directory: mount /dev/nvme0n1p1 /media/nvme.
Change to the /media/nvme directory.
Check that the file is still there using: ls.
Display the file using: cat test.txt.

What now?

Here are some interesting things you can explore which will be topics for future tutorials:

Using hdparm to measure the read/write speeds of the SSD
Creating a PetaLinux root file system on the SSD
Booting PetaLinux from the SSD

Source code Git repository

The sources for re-generating this project automatically can be found on Github here: FPGA Drive PCIe Root Complex design

Other useful resources

Here are some other useful resources for creating PCI Express designs:

If you have any questions about this tutorial, or if you run into problems, please leave me a comment below.

↧

Multi-port Ethernet in PetaLinux

May 3, 2016, 6:57 pm

≫ Next: Avnet releases PicoZed FMC Carrier Card V2

≪ Previous: Connecting an SSD to an FPGA running PetaLinux

Many FPGA-based embedded designs require connections to multiple Ethernet devices such as IP cameras, and control of those devices under an operating system, typically Linux. The development of such applications can be accelerated through the use of development boards such as the ZedBoard and the Ethernet FMC. In this tutorial, we will build a custom version of PetaLinux for the ZedBoard and bring up 4 extra Ethernet ports, made available by the Ethernet FMC. The Vivado hardware design used in this tutorial will be very similar to the one we created in a previous tutorial titled: Using AXI Ethernet Subsystem and GMII-to-RGMII in a Multi-port Ethernet design. You don’t need to have followed that tutorial to do this one, as the Vivado project can be built from the sources on Github.

Requirements

To complete this tutorial you will need the following:

Vivado 2015.4
PetaLinux SDK 2015.4
ZedBoard
Ethernet FMC
Putty (or similar terminal program)

Tool Setup for Windows users

If you are building your Linux setup for the first time, here are the supported OSes according to the PetaLinux SDK Installation guide:

RHEL 5 (32-bit or 64-bit)
RHEL 6 (32-bit or 64-bit)
SUSE Enterprise 11 (32-bit or 64-bit)

Note: I had problems installing PetaLinux SDK 2015.4 on 32-bit Ubuntu, as did others, so I use 64-bit Ubuntu and I haven’t had any problems with my setup.

Regenerate the Vivado project

The details of the Vivado design will not be covered by this tutorial as it has already been covered in a previous tutorial – except that in this tutorial, we will be using AXI Ethernet Subsystem IP for all 4 ports. Follow these instructions to regenerate the Vivado project from scripts. Please note that the Git repository is regularly updated for the latest version of Vivado, so you must download the last “commit” for the version of Vivado that you are using.

Download the sources from Github here: https://github.com/fpgadeveloper/zedboard-qgige-axieth
Depending on your operating system:
- If you are using a Windows machine, open Windows Explorer, browse to the “Vivado” folder within the sources you just downloaded. Double-click on the build.bat file to run the batch file.
- If you are using a Linux machine, run Vivado and then select Window->Tcl Console from the welcome screen. In the Tcl console, use the “cd” command to navigate to the “Vivado” folder within the sources you just downloaded. Then type source build.tcl to run the build script.
Once the script has finished running, the Vivado project should be regenerated and located in the “Vivado” folder. Run Vivado and open the newly generated project.

Generate the bitstream

The first thing we’ll need to do is to generate the bitstream from the Vivado project.

Open the project in Vivado.
From the Flow Navigator, click “Generate Bitstream”.
Depending on your machine, it will take several minutes to perform synthesis and implementation. In the end, you should see the following message. Just select “View Reports” and click OK.
Now we need to use the “Export to SDK” feature to create a hardware description file (.hdf) for the project. From the menu, select File->Export->Export Hardware.
In the Export Hardware window, tick “Include bitstream” and choose “Local to Project” as the export location.

Build PetaLinux for our design

Now it’s time to move to our Linux machine and use the PetaLinux SDK to build PetaLinux for our hardware design.

On your Linux machine, start a command terminal.
Type source /<your-petalinux-install-dir>/settings.sh into the terminal and press Enter. Obviously you must insert the location of your PetaLinux installation.
For consistency, let’s work from a directory called projects/zedboard-multiport-ethernet in your home directory. Create that directory and then “cd” to it.
Use a USB stick or another method to copy the entire Vivado project directory (should be zedboard_qgige_axieth) from your Windows machine onto your Linux machine. Place it into the directory we just created.
Create a PetaLinux project using this command: petalinux-create --type project --template zynq --name petalinux_prj
Change to the “petalinux_prj” directory in the command terminal.

Stay in the PetaLinux project folder from here on. It is important that all the following commands are run from the PetaLinux project folder that we just created.
Import the Vivado generated hardware description into our PetaLinux project with the command: petalinux-config --get-hw-description ../zedboard_qgige_axieth/zedboard_qgige_axieth.sdk/
The Linux System Configuration will open, but we don’t have any changes to make here, so simply exit and save the configuration.
Configure the Linux kernel with the command: petalinux-config -c kernel
In the Kernel configuration, we need to disable the Xilinx AXI DMA driver, as it conflicts with the AXI Ethernet driver. Disable: Device Drivers->DMA Engine support->Xilinx DMA Engines->Xilinx AXI DMA Engine, then exit and save the configuration.
We don’t have anything to change in the Linux root file system, but if you want to make your own changes, run the command: petalinux-config -c rootfs
The device tree that was generated by PetaLinux SDK will not contain the MAC addresses, nor the addresses of the Ethernet PHYs, so we have to add this information manually. Open the system-top.dts file in the petalinux-prj/subsystems/linux/configs/device-tree directory.
Add the following code to the end of the system-top.dts file and then save it:

&axi_ethernet_0 {
	local-mac-address = [00 0a 35 00 01 22];
	phy-handle = <&phy0>;
	xlnx,has-mdio = <0x1>;
	mdio {
		#address-cells = <1>;
		#size-cells = <0>;
		phy0: phy@0 {
			compatible = "marvell,88e1510";
			device_type = "ethernet-phy";
			reg = <0>;
			marvell,reg-init = <2 21 0xffef 0x00 0x0 0 0x7fff 0x8000>;
		};
	};
};

&axi_ethernet_1 {
	local-mac-address = [00 0a 35 00 01 23];
	phy-handle = <&phy1>;
	xlnx,has-mdio = <0x1>;
	mdio {
		#address-cells = <1>;
		#size-cells = <0>;
		phy1: phy@0 {
			compatible = "marvell,88e1510";
			device_type = "ethernet-phy";
			reg = <0>;
			marvell,reg-init = <2 21 0xffef 0x00 0x0 0 0x7fff 0x8000>;
		};
	};
};

&axi_ethernet_2 {
	local-mac-address = [00 0a 35 00 01 24];
	phy-handle = <&phy2>;
	xlnx,has-mdio = <0x1>;
	mdio {
		#address-cells = <1>;
		#size-cells = <0>;
		phy2: phy@0 {
			compatible = "marvell,88e1510";
			device_type = "ethernet-phy";
			reg = <0>;
			marvell,reg-init = <2 21 0xffef 0x00 0x0 0 0x7fff 0x8000>;
		};
	};
};

&axi_ethernet_3{
	local-mac-address = [00 0a 35 00 01 25];
	phy-handle = <&phy3>;
	xlnx,has-mdio = <0x1>;
	mdio {
		#address-cells = <1>;
		#size-cells = <0>;
		phy3: phy@0 {
			compatible = "marvell,88e1510";
			device_type = "ethernet-phy";
			reg = <0>;
			marvell,reg-init = <2 21 0xffef 0x00 0x0 0 0x7fff 0x8000>;
		};
	};
};

Build PetaLinux using the command: petalinux-build

PetaLinux will take a few minutes to build depending on your machine.

Boot PetaLinux from SD card

Now we will generate the boot files for an SD card, copy those files to the SD card and then boot PetaLinux on the ZedBoard.

Generate the boot files using these commands:
- petalinux-package --boot --fsbl ./images/linux/zynq_fsbl.elf --fpga ../zedboard_qgige_axieth/zedboard_qgige_axieth.runs/impl_1/design_1_wrapper.bit --uboot --force
- petalinux-package --prebuilt --fpga ../zedboard_qgige_axieth/zedboard_qgige_axieth.runs/impl_1/design_1_wrapper.bit
The boot files can now be found in the petalinux-prj/images/linux folder. Copy the BOOT.BIN and image.ub files into the root of your SD card.
Plug the SD card into your ZedBoard.
Make sure that your ZedBoard is configured to boot from the SD card by setting jumpers JP7, JP8, JP9, JP10 and JP11 to 00110 respectively.
Make sure that a USB cable connects the ZedBoard USB-UART to your PC.
Turn ON the ZedBoard.
Find the COM port associated with your ZedBoard USB-UART by going into Device Manager.
Open a new session in Putty using these settings and the COM port you just identified:
- Baud rate: 115200bps
- Data: 8 bits
- Parity: None
- Stop bits: 1
Watch PetaLinux booting up in the Putty console and wait for the login. If you don’t see anything, you probably missed the boot sequence – just press ENTER and you should see the login prompt.

If you want to see the complete boot log, click here.

Configure the Ethernet ports

Our Vivado design has 5 Ethernet ports: the on-board port of the ZedBoard plus the 4 ports of the Ethernet FMC. In PetaLinux, these ports will be assigned to eth0 (on-board port), and eth1-eth4 (Ethernet FMC ports 0-3). Using ifconfig, we will configure the Ethernet FMC ports with fixed IP addresses. We will then connect one of them to a PC and use ping to test it.

First login to PetaLinux using the username “root” and the password “root”.
Configure the Ethernet ports using the following commands:
- ifconfig eth1 192.168.1.11 netmask 255.255.255.0 up
- ifconfig eth2 192.168.1.12 netmask 255.255.255.0 up
- ifconfig eth3 192.168.1.13 netmask 255.255.255.0 up
- ifconfig eth4 192.168.1.14 netmask 255.255.255.0 up
When you enter each of the above commands, you should get an output that looks like this:

net eth1: Promiscuous mode disabled.
net eth1: Promiscuous mode disabled.
xilinx_axienet 41000000.ethernet eth1: Link is Down

Test the Ethernet ports

To test the Ethernet ports, we’ll need a PC with it’s own gigabit Ethernet port. Here I’m using my laptop which runs on Windows 10.

On the test PC, configure the Ethernet port to use a fixed IP address of 192.168.1.10.
Use an Ethernet cable to connect port 0 of the Ethernet FMC to the test PC. You should see the following message in Putty:

xilinx_axienet 41000000.ethernet eth1: Link is Up - 1Gbps/Full - flow control off

First let’s try pinging from the PC to the ZedBoard. Open a command window on the test PC and type the command: ping 192.168.1.11
Now to ping in the reverse direction (ZedBoard to PC), you will probably need to disable the public firewall on your PC.
On the Linux command line in Putty, type the command: ping -I eth1 192.168.1.10

You will have to press Ctrl-C to stop pinging. Notice that we have to use the argument “-I eth1” from the ZedBoard as there are multiple ports we could possibly ping from. We can do the same ping test to verify the other ports.

Source code Git repository

The sources for re-generating this project automatically can be found on Github here: ZedBoard Multi-port Ethernet design

Boot files for the ZedBoard

If you want to try out my boot files for the ZedBoard, download them here:

If you run into problems going through these instructions, just write me a comment below.

↧

Avnet releases PicoZed FMC Carrier Card V2

May 17, 2016, 11:34 am

≫ Next: FMC for Connecting an SSD to an FPGA

≪ Previous: Multi-port Ethernet in PetaLinux

For those of you who were interested in running my recent tutorials about connecting a PCIe SSD to the Zynq (Zynq PCI Express Root Complex design in Vivado and Connecting an SSD to an FPGA running PetaLinux) you’ll be happy to know that Avnet has released the PicoZed FMC Carrier Card V2, which is the platform on which those tutorials were based.

For more information about the new PicoZed carrier, check out their video – and keep an eye out at 1m:32s where the Ethernet FMC gets a mention!

↧

FMC for Connecting an SSD to an FPGA

June 8, 2016, 10:09 am

≫ Next: Breakout the Zynq Ultrascale+ GEMs with Ethernet FMC

≪ Previous: Avnet releases PicoZed FMC Carrier Card V2

Here’s a first look at the FMC version of the FPGA Drive product, featured with the Samsung VNAND 950 Pro SSD. The FMC version can carry M-keyed M.2 modules for PCI Express and is designed to support up to 4-lanes. It has a HPC FMC connector which can be used on a LPC FMC carrier for a single-lane connection to the SSD, or a HPC FMC carrier to exploit the maximum throughput of a 4-lane connection. The FMC also has a 100MHz clock generator for PCIe applications, which provides a reference clock to the SSD and to the FPGA.

fpga-drive-fmc-3d-4

fpga-drive-fmc-dimensions

I expect the product to be available for purchase in 4-6 weeks time. Example designs are already available on Github here.

If you’re interested in the product or you would like more information, please contact me.

↧

Breakout the Zynq Ultrascale+ GEMs with Ethernet FMC

June 16, 2016, 6:28 pm

≫ Next: Bye bye Platform Cable USB II, Hello JTAG HS3

≪ Previous: FMC for Connecting an SSD to an FPGA

Did you know that the Zynq Ultrascale+ has 4 built-in Gigabit Ethernet MACs (GEMs)? That makes it awesome for Ethernet applications which is why I’ve just developed and shared an example design for the Zynq Ultrascale+ ZCU102 Evaluation board, armed with an Ethernet FMC to break-out those handy GEMs. The ZCU102 board has two FMC connectors, both high-pin-count (HPC), so I’ve created one basic design with two sets of constraints to choose from, depending on which FMC connector you want to use.

These scripts will build the Vivado project and block diagram for you: Zynq GEM Ethernet FMC example design

The scripts rely on the ZCU102 board definition files which don’t come built into Vivado 2016.1. I’m guessing that they will in the near future, but for now, to be able to build the project you’ll need to request access to the ZCU102 HeadStart Lounge and properly install the board definition files.

Want to know more about the Zynq UltraScale+ MPSoC? Checkout the video from Xilinx below. By the way, the image above comes from 0:58 of the video.

↧

Bye bye Platform Cable USB II, Hello JTAG HS3

June 22, 2016, 10:30 am

≫ Next: At last! Affordable and fast, non-volatile storage for FPGAs

≪ Previous: Breakout the Zynq Ultrascale+ GEMs with Ethernet FMC

Now that I think about it, I’ve been using my Xilinx Platform Cable USB II for 10 years now!!! That’s a terrific run in my opinion, I got it in a kit for the Virtex-5 ML505 board in 2006 and I would have kept using it if I didn’t start getting these strange error messages recently. So from a recommendation, I got myself a JTAG HS3 from Digilent and it is just ridiculously better. As you can see from the photo, it’s much smaller although some people might see that as a down-side because it’s easier to lose… I don’t know.. for me, the real advantage is that it is so much faster than the Platform cable. I like tools that don’t make me wait, because my time is important and I have no patience for that moment when I’m waiting for the bitstream to download and I need to know whether my design changes are going to work or not. This tool rocks!

↧