Module common
This document contains technical documentation for the common
module.
addr_pkg.vhd
View source code on gitlab.com.
Collection of types/functions for working with address decode/matching.
attribute_pkg.vhd
View source code on gitlab.com.
Commonly used attributes for Xilinx Vivado, with description and their valid values. Information gathered from documents UG901 and UG912.
axi_stream_protocol_checker.vhd
View source code on gitlab.com.

A wrapper around the VUnit AXI-Stream protocol checker. Has simpler interface, and can hence be included in synthesizable code with a generate guard:
if in_simulation generate
axi_stream_protocol_checker_inst : common.axi_stream_protocol_checker
generic map (
...
);
end generate;
Without the generate guard, synthesis will fail. The file is placed in the “sim” folder, so it will not be included in synthesis projects by default when using tsfpga.
clean_packet_dropper.vhd
View source code on gitlab.com.

An incoming packet will be dropped cleanly if drop
is asserted for at least one clock cycle
during the active packet.
Once drop
has been asserted during an active packet, this entity will
Not pass anything of the current
input
packet on to theresult
side, including anything that was consumed beforedrop
was asserted.This means that only whole, non-corrupted, packets will be available on the
result
side.Keep
input_ready
high until the whole packet has been consumed, so the upstream on theinput
side is not stalled.
Note
The fifo.vhd instance in this module is in packet mode, meaning that a whole packet
has to be written to FIFO before any data is passed on to result
side.
Hence the fifo_depth
generic has to be chosen so that it can hold the maximum possible
packet length from the input
side.
clock_counter.vhd
View source code on gitlab.com.

Measure the switching rate of an unknown clock by using a free-running reference clock of known frequency.
Note
This entity instantiates a resync_counter.vhd block. See documentation of that entity for constraining details.
The frequency of target_clock is given by
target_tick_count * reference_clock_frequency / 2 ** resolution_bits
The target_tick_count
value is updated every 2 ** resolution_bits
cycles.
It is invalid for 2 * 2 ** resolution_bits
cycles in the beginning as reference_clock
starts switching, but after that it is always valid.
For the calculation to work, target_clock
must be no more than
2 ** (max_relation_bits - 1)
times faster than reference_clock.
Resource utilization
This entity has netlist builds set up with
automatic size checkers
in module_common.py
.
The following table lists the resource utilization for the entity, depending on
generic configuration.
Generics |
Total LUTs |
SRLs |
FFs |
---|---|---|---|
resolution_bits = 24 max_relation_bits = 6 |
84 |
5 |
185 |
resolution_bits = 10 max_relation_bits = 4 |
38 |
2 |
86 |
common_pkg.vhd
View source code on gitlab.com.
Package with common features that do not fit in anywhere else, and are not significant enough to warrant their own package.
debounce.vhd
View source code on gitlab.com.

Simple debounce mechanism to be used with e.g. the signal from a button or dip switch. It eliminates noise and metastability by requiring the input to have a stable value for a specified number of clock cycles before propagating the value.
Note
This entity instantiates a resync_level.vhd block (async_reg
chain) to make sure
the input is not metastable. The resync_level.vhd has a scoped constraint file that
must be used.
handshake_mux.vhd
View source code on gitlab.com.

Multiplex between many AXI-Stream-like inputs towards one output bus.
Will lock onto one input
and let its data through until a packet has passed, as indicated
by the last
signal.
The implementation is simple, which comes with a few limitations:
Warning
If there are holes in an input
packet stream after valid
has been asserted, this
multiplexer will be unnecessarily stalled even if another input
has data available.
It is up to the user to make sure that this does not occur, using e.g. a fifo.vhd
in packet mode, or calculate that the system throughput is still sufficient.
The arbitration is done in the most resource-efficient round-robin manner possible, which
means that one input
can starve out the others if it continuously sends data.
handshake_pipeline.vhd
View source code on gitlab.com.

Handshake pipeline. Is used to ease the timing of a streaming data interface by inserting register stages on the data and/or control signals.
There are many modes available, with different characteristics, that are enabled
with different combinations of full_throughput
, pipeline_control_signals
and pipeline_data_signals
.
See the descriptions within the code for more details about throughput and fanout.
Resource utilization
This entity has netlist builds set up with
automatic size checkers
in module_common.py
.
The following table lists the resource utilization for the entity, depending on
generic configuration.
Generics |
Total LUTs |
FFs |
Maximum logic level |
---|---|---|---|
data_width = 32 full_throughput = True pipeline_control_signals = True pipeline_data_signals = True |
41 |
78 |
2 |
data_width = 32 full_throughput = True pipeline_control_signals = False pipeline_data_signals = True |
1 |
38 |
2 |
data_width = 32 full_throughput = True pipeline_control_signals = False pipeline_data_signals = False |
0 |
0 |
0 |
data_width = 32 full_throughput = False pipeline_control_signals = True pipeline_data_signals = True |
1 |
39 |
2 |
data_width = 32 full_throughput = False pipeline_control_signals = True pipeline_data_signals = False |
2 |
3 |
2 |
data_width = 32 full_throughput = False pipeline_control_signals = False pipeline_data_signals = True |
2 |
38 |
2 |
data_width = 32 full_throughput = False pipeline_control_signals = False pipeline_data_signals = False |
0 |
0 |
0 |
handshake_splitter.vhd
View source code on gitlab.com.

Combinatorially split an AXI-Stream-like handshaking interface, for cases where many slaves
are to receive the data.
Maintains full throughput and is AXI-stream compliant in its handling of the handshake signals
(valid
does not wait for ready
, valid
does not fall unless a transaction
has occurred).
This entity has no pipelining of the handshake signals, but instead connects
them combinatorially.
This increases the logic depth for handshake signals where this entity is used.
If timing issues occur (on the input
or one of the output
s) a
handshake_pipeline.vhd instance can be used.
Resource utilization
This entity has netlist builds set up with
automatic size checkers
in module_common.py
.
The following table lists the resource utilization for the entity, depending on
generic configuration.
Generics |
Total LUTs |
FFs |
---|---|---|
num_interfaces = 2 |
4 |
2 |
num_interfaces = 4 |
9 |
4 |
keep_remover.vhd
View source code on gitlab.com.

This entity removes strobe’d out lanes from the input, resulting in an output stream where all lanes are always strobed (except for the last beat, potentially). The strobe on input can be considered as the TKEEP signal in AXI-Stream terminology, and the output strobe would be TKEEP/TSTRB.
The entity works by continuously filling up a data buffer with data from the input.
Only the lanes that are strobed will be saved to the buffer.
Note that input words may have all their lanes strobed out (except for the last beat, see below).
When enough lanes are saved to fill a whole word, data is passed to the output by asserting
output_valid
. When input_last
is asserted for an input
word, an output word will be sent out, with output_last
asserted, even if a whole strobed
word is not currently filled in the buffer.
The strobe unit data width is configurable via a generic. Most of the time it would be eight, i.e. a byte strobe. But in some cases the strobe represents a wider quanta, in which case the generic can be increased. Increasing the generic will drastically decrease the resource utilization, since that is the “atom” of data that is handled internally.
The handling of input_last
presents a corner case.
Lets assume that data_width
is 16 and strobe_unit_width
is 8.
Furthermore, there is one atom of data available in the buffer, and input stream has both lanes
strobed. In this case, one input word shall result in two output words. The first output word
comes from a whole word being filled in the buffer. The second word comes from a half filled word
in the buffer, but input_last being asserted.
This is solved by having a small state machine that pads input data with an extra word when
this corner case arises. The padding stage makes it possible to have a very simple data buffer
stage, with low resource utilization.
Throughput
The entity achieves full throughput, except for the corner case mentioned above, where it might stall one cycle on the input.
Limitations
input_last
may not be asserted on an input word that has all lanes strobed out.There may never be a ‘1’ above a ‘0’ in the input strobe. E.g. “0111” is allowed, but “1100” is not.
Resource utilization
This entity has netlist builds set up with
automatic size checkers
in module_common.py
.
The following table lists the resource utilization for the entity, depending on
generic configuration.
Generics |
Total LUTs |
FFs |
Maximum logic level |
DSP Blocks |
---|---|---|---|---|
data_width = 32 strobe_unit_width = 16 |
88 |
79 |
3 |
0 |
data_width = 64 strobe_unit_width = 8 |
410 |
175 |
6 |
0 |
data_width = 128 strobe_unit_width = 32 |
414 |
282 |
5 |
0 |
periodic_pulser.vhd
View source code on gitlab.com.

Outputs a one cycle pulse
after a generic number of assertions of count_enable
.
Shift registers are used as far as possible to create the pulse. This makes the implementation resource efficient on devices with cheap shift registers (such as SRLs in Xilinx devices). In the worst case a single counter is created.
The period
is broken down into factors that are represented using shift
registers, with the shift register length being the factor value. By rotating the shift register
on each count_enable
, a fixed period is created.
When possible, multiple shift registers are AND-gated to create a longer
period. For example a period of 30 can be achieved by gating two registers of length 10
and 3. This method only works if the lengths are mutual primes (i.e. the greatest common
divisor is 1).
If the remaining factor is not 1 after the shift registers have been added, a new instance of this module is added through recursion.
If period
cannot be factorized into one or more shift registers, recursion ends with
either a simple counter or a longer shift register (depending on the size of the factor).
Example
Let’s say that the maximum shift register length is 16. A period of 510 = 10 * 3 * 17 can then be achieved using two shift registers of length 10 and 3, and then instantiating a new periodic_pulser.vhd
[0][0][0][0][0][0][0][0][0][1]
\
[AND] -> pulse -> [periodic_pulser with period 17]
/
[0][0][1]
The next stage will create a counter, because 17 is a prime larger than the maximum shift register length.
Resource utilization
This entity has netlist builds set up with
automatic size checkers
in module_common.py
.
The following table lists the resource utilization for the entity, depending on
generic configuration.
Generics |
Total LUTs |
SRLs |
FFs |
---|---|---|---|
period = 33 shift_register_length = 33 |
2 |
1 |
1 |
period = 33 shift_register_length = 1 |
6 |
0 |
6 |
period = 37 shift_register_length = 33 |
3 |
2 |
1 |
period = 37 shift_register_length = 1 |
6 |
0 |
6 |
period = 100 shift_register_length = 33 |
3 |
2 |
2 |
period = 100 shift_register_length = 1 |
8 |
0 |
7 |
period = 125 shift_register_length = 33 |
4 |
2 |
2 |
period = 125 shift_register_length = 1 |
7 |
0 |
7 |
period = 127 shift_register_length = 33 |
5 |
4 |
1 |
period = 127 shift_register_length = 1 |
8 |
0 |
7 |
period = 4625 shift_register_length = 33 |
7 |
4 |
3 |
period = 4625 shift_register_length = 1 |
2 |
0 |
13 |
period = 311000000 shift_register_length = 33 |
15 |
4 |
15 |
period = 311000000 shift_register_length = 1 |
2 |
0 |
29 |
strobe_on_last.vhd
View source code on gitlab.com.

The goal of this entity is to process an AXI-Stream so that packets where last
is asserted on
a word that is completely strobed out are modified so that last
is instead asserted on the
last word which does have a strobe.
As a consequence of this, all words in the stream that are completely strobed out are dropped by this entity.
Resource utilization
This entity has netlist builds set up with
automatic size checkers
in module_common.py
.
The following table lists the resource utilization for the entity, depending on
generic configuration.
Generics |
Total LUTs |
FFs |
Maximum logic level |
---|---|---|---|
data_width = 8 |
7 |
12 |
3 |
data_width = 32 |
8 |
39 |
3 |
data_width = 64 |
9 |
75 |
3 |
time_pkg.vhd
View source code on gitlab.com.
Contains a couple of methods for working with the VHDL time
type.
The time
type can be tricky sometimes because its precision is implementation dependent,
just like the integer
and universal_integer
types:
integer'high
is
2147483647 in GHDL 3.0.0-dev, corresponding to a 32 bit signed integer.
2147483647 in Vivado 2021.2, corresponding to a 32 bit signed integer.
time'high
is
9223372036854775807 fs in GHDL 3.0.0-dev, corresponding to a 64 bit signed integer. Time values greater than this will result in an error.
2147483647 fs in Vivado 2021.2, corresponding to a 32 bit signed integer. However, Vivado 2021.2 can represent time values greater than this since it uses a dynamic secondary unit for
time
, as outlined in IEEE Std 1076-2008, page 39. Precision is never greater than 32 bits though.
In the standard library, the following functions are available for working with
time
values (IEEE Std 1076-2008, page 260):
function "=" (anonymous, anonymous: TIME) return BOOLEAN;
function "/=" (anonymous, anonymous: TIME) return BOOLEAN;
function "<" (anonymous, anonymous: TIME) return BOOLEAN;
function "<=" (anonymous, anonymous: TIME) return BOOLEAN;
function ">" (anonymous, anonymous: TIME) return BOOLEAN;
function ">=" (anonymous, anonymous: TIME) return BOOLEAN;
function "+" (anonymous: TIME) return TIME;
function "- (anonymous: TIME) return TIME;
function "abs" (anonymous: TIME) return TIME;
function "+" (anonymous, anonymous: TIME) return TIME;
function "-" (anonymous, anonymous: TIME) return TIME;
function "*" (anonymous: TIME; anonymous: INTEGER) return TIME;
function "*" (anonymous: TIME; anonymous: REAL) return TIME;
function "*" (anonymous: INTEGER; anonymous: TIME) return TIME;
function "*" (anonymous: REAL; anonymous: TIME) return TIME;
function "/" (anonymous: TIME; anonymous: INTEGER) return TIME;
function "/" (anonymous: TIME; anonymous: REAL) return TIME;
function "/" (anonymous, anonymous: TIME) return universal_integer;
function "mod" (anonymous, anonymous: TIME) return TIME;
function "rem" (anonymous, anonymous: TIME) return TIME;
function MINIMUM (L, R: TIME) return TIME;
function MAXIMUM (L, R: TIME) return TIME;
Notably missing is a convenient and accurate way of converting a time
value to real
or integer
.
So that is most of the complexity in the conversion functions below.
types_pkg.vhd
View source code on gitlab.com.
Some basic types that make it easier to work with VHDL. Also some basic functions operating on these types.
width_conversion.vhd
View source code on gitlab.com.

Width conversion of an AXI-Stream-like data bus. Can handle downconversion (wide to thin) or upconversion (thin to wide). The data widths must be a power-of-two multiple of each other. E.g. 10->40 is supported while 8->24 is not.
There is a generic to enable usage of the last
signal. The last
indicator will be passed
along with the data from the input
to output
side as-is. Note that enabling the
support_unaligned_packet_length
generic will enable further processing of last
, but in
barebone configuration the signal is merely passed on.
There is a generic to enable strobing of data. The strobe
will be passed on from
input
to output
side as-is. Note that enabling support_unaligned_packet_length
generic will enable further processing of strobe
, but in barebone configuration the signal
is merely passed on.
This means, for example, that there might be output words where all strobe lanes are
zero when downconverting.
There are some limitations, and possible remedies, concerning packet length alignment, depending on if we are doing upconversion or downconversion. See below.
Downconversion
When doing downconversion, one input beat will result in two or more output beats, depending
on width configuration. This means that the output packet length is always aligned with the input
data width. This is not always desirable when working with the strobe
and last
signals.
Say for example that we are converting a bus from 16 to 8, and input_last
is asserted on a
beat where the lowest byte is strobed but the highest is not. In this case, we would want
output_last
to be asserted on the second to last byte, and the last byte (which is strobed
out) to be removed.
This is achieved by enabling the support_unaligned_packet_length
generic.
If the generic is not set, output_last
will be asserted on the very last byte, which will
be strobed out.
Upconversion
When upconverting, two or more input
beats result in one output
beat, depending on width
configuration. This means that the input packet length must be aligned with the output
data width, so that each packet fills up a whole number of output words.
If this can not be guaranteed, then the support_unaligned_packet_length
mode can be used.
When that is enabled, the input stream will be padded upon last
indication so that a whole
output word is filled.
Consider the example of converting a bus from 8 to 16, and input
last is asserted on the
third input beat. If support_unaligned_packet_length
is disabled, there will be one output
beat sent and half an output beat left in the converter.
If the mode is enabled however, the input stream will be padded with another byte so that an
output beat can be sent. The padded parts will have strobe
set to zero.
Note that the handling of unaligned packet lengths is highly dependent on the input stream being well behaved. Specifically
There may never be input beats where
input_strobe
is all zeros.For all beats except the one where
input_last
is asserted,input_strobe
must be asserted on all lanes.
Resource utilization
This entity has netlist builds set up with
automatic size checkers
in module_common.py
.
The following table lists the resource utilization for the entity, depending on
generic configuration.
Generics |
Total LUTs |
FFs |
Maximum logic level |
---|---|---|---|
input_width = 32 output_width = 16 enable_strobe = False enable_last = False support_unaligned_packet_length = False |
20 |
51 |
2 |
input_width = 32 output_width = 16 enable_strobe = True enable_last = True support_unaligned_packet_length = False |
23 |
59 |
2 |
input_width = 32 output_width = 16 enable_strobe = True enable_last = True support_unaligned_packet_length = True |
30 |
63 |
3 |
input_width = 16 output_width = 32 enable_strobe = False enable_last = False support_unaligned_packet_length = False |
35 |
51 |
2 |
input_width = 16 output_width = 32 enable_strobe = True enable_last = True support_unaligned_packet_length = False |
40 |
60 |
2 |
input_width = 16 output_width = 32 enable_strobe = True enable_last = True support_unaligned_packet_length = True |
45 |
62 |
3 |