# Module sine_generator

This document contains technical documentation for the `sine_generator`

module.
To browse the source code, please visit the repository on GitHub.

This module contains a flexible and robust sinusoidal waveform generator written in VHDL. Also known as a direct digital synthesizer (DDS), numerically-controlled oscillator (NCO), or a sine/sinus wave generator. This is a very common component when doing any kind of signal processing, signal analysis or modulation in an FPGA or ASIC.

Key features

SFDR of 192 dB in fractional phase mode.

Theoretically unlimited SFDR in integer phase mode.

Both sine and cosine outputs, for I/Q modulation and other applications.

Synthesizes frequencies all the way up to Nyquist.

Written in pure VHDL. Needs no separate Matlab/Python tools to pre-calculate anything.

Better performance and lower resource footprint compared to other implementations.

The implementation is based around a quarter-wave lookup table in block RAM. It supports both integer and fractional phase modes, and can be parameterized to use dithering and Taylor expansion to increase the performance, when necessary. It is well-tested and well-analyzed. The performance is proven by theory as well as simulation and on-device tests.

## Quick start guide

The theory behind this module is somewhat complex, and it has a number of parameters that must be assigned and understood in order for things to work as intended. While reading this full document is recommended for a full insight, below is a quick step-by-step guide to utilizing the module.

Determine the Frequency resolution requirement of your application.

Decide if you will use Integer or fractional phase mode, given your requirements.

Parameterize the module to reach the desired performance, using either

Determine your phase increment value based on About phase increment.

Instantiate the sine_generator.vhd entity in your design.

Note

The performance of the module is measured in terms of the spurious-free dynamic range (SFDR) of the output signal. See https://en.wikipedia.org/wiki/Spurious-free_dynamic_range if you are not familiar with this.

### Frequency resolution

Frequency resolution is defined as the smallest output frequency step that can be taken by adjusting
`phase_increment`

.
It is given by

Where `clk_frequency_hz`

is the frequency of the system clock that is clocking this module,
and the `memory_address_width`

and `phase_fractional_width`

are generics to this module.
The resolution required depends on the application of the module, and must be determined by
the user.

### Integer or fractional phase mode

If the Frequency resolution requirement of your system can be satisfied with
`phase_fractional_width`

kept at zero, the module can operate in “integer phase” mode which has
many benefits.
Otherwise the module must operate in “fractional phase” mode, which comes with some drawbacks.
Note that the module will always use a memory that is

large, and you must hence choose a maximum memory size that your design can afford.
The `memory_data_width`

is typically 18 and `memory_address_width`

between 9 and 12, since
that maps very nicely to BRAM primitives
But they can be both less or more.

### Parameterize in integer phase mode

In integer phase mode, the performance is limited only by the `memory_data_width`

generic value
(see About integer phase mode for details).
The SFDR of the output signal is at least

Use this equation to determine the `memory_data_width`

generic value you need, given your
SFDR requirement.
There is no need to adjust any other generic value to the generator top level.

### Parameterize in fractional phase mode

If we reorder the Frequency resolution equation above, we get

Use this to calculate the `fractional_phase_width`

generic value needed.

When in fractional phase mode, the performance is limited mainly by the `memory_address_width`

generic value (see About fractional phase mode for details).
It can be improved by enabling Fractional phase with dithering or Fractional phase with Taylor expansion.
See the performance equations below to determine your required `memory_address_width`

generic
value, and whether you want to set `enable_phase_dithering`

or `enable_first_order_taylor`

.

Note that that in all cases using fractional phase mode, the `memory_data_width`

generic must have
a value of at least

in order for the performance to not be limited by quantization noise. A value of 18 is typical, since it maps nicely to a BRAM primitive, but it might have to be increased in extreme performance scenarios.

#### Base fractional performance

If neither dithering nor Taylor expansion is enabled, the SFDR of the output signal is at least

Use this equation to determine the `memory_address_width`

generic value you need, given your
SFDR requirement.

#### Performance with phase dithering

When the Fractional phase with dithering feature is enabled, the SFDR of the output signal is improved to at least

Use this equation to determine the `memory_address_width`

generic value you need, given your
SFDR requirement.

#### Performance with Taylor expansion

When the Fractional phase with Taylor expansion feature is enabled, the SFDR of the output signal is improved to at least

Use this equation to determine the `memory_address_width`

generic value you need, given your
SFDR requirement.

## About integer phase mode

In integer phase mode, the phase, which is an accumulation of the input `phase_increment`

,
will always point exactly to an integer memory address.
Hence there is no truncation of the phase and no phase error.
See Parameterize in integer phase mode for an SFDR performance equation.

This means that the accuracy of the result is limited only by the bit width of the sine
samples in memory (`memory_data_width`

).
And not at all by the number of samples in the memory (`memory_address_width`

).
This leads to very high performance in typical scenarios.

Note

Enabling dithering or Taylor expansion does nothing for the performance in integer phase mode. This is because both of these mechanisms work on the phase error, which is zero in integer phase mode.

## About fractional phase mode

In fractional phase mode, the phase will not always point exactly to a memory address. Hence the phase is truncated, which leads to an error in the result. I.e. worse performance. See Base fractional performance for an SFDR performance equation.

The example simulation plot below has the same configuration as the integer phase example above, except that the target sine frequency is slightly adjusted to require five fractional phase increment bits. The massive drop in performance is clearly visible.

In this mode the input port `phase_increment`

needs fractional bits in order to express the
desired sine frequency.
The generic `phase_fractional_width`

must be set to a non-zero value so the desired frequency
resolution is reached.

### Fractional phase with dithering

Phase dithering can be enabled to increase the performance in fractional phase mode by setting
the `enable_phase_dithering`

generic.
See Performance with phase dithering for an SFDR performance equation.
See also here for implementation details.

The result of simulating the example scenario from About fractional phase mode above, but with dithering enabled, is shown below.

As can be seen when comparing the performance to the non-dithered ditto above, the SFDR is better but the SNDR is worse. One can also note that the noise floor is much more uniformly spread out.

### Fractional phase with Taylor expansion

Taylor expansion can be enabled to increase the performance in fractional phase mode by setting
the `enable_first_order_taylor`

generic.
See Performance with Taylor expansion for an SFDR performance equation.
See also here for a background on the Taylor expansion concept.

The result of simulating the example scenario from About fractional phase mode above, but with first-order Taylor expansion enabled, is shown below.

As can be seen in the plot, both the SNDR and SFDR are massively improved. Compared to the non-Taylor-expanded ditto above, the performance is roughly doubled. Other than that, the noise floor is quite similar with distinct distortion peaks, but they are all suppressed more by the Taylor expansion.

## sine_calculator.vhd

Calculates the sinus value corresponding to the provided phase value. Instantiates a sine lookup table (sine_lookup.vhd) where the integer part of the phase will be used as address.

If fractional phase is enabled, the fractional part of the phase will be truncated when forming the lookup address. First-order Taylor expansion can be enabled to improve the accuracy using the truncated phase.

### Taylor expansion

The Taylor expansion of a function is given by

See https://en.wikipedia.org/wiki/Taylor_series. The accuracy is better if \(a\) is close to \(x\), or if many terms are used. Substituting

and realizing the following properties of the derivative of the sine function

we get

### Taylor expansion implementation

This entity corrects the sine lookup value using first-order Taylor expansion, meaning

In this representation, \(x\) is the full phase value, including fractional bits. The \(a\) is the integer part of the phase value that forms the memory address, and \(e\) is the fractional part of the phase that gets truncated. The \(A \sin(B x + C)\) and \(A \cos(B x + C)\) values are given by sine_lookup.vhd. The \(B\) value is the phase increment of the lookup table:

The calculation is partitioned like this, using DSP48 blocks:

The \(\pi / 2\) is handled as a fixed-pointed value with a number of fractional bits determined to give sufficient performance. Multiplying with the phase error, which is a fractional value, and then the cosine value gives a value that has a very high number of fractional bits. In order for the summation with the sine value to be correct, the sine value must be shifted up until the binal points align.

When the operands are small, the last multiplication and the addition can fit in the same DSP48. This is not the case in general though.

Note

An alternative approach would be to store the error term in a ROM and use the phase error as lookup address. This would save DSP blocks but cost BRAM. We can support that in the future with a generic switch if there is ever a need.

## sine_generator.vhd

This sinus generator top level accumulates the incoming `phase_increment`

to form a
phase value.
The sine_calculator.vhd is instantiated to calculate sinusoid values
based on this phase.

Set the `enable_sine`

and `enable_cosine`

generic parameters to enable sine and/or
cosine output.

If fractional phase is enabled, the fractional part of the phase will be truncated in sine_calculator.vhd when forming the lookup address. Phase dithering can be enabled in this case to improve the SFDR.

### About phase increment

The frequency of the output signal is determined by the `phase_increment`

input port value.
The width of this port is equal to

In VHDL you are recommended to utilize the `get_phase_width`

function in
sine_generator_pkg.vhd.

The phase increment value can be calculated as

Where `sine_frequency_hz`

is the target sinus output frequency, and `clk_frequency_hz`

is
the frequency of the system clock that is clocking this module.
In VHDL you are recommended to utilize the `get_phase_increment`

function in
sine_generator_pkg.vhd.

Note that the Nyquist condition must be honored, meaning that the sine frequency must be less than half the clock frequency.

### Phase dithering

Phase dithering adds a pseudo-random offset to the phase that is sent to sine_calculator.vhd. The phase offset is uniformly distributed over the entire fractional phase width, meaning between 0 and almost 1 LSB of the memory address. The phase offset is added on top of the phase accumulator, and sometimes the addition will result in +1 LSB in the address, sometimes it will not.

This phase offset spreads out the spectrum distortion caused by phase truncation when reading from memory. The result is a lower peak distortion, i.e. a higher SFDR. This comes, of course, at the cost of an increased overall level of noise, i.e. a lower SNDR. Whether this tradeoff is worth it depends on the use case, and the choice is left to the user.

See Fractional phase with dithering for a system-level perspective and some performance graphs.

#### Pseudo-random algorithm

Dithering is implemented using a maximum-length linear feedback shift register (LFSR) from the Module lfsr. This gives a sequence of repeating state outputs that is not correlated with the phase and appears pseudo-random.

The LFSR length is at least equal to the fractional width of the phase increment.

### Resource utilization

This entity has netlist builds set up with automatic size checkers in module_sine_generator.py. The following table lists the resource utilization for the entity, depending on generic configuration.

Generics |
Total LUTs |
FFs |
DSP Blocks |
RAMB18 |
RAMB36 |
Maximum logic level |
---|---|---|---|---|---|---|

memory_data_width = 14 memory_address_width = 8 |
39 |
28 |
0 |
1 |
0 |
7 |

memory_data_width = 18 memory_address_width = 8 |
45 |
32 |
0 |
1 |
0 |
8 |

memory_data_width = 14 memory_address_width = 12 |
47 |
32 |
0 |
0 |
2 |
7 |

memory_data_width = 14 memory_address_width = 8 phase_fractional_width = 5 enable_phase_dithering = True |
54 |
51 |
0 |
1 |
0 |
7 |

memory_data_width = 18 memory_address_width = 12 phase_fractional_width = 24 enable_phase_dithering = True |
114 |
101 |
0 |
0 |
2 |
12 |

memory_data_width = 17 memory_address_width = 8 phase_fractional_width = 5 enable_first_order_taylor = True |
100 |
38 |
2 |
1 |
0 |
8 |

memory_data_width = 25 memory_address_width = 12 phase_fractional_width = 24 enable_first_order_taylor = True |
156 |
69 |
3 |
0 |
3 |
12 |

memory_data_width = 23 memory_address_width = 11 phase_fractional_width = 28 enable_first_order_taylor = True |
151 |
70 |
3 |
1 |
1 |
13 |

memory_data_width = 23 memory_address_width = 11 phase_fractional_width = 28 enable_sine = False enable_cosine = True enable_first_order_taylor = True |
161 |
70 |
3 |
1 |
1 |
13 |

memory_data_width = 23 memory_address_width = 11 phase_fractional_width = 28 enable_sine = True enable_cosine = True enable_first_order_taylor = True |
177 |
94 |
5 |
1 |
1 |
13 |

## sine_generator_pkg.vhd

Package with constants/types/functions for the sine generator ecosystem.

## sine_lookup.vhd

A lookup table for fixed-point sine and cosine values.
The `input_phase`

is in range \([0, 2 \pi[\), but the memory in this entity stores only
samples for \([0, \pi / 2[\).
The phase is furthermore offset by plus half an LSB (see Quadrant handling below).

Each sample value in memory is `memory_data_width`

bits wide.

Use the `enable_*`

generics to specify which signals to calculate.
Enabling many is convenient when you want sinusoids that are perfectly \(\pi / 2\)-offset
from each other.
Enabling further signals will not require any extra memory, but it will add logic.
Also, sine calculation (positive or negative) requires one memory read port, while cosine
calculation (positive or negative) requires another.
So enabling any sine along with any cosine will require two memory read ports.

### Quadrant handling

Consider the unit circle, and the sine and cosine plots in the picture below.

When implementing an angle-discrete sine lookup table, the first approach might be to use the blue points in the plot above, starting at phase zero. However, for the implementation to be efficient, we want to be able to calculate e.g. sine in quadrant one as the sine in quadrant zero, but read out in reverse order. This is desirable since “reverse order” when working with fixed-point numbers simply means a bit-wise inverse of a “normal order” counter.

With this goal of efficient implementation in mind, we offset the phase so that the points are mirrored around \(0\), \(\pi/2\), \(\pi\) and \(3 \pi/2\). The resulting symmetry can be clearly seen in the sine and cosine plots above. For example, the sine points in quadrant one are clearly the same points as in quadrant zero, but in reverse order. Apart from this ocular hint, we can also show it using basic trigonometric identities:

This shows how both sine and cosine for all four quadrant can be calculated using only sine values from the first quadrant (\([0, \pi/2]\)). In the calculations above we have utilized the fact that a phase of \(\pi/2 - \theta\), meaning phase in reverse order, is the same as bit-inversion of the phase.

### Fixed-point representation

When we want to distribute \(2^\text{memory_address_width}\) number of points over the phase range \([0, \pi / 2[\), we use

To achieve the symmetry we aim for in the discussion above, we offset the phase by half an LSB:

This gives a total phase of

We also have an amplitude-quantization given by the `memory_data_width`

generic.
This gives a maximum amplitude of

With this established, we can calculate the memory values as

As can be seen in the trigonometric identities above, the resulting output sine value from this entity is negated in some some quadrants. This gives an output range of \([-A, A]\). Where fixed-point \(A\) is equivalent to floating-point \(1.0\).

### Performance

Samples in memory are stored with `memory_data_width`

bits, and the quadrant handling discussed
above adds one more sign bit.
The only source of noise and distortion is the digital quantization noise when storing sine
values with a fixed-point representation in memory.

Hence the result from this entity has SNDR and SFDR equal to `memory_data_width + 1`

ENOB.

## taylor_expansion_core.vhd

Core that calculates the first-order Taylor expansion of sinusoid function

with fixed-point numbers that fit in a DSP48. This core is to be used in sine_calculator.vhd, and is not really suitable for other purposes.

Warning

This is an internal core. The interface might change without notice.