.. _module_sine_generator: Module sine_generator ===================== This document contains technical documentation for the ``sine_generator`` module. To browse the source code, please visit the `repository on GitHub `__. This module contains a flexible and robust sinusoidal waveform generator written in VHDL. Also known as a direct digital synthesizer (DDS), numerically-controlled oscillator (NCO), or a sine/sinus wave generator. This is a very common component when doing any kind of signal processing, signal analysis or modulation in an FPGA or ASIC. Key features * SFDR of 192 dB in fractional phase mode. * Theoretically unlimited SFDR in integer phase mode. * Both sine and cosine outputs, for I/Q modulation and other applications. * Synthesizes frequencies all the way up to Nyquist. * Written in pure VHDL. Needs no separate Matlab/Python tools to pre-calculate anything. * Better performance and lower resource footprint compared to other implementations. The implementation is based around a quarter-wave lookup table in block RAM. It supports both integer and fractional phase modes, and can be parameterized to use dithering and Taylor expansion to increase the performance, when necessary. It is well-tested and well-analyzed. The performance is proven by theory as well as simulation and on-device tests. Quick start guide ----------------- The theory behind this module is somewhat complex, and it has a number of parameters that must be assigned and understood in order for things to work as intended. While reading this full document is recommended for a full insight, below is a quick step-by-step guide to utilizing the module. 1. Determine the :ref:`sine_frequency_resolution` requirement of your application. 2. Decide if you will use :ref:`sine_phase_mode`, given your requirements. 3. Parameterize the module to reach the desired performance, using either a. :ref:`sine_parameterize_integer_phase_mode` or b. :ref:`sine_parameterize_fractional_phase_mode`. 4. Determine your phase increment value based on :ref:`sine_calculate_increment`. 5. Instantiate the :ref:`sine_generator.sine_generator` entity in your design. .. note:: The performance of the module is measured in terms of the spurious-free dynamic range (SFDR) of the output signal. See https://en.wikipedia.org/wiki/Spurious-free_dynamic_range if you are not familiar with this. .. _sine_frequency_resolution: Frequency resolution ____________________ Frequency resolution is defined as the smallest output frequency step that can be taken by adjusting ``phase_increment``. It is given by .. math:: \text{frequency_resolution_hz} \equiv \frac{\text{clk_frequency_hz}}{2^\text{phase_width}} = \frac{\text{clk_frequency_hz}}{ 2^{\text{memory_address_width} + 2 + \text{phase_fractional_width}} }. Where ``clk_frequency_hz`` is the frequency of the system clock that is clocking this module, and the ``memory_address_width`` and ``phase_fractional_width`` are generics to this module. The resolution required depends on the application of the module, and must be determined by the user. .. _sine_phase_mode: Integer or fractional phase mode ________________________________ If the :ref:`sine_frequency_resolution` requirement of your system can be satisfied with ``phase_fractional_width`` kept at zero, the module can operate in "integer phase" mode which has many benefits. Otherwise the module must operate in "fractional phase" mode, which comes with some drawbacks. Note that the module will always use a memory that is .. math:: 2^\text{memory_address_width} \times \text{memory_data_width} large, and you must hence choose a maximum memory size that your design can afford. The ``memory_data_width`` is typically 18 and ``memory_address_width`` between 9 and 12, since that maps very nicely to BRAM primitives But they can be both less or more. .. _sine_parameterize_integer_phase_mode: Parameterize in integer phase mode __________________________________ In integer phase mode, the performance is limited only by the ``memory_data_width`` generic value (see :ref:`sine_integer_phase_mode` for details). The SFDR of the output signal is at least .. math:: \text{SFDR} = 6 \times (\text{memory_data_width} + 1) \text{ dB}. Use this equation to determine the ``memory_data_width`` generic value you need, given your SFDR requirement. There is no need to adjust any other generic value to the generator top level. .. _sine_parameterize_fractional_phase_mode: Parameterize in fractional phase mode _____________________________________ If we reorder the :ref:`sine_frequency_resolution` equation above, we get .. math:: \text{phase_fractional_width} = \left\lceil \log_2 \left( \frac{\text{clk_frequency_hz}}{\text{frequency_resolution_hz}} \right) \right\rceil - \text{memory_address_width} - 2. Use this to calculate the ``fractional_phase_width`` generic value needed. When in fractional phase mode, the performance is limited mainly by the ``memory_address_width`` generic value (see :ref:`sine_fractional_phase_mode` for details). It can be improved by enabling :ref:`sine_phase_dithering` or :ref:`sine_phase_taylor`. See the performance equations below to determine your required ``memory_address_width`` generic value, and whether you want to set ``enable_phase_dithering`` or ``enable_first_order_taylor``. Note that that in all cases using fractional phase mode, the ``memory_data_width`` generic must have a value of at least .. math:: \frac{\text{SFDR}}{6} - 1 in order for the performance to not be limited by quantization noise. A value of 18 is typical, since it maps nicely to a BRAM primitive, but it might have to be increased in extreme performance scenarios. .. _sine_fractional_performance: Base fractional performance ~~~~~~~~~~~~~~~~~~~~~~~~~~~ If neither dithering nor Taylor expansion is enabled, the SFDR of the output signal is at least .. math:: \text{SFDR} = 6 \times (\text{memory_address_width} + 1) \text{ dB}. Use this equation to determine the ``memory_address_width`` generic value you need, given your SFDR requirement. .. _sine_dithering_performance: Performance with phase dithering ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ When the :ref:`sine_phase_dithering` feature is enabled, the SFDR of the output signal is improved to at least .. math:: \text{SFDR} = 6 \times (\text{memory_address_width} + 4) \text{ dB}. Use this equation to determine the ``memory_address_width`` generic value you need, given your SFDR requirement. .. _sine_taylor_performance: Performance with Taylor expansion ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ When the :ref:`sine_phase_taylor` feature is enabled, the SFDR of the output signal is improved to at least .. math:: \text{SFDR} = 12 \times (\text{memory_address_width} + 1) \text{ dB}. Use this equation to determine the ``memory_address_width`` generic value you need, given your SFDR requirement. .. _sine_integer_phase_mode: About integer phase mode ------------------------ In integer phase mode, the phase, which is an accumulation of the input ``phase_increment``, will always point exactly to an integer memory address. Hence there is no truncation of the phase and no phase error. See :ref:`sine_parameterize_integer_phase_mode` for an SFDR performance equation. This means that the accuracy of the result is limited only by the bit width of the sine samples in memory (``memory_data_width``). And not at all by the number of samples in the memory (``memory_address_width``). This leads to very high performance in typical scenarios. .. figure:: integer_phase.png Example simulation with an integer phase increment. .. note:: Enabling :ref:`dithering ` or :ref:`Taylor expansion ` does nothing for the performance in integer phase mode. This is because both of these mechanisms work on the phase error, which is zero in integer phase mode. .. _sine_fractional_phase_mode: About fractional phase mode --------------------------- In fractional phase mode, the phase will not always point exactly to a memory address. Hence the phase is truncated, which leads to an error in the result. I.e. worse performance. See :ref:`sine_fractional_performance` for an SFDR performance equation. The example simulation plot below has the same configuration as the integer phase example above, except that the target sine frequency is slightly adjusted to require five fractional phase increment bits. The massive drop in performance is clearly visible. .. figure:: fractional_phase.png Example simulation with a fractional phase increment. In this mode the input port ``phase_increment`` needs fractional bits in order to express the desired sine frequency. The generic ``phase_fractional_width`` must be set to a non-zero value so the desired frequency resolution is reached. .. _sine_phase_dithering: Fractional phase with dithering _______________________________ Phase dithering can be enabled to increase the performance in fractional phase mode by setting the ``enable_phase_dithering`` generic. See :ref:`sine_dithering_performance` for an SFDR performance equation. See also :ref:`here ` for implementation details. The result of simulating the example scenario from :ref:`sine_fractional_phase_mode` above, but with dithering enabled, is shown below. .. figure:: dithering.png Example simulation with a fractional phase increment and dithering. As can be seen when comparing the performance to the non-dithered ditto above, the SFDR is better but the SNDR is worse. One can also note that the noise floor is much more uniformly spread out. .. _sine_phase_taylor: Fractional phase with Taylor expansion ______________________________________ Taylor expansion can be enabled to increase the performance in fractional phase mode by setting the ``enable_first_order_taylor`` generic. See :ref:`sine_taylor_performance` for an SFDR performance equation. See also :ref:`here ` for a background on the Taylor expansion concept. The result of simulating the example scenario from :ref:`sine_fractional_phase_mode` above, but with first-order Taylor expansion enabled, is shown below. .. figure:: taylor.png Example simulation with a fractional phase increment and Taylor expansion. As can be seen in the plot, both the SNDR and SFDR are massively improved. Compared to the non-Taylor-expanded ditto above, the performance is roughly doubled. Other than that, the noise floor is quite similar with distinct distortion peaks, but they are all suppressed more by the Taylor expansion. .. _sine_generator.sine_calculator: sine_calculator.vhd ------------------- `View source code on GitHub `__. .. symbolator:: component sine_calculator is generic ( memory_data_width : positive; memory_address_width : positive; enable_sine : boolean; enable_cosine : boolean; phase_fractional_width : natural; enable_first_order_taylor : boolean ); port ( clk : in std_ulogic; --# {{}} input_valid : in std_ulogic; input_phase : in u_unsigned; --# {{}} result_valid : out std_ulogic; result_sine : out u_signed; result_cosine : out u_signed ); end component; Calculates the sinus value corresponding to the provided phase value. Instantiates a sine lookup table (:ref:`sine_generator.sine_lookup`) where the integer part of the phase will be used as address. If fractional phase is enabled, the fractional part of the phase will be truncated when forming the lookup address. First-order Taylor expansion can be enabled to improve the accuracy using the truncated phase. .. _sine_taylor_expansion: Taylor expansion ________________ The Taylor expansion of a function is given by .. math:: f(x) = f(a) + f'(a) \times (x - a) + \frac{f''(a)}{2!} \times (x - a)^2 + \frac{f'''(a)}{3!} \times (x - a)^3 + ... See https://en.wikipedia.org/wiki/Taylor_series. The accuracy is better if :math:`a` is close to :math:`x`, or if many terms are used. Substituting .. math:: e \equiv x - a, and realizing the following properties of the derivative of the sine function .. math:: f(x) & \equiv A \sin(B x + C) \\ \Rightarrow f'(x) & = A B \cos(B x + C) \\ \Rightarrow f''(x) & = -A B^2 \sin(B x + C) = - B^2 f(x) \\ \Rightarrow f'''(x) & = -A B^3 \cos(B x + C) = - B^2 f'(x) \\ \Rightarrow f''''(x) & = A B^4 \sin(B x + C) = B^4 f(x) we get .. math:: A \sin(B x + C) = & \ A \sin(B a + C) \times \left( 1 - \frac{(Be)^2}{2!} + \frac{(Be)^4}{4!} - \frac{(Be)^6}{6!} + \ldots \right) \\ & + A \cos(B a + C) \times \left( Be - \frac{(Be)^3}{3!} + \frac{(Be)^5}{5!} - \frac{(Be)^7}{7!} + \ldots \right). Taylor expansion implementation _______________________________ This entity corrects the sine lookup value using first-order Taylor expansion, meaning .. math:: A \sin(B x + C) \approx A \sin(B a + C) + A \cos(B a + C) \times B e. In this representation, :math:`x` is the full phase value, including fractional bits. The :math:`a` is the integer part of the phase value that forms the memory address, and :math:`e` is the fractional part of the phase that gets truncated. The :math:`A \sin(B x + C)` and :math:`A \cos(B x + C)` values are given by :ref:`sine_generator.sine_lookup`. The :math:`B` value is the phase increment of the lookup table: .. math:: B \equiv \frac{\pi / 2}{2^\text{memory_address_width}}. The calculation is partitioned like this, using DSP48 blocks: .. digraph:: my_graph graph [dpi=300]; rankdir="LR"; phase_error [shape="none" label="phase error"]; pi [shape="none" label="π/2"]; { rank=same; phase_error; pi; } first_multiplication [shape="box" label="x"]; phase_error -> first_multiplication [label="<=25"]; pi -> first_multiplication [label="<=18"]; lookup_cosine [shape="none" label="lookup cosine"]; { first_multiplication=same; lookup_cosine; pi; } second_multiplication [shape="box" label="x"]; first_multiplication -> second_multiplication [label="<=25"]; lookup_cosine -> second_multiplication [label="<=18"]; lookup_sine [shape="none" label="lookup sine & 0"]; { first_multiplication=same; second_multiplication; lookup_sine; } addition [shape="box" label="+"]; second_multiplication -> addition [label="<=47"]; lookup_sine -> addition [label="<=47"]; saturation [shape="box" label="saturation"]; addition -> saturation [label="<=48"]; result [shape="none" label="result"]; saturation -> result; The :math:`\pi / 2` is handled as a fixed-pointed value with a number of fractional bits determined to give sufficient performance. Multiplying with the phase error, which is a fractional value, and then the cosine value gives a value that has a very high number of fractional bits. In order for the summation with the sine value to be correct, the sine value must be shifted up until the binal points align. When the operands are small, the last multiplication and the addition can fit in the same DSP48. This is not the case in general though. .. note:: An alternative approach would be to store the error term in a ROM and use the phase error as lookup address. This would save DSP blocks but cost BRAM. We can support that in the future with a generic switch if there is ever a need. .. _sine_generator.sine_generator: sine_generator.vhd ------------------ `View source code on GitHub `__. .. symbolator:: component sine_generator is generic ( memory_data_width : positive; memory_address_width : positive; phase_fractional_width : natural; enable_sine : boolean; enable_cosine : boolean; enable_phase_dithering : boolean; enable_first_order_taylor : boolean; initial_phase : u_unsigned ); port ( clk : in std_ulogic; --# {{}} input_valid : in std_ulogic; input_phase_increment : in u_unsigned; --# {{}} result_valid : out std_ulogic; result_sine : out u_signed; result_cosine : out u_signed ); end component; This sinus generator top level accumulates the incoming ``phase_increment`` to form a phase value. The :ref:`sine_generator.sine_calculator` is instantiated to calculate sinusoid values based on this phase. Set the ``enable_sine`` and ``enable_cosine`` generic parameters to enable sine and/or cosine output. If fractional phase is enabled, the fractional part of the phase will be truncated in :ref:`sine_generator.sine_calculator` when forming the lookup address. Phase dithering can be enabled in this case to improve the SFDR. .. _sine_calculate_increment: About phase increment _____________________ The frequency of the output signal is determined by the ``phase_increment`` input port value. The width of this port is equal to .. math:: \text{phase_width} = \text{memory_address_width} + 2 + \text{phase_fractional_width} In VHDL you are recommended to utilize the ``get_phase_width`` function in :ref:`sine_generator.sine_generator_pkg`. The phase increment value can be calculated as .. math:: \text{phase_increment} = \text{int} \left( \frac{\text{sine_frequency_hz}}{\text{clk_frequency_hz}} \times 2^\text{phase_width} \right). Where ``sine_frequency_hz`` is the target sinus output frequency, and ``clk_frequency_hz`` is the frequency of the system clock that is clocking this module. In VHDL you are recommended to utilize the ``get_phase_increment`` function in :ref:`sine_generator.sine_generator_pkg`. Note that the Nyquist condition must be honored, meaning that the sine frequency must be less than half the clock frequency. .. _sine_generator_dithering: Phase dithering _______________ Phase dithering adds a pseudo-random offset to the phase that is sent to :ref:`sine_generator.sine_calculator`. The phase offset is uniformly distributed over the entire fractional phase width, meaning between 0 and almost 1 LSB of the memory address. The phase offset is added on top of the phase accumulator, and sometimes the addition will result in +1 LSB in the address, sometimes it will not. .. figure:: dithering_zoom.png Zoom in of a low-frequency sine wave, without and with dithering. This phase offset spreads out the spectrum distortion caused by phase truncation when reading from memory. The result is a lower peak distortion, i.e. a higher SFDR. This comes, of course, at the cost of an increased overall level of noise, i.e. a lower SNDR. Whether this tradeoff is worth it depends on the use case, and the choice is left to the user. See :ref:`sine_phase_dithering` for a system-level perspective and some performance graphs. Pseudo-random algorithm ~~~~~~~~~~~~~~~~~~~~~~~ Dithering is implemented using a maximum-length linear feedback shift register (LFSR) from the :ref:`module_lfsr`. This gives a sequence of repeating state outputs that is not correlated with the phase and appears pseudo-random. The LFSR length is at least equal to the fractional width of the phase increment. .. _sine_generator.sine_generator.resource_utilization: Resource utilization ____________________ This entity has `netlist builds `__ set up with `automatic size checkers `__ in `module_sine_generator.py `__. The following table lists the resource utilization for the entity, depending on generic configuration. .. list-table:: Resource utilization for **sine_generator** netlist builds. :header-rows: 1 * - Generics - Total LUTs - FFs - DSP Blocks - RAMB18 - RAMB36 - Maximum logic level * - memory_data_width = 14 memory_address_width = 8 - 39 - 28 - 0 - 1 - 0 - 7 * - memory_data_width = 18 memory_address_width = 8 - 45 - 32 - 0 - 1 - 0 - 8 * - memory_data_width = 14 memory_address_width = 12 - 47 - 32 - 0 - 0 - 2 - 7 * - memory_data_width = 14 memory_address_width = 8 phase_fractional_width = 5 enable_phase_dithering = True - 54 - 51 - 0 - 1 - 0 - 7 * - memory_data_width = 18 memory_address_width = 12 phase_fractional_width = 24 enable_phase_dithering = True - 114 - 101 - 0 - 0 - 2 - 12 * - memory_data_width = 17 memory_address_width = 8 phase_fractional_width = 5 enable_first_order_taylor = True - 100 - 38 - 2 - 1 - 0 - 8 * - memory_data_width = 25 memory_address_width = 12 phase_fractional_width = 24 enable_first_order_taylor = True - 156 - 69 - 3 - 0 - 3 - 12 * - memory_data_width = 23 memory_address_width = 11 phase_fractional_width = 28 enable_first_order_taylor = True - 151 - 70 - 3 - 1 - 1 - 13 * - memory_data_width = 23 memory_address_width = 11 phase_fractional_width = 28 enable_sine = False enable_cosine = True enable_first_order_taylor = True - 161 - 70 - 3 - 1 - 1 - 13 * - memory_data_width = 23 memory_address_width = 11 phase_fractional_width = 28 enable_sine = True enable_cosine = True enable_first_order_taylor = True - 177 - 94 - 5 - 1 - 1 - 13 .. _sine_generator.sine_generator_pkg: sine_generator_pkg.vhd ---------------------- `View source code on GitHub `__. Package with constants/types/functions for the sine generator ecosystem. .. _sine_generator.sine_lookup: sine_lookup.vhd --------------- `View source code on GitHub `__. .. symbolator:: component sine_lookup is generic ( memory_data_width : positive; memory_address_width : positive; enable_sine : boolean; enable_cosine : boolean; enable_minus_sine : boolean; enable_minus_cosine : boolean ); port ( clk : in std_ulogic; --# {{}} input_valid : in std_ulogic; input_phase : in u_unsigned; --# {{}} result_valid : out std_ulogic; result_sine : out u_signed; result_cosine : out u_signed; result_minus_sine : out u_signed; result_minus_cosine : out u_signed ); end component; A lookup table for fixed-point sine and cosine values. The ``input_phase`` is in range :math:`[0, 2 \pi[`, but the memory in this entity stores only samples for :math:`[0, \pi / 2[`. The phase is furthermore offset by plus half an LSB (see :ref:`sine_lookup_quadrant` below). Each sample value in memory is ``memory_data_width`` bits wide. Use the ``enable_*`` generics to specify which signals to calculate. Enabling many is convenient when you want sinusoids that are perfectly :math:`\pi / 2`-offset from each other. Enabling further signals will not require any extra memory, but it will add logic. Also, sine calculation (positive or negative) requires one memory read port, while cosine calculation (positive or negative) requires another. So enabling any sine along with any cosine will require two memory read ports. .. _sine_lookup_quadrant: Quadrant handling _________________ Consider the unit circle, and the sine and cosine plots in the picture below. .. figure:: quadrants.png Overview of the four quadrants. When implementing an angle-discrete sine lookup table, the first approach might be to use the blue points in the plot above, starting at phase zero. However, for the implementation to be efficient, we want to be able to calculate e.g. sine in quadrant one as the sine in quadrant zero, but read out in reverse order. This is desirable since "reverse order" when working with fixed-point numbers simply means a bit-wise inverse of a "normal order" counter. With this goal of efficient implementation in mind, we offset the phase so that the points are mirrored around :math:`0`, :math:`\pi/2`, :math:`\pi` and :math:`3 \pi/2`. The resulting symmetry can be clearly seen in the sine and cosine plots above. For example, the sine points in quadrant one are clearly the same points as in quadrant zero, but in reverse order. Apart from this ocular hint, we can also show it using basic trigonometric identities: .. math:: \text{Sine quadrant 0: } & \sin(\theta) = \sin(\theta) \\ \text{Sine quadrant 1: } & \sin(\theta + \frac{\pi}{2}) = \sin(\frac{\pi}{2} - \theta) = \sin( \bar{\theta} ) \\ \text{Sine quadrant 2: } & \sin(\theta + \pi) = - \sin(\theta) \\ \text{Sine quadrant 3: } & \sin(\theta + \frac{3 \pi}{2}) = - \sin(\frac{\pi}{2} - \theta) = - \sin( \bar{\theta} ) \\ \\ \text{Cosine quadrant 0: } & \cos(\theta) = \sin(\frac{\pi}{2} - \theta) = \sin( \bar{\theta} ) \\ \text{Cosine quadrant 1: } & \cos(\theta + \frac{\pi}{2}) = - \sin(\theta) \\ \text{Cosine quadrant 2: } & \cos(\theta + \pi) = - \sin(\frac{\pi}{2} - \theta) = - \sin( \bar{\theta} ) \\ \text{Cosine quadrant 3: } & \cos(\theta + \frac{3 \pi}{2}) = \sin(\theta) This shows how both sine and cosine for all four quadrant can be calculated using only sine values from the first quadrant (:math:`[0, \pi/2]`). In the calculations above we have utilized the fact that a phase of :math:`\pi/2 - \theta`, meaning phase in reverse order, is the same as bit-inversion of the phase. Fixed-point representation __________________________ When we want to distribute :math:`2^\text{memory_address_width}` number of points over the phase range :math:`[0, \pi / 2[`, we use .. math:: \text{phase_increment} \equiv \frac{\pi / 2}{2^\text{memory_address_width}}. To achieve the symmetry we aim for in the discussion above, we offset the phase by half an LSB: .. math:: \phi \equiv \frac{\text{phase_increment}}{2}. This gives a total phase of .. math:: \theta(i) \equiv i \times \text{phase_increment} + \phi. We also have an amplitude-quantization given by the ``memory_data_width`` generic. This gives a maximum amplitude of .. math:: A \equiv 2^\text{memory_data_width} - 1. With this established, we can calculate the memory values as .. math:: \text{mem} (i) = \text{int} \left( A \times \sin(\theta(i)) \right), \forall i \in [0, 2^\text{memory_address_width} - 1]. As can be seen in the trigonometric identities above, the resulting output sine value from this entity is negated in some some quadrants. This gives an output range of :math:`[-A, A]`. Where fixed-point :math:`A` is equivalent to floating-point :math:`1.0`. Performance ___________ Samples in memory are stored with ``memory_data_width`` bits, and the quadrant handling discussed above adds one more sign bit. The only source of noise and distortion is the digital quantization noise when storing sine values with a fixed-point representation in memory. Hence the result from this entity has SNDR and SFDR equal to ``memory_data_width + 1`` ENOB. .. _sine_generator.taylor_expansion_core: taylor_expansion_core.vhd ------------------------- `View source code on GitHub `__. .. symbolator:: component taylor_expansion_core is generic ( sinusoid_width : positive; error_factor_width : positive; error_factor_fractional_width : positive; result_width : positive; minus_derivative : boolean ); port ( clk : in std_ulogic; --# {{}} input_value : in u_signed; input_derivative : in u_signed; input_error_factor : in u_signed; --# {{}} result_value : out u_signed ); end component; Core that calculates the first-order Taylor expansion of sinusoid function .. math:: f(x) \approx f(a) + e \times f'(a) with fixed-point numbers that fit in a DSP48. This core is to be used in :ref:`sine_generator.sine_calculator`, and is not really suitable for other purposes. .. warning:: This is an internal core. The interface might change without notice.