HDL Implementation of LTE OFDM Modulator and Detector

This example shows how to build an LTE compliant OFDM Modulator and Detector for implementation with HDL Coder™, and use LTE System Toolbox™ to verify the HDL implementation model.

Introduction

This example addresses real-world problems associated with implementing OFDM modulation and detection for HDL code generation. The modulator includes DC carrier insertion, cyclic prefix insertion and windowing while the detector implements frequency recovery, Primary Synchronization Signal (PSS) and Secondary Synchronization Signal (SSS) detection to determine the physical layer cell identity. LTE System Toolbox™ is used to verify the functionality of the HDL models by providing input stimulus and golden reference output waveforms. Diagrams of the modulator and detector are shown below.

LTE Modulator Structure

LTE Detector Structure

The LTE Modulator HDL subsystem takes a pre-generated Long-Term Evolution (LTE) downlink (DL) resource grid that is created using the LTE System Toolbox, and performs OFDM modulation in accordance with the LTE standard. The modulator subsystem is parameterizable, and supports all standard LTE channel bandwidths. To verify the output of the HDL implementation, a golden reference, OFDM modulated waveform is created using the LTE System Toolbox lteTestModel and lteTestModelTool functions. The default golden reference waveform that is used for verification is generated in accordance with E-UTRA test model (E-TM) 1.1, and a bandwidth of . For more information on E-TM 1.1, see Clause 6 of [ 2 ].

Details of the supported bandwidths, as well as associated IFFT lengths and sample rates, are provided in the table below.

A floating-point channel model, Channel, is used to introduce a frequency offset, attenuation, channel noise and time delay in order to demonstrate the operation of the receiver. The LTE_Detector_HDL subsystem implements the initial stages of OFDM receiver functionality to identify the LTE Cell ID.

LTE System Toolbox Functions

Various functions from the LTE System Toolbox are used in this example to generate a golden reference modulated waveform that is used to verify the HDL implementation. Those functions, and their location within the example, are highlighted below. More info on the functions can be found by following the link in the function name.

CP_Extension_Windowing: Schedules the CP extension and performs the overlap and add windowing operation

Filtering: Filters the transmit signal to insure that it meets the required spectral mask requirements

The structure of the Channel subsystem is shown below. The channel model consists of AWGN noise, a time delay, attenuation and a frequency offset. The signal is first converted to double, mimicking the operation of a Digital-to-Analog Converter (DAC). The additive noise, time delay, attenutation and frequency offset are then applied. Finally the sample rate is reduced to 1.92 Msps using an FIR Decimation filter, and the signal is converted to 16 bit fixed-point data, modeling an Analog-to-Digital Converter (ADC). The Channel subsystem is only a simulation model and does not generate HDL code.

The following diagram shows the detailed structure of the LTE_Detector_HDL subsystem.

The LTE_Detector_HDL subsystem contains the following components which are described in greater detail in the HDL Implementation of LTE Detector section.

Frequency_Estimation: Estimates the frequency offset of the received signal using the cyclic prefix.

Frequency_Correction: Corrects the frequency offset determined by the Frequency_Estimation subsystem.

PSS_Detection: Performs cross-correlation of the received signal with three possible Primary Synchronization Signals (PSSs) to determine the cell identity within the group, and to calculate the received timing offset

Timing_Adjustment: Uses the calculated timing offset value from PSS_Detection to schedule the input of data to the FFT

SSS_Detection: Performs a dot product of the received frequency-domain samples and the 168 possible Secondary Synchronization Signals (SSSs) to determine the cell group.

Determine_Cell_ID: Calculates the LTE cell identity from the detected cell group (from SSS detection), and the detected position within the cell group (from PSS detection).

HDL Implementation of LTE Modulator (LTE_Modulator_HDL)

The HDL Implementation of LTE Modulator contains OFDM_Symbol_Mapping, FFT Shift, IFFT, CP_Extension_Windowing, and Filtering blocks which are described in detail in the following sections.

1 - OFDM_Symbol_Mapping

The IFFT_Subcarrier_Counter generates a counter signal in the range of 0 to IFFT Length, where each value corresponds to one IFFT bin (or OFDM subcarrier). The remainder of the logic in the top-level of the OFDM Symbol Mapping block maps the LTE DL resource grid data to the central subcarriers of the IFFT, zeros the DC (zero frequency) subcarrier, and reserves the correct number of samples for the CP extension operation. A valid out signal is also generated to indicate the validity of data further down the signal processing path. As there is a pipeline delay associated with the symbol mapping stage, the valid signal is therefore delayed to match the pipeline delay of the data path.

IFFT_Subcarrier_Counter: The IFFT_Subcarrier_Counter is shown in the following figure. In an LTE system there are 7 OFDM symbols per slot, the first of which has a longer CP length than the remaining 6 symbols. It is therefore necessary to vary the number of samples which are reserved for the CP extension process. This is implemented using the combination of two counters - one to count the subcarrier number plus current CP length, and a second to count the symbol number. The IFFT subcarrier port outputs the current IFFT subcarrier number (in the range of 0 to IFFT size - 1) for valid data samples, and IFFT size for samples reserved for the CP.

2 - FFT_Shift

For the OFDM modulated LTE signal to be correctly aligned within the frequency spectrum (centrally aligned with the DC subcarrier at zero frequency), the input to the IFFT operation must be reordered so that the DC subcarrier sample is aligned with the first IFFT bin. This is achieved using an fftshift operation. This subsystem implements a hardware optimized implementation of the fftshift operation. The incoming OFDM symbol samples are written to a Dual Port RAM block with an addressable range of 2 x IFFT size. After an initial latency of one IFFT frame, the first FFT shifted samples can be read from RAM. The correct read address for the FFT shifted samples is computed by the Compute_Read_Address subsystem. As there is a single clock cycle delay associated with reading data from the RAM block, the valid signal is therefore delayed accordingly.

Compute_Read_Address: The Compute_Read_Address subsystem is shown in the figure below. When the rdEnable signal goes high, a counter is triggered which counts one full IFFT frame. As the RAM is simultaneously having new samples written in, and older samples read out, it is important to ensure that the samples being read are not from the same half of the RAM that is currently being written to. The logic at the bottom half of the subsystem ensures that this is the case by toggling or "ping/ponging" between two initial read addresses. The fftshift operation is implemented using a bitwise XOR of the IFFT index counter value and the initial read address. The XOR operation between the IFFT index value and the read address of IFFT size/2 flips the MSB of the index value, while an XOR operation with IFFT Size flips the ping/pong between the two halves of the RAM.

3 - IFFT

The Inverse Fast Fourier Transform (IFFT) is implemented using the IFFT HDL Optimized block which provides hardware speed and area optimization for streaming data applications. See IFFT HDL Optimized for more information on the functionality of the block.

4 - CP_Extension_Windowing

The Compute_CP_Pass_Through_Index subsystem detects the end of the current OFDM symbol, determines the length of the CP that corresponds to that symbol and subtracts it from the IFFT length to create the fftMinusCP output. This value is used to schedule the CP extension. Another output, fftMinusCPandWindow, is created and used to schedule the windowing operations. The Check Window Multiply subsystem schedules the windowing operation by creating control and indexing signals, which are used by the Multiply_by_Window subsystem to multiply the "head" and "tail" of the OFDM symbol by raised-cosine windowing samples.

Compute_CP_Pass_Through_Index: As the length of the CP is longer for the first symbol in each slot, it is necessary to detect the end of each OFDM symbol. The end of the symbol can be determined by the change in level of the validIn signal. The detection logic creates a strobe which is high for a single clock cycle which, in turn, enables the OFDM Symbol Counter to advance.

Multiply_by_Window: To comply with the functionality of the lteOFDMModulate function in the LTE System Toolbox, the windowing process is split into two subprocesses (one for the head of the symbol, and one for the tail) to allow a windowing length that is greater than that of the CP. If the windowing length were to be constrained to less than or equal to that of the CP, this process could be optimized to use a single counter and Lookup Table (LUT) combination. The blocks which multiply the head of the OFDM symbol are situated at the top of the subsystem. As cyclic prefix extension is required for an OFDM modulator, depending on the window length, the windowing of the head may be applied to only the CP samples, and not those that make up the symbol itself. For the relevant CP data samples (0 to window length), an up-counter is used to address a Lookup Table (LUT) containing the pre-calculated Raised-Cosine windowing samples. The output of the RC Window Lookup is then multiplied by the incoming CP data sample. All non-windowed samples are multiplied by 1. The blocks which perform the window multiplication on the tail of the OFDM symbol are situated in the bottom half of the subsystem. In a similar fashion to the windowing of the head samples, for the relevant tail samples (the final IFFT length - window length-1 to IFFT length-1 samples), a down-counter addresses an LUT containing the windowing samples. The output of the RC Window Lookup is then multiplied by the incoming CP data sample. All non-windowed samples are multiplied by 1.

Overlap_and_Add: The second part of the windowing process is the overlap and add. In this subsystem, the tail-end of the current OFDM symbol is overlapped with the head of the next OFDM symbol, and summed together. A counter counts the number of OFDM and CP samples which have been modulated over an entire LTE radio frame (plus an additional window length to compensate for the capturing of the head of the radio frame), in order to determine the start and end of the frame. This is necessary to schedule the capture of the head from the start of the radio frame in the headRAM, and to overlap and add it with the tail at the end of the radio frame. When the control signal Add Window is high, and the totSampleCount value is less than the total number of samples in the radio frame, the output of the subsystem is the sum of the overlapping head and tail samples from consecutive OFDM symbols. When Add Window is high, and the totSampleCount value is greater than the total number of samples in the radio frame, the output of the subsystem is the sum of the overlapping head samples stored in headRAM and the windowed tail samples of the radio frame. When Add Window is low, the output of the subsystem is un-windowed symbol samples.

5 - Filtering

The Filtering subsystem implements a lowpass FIR filter to ensure that the modulated LTE OFDM waveform is within the required spectral mask requirements outlined by the LTE standard.

HDL Implementation of LTE Detector (LTE_Detector_HDL)

The HDL Implementation of LTE Detector contains frequency recovery, time domain PSS detection and frequency domain SSS detection. The final cell ID is then computed from the PSS and SSS detection results. The following sections describe the subsystems which perform these functions in detail.

1 - Frequency Recovery

Frequency recovery is performed in two stages: frequency estimation and frequency correction.

Frequency_Estimation

The frequency offset is measured by exploiting the cyclic prefix of the OFDM signal. The Cyclic_Prefix_Correlator subsystem generates a complex valued correlation signal which has magnitude peaks at the end of each OFDM symbol. The phase angle of the correlation signal at the peaks is proportional to the frequency offset. The correlation signal is converted to magnitude and angle values by the Rect_to_Polar subsystem. The Angle_at_Maximum subsystem then searches for the maximum correlation magnitude and records the corresponding phase angle every 960 samples. Finally the phase angles are filtered with a simple IIR low pass filter to smooth the result, generating the final frequency estimate.

Cyclic_Prefix_Correlator.

The Cyclic_Prefix_Correlator is shown below. This uses a well known technique for detecting OFDM symbols, which involves cross correlating the signal with a delayed version of itself, and then applying a moving average filter with averaging length set to the cyclic prefix length. The cylic prefix length is 10 samples for the first OFDM symbol in a slot and 9 for all the remaining symbols. In this example we use an averaging length of 8 to generate efficient HDL code. We then further average the correlation result in the Slot_Average subsystem.

Angle_at_Maximum

The Angle_at_Maximum subsystem The Angle_at_Maximum subsystem measures the maximum correlation magnitude, and the corresponding phase angle, every 960 samples. This means that the phase angle from the strongest correlation peak is registered and used to estimate the frequency offset.

Frequency_Correction

The Frequency_Correction subsystem uses an Numerically Controlled Oscillator (NCO) to generate a complex phasor at the estimated frequency. The complex conjugate of this signal is then multiplied by the recieved signal to correct the frequency offset.

2 - PSS_Detection

The PSS_Detection subsystem is split into three further subsystems: Cross_Correlation, Peak_Detection, and Determine_Cell_ID and Offset. PSS detection is performed for two main reasons:

To determine the physical layer identity within the LTE cell group; and

To determine the position of the PSS within the received signal for timing adjustment

Cross_Correlation: The Cross_Correlation subsystem is shown below. The subsystem cross-correlates the received data signal with each of the three possible time-domain PSS sequences, and then computes the square magnitude of each of the cross-correlations. The cross-correlation is performed via matched filtering implemented as a fully-serial FIR filter, in order to optimize the required hardware area, and to take advantage of the low sampling rate of 1.92 Msps. A threshold signal is also generated by calculating the average power of the received signal via an averaging filter. The threshold signal is used in the Peak_Detection subsystem.

Peak_Detection: The Peak_Detection subsystem checks if the cross-correlation output from each of the three matched filters exceeds the threshold value. If the threshold value is exceeded, a local search is performed on the current sample, and the following 9 successive samples. This ensures that the exact sample for the peak of the cross-correlation is identified, and not an adjacent sample of similar power.

Determine_Primary_Cell_ID: In order to provide output data based on the PSS detection process, logic is implemented to determine a number of variables:

The peak power of the PSS detection

The physical layer identity within the LTE cell group

The timing offset value

The cross-correlation output over time for the detected PSS, and the corresponding PSS detection strobe are also computed.

3 - Timing_Adjustment

The Timing Adjust subsystem is responsible for ensuring that the input to the FFT operation is correctly aligned in time. This is required so that the correct receive samples are matched with the correct FFT subcarriers, thus ensuring that the demodulated signal contains the proper frequency domain samples. The subsystem uses the PSS Detected input strobe to create a boolean valid signal for enabling downstream blocks. As the PSS strobe is only triggered once during the 128 samples of the first time-domain PSS sequence (mapped to OFDM symbol 6 in slot 0), the strobe indicates the position in time of the received signal. With the exact position of the last PSS sample known, and the knowledge that the first SSS sequence is mapped to the previous OFDM symbol (symbol 5 in slot 0), the first input to the FFT can be adjusted to line up exactly with the 128 samples of the time-domain SSS sequence by applying an appropriate delay. The value of the delay is calculated as:

4 - FFT

The Fast Fourier Transform (FFT) is implemented using the FFT HDL Optimized block which provides hardware speed and area optimization for streaming data applications. See FFT HDL Optimized for more information on the functionality of the block.

The output of the FFT is the demodulated OFDM symbols which correspond to the six central resource blocks of the LTE transmission. The FFT frame output contains the frequency-domain SSS which is detected in software in the post-simulation processing.

5 - SSS_Detection

The SSS_Detection is split into three subsystems: Sample_and_Store, SSS_Sequence_Update, and SSS_Dot_Product.

The SSS search determines the physical layer cell identity group. By using the value provided by the PSS detection in the HDL implementation, the physical layer cell identity within the cell group is known, reducing the number of possible SSS sequences from 504 to 168. To determine the physical layer cell identity group, the received demodulated signal undergoes cross-correlation with the 168 possible SSS sequences. The correct cell identity group is given by the SSS sequence which provides the highest peak correlation output.

Sample_and_Store:

Due to performing Timing_Adjustment, the first (valid) 128 samples that are emitted from the FFT operation will contain the transmitted SSS sequence (located in symbol 5 of slot 0). It is therefore necessary to store those 128 samples so that they can be compared to the 168 possible SSS sequences. The Sample_and_Store subsystem uses the Sample Counter to count the incoming data samples. The first 128 samples that are received are written into RAM. Those samples are also passed to the dataOut output as valid output samples. The output of the Sample Counter is also used to increment the Group Counter, which keeps track of the current cell group (0 to 167). After the first 128 samples have been written to RAM, the output of Sample Counter is used to generate the read address of the RAM, and the 128 stored samples are repeatedly read out of RAM to the dataOut output until all 168 cell groups have been covered.

The transmitted SSS sequence consists of 62 samples, plus one zero-valued sample for the DC subcarrier. The output of the FFT is such that the SSS sequence is split between the first 32, and last 31 samples. A validOut signal is created in accordance with this structure.

SSS_Sequence_Update:

The SSS_Sequence_Update subsystem is responsible for providing the appropriate SSS sequence samples for a given LTE Cell Group (SSS) and the position within that group (PSS).

All 504 possible SSS sequences are stored in a 2-dimensional LUT of size 62x504. As the imaginary component of all frequency-domain SSS sequences are zero-valued, only the real-valued component is stored in the LUT. Based on the detected PSS sequence (0 to 2) and the current cell group (0 to 167), the correct column of the LUT is addressed. The 62 rows that contain the corresponding SSS samples are addressed by the output of the Sample Counter. After all 62 samples of a given SSS sequence have been emitted from the LUT, a strobe is provided on the endOfGroup output.

SSS_Dot_Product

As the exact position of the transmitted SSS sequence is known, cross-correlation is not required to detect the transmitted SSS sequence. Instead, the dot product between the relevant samples from the received signal, and the corresponding samples from all possible SSS sequences is used.

As frequency-domain SSS samples only take the values of {1, -1}, a multiplier is not required. Instead, the multiplication by -1 can be calculated by inverting the sign of the received sample. This is implemented by splitting the received signal into its real- and imaginary-valued components, and performing a Bitwise NOT operation on each, before recombining into a complex-valued signal. The accumulator output is reset on the reception of a strobe signal from the SSS_Sequence_Update subsystem. As only the final output of the dot product operation is required to determine the transmitted SSS sequence, the same strobe signal is used to generate the validOut output.

5 - Determine_Cell_ID

The Determine_Cell_ID subsystem uses the output of SSS Detection and PSS Detection to determine the cell identity for transmitted LTE signal.

The transmitted SSS sequence is chosen as the SSS sequence which produces the maximum power output from the dot product.

With both the physical layer cell identity group (SSS sequence), and the position within the group (PSS sequence) identified, the full physical layer cell identity, , is calculated as:

is the physical layer cell identity group (0 to 167)

is the identity within the group (0 to 2)

Results & Displays

After running the simulation, the model displays three different figures illustrating the outputs and results. These figures are shown below, along with an explanation of each plot. The first plot illustrates the output of the LTE OFDM Modulator, the second plot displays the output of the PSS detection in the LTE OFDM Detector, while the final plot provides various text-based results from the final stages of the detector.

Transmitted LTE waveform Plot

The following plot illustrates the output of the LTE OFDM Modulator, and is split into two subplots:

Power Spectral Density (PSD) Plot: This subplot shows the Power Spectral Density (PSD) of the output of the LTE OFDM Modulator. The result is plotted on top of the PSD of the golden reference output signal which was generated using the LTE System Toolbox to visually show the equivalence of the two signals. The LTE transmission bandwidth, BW, is also displayed in the figure title. For the figure shown below, a transmission bandwidth of BW = 5MHz was used.

Time-Domain Plot: This subplot shows a zoomed-in portion of the absolute-valued output of the LTE OFDM Modulator over time. The result is plotted on top of the same portion of the absolute-valued golden reference output signal which was generated using the LTE System Toolbox to visually show the equivalence of the two signals. In order to further compare the output to that of the golden reference signal, a third signal is plotted which shows the absolute-valued difference between output of the HDL Implementation and the golden reference signal (i.e. abs(LSTreference - HDLImplementation)). This illustrates the minimal error between the two signals.

Frequency Estimate Plot

The following plot shows the frequency estimate converging. A new frequency estimate is generated every 960 samples, corresponding to the slot period. A one-pole filter is is used to smooth the estimates, which explains the curve shown in the plot.

PSS Cross-Correlation Plot

The following plot illustrates the output of the PSS Detection. Shown in the plot is the power output of the cross-correlation of the received signal with the detected PSS sequence over time. The average threshold signal is also plotted to illustrate the identification process. The visible peaks which exceed the threshold indicate that this PSS sequence has been detected. Two peaks are visible as the PSS sequence is transmitted twice per LTE radio frame.

SSS Cross-Correlation Plot

The following plot shows the output of the SSS_Detection. Illustrated in the plot is the power output of the dot product between the received signal and the possible SSS sequences. The values on the X-axis correspond to the possible SSS sequences. For each SSS sequence, only the final value of the dot product is plotted. The SSS sequence with the largest dot product power output is chosen as the transmitted sequence.

Final Simulation Outputs

The following plot provides text-based results for the final stages of the simulation. It is split into three sections:

Detected Cell ID: displays the final results of the LTE cell search.

PSS Cross-Correlation Values: displays the Peak cross-correlation power and Timing offset from the PSS Detection.