sionna-part1
章节1:sionna入门
这个章节将会指导了解sionna,从最基本的原理到使用5G NR兼容代码和3GPP信道模型实现端到端的链路。还将学习如何通过应用最先进的神经接收器来编写自定义训练层,以及如何去训练和评估端到端的通信系统。
Imports 和基础
# Import TensorFlow and NumPy
import tensorflow as tf
import numpy as np
# Import Sionna
try:
import sionna as sn
except ImportError as e:
# Install Sionna if package is not already installed
import os
os.system("pip install sionna")
import sionna as sn
# For plotting 绘图
%matplotlib inline
# also try %matplotlib widget
import matplotlib.pyplot as plt
# for performance measurements 性能测试
import time
# For the implementation of the Keras models
from tensorflow.keras import Model
现在可以通过sn访问sionna函数
!nvidia-smi
Fri Mar 24 14:58:10 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.78.01 Driver Version: 525.78.01 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 On | N/A |
| 40% 31C P8 18W / 120W | 3504MiB / 6144MiB | 18% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1100 G /usr/lib/xorg/Xorg 250MiB |
| 0 N/A N/A 1370 G /usr/bin/gnome-shell 92MiB |
| 0 N/A N/A 1918 G /usr/lib/firefox/firefox 374MiB |
| 0 N/A N/A 2957 C ...nvs/tensorflow/bin/python 2738MiB |
| 0 N/A N/A 25254 G ...tePerProcess --no-sandbox 43MiB |
+-----------------------------------------------------------------------------+
sionna数据流和设计范式
sionna本质上通过并行批处理进行模拟,在批处理维度中每个元素都是独立模拟的。
这意味着第一个张量维度总是用于帧间并行化,类似于matlab/numpy模拟中的外部for循环。
为了保持高效的数据流,sionna遵循一些简单的设计原则:
- 信号处理组件作为单独的keras层实现;
- tf.float32分别被用作首选的数据类型和tf.complex64复数数值类型;允许更简单的使用组件(如相同的干扰层可用作二进制输入和LLR值);
- 可以在eager模式下通过简单快速修改系统参数来开发模型
- 可以在更快的图形模式下执行数字运算模拟,甚至大多数组件都可以使用XLA加速
- 只要有可能,组件就会通过自动梯度自动微分,来简化深度学习的设计流程;
- 代码被构造成不同任务的子包,例如通道编码、映射等
划分单独的块简化了部署,所有的层和功能都带有单元测试来保证正确的执行。
这些范例简化了组件在广泛的通信相关的应用程序中的可靠性和可用性。
Hello,sionna!
从一个最简单的模拟开始:通过AWGN信道传输QAM符号,将实现下图所示的系统。
将使用大写字母来命名所有的仿真参数
每一层在使用之前都需要进行初始化一次
提示:大多数的层都被定义为复数
首先创建一个QAM星座
NUM_BITS_PER_SYMBOL = 2 # QPSK
constellation = sn.mapping.Constellation("qam", NUM_BITS_PER_SYMBOL)
constellation.show();
任务:尝试更改调制阶数,如16QAM
然后需要设置一个映射器来将位(bits)映射到星座的点上。该映射器将星座作为参数。
同时还需要设置相对应的解映射器来计算接收到的噪声样本的对数似然比(LLRs)。
mapper = sn.mapping.Mapper(constellation=constellation)
# 解映射器和映射器使用相同的星座对象
demapper = sn.mapping.Demapper("app", constellation=constellation)
提示:可以通过?命令访问签名+文档字符串,并通过??操作打印完整的类定义
同样,也可以通过访问 https://github.com/nvlabs/sionna/
获取源码
#打印星座类的定义
sn.mapping.Mapper??
可以看出,该映射器继承自层,即实现了一个Keras层。
这允许通过Keras函数API堆叠层来简单的构建复杂的系统。
sionna提供了一个二进制源作为实用程序对统一的i.i.d bits进行采样。
binary_source = sn.utils.BinarySource()
最后还需要AWGN信道
awgn_channel = sn.channel.AWGN()
sionna提供一个实体函数来计算噪声功率谱密度比$N_0$从每比特能量到噪声功率谱密度比$E_b/N_0$以dB为单位以及各种参数,例如编码率和每个符号的位数。
no = sn.utils.ebnodb2no(ebno_db=10.0,
num_bits_per_symbol=NUM_BITS_PER_SYMBOL,
coderate=1.0) # 由于未进行编码传输,编码率设置为1
现在完成了通过AWGN信道传输QAM符号所需要的所有组件
sionna原生支持多维张量
大多数层在最后一个维度上运行,可以具有任意输入的shape(在输出时保留)
BATCH_SIZE = 64 # 通过sionnao进行处理样本时,有多少样本并行处理(大小与GPU性能相关)
bits = binary_source([BATCH_SIZE,
1024]) # Blocklength 区块长度
print("Shape of bits: ", bits.shape)
x = mapper(bits)
print("Shape of x: ", x.shape)
y = awgn_channel([x, no])
print("Shape of y: ", y.shape)
llr = demapper([y, no])
print("Shape of llr: ", llr.shape)
Shape of bits: (64, 1024)
Shape of x: (64, 512)
Shape of y: (64, 512)
Shape of llr: (64, 1024)
在eager模式下,可以直接访问每个张量的值。可以更简单的debug
num_samples = 8 # how many samples shall be printed
num_symbols = int(num_samples/NUM_BITS_PER_SYMBOL)
print(f"First {num_samples} transmitted bits: {bits[0,:num_samples]}")
print(f"First {num_symbols} transmitted symbols: {np.round(x[0,:num_symbols], 2)}")
print(f"First {num_symbols} received symbols: {np.round(y[0,:num_symbols], 2)}")
print(f"First {num_samples} demapped llrs: {np.round(llr[0,:num_samples], 2)}")
First 8 transmitted bits: [0. 1. 1. 1. 1. 1. 0. 1.]
First 4 transmitted symbols: [ 0.71-0.71j -0.71-0.71j -0.71-0.71j 0.71-0.71j]
First 4 received symbols: [ 0.65-0.71j -0.97-0.59j -0.66-0.57j 0.76-0.88j]
First 8 demapped llrs: [-36.62 40.06 54.82 33.54 37.23 32.19 -42.93 49.81]
可视化接收到的噪声样本
plt.figure(figsize=(8,8))
plt.axes().set_aspect(1)
plt.grid(True)
plt.title('Channel output')
plt.xlabel('Real Part')
plt.ylabel('Imaginary Part')
plt.scatter(tf.math.real(y), tf.math.imag(y))
plt.tight_layout()
任务:可以使用SNR来可视化接收样本的影响。
高级任务:比较“app”解映射和“maxlog”解映射的LLR区别,其中Bit-Interleaved Coded Modulation(https://nvlabs.github.io/sionna/examples/Bit_Interleaved_Coded_Modulation.html)
可以帮助解决此任务
基于keras模型的通信系统
将基于sionna的通信系统封装到Keras模型中通常更加方便
这些模型可以通过使用Keras函数API来堆叠层去更简单的构建
下面单元将前面的系统通过Keras模型实现
关键函数需要被定义为__init__(),可以实例化所需要的组件,以及__call__()通过端到端系统执行向前传递
class UncodedSystemAWGN(Model): # Inherits from Keras Model 继承自Keras模型
def __init__(self, num_bits_per_symbol, block_length):
"""
A keras model of an uncoded transmission over the AWGN channel.
Parameters
----------
num_bits_per_symbol: int
The number of bits per constellation symbol, e.g., 4 for QAM16.
block_length: int
The number of bits per transmitted message block (will be the codeword length later).
Input
-----
batch_size: int
The batch_size of the Monte-Carlo simulation.
ebno_db: float
The `Eb/No` value (=rate-adjusted SNR) in dB.
Output
------
(bits, llr):
Tuple:
bits: tf.float32
A tensor of shape `[batch_size, block_length] of 0s and 1s
containing the transmitted information bits.
llr: tf.float32
A tensor of shape `[batch_size, block_length] containing the
received log-likelihood-ratio (LLR) values.
"""
super().__init__() # 必须调用keras模型初始化
self.num_bits_per_symbol = num_bits_per_symbol
self.block_length = block_length
self.constellation = sn.mapping.Constellation("qam", self.num_bits_per_symbol)
self.mapper = sn.mapping.Mapper(constellation=self.constellation)
self.demapper = sn.mapping.Demapper("app", constellation=self.constellation)
self.binary_source = sn.utils.BinarySource()
self.awgn_channel = sn.channel.AWGN()
# @tf.function # Enable graph execution to speed things up
def __call__(self, batch_size, ebno_db):
# no channel coding used; we set coderate=1.0
no = sn.utils.ebnodb2no(ebno_db,
num_bits_per_symbol=self.num_bits_per_symbol,
coderate=1.0)
bits = self.binary_source([batch_size, self.block_length]) # Blocklength set to 1024 bits
x = self.mapper(bits)
y = self.awgn_channel([x, no])
llr = self.demapper([y,no])
return bits, llr
首先实例化该模型
model_uncoded_awgn = UncodedSystemAWGN(num_bits_per_symbol=NUM_BITS_PER_SYMBOL, block_length=1024)
sionna提供一个实用程序来更简单计算和绘制误码率(BER)
EBN0_DB_MIN = -3.0 # Minimum value of Eb/N0 [dB] for simulations
EBN0_DB_MAX = 5.0 # Maximum value of Eb/N0 [dB] for simulations
BATCH_SIZE = 2000 # How many examples are processed by Sionna in parallel
ber_plots = sn.utils.PlotBER("AWGN")
ber_plots.simulate(model_uncoded_awgn,
ebno_dbs=np.linspace(EBN0_DB_MIN, EBN0_DB_MAX, 20),
batch_size=BATCH_SIZE,
num_target_block_errors=100, # simulate until 100 block errors occured 仿真到100个错误
legend="Uncoded",
soft_estimates=True,
max_mc_iter=100, # run 100 Monte-Carlo simulations (each with batch_size samples)
show_fig=True);
EbNo [dB] | BER | BLER | bit errors | num bits | block errors | num blocks | runtime [s] | status
---------------------------------------------------------------------------------------------------------------------------------------
-3.0 | 1.5803e-01 | 1.0000e+00 | 323644 | 2048000 | 2000 | 2000 | 0.0 |reached target block errors
-2.579 | 1.4707e-01 | 1.0000e+00 | 301200 | 2048000 | 2000 | 2000 | 0.0 |reached target block errors
-2.158 | 1.3537e-01 | 1.0000e+00 | 277231 | 2048000 | 2000 | 2000 | 0.0 |reached target block errors
-1.737 | 1.2307e-01 | 1.0000e+00 | 252046 | 2048000 | 2000 | 2000 | 0.0 |reached target block errors
-1.316 | 1.1200e-01 | 1.0000e+00 | 229379 | 2048000 | 2000 | 2000 | 0.0 |reached target block errors
-0.895 | 1.0077e-01 | 1.0000e+00 | 206387 | 2048000 | 2000 | 2000 | 0.0 |reached target block errors
-0.474 | 9.0603e-02 | 1.0000e+00 | 185555 | 2048000 | 2000 | 2000 | 0.0 |reached target block errors
-0.053 | 7.9782e-02 | 1.0000e+00 | 163394 | 2048000 | 2000 | 2000 | 0.0 |reached target block errors
0.368 | 7.0127e-02 | 1.0000e+00 | 143620 | 2048000 | 2000 | 2000 | 0.0 |reached target block errors
0.789 | 6.0864e-02 | 1.0000e+00 | 124650 | 2048000 | 2000 | 2000 | 0.0 |reached target block errors
1.211 | 5.2066e-02 | 1.0000e+00 | 106632 | 2048000 | 2000 | 2000 | 0.0 |reached target block errors
1.632 | 4.3844e-02 | 1.0000e+00 | 89792 | 2048000 | 2000 | 2000 | 0.0 |reached target block errors
2.053 | 3.6577e-02 | 1.0000e+00 | 74910 | 2048000 | 2000 | 2000 | 0.0 |reached target block errors
2.474 | 2.9657e-02 | 1.0000e+00 | 60737 | 2048000 | 2000 | 2000 | 0.0 |reached target block errors
2.895 | 2.4323e-02 | 1.0000e+00 | 49813 | 2048000 | 2000 | 2000 | 0.0 |reached target block errors
3.316 | 1.9339e-02 | 1.0000e+00 | 39606 | 2048000 | 2000 | 2000 | 0.0 |reached target block errors
3.737 | 1.4732e-02 | 1.0000e+00 | 30172 | 2048000 | 2000 | 2000 | 0.0 |reached target block errors
4.158 | 1.1352e-02 | 1.0000e+00 | 23249 | 2048000 | 2000 | 2000 | 0.0 |reached target block errors
4.579 | 8.1646e-03 | 1.0000e+00 | 16721 | 2048000 | 2000 | 2000 | 0.0 |reached target block errors
5.0 | 5.9590e-03 | 9.9750e-01 | 12204 | 2048000 | 1995 | 2000 | 0.0 |reached target block errors
该sn.stils.plotBER对象存储结果并允许向先前的曲线添加额外的模拟
备注:在sionna中,如果对于两个张量在最后一个维度中至少有一个位置不同就定义为发生了块错误(block error),即每个码字至少错误接收一位。误码率是错误位置的整体数量除以传输的比特总数。
前向纠错(FEC)
现在增加信道编码到发射机中来增加接收机抵抗错误的能力。为此,使用符合5G标准的低密度奇偶校验编码(LDPC)和Polar码。可以获取更多关于 Bit-Interleaved Coded Modulation (BICM) 和 5G Channel Coding and Rate-Matching: Polar vs. LDPC Codes的信息:https://nvlabs.github.io/sionna/examples/Bit_Interleaved_Coded_Modulation.html 和 https://nvlabs.github.io/sionna/examples/5G_Channel_Coding_Polar_vs_LDPC_Codes.html
k = 12
n = 20
encoder = sn.fec.ldpc.LDPC5GEncoder(k, n)
decoder = sn.fec.ldpc.LDPC5GDecoder(encoder, hard_out=True)
对随机输入的比特进行编码
BATCH_SIZE = 1 # one codeword in parallel
u = binary_source([BATCH_SIZE, k])
print("Input bits are: \n", u.numpy())
c = encoder(u)
print("Encoded bits are: \n", c.numpy())
Input bits are:
[[1. 0. 1. 0. 1. 1. 0. 0. 1. 0. 0. 0.]]
Encoded bits are:
[[1. 1. 0. 0. 1. 0. 0. 0. 1. 0. 0. 1. 0. 1. 0. 1. 0. 1. 1. 0.]]
sionna的基本范例之一是批处理。因此,以上示例可以对任意批量的大小执行,以batch_size码字并行仿真、
然而,sionna还可以有更多——支持N维输入张量,因此在单个命令行中允许多用户和多天线的多个样本处理。假设为每个num_users连接的各个num_basestations进行编码且长度为n的batch_size字码。这意味着整体传输比特为batch_size* n* num_users* num_basestations
BATCH_SIZE = 10 # samples per scenario 每个场景样本数
num_basestations = 4
num_users = 5 # users per basestation 每个基站用户数
n = 1000 # codeword length per transmitted codeword 每个传输字码的长度
coderate = 0.5 # coderate 码率
k = int(coderate * n) # number of info bits per codeword 每个字码的信息比特数
# instantiate a new encoder for codewords of length n 初始化一个新的长度为n的字码的编码器
encoder = sn.fec.ldpc.LDPC5GEncoder(k, n)
# the decoder must be linked to the encoder (以便了解用于编码的确切代码参数)
decoder = sn.fec.ldpc.LDPC5GDecoder(encoder,
hard_out=True, # binary output or provide soft-estimates 二进制输出或者提供软估计
return_infobits=True, # or also return (decoded) parity bits 返回解码奇偶校验位
num_iter=20, # number of decoding iterations 解码迭代次数
cn_type="boxplus-phi") # also try "minsum" decoding
# draw random bits to encode
u = binary_source([BATCH_SIZE, num_basestations, num_users, k])
print("Shape of u: ", u.shape)
# We can immediately encode u for all users, basetation and samples
# This all happens with a single line of code
c = encoder(u)
print("Shape of c: ", c.shape)
print("Total number of processed bits: ", np.prod(c.shape))
Shape of u: (10, 4, 5, 500)
Shape of c: (10, 4, 5, 1000)
Total number of processed bits: 200000
这适用于任意维度并允许将设计的系统简单扩展到多用户或者多天线场景
现在用Polar码代替LDPC码
k = 64
n = 128
encoder = sn.fec.polar.Polar5GEncoder(k, n)
decoder = sn.fec.polar.Polar5GDecoder(encoder,
dec_type="SCL") # you can also use "SCL"
高级备注:5G Polar编解码类直接用于速率匹配和额外的CRC级联。这都是在内部完成的且对用户透明
如果想要访问Polar码的low_level features,可以使用sionna.fec.polar.PolarEncoder 和所需的解码器sionna.fec.polar.PolarSCDecoder , sionna.fec.polar.PolarSCLDecoder或sionna.fec.polar.PolarBPDecoder
更多信息可以在5G Channel Coding and Rate-Matching: Polar vs. LDPC Codes(https://nvlabs.github.io/sionna/examples/5G_Channel_Coding_Polar_vs_LDPC_Codes.html)
class CodedSystemAWGN(Model): # Inherits from Keras Model
def __init__(self, num_bits_per_symbol, n, coderate):
super().__init__() # Must call the Keras model initializer
self.num_bits_per_symbol = num_bits_per_symbol
self.n = n
self.k = int(n*coderate)
self.coderate = coderate
self.constellation = sn.mapping.Constellation("qam", self.num_bits_per_symbol)
self.mapper = sn.mapping.Mapper(constellation=self.constellation)
self.demapper = sn.mapping.Demapper("app", constellation=self.constellation)
self.binary_source = sn.utils.BinarySource()
self.awgn_channel = sn.channel.AWGN()
self.encoder = sn.fec.ldpc.LDPC5GEncoder(self.k, self.n)
self.decoder = sn.fec.ldpc.LDPC5GDecoder(self.encoder, hard_out=True)
#@tf.function # activate graph execution to speed things up
def __call__(self, batch_size, ebno_db):
no = sn.utils.ebnodb2no(ebno_db, num_bits_per_symbol=self.num_bits_per_symbol, coderate=self.coderate)
bits = self.binary_source([batch_size, self.k])
codewords = self.encoder(bits)
x = self.mapper(codewords)
y = self.awgn_channel([x, no])
llr = self.demapper([y,no])
bits_hat = self.decoder(llr)
return bits, bits_hat
CODERATE = 0.5
BATCH_SIZE = 2000
model_coded_awgn = CodedSystemAWGN(num_bits_per_symbol=NUM_BITS_PER_SYMBOL,
n=2048,
coderate=CODERATE)
ber_plots.simulate(model_coded_awgn,
ebno_dbs=np.linspace(EBN0_DB_MIN, EBN0_DB_MAX, 15),
batch_size=BATCH_SIZE,
num_target_block_errors=500,
legend="Coded",
soft_estimates=False,
max_mc_iter=15,
show_fig=True,
forward_keyboard_interrupt=False);
EbNo [dB] | BER | BLER | bit errors | num bits | block errors | num blocks | runtime [s] | status
---------------------------------------------------------------------------------------------------------------------------------------
-3.0 | 2.7979e-01 | 1.0000e+00 | 573002 | 2048000 | 2000 | 2000 | 2.1 |reached target block errors
-2.429 | 2.6433e-01 | 1.0000e+00 | 541342 | 2048000 | 2000 | 2000 | 2.0 |reached target block errors
-1.857 | 2.4681e-01 | 1.0000e+00 | 505477 | 2048000 | 2000 | 2000 | 1.9 |reached target block errors
-1.286 | 2.2729e-01 | 1.0000e+00 | 465494 | 2048000 | 2000 | 2000 | 2.0 |reached target block errors
-0.714 | 2.0353e-01 | 1.0000e+00 | 416825 | 2048000 | 2000 | 2000 | 2.1 |reached target block errors
-0.143 | 1.7175e-01 | 1.0000e+00 | 351752 | 2048000 | 2000 | 2000 | 2.2 |reached target block errors
0.429 | 1.1172e-01 | 9.8900e-01 | 228794 | 2048000 | 1978 | 2000 | 2.2 |reached target block errors
1.0 | 2.1537e-02 | 4.7300e-01 | 44107 | 2048000 | 946 | 2000 | 2.0 |reached target block errors
1.571 | 1.5762e-04 | 1.0800e-02 | 4842 | 30720000 | 324 | 30000 | 29.6 |reached max iter
2.143 | 0.0000e+00 | 0.0000e+00 | 0 | 30720000 | 0 | 30000 | 28.0 |reached max iter
Simulation stopped as no error occurred @ EbNo = 2.1 dB.
可以看出,BerPlot类使用多个停止条件并在特定的SNR点无错误发生后停止仿真
任务:用Polar编码器/解码器或卷积码编码Viterbi解码来替换方案
Eager vs Graph 模式
目前一用eager模式执行了示例。这允许运行TensorFlow操作,就像是用numpy编写一样,简化了开发和调试
为了释放sionna的全部性能,需要使用图形模式,可以通过函数装饰器@tf.function() 启用。
@tf.function() # enables graph-mode of the following function 启用图形模式
def run_graph(batch_size, ebno_db):
# all code inside this function will be executed in graph mode, also calls of other functions
print(f"Tracing run_graph for values batch_size={batch_size} and ebno_db={ebno_db}.") # 当该函数被跟踪时打印 print whenever this function is traced
return model_coded_awgn(batch_size, ebno_db)
batch_size = 10 # try also different batch sizes 可以尝试不同的批大小
ebno_db = 1.5
# run twice - how does the output change? 运行2次,结果会有啥变化
run_graph(batch_size, ebno_db)
(<tf.Tensor: shape=(10, 1024), dtype=float32, numpy=
array([[0., 0., 1., ..., 1., 0., 1.],
[0., 1., 1., ..., 0., 0., 0.],
[0., 1., 1., ..., 1., 1., 0.],
...,
[0., 1., 1., ..., 0., 0., 1.],
[1., 0., 1., ..., 0., 1., 0.],
[0., 0., 0., ..., 1., 1., 0.]], dtype=float32)>,
<tf.Tensor: shape=(10, 1024), dtype=float32, numpy=
array([[0., 0., 1., ..., 1., 0., 1.],
[0., 1., 1., ..., 0., 0., 0.],
[0., 1., 1., ..., 1., 1., 0.],
...,
[0., 1., 1., ..., 0., 0., 1.],
[1., 0., 1., ..., 0., 1., 0.],
[0., 0., 0., ..., 1., 1., 0.]], dtype=float32)>)
在图形模式下python代码(即非Tensorflow代码)仅在函数被追踪时执行。只要输入的签名发生变化就会发生
从上面可以看出,执行了print语句,即再次被追踪
为了避免这种针对不同输入的重新追踪,现在输入张量。可以看到针对相同的dtype函数仅被追踪一次
任务:更改上面的代码,用张量作为输入并执行代码使用不同的输入值。了解何时发生重新追踪
备注:如果函数的输入是张量,则其签名必须更改,而不仅仅是其值。例如,输入可以有不同的大小或数据类型。为了高效的代码执行,我们通常希望避免在不需要时重新跟踪代码。
# You can print the cached signatures with 可以打印缓存的签名
print(run_graph.pretty_printed_concrete_signatures())
run_graph(batch_size=10, ebno_db=1.5)
Returns:
(<1>, <2>)
<1>: float32 Tensor, shape=(10, 1024)
<2>: float32 Tensor, shape=(10, 1024)
run_graph(batch_size=15, ebno_db=1.5)
Returns:
(<1>, <2>)
<1>: float32 Tensor, shape=(15, 1024)
<2>: float32 Tensor, shape=(15, 1024)
现在比较不同模式的吞吐量
repetitions = 4 # average over multiple runs
batch_size = BATCH_SIZE # try also different batch sizes
ebno_db = 1.5
# --- eager mode ---
t_start = time.perf_counter()
for _ in range(repetitions):
bits, bits_hat = model_coded_awgn(tf.constant(batch_size, tf.int32),
tf.constant(ebno_db, tf. float32))
t_stop = time.perf_counter()
# throughput in bit/s
throughput_eager = np.size(bits.numpy())*repetitions / (t_stop - t_start) / 1e6
print(f"Throughput in Eager mode: {throughput_eager :.3f} Mbit/s")
# --- graph mode ---
# run once to trace graph (ignored for throughput)
run_graph(tf.constant(batch_size, tf.int32),
tf.constant(ebno_db, tf. float32))
t_start = time.perf_counter()
for _ in range(repetitions):
bits, bits_hat = run_graph(tf.constant(batch_size, tf.int32),
tf.constant(ebno_db, tf. float32))
t_stop = time.perf_counter()
# throughput in bit/s
throughput_graph = np.size(bits.numpy())*repetitions / (t_stop - t_start) / 1e6
print(f"Throughput in graph mode: {throughput_graph :.3f} Mbit/s")
Throughput in Eager mode: 0.999 Mbit/s
Tracing run_graph for values batch_size=Tensor("batch_size:0", shape=(), dtype=int32) and ebno_db=Tensor("ebno_db:0", shape=(), dtype=float32).
Throughput in graph mode: 3.723 Mbit/s
在图形模式下运行上面相同的仿真
ber_plots.simulate(run_graph,
ebno_dbs=np.linspace(EBN0_DB_MIN, EBN0_DB_MAX, 12),
batch_size=BATCH_SIZE,
num_target_block_errors=500,
legend="Coded (Graph mode)",
soft_estimates=True,
max_mc_iter=100,
show_fig=True,
forward_keyboard_interrupt=False);
EbNo [dB] | BER | BLER | bit errors | num bits | block errors | num blocks | runtime [s] | status
---------------------------------------------------------------------------------------------------------------------------------------
-3.0 | 2.7921e-01 | 1.0000e+00 | 571830 | 2048000 | 2000 | 2000 | 0.5 |reached target block errors
-2.273 | 2.5901e-01 | 1.0000e+00 | 530447 | 2048000 | 2000 | 2000 | 0.6 |reached target block errors
-1.545 | 2.3567e-01 | 1.0000e+00 | 482655 | 2048000 | 2000 | 2000 | 0.6 |reached target block errors
-0.818 | 2.0751e-01 | 1.0000e+00 | 424979 | 2048000 | 2000 | 2000 | 0.6 |reached target block errors
-0.091 | 1.6814e-01 | 1.0000e+00 | 344346 | 2048000 | 2000 | 2000 | 0.6 |reached target block errors
0.636 | 7.6388e-02 | 9.1100e-01 | 156443 | 2048000 | 1822 | 2000 | 0.6 |reached target block errors
1.364 | 1.6884e-03 | 7.5000e-02 | 13831 | 8192000 | 600 | 8000 | 2.4 |reached target block errors
2.091 | 1.0596e-06 | 6.0000e-05 | 217 | 204800000 | 12 | 200000 | 58.8 |reached max iter
2.818 | 0.0000e+00 | 0.0000e+00 | 0 | 204800000 | 0 | 200000 | 58.3 |reached max iter
Simulation stopped as no error occurred @ EbNo = 2.8 dB.
任务:TensorFlow 允许使用XLA编译图形。尝试使用 XLA (@tf.function(jit_compile=True)) 进一步加速代码。
备注:XLA 仍是一项实验性功能,并非所有 TensorFlow(以及 Sionna)函数都支持 XLA。
任务 2:使用!nvidia-smi来检查 GPU 负载。为特定 GPU 架构找到批量大小和吞吐量之间的最佳权衡。
!nvidia-smi
Fri Mar 24 20:32:30 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.78.01 Driver Version: 525.78.01 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 On | N/A |
| 47% 50C P2 45W / 120W | 4927MiB / 6144MiB | 72% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1100 G /usr/lib/xorg/Xorg 242MiB |
| 0 N/A N/A 1370 G /usr/bin/gnome-shell 46MiB |
| 0 N/A N/A 1918 G /usr/lib/firefox/firefox 405MiB |
| 0 N/A N/A 2957 C ...nvs/tensorflow/bin/python 2738MiB |
| 0 N/A N/A 20986 C ...nvs/tensorflow/bin/python 1438MiB |
| 0 N/A N/A 25254 G ...tePerProcess --no-sandbox 51MiB |
+-----------------------------------------------------------------------------+
练习
模拟 Polar 编码和 64-QAM 调制的编码误码率 (BER)。假设代码字长度为 n = 200 且码率 = 0.5。
提示:对于 Polar 码,连续取消列表解码 (SCL) 可提供最佳 BER 性能。连续消除 (SC) 解码(没有列表)并不复杂。
n = 200
coderate = 0.5
# *You can implement your code here*