SEAL源码分析1.0

毕业设计做的seal的同态加密,看源码是非常必要的;但由于个人C++水平太菜,分析的可能比较粗浅,欢迎有兴趣的大佬与我交流。

plaintext.h

​ 主要定义了明文格式的大类,并且实现了若干关于明文的函数声明。只记录比较核心的内容。

​ 首先明文通常定义为多项式格式,这是R-LWE的基础数据形式,当scheme为scheme_type::bfv时(选择bfv为加密方案),明文的每个系数都是64位;而当scheme为scheme_type::ckks时,默认为明文存储在NTT变换的形式中关于的每个质数模量系数。因此,所需要的分配的大小是大小系数的模数(质数的数目)乘以degree的多项式系数。此外,一个有效的CKKS明文还存储parms_id对应的加密参数。

using pt_coeff_type = std::uint64_t;
//默认为BFV的系数形式

​ Plaintext的创建是从线程池里取用的,这样能够减小频繁开辟和销毁内存带来的开销。

Plaintext(MemoryPoolHandle pool = MemoryManager::GetPool()) : data_(std::move(pool)){}

​ 在上述基础上,为了减少隐式转换带来的编译问题,引入了explicit构造的函数:

explicit Plaintext(std::size_t coeff_count, MemoryPoolHandle pool = MemoryManager::GetPool()): coeff_count_(coeff_count), data_(coeff_count_, std::move(pool)){}

​ 为了方便使用,实现了16进制字符串系数的明文格式:

Plaintext(const std::string &hex_poly, MemoryPoolHandle pool = MemoryManager::GetPool()): data_(std::move(pool))
{
 operator=(hex_poly);
}

​ 下边介绍类中的函数,基本都是关于明文的一些操作。

​ 1.拷贝和移动明文:

Plaintext(const Plaintext &copy) = default;//copy
Plaintext(Plaintext &&source) = default;//move

​ 2.通过复制创建新的明文,这里从线程池里取了一个线程作为新的Plaintext:

Plaintext(const Plaintext &copy, MemoryPoolHandle pool) : Plaintext(std::move(pool))
{
    *this = copy;
}

​ 3.给指定大小的明文分配足够的内存:

void reserve(std::size_t capacity)
{
     if (is_ntt_form())
     {
       	throw std::logic_error("cannot reserve for an NTT transformed Plaintext");
     }
     data_.reserve(capacity);
     coeff_count_ = data_.size();
}

​ 4.分配足够的内存来容纳当前的后备数组明文并将其复制到新位置。这个函数的意思是将明文的内存使用减少到尽可能小,减少开销。

inline void shrink_to_fit()
{
   data_.shrink_to_fit();
}

​ 5.重置明文。把之前分配的所有内存明文,返回到内存池。

inline void release() noexcept
{
    parms_id_ = parms_id_zero;
    coeff_count_ = 0;
    scale_ = 1.0;
    data_.release();
}

​ 6.调整多项式系数大小。

inline void resize(std::size_t coeff_count)
{
   if (is_ntt_form())
	{
      throw std::logic_error("cannot reserve for an NTT transformed Plaintext");
     }
    data_.resize(coeff_count);
    coeff_count_ = coeff_count;
}

​ 剩下的基本属于输入输出的函数,这里不再赘述。

dynarray.h

​ 用于存储从Microsoft SEAL内存分配的对象的动态数组池。DynArray类主要用于内部使用,并提供明文和密文类的底层数据结构。简单来说就是plaintext和ciphertext的底层实现。它的优点在于可以根据在已知所需大小的时候预先分配内存。例如以下初始化函数,它要求capacity必须要大于size,这里的size应该就是理论上需要的容量,capacity是分配的内存大小。

explicit DynArray(std::size_t capacity, std::size_t size, MemoryPoolHandle pool = MemoryManager::GetPool())
    : pool_(std::move(pool))
{
    if (!pool_)
    {
       throw std::invalid_argument("pool is uninitialized");
    }
    if (capacity < size)
    {
        throw std::invalid_argument("capacity cannot be smaller than size");
    }

     // Reserve memory, resize, and set to zero
     reserve(capacity);
     resize(size);
}

​ reserve 调整size和capacity的大小,使两者一样大。由于size在初始化的时候被限制了小于capacity,所以这里其实就是在让整个数组的大小刚好与需求大小相符。

inline void reserve(std::size_t capacity)
{
  std::size_t copy_size = std::min<>(capacity, size_);

   // Create new allocation and copy over value
  auto new_data(util::allocate<T>(capacity, pool_));
  std::copy_n(cbegin(), copy_size, new_data.get());
  std::swap(data_, new_data);

 // Set the coeff_count and capacity
  capacity_ = capacity;
  size_ = copy_size;
}

​ plaintext里的的shrink_to_fit,就是直接调用了reserve。

inline void shrink_to_fit()
{
   reserve(size_);
}

​ resize 调整数组的size,如果数组容量扩大,会增加一些占位为0的内存(fill_zero为true的时候);如果数组容量缩小,则最后的元素将被drop。并且当capacity已经小于新的数组size,那么整个数组将被按照现在的size重新分配内存。

inline void resize(std::size_t size, bool fill_zero = true)
{
   if (size <= capacity_)
    {
          // Are we changing size to bigger within current capacity?
          // If so, need to set top terms to zero
          if (size > size_ && fill_zero)
           {
              std::fill(end(), begin() + size, T(0));
           }

            // Set the size
            size_ = size;

            return;
	}
    // At this point we know for sure that size_ <= capacity_ < size so need
    // to reallocate to bigger
	auto new_data(util::allocate<T>(size, pool_));
	std::copy(cbegin(), cend(), new_data.get());
	if (fill_zero)
	{
    	std::fill(new_data.get() + size_, new_data.get() + size, T(0));
	}
	std::swap(data_, new_data);

	// Set the coeff_count and capacity
     capacity_ = size;
     size_ = size;
}

ciphertext.h

​ cipher的占用的内存和加密的参数选取息息相关,具体见代码内的注释:

    Class to store a ciphertext element. The data for a ciphertext consists
    of two or more polynomials, which are in Microsoft SEAL stored in a CRT
    form with respect to the factors of the coefficient modulus. This data
    itself is not meant to be modified directly by the user, but is instead
    operated on by functions in the Evaluator class. The size of the backing
    array of a ciphertext depends on the encryption parameters and the size
    of the ciphertext (at least 2). If the size of the ciphertext is T,
    the poly_modulus_degree encryption parameter is N, and the number of
    primes in the coeff_modulus encryption parameter is K, then the
    ciphertext backing array requires precisely 8*N*K*T bytes of memory.
    A ciphertext also carries with it the parms_id of its associated
    encryption parameters, which is used to check the validity of the
    ciphertext for homomorphic operations and decryption.
    @par Memory Management
    The size of a ciphertext refers to the number of polynomials it contains,
    whereas its capacity refers to the number of polynomials that fit in the
    current memory allocation. In high-performance applications unnecessary
    re-allocations should be avoided by reserving enough memory for the
    ciphertext to begin with either by providing the desired capacity to the
    constructor as an extra argument, or by calling the reserve function at
    any time.
    @par Thread Safety
    In general, reading from ciphertext is thread-safe as long as no other
    thread is concurrently mutating it. This is due to the underlying data
    structure storing the ciphertext not being thread-safe.

​ ciphertext的reserve和plaintext是有区别的,它多了parms_id这个参数,因为密文的容量和参数选择紧密相关。

inline void reserve(const SEALContext &context, std::size_t size_capacity)
{
   auto parms_id = context.first_parms_id();
   reserve(context, parms_id, size_capacity);
}

​ resize也是同理,并且这里指出user一般没有理由去手动调整密文的大小,这个函数一般被Evaluator::multiply等函数调用来自动调整密文大小。

inline void resize(const SEALContext &context, std::size_t size)
{
   auto parms_id = context.first_parms_id();
   resize(context, parms_id, size);
}

context.h

​ 非常重要的一个头文件。基本上所有的加解密对象都会用到其中的类。

​ class EncryptionParameterQualifiers:包含了若干加密相关参数,例如加密类型、模数系数格式、批处理等,这些参数将在SEALContext类中被自动创建,进而传入Encryptor, Evaluator, Decryptor这些类中,一般来说用户不需要手动更改这些参数。

    Stores a set of attributes (qualifiers) of a set of encryption parameters.
    These parameters are mainly used internally in various parts of the library,
    e.g., to determine which algorithmic optimizations the current support. The
    qualifiers are automatically created by the SEALContext class, silently passed
    on to classes such as Encryptor, Evaluator, and Decryptor, and the only way to
    change them is by changing the encryption parameters themselves. In other
    words, a user will never have to create their own instance of this class, and
    in most cases never have to worry about it at all.

​ class SEALContext:基于上一个类构造的类,必须在上一个类中的某些参数设置完成之后才能正常构造。

    Performs sanity checks (validation) and pre-computations for a given set of encryption
    parameters. While the EncryptionParameters class is intended to be a light-weight class
    to store the encryption parameters, the SEALContext class is a heavy-weight class that
    is constructed from a given set of encryption parameters. It validates the parameters
    for correctness, evaluates their properties, and performs and stores the results of
    several costly pre-computations.
    After the user has set at least the poly_modulus, coeff_modulus, and plain_modulus
    parameters in a given EncryptionParameters instance, the parameters can be validated
    for correctness and functionality by constructing an instance of SEALContext. The
    constructor of SEALContext does all of its work automatically, and concludes by
    constructing and storing an instance of the EncryptionParameterQualifiers class, with
    its flags set according to the properties of the given parameters. If the created
    instance of EncryptionParameterQualifiers has the parameters_set flag set to true, the
    given parameter set has been deemed valid and is ready to be used. If the parameters
    were for some reason not appropriately set, the parameters_set flag will be false,
    and a new SEALContext will have to be created after the parameters are corrected.

encryptionparams.h

​ 这个类主要是对加密的参数定义。选择合适的参数具有重要意义,合适的参数(poly_modulus, coeff_modulus, plain_modulus等)是权衡安全性和效率的最优解。parms_id是用来标识一组加密参数的唯一记号,因为它是基于加密参数生成的哈希值,基本不会重复,类似于主键。
​ set_poly_modulus_degree将多项式模量参数的度设置为指定的值。多项式模量直接影响明文多项式的系数,密文的size,计算性能(越大越差)和安全级别(越大越好)。Microsoft SEAL中,多项式模量的次必须为2的幂(例如1024、2048、4096、8192、16384或32768)。
​ set_coeff_modulus设置参数 coeff_modulus,该参数为一个很大的整数,由多个不同素数的乘积构成,而每一个素数的大小会多达 60 bits。该参数使用素数向量来表示,每一个素数都使用类 Modulus 来表示。而 coeff_modulus 参数的 bit 长度为所有素数长度的总和。一个更大的 coeff_modulus 意味着一个更大的噪声预算,进而更多的计算容量。SEAL 建议给定 poly_modulus_degreecoeff_modulus 默认有其上限。
​ set_plain_modulus设置明文空间的模数 plain_modulus。该明文模数为正整数,你可以使用素数或其他,这里使用 2 的幂。明文模数决定了明文数据类型的大小,和同态乘法在噪声预算上的消耗。因此,为了更好的性能,应该尽可能的使明文数据小,而噪声预算在一个刚生成的密文中为 log_2(coeff_modulus/plain_modulus) bits,而在同态乘法中的噪声预算为log_2(plain_modulus)+others。需要注意,只有 BFV 方案才需要明文模数,CKKS 方案不需要该模数。

evaluator.h

​ 提供了关于密文计算的类,例如加法、乘法、减法、幂运算等,同时也包含了重线性化等语义性质不强的操作。除此之外,对于非密文值的运算也是很重要的,比如需要对某个密文c乘2,这个2是明文,这就出现了密文和明文的混合运算,这个类也提供了这样的函数。代码注释里解释了重线性化的意义:

One of the most important non-arithmetic operations is relinearization, which takes as input a ciphertext of size K+1 and relinearization keys (at least K-1 keys are needed), and changes the size of the ciphertext down to 2 (minimum size). For most use-cases only one relinearization key suffices, in which case relinearization should be performed after every multiplication. Homomorphic multiplication of ciphertexts of size K+1 and L+1 outputs a ciphertext of size K+L+1, and the computational cost of multiplication is proportional to K*L. Plain multiplication and addition operations of any type do not change the size. Relinearization requires relinearization keys to have been generated.

简单来说就是可以把密文的大小缩减到2(密文的最小值),这样能够减少运算的噪音开销,一般来说重线性化用于每次密文乘法之后,常数乘法和加法并不需要。
代码还支持NTT相关的运算,比如NTT转化成多项式格式,NTT表达式的运算等。其中BFV/BGV不需要NTT格式计算,CKKS必须使用NTT。关于类里的方法,只介绍几个比较复杂的。
multiply_many将几个密文相乘(向量存储)。该函数计算给定的数个密文的乘积并将结果存储在目标参数中。乘法是在深度最优的情况下完成的顺序,并且在流程中的每次乘法之后自动执行重新线性化。在重新线性化使用给定的重新线性化键。进程中的动态内存分配被分配MemoryPoolHandle指向的内存池。

 void multiply_many(
            const std::vector<Ciphertext> &encrypteds, const RelinKeys &relin_keys, Ciphertext &destination,
            MemoryPoolHandle pool = MemoryManager::GetPool()) const;

​ exponentiate_inplace对密文求幂。该函数将加密提高到幂。中的动态内存分配进程从MemoryPoolHandle所指向的内存池中分配。求幂就完成了中的每次乘法后自动执行再线性化的过程。在再线性化中使用给定的再线性化密钥。

void exponentiate_inplace(
            Ciphertext &encrypted, std::uint64_t exponent, const RelinKeys &relin_keys,
            MemoryPoolHandle pool = MemoryManager::GetPool()) const;

​ add_plain_inplace计算密文和明文之和:

void add_plain_inplace(
            Ciphertext &encrypted, const Plaintext &plain, MemoryPoolHandle pool = MemoryManager::GetPool()) const;

​ multiply_plain_inplace把明文和密文相乘,明文不等于0:

void multiply_plain_inplace(
            Ciphertext &encrypted, const Plaintext &plain, MemoryPoolHandle pool = MemoryManager::GetPool()) const;

batchencoder.h

​ 首先需要认识SEAL CRT的原理,即用自同构来把明文用矩阵表示。
​ BatchEncoder接收一个SEALContext对象作为参数,其中Context需要指明关于批处理的参数。encode是从给定矩阵来创建一个新的明文,decode则是相反操作,由一组明文得到矩阵。这个类的功能很明确也很简洁。
​ 注意:slots_ = context_data.parms().poly_modulus_degree();

keygenerator.h

​ class KeyGenerator生成匹配的密钥和公钥。现有的KeyGenerator可以也在任何时候被用来生成重线性化密钥和伽罗瓦密钥。构造一个KeyGenerator只需要一个SEALContext。
​ create_public_key生成公钥并将结果存储在目标位置。每一次调用此函数,将生成一个新的公钥。create_relin_keys生成重线性化密钥并将结果存储在目标位置。create_galois_keys生成伽罗瓦密钥,其用于批处理加密。

encryptor.h

​ class Encryptor接收context构造加密对象,其支持对称和非对称加密:

Encrypts Plaintext objects into Ciphertext objects. Constructing an Encryptor requires a SEALContext with valid encryption parameters, the public key and/or
the secret key. If an Encrytor is given a secret key, it supports symmetric-key encryption. If an Encryptor is given a public key, it supports asymmetric-key
encryption.

​ set_public_key可以设置一组新的公钥,set_secret_key则是私钥。encrypt是用公钥加密,即非对称类型;encrypt_symmetric是用私钥加密,为对称加密。

decryptor.h

​ 与encryptor对应,作为解密对象,需要接收一个具有私钥和加密参数的SEALContext,并且要求ciphertexts为默认NTT格式。invariant_noise_budget函数是非常重要的,它用于计算当前密文的剩余噪音预算,只适用于BFV和BGV。

posted @ 2023-03-08 11:04  ZimaB1ue  阅读(253)  评论(0编辑  收藏  举报