TACO编译器张量与科学计算SpMV

TACO编译器张量与科学计算SpMV

定义张量

声明张量

taco::Tensor对象,对应于数学张量,构成了taco C++API的核心。可以通过指定新张量的名称、包含张量每个维度大小的向量,以及将用于存储张量的存储格式来声明新张量:

// Declare a new tensor "A" of double-precision floats with dimensions

// 512 x 64 x 2048, stored as a dense-sparse-sparse tensor

Tensor<double> A("A", {512,64,2048}, Format({Dense,Sparse,Sparse}));

张量的名称可以省略,在这种情况下,TACO将为张量指定一个任意名称:

// Declare another tensor with the same dimensions and storage format as before

Tensor<double> A({512,64,2048}, Format({Dense,Sparse,Sparse}));

标量被视为0阶张量,可以用一些任意值来声明和初始化,如下所示:

Tensor<double> alpha(42.0);  // Declare a scalar tensor initialized to 42.0

定义张量格式

从概念上讲,你可以把张量想象成一棵树,每个级别(不包括根)都对应于张量的一个维度。从根到叶节点的每条路径表示张量坐标及其相应值。树的每个级别对应的维度由张量的维度存储顺序决定。

TACO使用了一种新的方案,该方案可以通过指定张量维度的存储顺序以及每个维度是稀疏还是密集来描述任何张量的不同存储格式。稀疏维度仅存储包含非零值的维度的子集,并且在概念上类似于压缩稀疏行(CSR)矩阵格式中使用的索引数组,而密集维度存储零和非零。如下所示,该方案足够灵活,可以表达许多常用的矩阵存储格式。

可以通过创建taco::format对象来定义新的张量存储格式。taco::Format的构造函数将指定每个维度类型的向量和(可选)指定维度存储顺序的向量作为参数,遵循上述方案:

Format   dm({Dense,Dense});           // (Row-major) dense matrix

Format  csr({Dense,Sparse});          // Compressed sparse row matrix

Format  csc({Dense,Sparse}, {1,0});   // Compressed sparse column matrix

Format dcsr({Sparse,Sparse}, {1,0});  // Doubly compressed sparse column matrix

或者,可以定义仅包含稀疏或密集维度的张量格式,如下所示:

Format csf(Sparse);  // Compressed sparse fiber tensor

初始化张量

可以通过调用insert方法将非零分量添加到张量来初始化taco::Tensor。insert方法采用两个参数,一个矢量指定要添加的非零组件的坐标和要在该坐标处插入的值:

A.insert({128,32,1024}, 42.0);  // A(128,32,1024) = 42.0

insert方法将插入的非零添加到临时缓冲区。在计算中实际使用张量之前,必须调用pack方法将张量压缩为首次声明张量时指定的存储格式:

A.pack();  // Construct dense-sparse-sparse tensor containing inserted non-zeros

从文件加载张量

可以通过调用taco::read直接从文件加载张量,而不是手动调用insert和pack来初始化张量,如下所示:

// Load a dense-sparse-sparse tensor from file A.tns

A = read("A.tns", Format({Dense, Sparse, Sparse}));

默认情况下,taco::read返回一个压缩张量。可以选择传递布尔标志作为参数,以指示是否应打包返回的张量:

// Load an unpacked tensor from file A.tns

A = read("A.tns", Format({Dense, Sparse, Sparse}), false);

目前,TACO支持从以下矩阵和张量文件格式加载:

矩阵市场(坐标)格式(.mtx)

卢瑟福波音格式(.rb)

FROSTT格式(.tns)

将张量写入文件

也可以通过调用taco::write将(压缩的)张量直接写入文件,如下所示:

write("A.tns", A);  // Write tensor A to file A.tns

taco::write支持与taco::read相同的一组矩阵和张量文件格式

科学计算:SpMV

稀疏矩阵矢量乘法(SpMV)是许多科学和工程计算中的一个瓶颈计算。从数学上讲,SpMV可以表示为y=Ax+z, 其中A是稀疏矩阵,x、y和z是密集向量。计算也可以用索引表示法表示为

 

可以使用TACO C++API轻松高效地计算SpMV,如下所示:

//在Linux和MacOS上,可以这样编译和运行此程序:

//   g++ -std=c++11 -O3 -DNDEBUG -DTACO -I ../../include -L../../build/lib spmv.cpp -o spmv -ltaco

//   LD_LIBRARY_PATH=../../build/lib ./spmv

#include <random>

#include "taco.h"

using namespace taco;

int main(int argc, char* argv[]) {

  std::default_random_engine gen(0);

  std::uniform_real_distribution<double> unif(0.0, 1.0);

  // Predeclare the storage formats that the inputs and output will be stored as.

  // To define a format, you must specify whether each dimension is dense or sparse

  // and (optionally) the order in which dimensions should be stored. The formats

  // declared below correspond to compressed sparse row (csr) and dense vector (dv).

  Format csr({Dense,Sparse});

  Format  dv({Dense});

  // Load a sparse matrix from file (stored in the Matrix Market format) and

  // store it as a compressed sparse row matrix. Matrices correspond to order-2

  // tensors in taco. The matrix in this example can be downloaded from:

  // https://www.cise.ufl.edu/research/sparse/MM/Boeing/pwtk.tar.gz

  Tensor<double> A = read("pwtk.mtx", csr);

  // Generate a random dense vector and store it in the dense vector format.

  // Vectors correspond to order-1 tensors in taco.

  Tensor<double> x({A.getDimension(1)}, dv);

  for (int i = 0; i < x.getDimension(0); ++i) {

    x.insert({i}, unif(gen));

  }

  x.pack();

  // Generate another random dense vetor and store it in the dense vector format..

  Tensor<double> z({A.getDimension(0)}, dv);

  for (int i = 0; i < z.getDimension(0); ++i) {

    z.insert({i}, unif(gen));

  }

  z.pack();

 

  // Declare and initializing the scaling factors in the SpMV computation.

  // Scalars correspond to order-0 tensors in taco.

  Tensor<double> alpha(42.0);

  Tensor<double> beta(33.0);

  // Declare the output matrix to be a sparse matrix with the same dimensions as

  // input matrix B, to be also stored as a doubly compressed sparse row matrix.

  Tensor<double> y({A.getDimension(0)}, dv);

  // Define the SpMV computation using index notation.

  IndexVar i, j;

  y(i) = alpha() * (A(i,j) * x(j)) + beta() * z(i);

  // At this point, we have defined how entries in the output vector should be

  // computed from entries in the input matrice and vectorsbut have not actually

  // performed the computation yet. To do so, we must first tell taco to generate

  // code that can be executed to compute the SpMV operation.

  y.compile();

  // We can now call the functions taco generated to assemble the indices of the

  // output vector and then actually compute the SpMV.

  y.assemble();

  y.compute();

  // Write the output of the computation to file (stored in the FROSTT format).

  write("y.tns", y);

}

You can also use the TACO Python API to perform the same computation, as demonstrated here:

import pytaco as pt

from pytaco import compressed, dense

import numpy as np

# Define formats for storing the sparse matrix and dense vectors

csr = pt.format([dense, compressed])

dv  = pt.format([dense])

# Load a sparse matrix stored in the matrix market format) and store it

# as a CSR matrix.  The matrix in this example can be downloaded from:

# https://www.cise.ufl.edu/research/sparse/MM/Boeing/pwtk.tar.gz

A = pt.read("pwtk.mtx", csr)

# Generate two random vectors using NumPy and pass them into TACO

x = pt.from_array(np.random.uniform(size=A.shape[1]))

z = pt.from_array(np.random.uniform(size=A.shape[0]))

# Declare the result to be a dense vector

y = pt.tensor([A.shape[0]], dv)

# Declare index vars

i, j = pt.get_index_vars(2)

# Define the SpMV computation

y[i] = A[i, j] * x[j] + z[i]

# Perform the SpMV computation and write the result to file

pt.write("y.tns", y)

当运行上面的Python程序时,TACO会在引擎盖下生成代码,一次有效地执行计算。这使TACO避免了将中间矩阵向量积具体化,从而减少了内存访问量并加快了计算速度。

 

参考文献链接

http://tensor-compiler.org/docs/tensors.html

posted @   吴建明wujianming  阅读(42)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· 全程不用写代码,我用AI程序员写了一个飞机大战
· DeepSeek 开源周回顾「GitHub 热点速览」
· 记一次.NET内存居高不下排查解决与启示
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· .NET10 - 预览版1新功能体验(一)
历史上的今天:
2023-05-14 OPPO哲库芯片一夜解散巨震
2022-05-14 EDA技术与市场分析
2020-05-14 几种软件架构
点击右上角即可分享
微信分享提示