csapp cache lab partB,30行不到全case通用解法

#define ROUND_UP(x, align) (((int)(x) + (align - 1)) & ~(align - 1))
#define MIN(a, b) (((a) < (b)) ? (a) : (b))
#define BLOCK_SZ 8

void transpose_submit(int M, int N, int A[N][M], int B[M][N])
{
    int i, j;
    for (i = 0; i < N; i += BLOCK_SZ)
    {
        for (j = 0; j < M; j += BLOCK_SZ)
        {
            int R = MIN(N - i, BLOCK_SZ);
            int C = MIN(M - j, BLOCK_SZ);
            int tmp[BLOCK_SZ][BLOCK_SZ];
            int r, c;
            for (r = 0; r < R; r++)
            {
                for (c = 0; c < C; c++)
                {
                    tmp[r][c] = A[i + r][j + c];
                }
            }
            for (c = 0; c < C; c++)
            {
                for (r = 0; r < R; r++)
                {
                    B[j + c][i + r] = tmp[r][c];
                }
            }
        }
    }
}

把A矩阵切分成8X8的Block,然后关键的地方来了,我们用一个8X8的局部数组存放这个Block,然后按列遍历,把Block写入到B矩阵。

posted @ 2022-09-25 15:24  james_ling  阅读(33)  评论(0编辑  收藏  举报