[LeetCode] 924. Minimize Malware Spread 最大程度上减少恶意软件的传播

In a network of nodes, each node `i` is directly connected to another node `j` if and only if `graph[i][j] = 1`.

Some nodes initial are initially infected by malware. Whenever two nodes are directly connected and at least one of those two nodes is infected by malware, both nodes will be infected by malware. This spread of malware will continue until no more nodes can be infected in this manner.

Suppose M(initial) is the final number of nodes infected with malware in the entire network, after the spread of malware stops.

We will remove one node from the initial list. Return the node that if removed, would minimize M(initial). If multiple nodes could be removed to minimize M(initial), return such a node with the smallest index.

Note that if a node was removed from the initial list of infected nodes, it may still be infected later as a result of the malware spread.

Example 1:

Input: graph = [[1,1,0],[1,1,0],[0,0,1]], initial = [0,1]
Output: 0

Example 2:

Input: graph = [[1,0,0],[0,1,0],[0,0,1]], initial = [0,2]
Output: 0

Example 3:

Input: graph = [[1,1,1],[1,1,1],[1,1,1]], initial = [1,2]
Output: 1

Note:

1 < graph.length = graph[0].length <= 300
0 <= graph[i][j] == graph[j][i] <= 1
graph[i][i] = 1
1 <= initial.length < graph.length
0 <= initial[i] < graph.length

这道题说是让我们最大程度的减少恶意软件的传播，谈到这个话题，博主最先想到的就是红衣教主周鸿祎，当年的 3Q 大战的场景还历历在目，将杀毒软件完全免费确实是造福了用户。回到本题吧，这里给了一个二维数组 graph，其中 graph[i][j] 的值表示结点i和结点j是否相连，1为相连，0为不连，这就是邻接矩阵啊，已经帮我们建立好了，遍历的时候就可以直接使用了。还给了一个数组 initial，里面是病毒源，所有跟病毒源相连接的结点都会被感染，现在问若从病毒源中去掉哪个结点，会使得感染结点的数量最大程度的减少，若出现平局，则返回结点序号较小的那个。那么实际上这道题的本质还是遍历这个无向图，遍历的方法就有 DFS 和 BFS 两种。这里先来看 BFS 的解法，既然要在病毒源中去掉一个结点，由于不知道该去掉哪个结点，就遍历所有的情况，每次去掉一个不同的结点，然后剩下的病毒源结点就是起点，都排入队列中开始遍历，一般来说迭代的解法不需要用子函数，但这里为了使程序的结构更加清晰，还是使用了子函数，这里的 BFS 遍历有向图的写法就不多解释，差不多都是一样的写法，得到了所有可以被感染的结点个数 cnt 之后，跟全局最小值 mn 比较，假如 cnt 小于 mn，或者二者相等但是当前去掉的结点 num 序号小于 res 时，需要更新 mn 和 res，参见代码如下：
解法一：

class Solution {
public:
    int minMalwareSpread(vector<vector<int>>& graph, vector<int>& initial) {
        int mn = INT_MAX, res = 0;
        unordered_set<int> infected(initial.begin(), initial.end());
        for (int num : initial) {
            infected.erase(num);
            int cnt = helper(graph, infected);
            if (cnt < mn || (cnt == mn && num < res)) {
                mn = cnt;
                res = num;
            }
            infected.insert(num);
        }
        return res;
    }
    int helper(vector<vector<int>>& graph, unordered_set<int> infected) {
        queue<int> q;
        for (int num : infected) q.push(num);
        while (!q.empty()) {
            auto t = q.front(); q.pop();
            for (int i = 0; i < graph[t].size(); ++i) {
                if (graph[t][i] != 1 || infected.count(i)) continue;
                infected.insert(i);
                q.push(i);
            }
        }
        return infected.size();
    }
};

当然也可以使用 DFS 的写法，但是为了避免写两个子函数，需要在递归的子函数中带上当前结点这个参数，那么在调用的时候就不能像 BFS 那么简单了，而是要定义一些变量，比如 cnt，还有 HashSet，而且要对每一个剩余的病毒源结点调用递归函数，其他部分跟上的解法没啥区别，参见代码如下：
解法二：

class Solution {
public:
    int minMalwareSpread(vector<vector<int>>& graph, vector<int>& initial) {
        int mn = INT_MAX, res = 0;
        unordered_set<int> infected(initial.begin(), initial.end());
        for (int num : initial) {
            infected.erase(num);
            int cnt = 0;
            unordered_set<int> visited;
            for (int cur : infected) {
                helper(graph, cur, visited, cnt);
            }
            if (cnt < mn || (cnt == mn && num < res)) {
                mn = cnt;
                res = num;
            }
            infected.insert(num);
        }
        return res;
    }
    void helper(vector<vector<int>>& graph, int cur, unordered_set<int>& visited, int& cnt) {
        if (visited.count(cur)) return;
        visited.insert(cur);
        ++cnt;
        for (int i = 0; i < graph[cur].size(); ++i) {
            if (graph[cur][i] != 1) continue;
            helper(graph, i, visited, cnt);
        }
    }
};

这道题也可以使用联合查找 Union Find 来做，因为 UF 算法可以将属于同一个群组的结点归类，使用一个 root 数组和一个 findRoot 函数，对于同一个群组的结点，调用 findRoot 函数会得到相同的祖先结点。这里还需要使用两个数组 area 和 malware，其中 area[i] 就表示祖先结点是i的群组中的结点个数，malware[i] 表示祖先结点是i的某个感染群组中的结点个数。首先初始化 root 数组，然后遍历 graph，对于每个 graph[i][j] 为1的位置，建立 root 数组的映射。注意这里跟之前有些不同的是，不能直接 root[i] = j，而是要分别对i和j调用 findRoot 函数，一定要找到i和j所属群组的祖先结点，然后对其建立映射。之后再遍历所有结点，找到每个结点的祖先结点，并在 area 数组中进行累加，同理，遍历所有的感染源结点，找到每个感染源结点的祖先结点，并在 malware 数组中进行累加。新建结果 res 为一个 pair 对儿，分别为去掉某个感染源结点后还会感染的结点总个数和那个去掉的结点序号，初始化为1和0，然后此时遍历所有的感染源结点，需要找到只有一个感染源结点的感染结点群组，想想为什么，因为多个感染源结点相连的话，不管去掉哪个，最终传染的结点个数都是相同的，所以若某个感染群组只有一个感染源的话，去掉这个感染源结点，就会拯救很多结点不被感染，而该群组的结点个数越多，表示去掉该感染源结点的效果越好，这里将结点个数取相反数，然后比较取最小值，得到的就是群结点数最多的情况，假如不存在只有一个感染源结点的群组，那么就返回序号最小的那个感染源结点即可，参见代码如下：
解法三：

class Solution {
public:
    int minMalwareSpread(vector<vector<int>>& graph, vector<int>& initial) {
        int n = graph.size();
        vector<int> root(n), area(n), malware(n), res{1, 0};
        for (int i = 0; i < n; ++i) root[i] = i;
        for (int i = 0; i < n; ++i) {
            for (int j = i + 1; j < n; ++j) {
                if (graph[i][j] == 1) root[findRoot(root, i)] = findRoot(root, j);
            }
        }
        for (int i = 0; i < n; ++i) ++area[findRoot(root, i)];
        for (int i : initial) ++malware[findRoot(root, i)];
        for (int i : initial) {
            res = min(res, {(malware[findRoot(root, i)] == 1 ) * (-area[findRoot(root, i)]), i});
        }
        return res[1];
    }
    int findRoot(vector<int>& root, int i) {
        return i == root[i] ? i : findRoot(root, root[i]);
    }
};

Github 同步地址:

https://github.com/grandyang/leetcode/issues/924

参考资料：

https://leetcode.com/problems/minimize-malware-spread/

https://leetcode.com/problems/minimize-malware-spread/discuss/181116/Java-BFS

https://leetcode.com/problems/minimize-malware-spread/discuss/181129/C%2B%2BPython-Union-Found

https://leetcode.com/problems/minimize-malware-spread/discuss/194464/Java-easy-to-understand-DFS-solution-with-detailed-explanation

[LeetCode All in One 题目讲解汇总(持续更新中...)](https://www.cnblogs.com/grandyang/p/4606334.html)