[Big Data] Week 3 (Advance)
Question 1
C -- D -- E / | | | \ A | | | B \ | | | / F -- G -- HWrite the adjacency matrix A, the degree matrix D, and the Laplacian matrix L. For each, find the sum of all entries and the number of nonzero entries. Then identify the true statement from the list below.
Your Answer | Score | Explanation | |
---|---|---|---|
L has 64 nonzero entries. | |||
A has 11 nonzero entries. | |||
L has 8 nonzero entries. | |||
L has 30 nonzero entries. | Correct | 1.00 | |
Total | 1.00 / 1.00 |
Question 2
2 ----6 / \ | 1 4 | \ / \ | 3 5
The goal is to find two clusters in this graph using Spectral Clustering on the Laplacian matrix. Compute the Laplacian of this graph. Then compute the second eigen vector of the Laplacian (the one corresponding to the second smallest eigenvalue).
To cluster the points, we decide to split at the mean value. We say that a node is a tie if its value in the eigen-vector is exactly equal to the mean value. Let's assume that if a point is a tie, we choose its cluster at random. Identify the true statement from the list below.
Your Answer | Score | Explanation | |
---|---|---|---|
4 and 6 can either be in the same cluster or in different clusters (depending on randomness) | Correct | 1.00 | |
5 and 6 can either be in the same cluster or in different clusters (depending on randomness) | |||
3 and 5 can either be in the same cluster or in different clusters (depending on randomness) | |||
3 is a tie | |||
Total | 1.00 / 1.00 |
Read More: Eigenvectors calculate the eigenvalue: http://www.wolframalpha.com/input/?i=eigenvectors%7B%7B2%2C-1%2C-1%2C0%2C0%2C0%7D%2C%7B-1%2C3%2C0%2C-1%2C0%2C-1%7D%2C%7B-1%2C0%2C2%2C-1%2C0%2C0%7D%2C%7B0%2C-1%2C-1%2C3%2C-1%2C0%7D%2C%7B0%2C0%2C0%2C-1%2C2%2C-1%7D%2C%7B0%2C-1%2C0%2C0%2C-1%2C2%7D%7D
To second smallest eigenvalue is v5: [-1,0-1,0,1,1]
Therefore, 1 and 3; 5 and 6 should be in the same cluster, 2 and 4 belong to which cluster all depends all the randomness.
Question 3
For our estimate of the surprise number, we shall choose three timestamps at random, and estimate the surprise number from each, using the AMS approach (length of the stream times 2m-1, where m is the number of occurrences of the element of the stream at that timestamp, considering all times from that timestamp on, to the current time). Then, our estimate will be the median of the three resulting values.
You should discover the simple rules that determine the estimate derived from any given timestamp and from any set of three timestamps. Then, identify from the list below the set of three "random" timestamps that give the closest estimate.
Your Answer | Score | Explanation | |
---|---|---|---|
{22, 42, 62} | Correct | 1.00 | |
{31, 32, 44} | |||
{25, 34, 47} | |||
{14, 35, 42} | |||
Total | 1.00 / 1.00 |
Question 4
A set of four of the elements 1 through 10 could give an estimate that is exact (if the estimate is 4), or too high, or too low. You should figure out under what circumstances a set of four elements falls into each of those categories. Then, identify in the list below the set of four elements that gives the exactly correct estimate.
Your Answer | Score | Explanation | |
---|---|---|---|
{ 1, 6, 7, 10} | Correct | 1.00 | |
{4, 5, 6, 7} | |||
{3, 4, 8, 10} | |||
{1, 3, 6, 8} | |||
Total | 1.00 / 1.00 |
Question 5
End Time | 100 | 98 | 95 | 92 | 87 | 80 | 65 |
---|---|---|---|---|---|---|---|
Size | 1 | 1 | 2 | 2 | 4 | 8 | 8 |
Note: we are showing timestamps as absolute values, rather than modulo the window size, as DGIM would do.
Suppose that at times 101 through 105, 1's appear in the stream. Compute the set of buckets that would exist in the system at time 105. Then identify one such bucket from the list below. Buckets are represented by pairs (end-time, size).
Your Answer | Score | Explanation | |
---|---|---|---|
(103,1) | |||
(95,4) | |||
(104,2) | Correct | 1.00 | |
(87,8) | |||
Total | 1.00 / 1.00 |
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· SQL Server 2025 AI相关能力初探
· Linux系列:如何用 C#调用 C方法造成内存泄露
· AI与.NET技术实操系列(二):开始使用ML.NET
· 记一次.NET内存居高不下排查解决与启示
· 探究高空视频全景AR技术的实现原理
· 阿里最新开源QwQ-32B,效果媲美deepseek-r1满血版,部署成本又又又降低了!
· Manus重磅发布:全球首款通用AI代理技术深度解析与实战指南
· 开源Multi-agent AI智能体框架aevatar.ai,欢迎大家贡献代码
· 被坑几百块钱后,我竟然真的恢复了删除的微信聊天记录!
· AI技术革命,工作效率10个最佳AI工具