Pytorch - Dataloader
Basically the DataLoader works with the Dataset object. So to use the DataLoader you need to get your data into this Dataset wrapper. To do this you only need to implement two magic methods: __getitem__
and __len__
. The __getitem__
takes an index and returns a tuple of (x, y) pair. The __len__
is just your usual length that returns the size of the data. And that’s that. [1]
Dataloader如何读取数据
import torch # Define some sample data X = torch.randn(5,3) # input y = torch.randn(5,3) # labe print(X,y)
我们的数据如下:
tensor([[-0.5138, -1.7766, -0.6183], [ 0.2235, 0.1974, 0.2892], [ 1.6249, -0.5768, -1.5081], [ 0.5972, -0.1788, 0.7579], [ 1.3844, -0.5480, -1.5612]]) tensor([[-0.5818, 0.1668, 0.5073], [-1.7707, -0.2907, 1.4918], [ 1.2157, -2.8250, -0.0247], [ 0.2748, 0.1086, 1.6052], [-0.7613, -1.3326, -0.5267]])
然后我们从dataloader读取。
# batch_size = 1, 这意味着只能一次只能读取一个数据 # shuffle = True, 在每个训练周期(epoch)开始时,数据集中的数据将被随机打乱 dataloader = DataLoader(dataset, batch_size=1, shuffle=False) for i, (batch_x, batch_y) in enumerate(dataloader): print(f"Batch {i}: input shape {batch_x}, \n label shape {batch_y}")
我们可以得到:
Batch 0: input shape tensor([[-0.5138, -1.7766, -0.6183]]), label shape tensor([[-0.5818, 0.1668, 0.5073]]) Batch 1: input shape tensor([[ 0.5972, -0.1788, 0.7579]]), label shape tensor([[0.2748, 0.1086, 1.6052]]) Batch 2: input shape tensor([[ 1.6249, -0.5768, -1.5081]]), label shape tensor([[ 1.2157, -2.8250, -0.0247]]) Batch 3: input shape tensor([[ 1.3844, -0.5480, -1.5612]]), label shape tensor([[-0.7613, -1.3326, -0.5267]]) Batch 4: input shape tensor([[0.2235, 0.1974, 0.2892]]), label shape tensor([[-1.7707, -0.2907, 1.4918]])
从batch size拿出来的输入的顺序和放进去的顺序是一样的吗?
answer: 所以这个问题被回答了,如果shuffle = true, 那就不是,因为数据会被随机打乱。否则就是相同的顺序。
本文作者:Kane,转载请注明原文链接:https://www.cnblogs.com/hackerk/p/18109127
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 被坑几百块钱后,我竟然真的恢复了删除的微信聊天记录!
· 没有Manus邀请码?试试免邀请码的MGX或者开源的OpenManus吧
· 【自荐】一款简洁、开源的在线白板工具 Drawnix
· 园子的第一款AI主题卫衣上架——"HELLO! HOW CAN I ASSIST YOU TODAY
· Docker 太简单,K8s 太复杂?w7panel 让容器管理更轻松!