使用Image读取图片
1 2 3 4 5 6 from PIL import Imageimg_path = 'data/1.jpg' img = Image.open (img_path) img.show()
png格式是四个通道,RGB+alpha,alpha表示透明度,alpha=0表示完全透明,alpha=255表示完全不透明。所以,需要调用image = image.convert(‘RGB’)方法将图片转换为RGB格式,保留其颜色通道。
读取一个目录的文件
1 2 3 4 5 6 7 8 import osdir_path = 'data/train' path_list = os.listdir(dir_path) path = os.path.join(dir_path, path_list[0 ]) img = Image.open (path) img.show()
1 2 3 4 5 6 7 8 from torch.utils.tensorboard import SummaryWriterwriter = SummaryWriter('logs' ) writer.add_image('example' , img) writer.add_scalar('test' , 0.5 , 1 ) writer.add_graph(net, input ) writer.close()
创建完writer对象后,需要在命令行中输入以下命令
1 tensorboard --logdir=logs --port=6008
输入完之后会出现一个本地地址,默认端口号为6006,为了防止端口冲突,可以指定其他端口号。
writer.add_image()方法读入图片的类型为tensor,ndarray,string类型
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 from PIL import Imagefrom torch.utils.tensorboard import SummaryWriterimg = Image.open ('data/1.jpg' ) print (type (img))import numpy as npimg_array = np.array(img) from torchvision import transformstensor_trans = transforms.ToTensor() img_tensor = tensor_trans(img) trans_norm = transforms.Normalize([0.5 , 0.5 , 0.5 ], [0.5 , 0.5 , 0.5 ]) img_norm = trans_norm(img_tensor) trans_resize = transforms.Resize((256 , 256 )) img_resize = trans_resize(img) trans_compose = transforms.Compose([transforms.Resize((256 , 256 )), transforms.ToTensor()]) img_compose = trans_compose(img) trans_crop = transforms.RandomCrop(256 ) img_crop = trans_crop(img) writer = SummaryWriter('logs' ) writer.add_image('example' , img_array, 1 , dataformats='HWC' )
加载数据
读取数据:Dataset, DataLoader
Dataset:提供一种方式去获取数据及其label
如何获取每一个数据及其label
总共有多少数据
1 2 3 import torchvisiontrain_data = torchvision.datasets.CIFAR10(root='data' , train=True , download=True )
DataLoader:为网络提供不同的数据形式
1 2 3 4 5 6 7 import torchvisionfrom torch.utils.data import DataLoadertest_data = torchvision.datasets.CIFAR10(root='data' , train=False , download=True ) test_loader = DataLoader(test_data, batch_size=64 , shuffle=True , num_workers=2 , drop_last=True ) img, label = next (iter (test_loader))
神经网络搭建
nn.Module:base class for all neural network modules
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 import torchclass MyModel (torch.nn.Module): def __init__ (self ): super (MyModel, self).__init__() self.conv1 = torch.nn.Conv2d(3 , 32 , 3 , 1 , 1 ) self.conv2 = torch.nn.Conv2d(32 , 64 , 3 , 1 , 1 ) self.fc1 = torch.nn.Linear(64 *8 *8 , 128 ) self.fc2 = torch.nn.Linear(128 , 10 ) def forward (self, x ): x = torch.nn.functional.relu(self.conv1(x)) x = torch.nn.functional.max_pool2d(x, 2 , 2 ) x = torch.nn.functional.relu(self.conv2(x)) x = torch.nn.functional.max_pool2d(x, 2 , 2 ) x = x.view(-1 , 64 *8 *8 ) x = torch.nn.functional.relu(self.fc1(x)) x = self.fc2(x) return x net = MyModel() x = torch.randn(1 , 3 , 32 , 32 ) out = net(x)
卷积层
1 2 3 4 5 6 7 8 9 import torchx = torch.randn(1 , 3 , 32 , 32 ) conv1 = torch.nn.Conv2d(3 , 32 , 3 , 1 , 1 ) out = conv1(x) print (out.shape)Conv2d(in_channels, out_channels, kernel_size, stride, padding, dilation, groups, bias=True , padding_mode='zeros' )
常用参数
in_channels:输入数据的通道数
out_channels:输出数据的通道数
kernel_size:卷积核的大小
stride:步长
padding:填充
其他参数
dilation:扩张
groups:分组卷积
bias:是否使用偏置
padding_mode:填充模式
池化层:最大池化层
1 2 3 4 5 6 7 8 9 import torchx = torch.randn(1 , 3 , 32 , 32 ) pool1 = torch.nn.MaxPool2d(2 , 2 ) out = pool1(x) print (out.shape)MaxPool2d(kernel_size, stride, padding, dilation, return_indices, ceil_mode)
常用参数
kernel_size:池化核的大小
stride:步长
padding:填充
其他参数
dilation:扩张
return_indices:是否返回最大值的索引
ceil_mode:是否向上取整
非线性激活层
1 2 3 4 5 6 7 import torchx = torch.randn(1 , 3 , 32 , 32 ) relu = torch.nn.ReLU() out = relu(x) print (out.shape)
线性层/全连接层
1 2 3 4 5 6 7 import torchx = torch.randn(1 , 3 , 32 , 32 ) fc = torch.nn.Linear(3 *32 *32 , 10 ) out = fc(x) print (out.shape)
Dropout层
1 2 3 4 5 6 7 import torchx = torch.randn(1 , 3 , 32 , 32 ) drop = torch.nn.Dropout(0.5 ) out = drop(x) print (out.shape)
nn.Sequential
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 import torchx = torch.randn(1 , 3 , 32 , 32 ) fc = torch.nn.Linear(3 *32 *32 , 10 ) relu = torch.nn.ReLU() pool = torch.nn.MaxPool2d(2 , 2 ) conv = torch.nn.Conv2d(3 , 32 , 3 , 1 , 1 ) model = torch.nn.Sequential( conv, relu, pool, conv, relu, pool, fc ) out = model(x) print (out.shape)
损失函数
L1Loss
L1Loss是计算预测值和真实值之间的平均绝对值误差/绝对值误差之和
MSELoss
MSELoss是计算预测值和真实值之间的均方误差/平方误差之和
CrossEntropyLoss
CrossEntropyLoss是计算预测值和真实值之间的交叉熵损失,常用于分类问题
方向传播
1 2 3 loss = torch.nn.CrossEntropyLoss() result_loss = loss(input , label) result_loss.backward()
优化器
1 2 3 4 5 6 7 8 9 10 11 optimizer = torch.optim.SGD(model.parameters(), lr=0.01 , momentum=0.9 ) loss = torch.nn.CrossEntropyLoss() result_loss = loss(input , label) print (result_loss.item())optimizer.zero_grad() result_loss.backward() optimizer.step()
现有模型的参数
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 vgg16 = torchvision.models.vgg16(pretrained=True ) vgg16.classifier.add_module('fc' , torch.nn.Linear(1000 , 10 )) vgg16.add_module('fc' , torch.nn.Linear(1000 , 10 )) vgg16.classifier[6 ] = torch.nn.Linear(4096 , 10 ) torch.save(vgg16, 'vgg16.pth' ) model = torch.load('vgg16.pth' ) torch.save(vgg16.state_dict(), 'vgg16.pth' ) vgg16 = torchvision.models.vgg16(pretrained=False ) vgg16.load_state_dict(torch.load('vgg16.pth' ))
分类问题计算准确率
1 2 3 4 5 6 output = torch.tensor([[0.1 ,0.2 ], [0.3 ,0.4 ]]) print (output.argmax(dim=1 ))pred = output.argmax(dim=1 ) target = torch.tensor([0 ,1 ]) print ((pred == target).sum ().item())
train和eval
1 2 3 4 5 6 model.train() model.eval ()
使用time模块计算训练时间
1 2 3 4 5 import timestart = time.time() end = time.time() print ('训练时间为:' , end-start)
使用GPU
找到网络模型、数据、损失函数,然后调用.cuda()方法
找到网络模型、数据、损失函数,然后调用.to (device)方法——常用
1 2 3 4 device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu' ) loss.to(device) model.to(device) input = input .to(device)
autograd
反向传播查看梯度
1 2 3 4 5 6 7 8 9 10 11 12 13 loss.backward() print (w.grad)print (b.grad)z = torch.matmul(x, w) + b print (z.requires_grad) with torch.no_grad(): z = torch.matmul(x, w) + b print (z.requires_grad) z = torch.matmul(x, w) + b z_det = z.detach() print (z_det.requires_grad)
对于backward()方法,如果是标量,可以不传入参数,如果是向量,需要传入参数,参数大小和输出大小一致
1 2 3 4 5 6 x = torch.tensor([1.0 , 2.0 , 3.0 , 4.0 ], requires_grad=True ) y = 2 *x z = y.view(2 , 2 ) v = torch.tensor([[1.0 , 0.1 ], [0.01 , 0.001 ]], dtype=torch.float ) z.backward(v) print (x.grad)