ConvNeXts 完全由标准 ConvNet 模块构建,在精确性和可扩展性方面与 Transformer 竞争,实现 87.8% ImageNet top-1 精确率,在 COCO 检测和 ADE20K 分割方面优于 Swin Transformers,同时保持标准 ConvNet 的简朴性和服从。
论文链接:https://arxiv.org/pdf/2201.03545.pdf
代码链接:https://github.com/facebookresearch/ConvNeXt
假如github不能下载,可以使用下面的毗连:
https://gitcode.net/hhhhhhhhhhwwwwwwwwww/ConvNeXt
ConvNexts的特点;
- 使用7×7的卷积核,在VGG、ResNet等经典的CNN模子中,使用的是小卷积核,但是ConvNexts证实白大卷积和的有用性。作者实验了几种内核巨细,包罗 3、5、7、9 和 11。网络的性能从 79.9% (3×3) 进步到 80.6% (7×7),而网络的 FLOPs 大抵保持稳固, 内核巨细的利益在 7×7 处到达饱和点。
- 使用GELU(高斯偏差线性单位)激活函数。GELUs是 dropout、zoneout、Relus的综合,GELUs对于输入乘以一个0,1构成的mask,而该mask的天生则是依概率随机的依靠于输入。实验结果要比Relus与ELUs都要好。下图是实验数据:
- 使用LayerNorm而不是BatchNorm。
- 倒置瓶颈。图 3 (a) 至 (b) 分析白这些设置。只管深度卷积层的 FLOPs 增长了,但由于下采样残差块的快捷 1×1 卷积层的 FLOPs 显着镌汰,这种厘革将整个网络的 FLOPs 镌汰到 4.6G。结果从 80.5% 进步到 80.6%。在 ResNet-200/Swin-B 方案中,这一步带来了更多的收益(81.9% 到 82.6%),同时也镌汰了 FLOP。
ConvNeXt残差模块
残差模块是整个模子的焦点。如下图:
代码实现:
- class Block(nn.Module):
- r""" ConvNeXt Block. There are two equivalent implementations:
- (1) DwConv -> LayerNorm (channels_first) -> 1x1 Conv -> GELU -> 1x1 Conv; all in (N, C, H, W)
- (2) DwConv -> Permute to (N, H, W, C); LayerNorm (channels_last) -> Linear -> GELU -> Linear; Permute back
- We use (2) as we find it slightly faster in PyTorch
-
- Args:
- dim (int): Number of input channels.
- drop_path (float): Stochastic depth rate. Default: 0.0
- layer_scale_init_value (float): Init value for Layer Scale. Default: 1e-6.
- """
- def __init__(self, dim, drop_path=0., layer_scale_init_value=1e-6):
- super().__init__()
- self.dwconv = nn.Conv2d(dim, dim, kernel_size=7, padding=3, groups=dim) # depthwise conv
- self.norm = LayerNorm(dim, eps=1e-6)
- self.pwconv1 = nn.Linear(dim, 4 * dim) # pointwise/1x1 convs, implemented with linear layers
- self.act = nn.GELU()
- self.pwconv2 = nn.Linear(4 * dim, dim)
- self.gamma = nn.Parameter(layer_scale_init_value * torch.ones((dim)),
- requires_grad=True) if layer_scale_init_value > 0 else None
- self.drop_path = DropPath(drop_path) if drop_path > 0. else nn.Identity()
- def forward(self, x):
- input = x
- x = self.dwconv(x)
- x = x.permute(0, 2, 3, 1) # (N, C, H, W) -> (N, H, W, C)
- x = self.norm(x)
- x = self.pwconv1(x)
- x = self.act(x)
- x = self.pwconv2(x)
- if self.gamma is not None:
- x = self.gamma * x
- x = x.permute(0, 3, 1, 2) # (N, H, W, C) -> (N, C, H, W)
- x = input + self.drop_path(x)
- return x
复制代码
数据增强Cutout和Mixup
ConvNext使用了Cutout和Mixup,为了进步结果我在我的代码中也参加这两种增强方式。官方使用timm,我没有接纳官方的,而选择用torchtoolbox。安装下令:
- <code>pip install torchtoolbox
复制代码
Cutout实现,在transforms中。
- from torchtoolbox.transform import Cutout
- # 数据预处理
- transform = transforms.Compose([
- transforms.Resize((224, 224)),
- Cutout(),
- transforms.ToTensor(),
- transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])
- ])
复制代码
Mixup实现,在train方法中。须要导入包:from torchtoolbox.tools import mixup_data, mixup_criterion
- for batch_idx, (data, target) in enumerate(train_loader):
- data, target = data.to(device, non_blocking=True), target.to(device, non_blocking=True)
- data, labels_a, labels_b, lam = mixup_data(data, target, alpha)
- optimizer.zero_grad()
- output = model(data)
- loss = mixup_criterion(criterion, output, labels_a, labels_b, lam)
- loss.backward()
- optimizer.step()
- print_loss = loss.data.item()
复制代码
项目布局
使用tree下令,打印项目布局
- <code>ConvNext_demo
- ├─data
- │ ├─test
- │ └─train
- │ ├─Black-grass
- │ ├─Charlock
- │ ├─Cleavers
- │ ├─Common Chickweed
- │ ├─Common wheat
- │ ├─Fat Hen
- │ ├─Loose Silky-bent
- │ ├─Maize
- │ ├─Scentless Mayweed
- │ ├─Shepherds Purse
- │ ├─Small-flowered Cranesbill
- │ └─Sugar beet
- ├─dataset
- │ ├─ __init__.py
- │ └─ dataset.py
- ├─Model
- │ └─convnext.py
- ├─ test1.py
- ├─ test2.py
- └─ train_connext.py
复制代码
数据集
数据集选用植物幼苗分类,统共12类。数据集毗连如下:
链接:https://pan.baidu.com/s/1TOLSNj9JE4-MFhU0Yv8TNQ
提取码:syng
在工程的根目次新建data文件夹,获取数据集后,将trian和test解压放到data文件夹下面,如下图:
导入模子文件
从官方的链接中找到convnext.py文件,将其放入Model文件夹中。如图:
安装库,并导入须要的库
模子用到了timm库,假如没有须要安装,实验下令:
复制代码 新建train_connext.py文件,导入所须要的包:
- import torch.optim as optim
- import torch
- import torch.nn as nn
- import torch.nn.parallel
- import torch.utils.data
- import torch.utils.data.distributed
- import torchvision.transforms as transforms
- from dataset.dataset import SeedlingData
- from torch.autograd import Variable
- from Model.convnext import convnext_tiny
- from torchtoolbox.tools import mixup_data, mixup_criterion
- from torchtoolbox.transform import Cutout
复制代码
设置全局参数
设置使用GPU,设置学习率、BatchSize、epoch等参数。
- # 设置全局参数
- modellr = 1e-4
- BATCH_SIZE = 8
- EPOCHS = 300
- DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
复制代码
数据预处理处罚
数据处理处罚比力简朴,没有做复杂的实验,有爱好的可以参加一些处理处罚。
- # 数据预处理
- transform = transforms.Compose([
- transforms.Resize((224, 224)),
- Cutout(),
- transforms.ToTensor(),
- transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])
- ])
- transform_test = transforms.Compose([
- transforms.Resize((224, 224)),
- transforms.ToTensor(),
- transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])
- ])
复制代码
数据读取
然后我们在dataset文件夹下面新建 init.py和dataset.py,在mydatasets.py文件夹写入下面的代码:
说一下代码的焦点逻辑。
第一步 创建字典,界说种别对应的ID,用数字取代种别。
第二步 在__init__内里编写获取图片路径的方法。测试集只有一层路径直接读取,练习集在train文件夹下面是种别文件夹,先获取到种别,再获取到详细的图片路径。然后使用sklearn中切分数据集的方法,按照7:3的比例切分练习集和验证集。
第三步 在__getitem__方法中界说读取单个图片和种别的方法,由于图像中有位深度32位的,以是我在读取图像的时间做了转换。
代码如下:
- # coding:utf8
- import os
- from PIL import Image
- from torch.utils import data
- from torchvision import transforms as T
- from sklearn.model_selection import train_test_split
- Labels = {'Black-grass': 0, 'Charlock': 1, 'Cleavers': 2, 'Common Chickweed': 3,
- 'Common wheat': 4, 'Fat Hen': 5, 'Loose Silky-bent': 6, 'Maize': 7, 'Scentless Mayweed': 8,
- 'Shepherds Purse': 9, 'Small-flowered Cranesbill': 10, 'Sugar beet': 11}
- class SeedlingData(data.Dataset):
- def __init__(self, root, transforms=None, train=True, test=False):
- """
- 主要目标: 获取所有图片的地址,并根据训练,验证,测试划分数据
- """
- self.test = test
- self.transforms = transforms
- if self.test:
- imgs = [os.path.join(root, img) for img in os.listdir(root)]
- self.imgs = imgs
- else:
- imgs_labels = [os.path.join(root, img) for img in os.listdir(root)]
- imgs = []
- for imglable in imgs_labels:
- for imgname in os.listdir(imglable):
- imgpath = os.path.join(imglable, imgname)
- imgs.append(imgpath)
- trainval_files, val_files = train_test_split(imgs, test_size=0.3, random_state=42)
- if train:
- self.imgs = trainval_files
- else:
- self.imgs = val_files
- def __getitem__(self, index):
- """
- 一次返回一张图片的数据
- """
- img_path = self.imgs[index]
- img_path = img_path.replace("\", '/')
- if self.test:
- label = -1
- else:
- labelname = img_path.split('/')[-2]
- label = Labels[labelname]
- data = Image.open(img_path).convert('RGB')
- data = self.transforms(data)
- return data, label
- def __len__(self):
- return len(self.imgs)
复制代码
然后我们在train.py调用SeedlingData读取数据 ,记取导入刚才写的dataset.py(from mydatasets import SeedlingData)
- # 读取数据
- dataset_train = SeedlingData('data/train', transforms=transform, train=True)
- dataset_test = SeedlingData("data/train", transforms=transform_test, train=False)
- # 导入数据
- train_loader = torch.utils.data.DataLoader(dataset_train, batch_size=BATCH_SIZE, shuffle=True)
- test_loader = torch.utils.data.DataLoader(dataset_test, batch_size=BATCH_SIZE, shuffle=False)
复制代码
设置模子
设置loss函数为nn.CrossEntropyLoss()。
- 设置模子为coatnet_0,修改末了一层全毗连输出改为12(数据集的种别)。
- 优化器设置为adam。
- 学习率调解战略改为余弦退火
- # 实例化模型并且移动到GPU
- criterion = nn.CrossEntropyLoss()
- #criterion = SoftTargetCrossEntropy()
- model_ft = convnext_tiny(pretrained=True)
- num_ftrs = model_ft.head.in_features
- model_ft.fc = nn.Linear(num_ftrs, 12)
- model_ft.to(DEVICE)
- # 选择简单暴力的Adam优化器,学习率调低
- optimizer = optim.Adam(model_ft.parameters(), lr=modellr)
- cosine_schedule = optim.lr_scheduler.CosineAnnealingLR(optimizer=optimizer,T_max=20,eta_min=1e-9)
复制代码
界说练习和验证函数
alpha=0.2 Mixup所需的参数。
- # 界说练习过程alpha=0.2def train(model, device, train_loader, optimizer, epoch): model.train() sum_loss = 0 total_num = len(train_loader.dataset) print(total_num, len(train_loader)) for batch_idx, (data, target) in enumerate(train_loader):
- data, target = data.to(device, non_blocking=True), target.to(device, non_blocking=True)
- data, labels_a, labels_b, lam = mixup_data(data, target, alpha)
- optimizer.zero_grad()
- output = model(data)
- loss = mixup_criterion(criterion, output, labels_a, labels_b, lam)
- loss.backward()
- optimizer.step()
- print_loss = loss.data.item() sum_loss += print_loss if (batch_idx + 1) % 10 == 0: print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format( epoch, (batch_idx + 1) * len(data), len(train_loader.dataset), 100. * (batch_idx + 1) / len(train_loader), loss.item())) ave_loss = sum_loss / len(train_loader) print('epoch:{},loss:{}'.format(epoch, ave_loss))ACC=0# 验证过程def val(model, device, test_loader): global ACC model.eval() test_loss = 0 correct = 0 total_num = len(test_loader.dataset) print(total_num, len(test_loader)) with torch.no_grad(): for data, target in test_loader: data, target = Variable(data).to(device), Variable(target).to(device) output = model(data) loss = criterion(output, target) _, pred = torch.max(output.data, 1) correct += torch.sum(pred == target) print_loss = loss.data.item() test_loss += print_loss correct = correct.data.item() acc = correct / total_num avgloss = test_loss / len(test_loader) print('\nVal set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format( avgloss, correct, len(test_loader.dataset), 100 * acc)) if acc > ACC: torch.save(model_ft, 'model_' + str(epoch) + '_' + str(round(acc, 3)) + '.pth') ACC = acc# 练习for epoch in range(1, EPOCHS + 1): train(model_ft, DEVICE, train_loader, optimizer, epoch) cosine_schedule.step() val(model_ft, DEVICE, test_loader)
复制代码
然后就可以开始练习了
练习10个epoch就能得到不错的结果:
测试
第一种写法
测试集存放的目次如下图:
第一步 界说种别,这个种别的序次和练习时的种别序次对应,肯定不要改变序次!!!!
- classes = ('Black-grass', 'Charlock', 'Cleavers', 'Common Chickweed',
- 'Common wheat', 'Fat Hen', 'Loose Silky-bent',
- 'Maize', 'Scentless Mayweed', 'Shepherds Purse', 'Small-flowered Cranesbill', 'Sugar beet')
复制代码
第二步 界说transforms,transforms和验证集的transforms一样即可,别做数据增强。
- transform_test = transforms.Compose([
- transforms.Resize((224, 224)),
- transforms.ToTensor(),
- transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])
- ])
复制代码
第三步 加载model,并将模子放在DEVICE里。
- <code>DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
- model = torch.load("model_8_0.971.pth")
- model.eval()
- model.to(DEVICE)
复制代码
第四步 读取图片并推测图片的种别,在这里注意,读取图片用PIL库的Image。不要用cv2,transforms不支持。
- path = 'data/test/'
- testList = os.listdir(path)
- for file in testList:
- img = Image.open(path + file)
- img = transform_test(img)
- img.unsqueeze_(0)
- img = Variable(img).to(DEVICE)
- out = model(img)
- # Predict
- _, pred = torch.max(out.data, 1)
- print('Image Name:{},predict:{}'.format(file, classes[pred.data.item()]))
复制代码
测试完备代码:
- import torch.utils.data.distributedimport torchvision.transforms as transformsfrom PIL import Imagefrom torch.autograd import Variableimport osclasses = ('Black-grass', 'Charlock', 'Cleavers', 'Common Chickweed',
- 'Common wheat', 'Fat Hen', 'Loose Silky-bent',
- 'Maize', 'Scentless Mayweed', 'Shepherds Purse', 'Small-flowered Cranesbill', 'Sugar beet')transform_test = transforms.Compose([ transforms.Resize((224, 224)), transforms.ToTensor(), transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])])<code>DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
- model = torch.load("model_8_0.971.pth")
- model.eval()
- model.to(DEVICE)path = 'data/test/'
- testList = os.listdir(path)
- for file in testList:
- img = Image.open(path + file)
- img = transform_test(img)
- img.unsqueeze_(0)
- img = Variable(img).to(DEVICE)
- out = model(img)
- # Predict
- _, pred = torch.max(out.data, 1)
- print('Image Name:{},predict:{}'.format(file, classes[pred.data.item()]))
复制代码
运行结果:
第二种写法
第二种,使用自界说的Dataset读取图片。前三步同上,差异紧张在第四步。读取数据的时间,使用Dataset的SeedlingData读取。
- dataset_test =SeedlingData('data/test/', transform_test,test=True)
- print(len(dataset_test))
- # 对应文件夹的label
-
- for index in range(len(dataset_test)):
- item = dataset_test[index]
- img, label = item
- img.unsqueeze_(0)
- data = Variable(img).to(DEVICE)
- output = model(data)
- _, pred = torch.max(output.data, 1)
- print('Image Name:{},predict:{}'.format(dataset_test.imgs[index], classes[pred.data.item()]))
- index += 1
复制代码
运行结果:
完备代码:
https://download.caogenba.net/download/hhhhhhhhhhwwwwwwwwww/75920884
免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作! |