目标检测算法再升级！YOLOv8保姆级教程一键体验-电子发烧友网

YOLO作为一种基于图像全局信息进行预测的目标检测系统，始终保持着极高的迭代更新率，从YOLOv5到YOLOv8，本次升级主要包括结构算法、命令行界面、Python API等。具体到YOLOv8，它可以在大型数据集上进行训练，并且能够在各种硬件平台上运行；YOLOv8还有一个关键特性是它的可扩展性，由于其被设计成一个框架，支持所有以前YOLO的版本，使得在不同版本之间切换和比较它们的性能变得容易。

本次内容《目标检测算法再升级！YOLOv8保姆级教程一键体验》，地平线开发者社区优秀开发者林松将会一步步引导大家在地平线旭日®X3派（下文简称旭日X3派）成功部署YOLOv8目标检测模型，并附上精度速度初探！相关问题欢迎大家注册加入地平线开发者社区交流讨论，配置文件及代码详见地平线开发者社区。

环境配置

本文所使用的脚本和代码目录结构和说明如下：

├── project # X3 工作目录 │ ├── calib_f32 # 量化校准数据集 │ ├── coco128 # 量化校准和待检测图片 │ ├── config.yaml # onnx 转 bin 模型配置 │ ├── modules.py -> ../ultralytics/ultralytics/nn/modules.py # 软链接 YOLOv8 后处理文件 │ ├── onnxruntime-infer.py # pc 端读取 onnx 并检测 │ ├── requirements.txt # python 依赖包 │ ├── step1_export_onnx.py # YOLOv8 ONNX 导出 │ ├── step2_make_calib.py # 制作量化校准数据集 │ ├── step3_convert_bin.sh # onnx 转 bin 脚本 │ ├── step4_inference.py # X3 推理代码 │ ├── yolo-comparison-plots.png # YOLO 模型对比图 │ ├── yolov8n.onnx # 转换好的 onnx │ ├── yolov8n.pt # YOLOv8 pytorch 权重 │ └── yolov8n_horizon.bin # 转换好的 bin 模型 ├── ultralytics # YOLOv8 仓库 │ ├── CITATION.cff │ ├── CONTRIBUTING.md │ ├── LICENSE │ ├── MANIFEST.in │ ├── README.md │ ├── README.zh-CN.md │ ├── docker │ ├── docs │ ├── examples │ ├── mkdocs.yml │ ├── requirements.txt │ ├── setup.cfg │ ├── setup.py │ ├── tests │ └── ultralytics

YOLOv8 PyTorch环境配置

请在开发机中导出ONNX模型，安装PyTorch ONNX等依赖，再安装YOLOv8：

cd project python3 -m pip install -r requirements.txt cd ../ultralytics python3 setup.py install cd ../project

模型导出

修改YOLOv8后处理代码

将YOLOv8中ultralytics/ultralytics/nn/modules.py软链接到 project/modules.py，方便定位到修改的代码位置，其中中有两个trick：

# *************************************************************************************************************** # # *************************************************************************************************************** # # 原仓库的版本带后处理注释掉！！！！ # def forward(self, x): # shape = x[0].shape # BCHW # for i in range(self.nl): # x[i] = torch.cat((self.cv2[i](x[i]), self.cv3[i](x[i])), 1) # if self.training: # return x # elif self.dynamic or self.shape != shape: # self.anchors, self.strides = (x.transpose(0, 1) for x in make_anchors(x, self.stride, 0.5)) # self.shape = shape # # box, cls = torch.cat([xi.view(shape[0], self.no, -1) for xi in x], 2).split((self.reg_max * 4, self.nc), 1) # dbox = dist2bbox(self.dfl(box), self.anchors.unsqueeze(0), xywh=True, dim=1) * self.strides # y = torch.cat((dbox, cls.sigmoid()), 1) # return y if self.export else (y, x) # *************************************************************************************************************** # # *************************************************************************************************************** # # X3 部署使用的版本！！！！ def forward(self, x): res = [] for i in range(self.nl): bboxes = self.cv2[i](x[i]).permute(0, 2, 3, 1) scores = self.cv3[i](x[i]).permute(0, 2, 3, 1) res.append(bboxes) res.append(scores) # 返回 tuple 不会导出报错 return tuple(res) # *************************************************************************************************************** # # *************************************************************************************************************** #

导出Transpose(permute)节点

bboxes = self.cv2[i](x[i]).permute(0, 2, 3, 1) scores = self.cv3[i](x[i]).permute(0, 2, 3, 1)

由于旭日X3派支持的模型格式为NHWC，但是PyTorch训练的模型是NCHW，因此我们导出的ONNX模型在转换bin时会在网络头和尾插入Transpose结点，而这个 Transpose节点的顺序是[0, 3, 1, 2]，可以发现与我们插入的[0, 2, 3, 1]节点正好可以抵消，相当与少了个Transpose节点，这样是可以提升模型推理速度，避免不必要的计算的。

将输出处理成 tuple

这步主要是为了让YOLOv8能够顺利导出不报错，如果使用list则会报tulpe的错误。

使用YOLOv8导出的ONNX

执行 step1_export_onnx.py，可以下载官方的权重并导出 ONNX。

# 导入 YOLOv8 from ultralytics import YOLO # 载入预训练权重 model = YOLO("yolov8n.pt") # 指定 opset=11 并且使用 onnx-sim 简化 ONNX success = model.export(format="onnx", opset=11, simplify=True)

python3 step1_export_onnx.py

注意：旭日X3派支持ONNX opset = 10/11，其他版本会无法通过模型工具链编译。

使用ONNXRuntime推理导出ONNX

为了避免导出的ONNX出错，最好使用ONNXRuntime来验证一下模型的正确性。

def letterbox(im, new_shape=(640, 640), color=114): # Resize and pad image while meeting stride-multiple constraints shape = im.shape[:2]

# current shape [height, width] if isinstance(new_shape, int): new_shape = (new_shape, new_shape) # Scale ratio (new / old) r = min(new_shape[0] / shape[0], new_shape[1] / shape[1]) # Compute padding new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r)) dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1]

# wh padding dw /= 2 # divide padding into 2 sides dh /= 2 if shape[::-1] != new_unpad: # resize im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR) top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1)) left, right = int(round(dw - 0.1)), int(round(dw + 0.1)) im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=(color, color, color)) # add border return im, 1 / r, (dw, dh) def ratioresize(im, new_shape=(640, 640), color=114): shape = im.shape[:2]

# current shape [height, width] if isinstance(new_shape, int): new_shape = (new_shape, new_shape) new_h, new_w = new_shape padded_img = np.ones((new_h, new_w, 3), dtype=np.uint8) * color # Scale ratio (new / old) r = min(new_h / shape[0], new_w / shape[1])

# Compute padding new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r)) if shape[::-1] != new_unpad: im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR) padded_img[: new_unpad[1], : new_unpad[0]] = im padded_img = np.ascontiguousarray(padded_img) return padded_img, 1 / r, (0, 0)

本文使用的两种图像缩放方法，letterbox是YOLOv8中训练时启用的方法，由于需要四周padding并且后处理要根据padding的数值还原，较为麻烦。使用 ratioresize方法，在保持图像的长宽比例的同时，使用右下角padding避免了后处理计算偏移量。

if __name__ == '__main__': images_path = Path('./coco128') model_path = Path('./yolov8n.onnx') score_thres = 0.4 iou_thres = 0.65 num_classes = 80 try: session = onnxruntime.InferenceSession(str(model_path), providers=['CPUExecutionProvider']) model_h, model_w = session.get_inputs()[0].shape[2:] except Exception as e: print(f'Load model error.\n{e}') exit() else: try: # 预热10次推理 for _ in range(10): session.run(None, {'images': np.random.randn(1, 3, model_h, model_w).astype(np.float32)}) except Exception as e: print(f'Warm up model error.\n{e}') cv2.namedWindow("results", cv2.WINDOW_AUTOSIZE) for img_path in images_path.iterdir(): image = cv2.imread(str(img_path)) t0 = time.perf_counter()

## yolov8 training letterbox # resized, ratio, (dw, dh) = letterbox(image, (model_h, model_w)) resized, ratio, (dw, dh) = ratioresize(image, (model_h, model_w)) buffer = blob(resized) t1 = time.perf_counter() outputs = session.run(None, {'images': buffer}) outputs = [o[0] for o in outputs] t2 = time.perf_counter() results = postprocess( outputs, score_thres, iou_thres, image.shape[0], image.shape[1], dh, dw, ratio, ratio, 16, num_classes) results = nms(*results) t3 = time.perf_counter() for (x0, y0, x1, y1, score, label) in results: x0, y0, x1, y1 = map(int, [x0, y0, x1, y1]) cls_id = int(label) cls = CLASSES[cls_id] color = COLORS[cls] cv2.rectangle(image, [x0, y0], [x1, y1], color, 1) cv2.putText(image, f'{cls}:{score:.3f}', (x0, y0 - 2), cv2.FONT_HERSHEY_SIMPLEX, 0.325, [0, 0, 225], thickness=1) t4 = time.perf_counter() cv2.imshow('results', image)

上述是推理主函数，为了保证模型打印的耗时稳定，前期启动了10次推理预热，建议端侧部署时一定记得预热一下。可以看到结果正确，后处理逻辑也是对的。

生成量化校准数据集

执行step2_make_calib.py，可以读取coco128目录下随机50张图片，制作校准数据集。

img = cv2.imread(str(i)) img = letterbox(img)[0] img = blob(img[:, :, ::-1]) # bgr -> rgb print(img.shape) img.astype(np.float32).tofile(str(save / (i.stem + '.rgbchw')))

制作校准数据集主要是读图-> resize -> uint8转float -> numpy.tofile。在calib_f32目录下会生成50个rgbchw结尾的文件：

python3 step2_make_calib.py

使用地平线提供的Docker编译bin模型

将docker_openexplorer_centos_7_xj3_v2.4.2.tar.gz下载到本地开发机，并使用以下命令开启docker：

cd ../ wget -c ftp://vrftp.horizon.ai/Open_Explorer_gcc_9.3.0/2.4.2/docker_openexplorer_centos_7_xj3_v2.4.2.tar.gz docker load -i docker_openexplorer_centos_7_xj3_v2.4.2.tar.gz docker run -it --name horizonX3 -v ${PWD}/project:/open_explorer/project openexplorer/ai_toolchain_centos_7_xj3:v2.4.2 docker exec -it horizonX3 /bin/bash

进入容器后，执行：

cd project bash step3_convert_bin.sh

编译成功后会打印如下日志：

/model.22/cv3.2/cv3.2.2/Conv BPU id(0) HzSQuantizedConv 0.998216 67.505043 2023-01-31 21:17:24,261 INFO [Tue Jan 31 21:17:24 2023] End to Horizon NN Model Convert. 2023-01-31 21:17:24,315 INFO start convert to *.bin file.... 2023-01-31 21:17:24,345 INFO ONNX model output num : 6 2023-01-31 21:17:24,346 INFO

############# model deps info ############# 2023-01-31 21:17:24,346 INFO hb_mapper version : 1.9.9 2023-01-31 21:17:24,346 INFO hbdk version : 3.37.2 2023-01-31 21:17:24,346 INFO hbdk runtime version: 3.14.14 2023-01-31 21:17:24,346 INFO horizon_nn version : 0.14.0 2023-01-31 21:17:24,346 INFO

############# model_parameters info ############# 2023-01-31 21:17:24,346 INFO onnx_model : /open_explorer/workspace/yolov8/yolov8n.onnx 2023-01-31 21:17:24,346 INFO BPU march : bernoulli2 2023-01-31 21:17:24,346 INFO layer_out_dump : False 2023-01-31 21:17:24,346 INFO log_level : DEBUG 2023-01-31 21:17:24,346 INFO working dir : /open_explorer/workspace/yolov8/model_output 2023-01-31 21:17:24,346 INFO output_model_file_prefix: yolov8n_horizon 2023-01-31 21:17:24,347 INFO

############# input_parameters info ############# 2023-01-31 21:17:24,347 INFO ------------------------------------------

2023-01-31 21:17:24,347 INFO ---------input info : images --------- 2023-01-31 21:17:24,347 INFO input_name : images 2023-01-31 21:17:24,347 INFO input_type_rt : nv12 2023-01-31 21:17:24,347 INFO input_space&range : regular 2023-01-31 21:17:24,347 INFO input_layout_rt : NHWC 2023-01-31 21:17:24,347 INFO input_type_train : rgb 2023-01-31 21:17:24,347 INFO input_layout_train : NCHW 2023-01-31 21:17:24,347 INFO norm_type : data_scale 2023-01-31 21:17:24,347 INFO input_shape : 1x3x640x640 2023-01-31 21:17:24,347 INFO input_batch : 1 2023-01-31 21:17:24,347 INFO scale_value : 0.003921568627451, 2023-01-31 21:17:24,347 INFO cal_data_dir : /open_explorer/calib_f32 2023-01-31 21:17:24,347 INFO ---------input info : images end ------- 2023-01-31 21:17:24,347 INFO ------------------------------------------

2023-01-31 21:17:24,347 INFO ############# calibration_parameters info #############

2023-01-31 21:17:24,348 INFO preprocess_on : False 2023-01-31 21:17:24,348 INFO calibration_type: : max 2023-01-31 21:17:24,348 INFO cal_data_type : float32 2023-01-31 21:17:24,348 INFO max_percentile : 0.99999 2023-01-31 21:17:24,348 INFO per_channel : True 2023-01-31 21:17:24,348 INFO ############# compiler_parameters info ############# 2023-01-31 21:17:24,348 INFO hbdk_pass_through_params: --core-num 2 --fast --O3 2023-01-31 21:17:24,348 INFO input-source : {'images': 'pyramid', '_default_value': 'ddr'} 2023-01-31 21:17:24,354 INFO Convert to runtime bin file sucessfully! 2023-01-31 21:17:24,354 INFO End Model Convert /model.22/cv3.2/cv3.2.2/Conv BPU id(0) HzSQuantizedConv 0.998216 67.505043

上文的0.998216表示量化后的模型最后一层输出的余弦相似度，越接近1代表模型精度保持的越高（PS.model_output/yolov8n_horizon.bin是转换完的bin模型）。

推理测试

上板测试

将project文件夹打包上传到旭日X3派中，可以使用ssh或者U盘复制到旭日X3派工作目录中。假设保存到入/home/sunrise/project，推理前处理需要将输入转换到 nv12：

def bgr2nv12_opencv(image): height, width = image.shape[:2] area = height * width yuv420p = cv2.cvtColor(image, cv2.COLOR_BGR2YUV_I420).reshape((area * 3 // 2,)) y = yuv420p[:area] uv_planar = yuv420p[area:].reshape((2, area // 4)) uv_packed = uv_planar.transpose((1, 0)).reshape((area // 2,)) nv12 = np.zeros_like(yuv420p) nv12[:area] = y nv12[area:] = uv_packed return nv12

使用终端执行：

cd /home/sunrise/project sudo python3 -m pip install opencv-python # 安装 X3 推理依赖 mv model_output/yolov8n_horizon.bin ./ sudo python3 step4_inference.py

会看到图片检测并绘制的结果，还会打印推理的耗时情况：

得出结果：
前处理：30.4ms
推理：168.5ms
后处理：66ms
画图：0.8ms
全程耗时：265.9ms

本文转自地平线开发者社区
原作者：tripleMu

声明：本文内容及配图由入驻作者撰写或者入驻合作网站授权转载。文章观点仅代表作者本人，不代表电子发烧友网立场。文章及其配图仅供工程师学习之用，如有内容侵权或者其他违规问题，请联系本站处理。举报投诉

嵌入式

嵌入式

+关注

关注
4984

文章
18297

浏览量
288627
检测

检测

+关注

关注
5

文章
4086

浏览量
90752
目标检测

目标检测

+关注

关注
0

文章
186

浏览量
15458

YOLOv8实现旋转对象检测

YOLOv8框架在在支持分类、对象检测、实例分割、姿态评估的基础上更近一步，现已经支持旋转对象检测（OBB），基于DOTA数据集，支持航拍图像的15个类别对象检测，包括车辆、船只、典型

发表于 01-11 10:43 •579次阅读

<b class='flag-5'>YOLOv8</b>实现旋转对象<b class='flag-5'>检测</b>

基于YOLOv8的自定义医学图像分割

YOLOv8是一种令人惊叹的分割模型；它易于训练、测试和部署。在本教程中，我们将学习如何在自定义数据集上使用YOLOv8。但在此之前，我想告诉你为什么在存在其他优秀的分割模型时应该使用YOLOv8呢？

发表于 12-20 10:51 •411次阅读

基于<b class='flag-5'>YOLOv8</b>的自定义医学图像分割

【爱芯派 Pro 开发板试用体验】yolov8模型转换

通过开发板上预制的ax_run_model 工具，测试模型速度和精度。示例都是yolov5的，没有yolov8，模型怎么运行还需进一步研究。

发表于 11-20 12:19

如何修改YOLOv8的源码

很多人也想跟修改YOLOv5源码一样的方式去修改YOLOv8的源码，但是在github上面却发现找到的YOLOv8项目下面TAG分支是空的，然后就直接从master/main下面把源码克隆出来一通

发表于 09-04 10:02 •1223次阅读

基于Transformer的目标检测算法

掌握基于Transformer的目标检测算法的思路和创新点，一些Transformer论文涉及的新概念比较多，话术没有那么通俗易懂，读完论文仍然不理解算法的细节部分。

发表于 08-16 10:51 •433次阅读

基于Transformer的<b class='flag-5'>目标</b><b class='flag-5'>检测算法</b>

解锁YOLOv8修改+注意力模块训练与部署流程

很多人也想跟修改YOLOv5源码一样的方式去修改YOLOv8的源码，但是在github上面却发现找到的YOLOv8项目下面TAG分支是空的

发表于 08-11 14:14 •2693次阅读

解锁<b class='flag-5'>YOLOv8</b>修改+注意力模块训练与部署流程

三种主流模型部署框架YOLOv8推理演示

部署。这里以YOLOv8为例，演示了YOLOv8对象检测模型在OpenVINO、ONNXRUNTIME、TensorRT三个主流框架上C++推理演示效果。

发表于 08-06 11:39 •1848次阅读

基于强化学习的目标检测算法案例

摘要：基于强化学习的目标检测算法在检测过程中通常采用预定义搜索行为，其产生的候选区域形状和尺寸变化单一，导致目标检测精确度较低。为此，在基于

发表于 07-19 14:35 •0次下载

无Anchor的目标检测算法边框回归策略

导读本文主要讲述：1.无Anchor的目标检测算法：YOLOv1，CenterNet，CornerNet的边框回归策略；2.有Anchor的目标检

发表于 07-17 11:17 •616次阅读

教你如何用两行代码搞定YOLOv8各种模型推理

大家好，YOLOv8 框架本身提供的API函数是可以两行代码实现 YOLOv8 模型推理，这次我把这段代码封装成了一个类，只有40行代码左右，可以同时支持YOLOv8对象检测、实例分割

发表于 06-18 11:50 •2090次阅读

一文彻底搞懂YOLOv8【网络结构+代码+实操】

从上面可以看出，YOLOv8 主要参考了最近提出的诸如 YOLOX、YOLOv6、YOLOv7 和 PPYOLOE 等算法的相关设计，本身的创新点不多，偏向工程实践，主推的还是 ult

发表于 06-15 17:15 •6764次阅读

AI爱克斯开发板上使用OpenVINO加速YOLOv8目标检测模型

《在AI爱克斯开发板上用OpenVINO加速YOLOv8分类模型》介绍了在AI爱克斯开发板上使用OpenVINO 开发套件部署并测评YOLOv8的分类模型，本文将介绍在AI爱克斯开发板上使用OpenVINO加速YOLOv8

发表于 05-26 11:03 •723次阅读

YOLOv8版本升级支持小目标检测与高分辨率图像输入

YOLOv8版本最近版本又更新了，除了支持姿态评估以外，通过模型结构的修改还支持了小目标检测与高分辨率图像检测。原始的YOLOv8模型结构如

发表于 05-16 11:14 •9091次阅读

在AI爱克斯开发板上用OpenVINO™加速YOLOv8目标检测模型

《在 AI 爱克斯开发板上用 OpenVINO 加速 YOLOv8 分类模型》介绍了在 AI 爱克斯开发板上使用 OpenVINO 开发套件部署并测评 YOLOv8 的分类模型，本文将介绍在 AI 爱克斯开发板上使用 OpenVINO 加速

发表于 05-12 09:08 •852次阅读

在AI爱克斯开发板上用OpenVINO™加速YOLOv8分类模型

本系列文章将在 AI 爱克斯开发板上使用 OpenVINO 开发套件依次部署并测评 YOLOv8 的分类模型、目标检测模型、实例分割模型和人体姿态估计模型。

发表于 05-05 11:47 •600次阅读