电赛：OpenMV与嵌入式视觉

2024-07-27

嵌入式

9.8k words

环境配置
一. OpenMV基础命令
1. 摄像头基础配置
2. 统计信息
3. 调试方法
二. 传统cv算法识别
1. 识别红绿激光
2. ROI跟踪
三. TensorFlow Lite神经网络识别
1. 数据采集
2. 模型训练
3. 模型部署

两天备战一下电赛。准备速成回顾一下OpenMV芝士，顺便记录一下。

参考资料：【API】OpenMV常用功能

环境配置

OpenMV最新版本的IDE感觉有很多问题，因此推荐安装4.0.14版本的IDE，个人感觉这个版本的是bug比较少、支持固件寄了后直接重新烧录的，非常方便。

一. OpenMV基础命令

1. 摄像头基础配置

参考文档

# Cam - By: wlate - Fri Jul 26 2024

import sensor, image, time

sensor.reset()
sensor.set_pixformat(sensor.RGB565)
sensor.set_framesize(sensor.QVGA)
sensor.set_auto_gain(False)         # 自动增益
sensor.set_auto_whitebal(False)     # 自动白平衡
sensor.set_auto_exposure(True)      # 自动曝光时间
sensor.skip_frames(time = 2000)

clock = time.clock()

while(True):
    clock.tick()
    img = sensor.snapshot()
    print(clock.fps())

图片是可以访问单个像素点的，例如：

1	img.get_pixel(10,10)

会得到一个类似(247, 255, 255)的值，表示三通道颜色，这里是RGB，也就是说三个通道分别是红色、绿色、蓝色。

除此之外，如果用的是带畸变的镜头，可以用下面的命令来去畸变：

1	img = sensor.snapshot().lens_corr(strength = 1.8, zoom = 1.0)

前后效果如下，感觉帧率是没区别的：

2. 统计信息

对于一张图片，数字图像处理中，经常需要得到一些统计信息，例如图片某个区域的灰度直方图之类的。这时候要用到ROI，也就是感兴趣区域。

OpenMV中的ROI，传入四个值 $(c,r,w,h)$ ，也就是左上角开始的列坐标、行坐标、列宽度、行高度。

下面是绘制了一个感兴趣区域并对区域进行颜色统计的代码。

# ROI - By: wlate - Fri Jul 26 2024

import sensor, image, time

sensor.reset()
sensor.set_pixformat(sensor.RGB565)
sensor.set_framesize(sensor.QVGA)
sensor.skip_frames(10)
sensor.set_auto_whitebal(False)

img = sensor.snapshot()
IMG_COLS = img.width()
IMG_ROWS = img.height()
ROI=(int(IMG_COLS/4),int(IMG_ROWS/4),int(IMG_COLS/2),int(IMG_ROWS/2))

while(True):
    img = sensor.snapshot()
    statistics=img.get_statistics(roi=ROI)
    img.draw_rectangle(ROI)
    
    color_l=statistics.l_mode()
    color_a=statistics.a_mode()
    color_b=statistics.b_mode()
    print(color_l,color_a,color_b)

效果如下：

可以看出来框选了图片的重心，串口可以看见打印出来的值。

这里statistics能统计的信息如下：

统计量	方法	说明	索引调用
平均值	`statistics.mean()`	灰度平均数 (0-255)	`statistics[0]`
中位数	`statistics.median()`	灰度中位数 (0-255)	`statistics[1]`
众数	`statistics.mode()`	灰度众数 (0-255)	`statistics[2]`
标准差	`statistics.stdev()`	灰度标准差 (0-255)	`statistics[3]`
最小值	`statistics.min()`	灰度最小值 (0-255)	`statistics[4]`
最大值	`statistics.max()`	灰度最大值 (0-255)	`statistics[5]`
第一四分数	`statistics.lq()`	灰度第一四分数 (0-255)	`statistics[6]`
第三四分数	`statistics.uq()`	灰度第三四分数 (0-255)	`statistics[7]`

通道	统计量	方法	说明
L	平均值、中位数等	`l_mean`等	灰度、标准差等统计信息
A	平均值、中位数等	`a_mean`等	同上
B	平均值、中位数等	`b_mean`等	同上

这里统计的时候用的是LAB格式。

L：亮度（Luminance），表示颜色的明暗程度。

A：从绿色到红色的颜色分量。

B：从蓝色到黄色的颜色分量。

如果要查看ROI区域中的RGB各个通道值，可以：

# ROI - By: wlate - Fri Jul 26 2024

import sensor, image, time

sensor.reset()
sensor.set_pixformat(sensor.RGB565)
sensor.set_framesize(sensor.QVGA)
sensor.skip_frames(10)
sensor.set_auto_whitebal(False)

img = sensor.snapshot()
IMG_COLS = img.width()
IMG_ROWS = img.height()
ROI=(int(IMG_COLS/4),int(IMG_ROWS/4),int(IMG_COLS/2),int(IMG_ROWS/2))

while(True):
    img = sensor.snapshot().lens_corr(strength = 1.8, zoom = 1.0)   # 去畸变
    img = img.to_grayscale(copy=True, rgb_channel=0)    # 筛出红色通道
    
    statistics=img.get_statistics(roi=ROI)
    img.draw_rectangle(ROI)

    print("Red Channel Mean:", statistics.mean())

这里img.to_grayscale(copy=True, rgb_channel=0)意思是只对红色通道做转换，这里需要一点数字图像处理的基本芝士。一般来说灰度图，每个像素的灰度是0_{255，0是最暗，也就是黑色，255是最亮，也就是白色。3色8位图，每个通道也是0}255。R通道，也就是红色通道，0是几乎没有，255是最红，其他同理。这里我们只筛选出红色通道，也就是说对绿色和蓝色滤波了。

3. 调试方法

在跑OpenMV的时候，如果一直跑循环串口流肯定不太方便，有的时候需要对单帧图片分析，可以对于我们需要测试的图片，单独读入后分析。

首先把图片保存在OpenMV的磁盘上，假设是1.jpg。

二. 传统cv算法识别

1. 识别红绿激光

红绿激光应该是电赛典中典的识别项目了，可以直接调用cv函数来识别。思路主要是红绿激光点的特征。这很容易想到：① 是圆形 ② 颜色。

首先可以先实现一个针对单色激光点的，也就是在滤波后找最圆的。

import sensor, image, time

sensor.reset()
sensor.set_pixformat(sensor.RGB565)
sensor.set_framesize(sensor.QQVGA)
sensor.skip_frames(time = 2000)
sensor.set_gainceiling(8)           # 适用颜色识别的增益
sensor.set_auto_whitebal(False)
clock = time.clock()

circle_thres = 500

while(True):
    img = sensor.snapshot().lens_corr(strength = 1.8, zoom = 1.0)  # 去畸变
    gray_red = img.to_grayscale(copy=False, rgb_channel=0)

    gray_red = gray_red.mean(1)

    circles_red = gray_red.find_circles(threshold = circle_thres,  # 阈值，用于控制圆形检测的灵敏度
                                       x_margin = 10,      # X轴边距
                                       y_margin = 10,      # Y轴边距
                                       r_margin = 10,      # 半径边距
                                       r_min = 1,          # 最小半径
                                       r_max = 10,         # 最大半径
                                       r_step = 1)         # 半径步长

    if circles_red:
        for c in circles_red:
            img.draw_cross(c.x(), c.y(), color = (255, 0, 0)) 
            print("Circle: x = %d, y = %d, r = %d, strength = %.2f" % (c.x(), c.y(), c.r(), c.magnitude()))  # 打印圆心坐标、半径和置信度
            print("Detected circle - Radius: %d" % c.r())

    else:
        print("None")

背景会造成误差，因此在摄像头不动的情况下，例如23年电赛E题，可以通过开机判断外矩形后，划定ROI，就不会错误识别到背景了。

接下来是区别红绿激光，我的思路是，因为红绿激光亮度都很高，所以先在画面中找符合要求的圆，然后找两个亮度最高，其他都扔掉。然后对于这两个，对比A通道，设定阈值即可。应该能一定程度避免噪声。（此外，硬件上可以通过调焦更好识别激光点，焦距模糊后能识别的更好）。

import sensor, image, time

sensor.reset()
sensor.set_pixformat(sensor.RGB565)
sensor.set_framesize(sensor.QQVGA)
sensor.skip_frames(time = 2000)
sensor.set_gainceiling(8)           # 适用颜色识别的增益
sensor.set_auto_whitebal(False)
clock = time.clock()

circle_thres = 500
offs = 2
color_thres = 0

while(True):
    img = sensor.snapshot().lens_corr(strength = 1.8, zoom = 1.0)  # 去畸变
    gray = img.to_grayscale(copy=True)
    gray = gray.mean(1)

    circles_red = gray.find_circles(threshold = circle_thres,  # 阈值，用于控制圆形检测的灵敏度
                                       x_margin = 10,          # X轴边距
                                       y_margin = 10,          # Y轴边距
                                       r_margin = 10,          # 半径边距
                                       r_min = 1,              # 最小半径
                                       r_max = 10,             # 最大半径
                                       r_step = 1)             # 半径步长

    if circles_red:
        circle_data = []
        for i, c in enumerate(circles_red):
            roi = (c.x() - c.r() - offs , c.y() - c.r() -offs , 2 * ( c.r() + offs ), 2 * ( c.r() + offs ))
            statis = img.get_statistics(roi=roi)
            brightness = statis.l_mode()
            color = statis.a_mode()
            circle_data.append((c, brightness, color))
        circle_data.sort(key=lambda x: (-x[1]))

        if len(circle_data) >= 2:
            circle_data = circle_data[:2]
            color_thres = ( circle_data[0][2] + circle_data[1][2] ) / 2

        for idx, (circ, _, color) in enumerate(circle_data[:2]):
            img.draw_rectangle(circ.x() - circ.r(), circ.y() - circ.r(), 2 * circ.r(), 2 * circ.r(), color=(255, 255, 0))

            if color > color_thres:
                img.draw_string(circ.x() - circ.r(), circ.y() - circ.r() - 10, "RED", color=(255, 0, 0))
            #    print("RED", color)
            else:
                img.draw_string(circ.x() - circ.r(), circ.y() - circ.r() - 10, "GREEN", color=(0, 255, 0))
            #    print("Green", color)

        print("Thres", color_thres)

    else:
        print("None")

2. ROI跟踪

有空再写这个。

三. TensorFlow Lite神经网络识别

虽然能用传统算法一般传统算法更快更准，但是有的时候会有传统算法很难办到的识别任务，还得用神经网络。

在OpenMV上跑的神经网络，一般都是在Edge Impulse上训练的。因为正好在做一个物联网项目，这里就记录一下利用Edge Impulse训练一个手势识别项目Gesture Recognition为例子，这是个目标检测模型。

1. 数据采集

OpenMV的IDE应该是可以实现数据集采集，但是看上去很麻烦，并且我的版本IDE采集会乱码，因此我选择更简便的方式，通过IDE的流来录制手势视频，然后用python十几帧采集一个作为结果。

注意，不要缩放视频。

利用python写的10帧切割视频代码如下：

'''
Author: wlaten
Date: 2024-07-27 02:52:06
LastEditTime: 2024-07-27 03:03:42
Discription: file content
'''
import cv2
import os
import glob

def export_frames(video_path, k, label):
    """
    导出视频的帧（每k帧）到output/label文件夹下    
    """
    output_folder = os.path.join('output', label)
    if not os.path.exists(output_folder):
        os.makedirs(output_folder)
    existing_files = glob.glob(os.path.join(output_folder, '*.jpg'))
    if existing_files:
        last_num = max([int(os.path.basename(f).split('.')[0]) for f in existing_files])
    else:
        last_num = 0
    cap = cv2.VideoCapture(video_path)
    frame_count = 0
    
    cnt = 0
    l = last_num + 1
    while True:
        ret, frame = cap.read()
        if not ret:
            break
        if frame_count % k == 0:
            last_num += 1
            output_path = os.path.join(output_folder, f'{last_num}.jpg')
            cv2.imwrite(output_path, frame)
            cnt += 1

        frame_count += 1
    cap.release()
    print(f'导出{cnt}帧到{output_folder}')
    if cnt > 0:
        print(f'编号: {l} -> {last_num}')

def delete_frames(start, end, label):
    """
    删除output/label文件夹下编号为start到end的帧（防止不小心放错标签集）
    """
    output_folder = os.path.join('output', label)
    for i in range(start, end+1):
        output_path = os.path.join(output_folder, f'{i}.jpg')
        if os.path.exists(output_path):
            os.remove(output_path)
            print(f'删除{output_path}')
        else:
            print(f'{output_path}不存在')

if __name__ == '__main__':
    export_frames('4.mp4', 10, 'scissor')
    delete_frames(197,332, 'scissor')

2. 模型训练

在Edge Impluse上，新建一个项目，然后Project Info里面，Labeling method选Bounding Boxes，Target Device选OpenMV Cam H7 Plus。然后就是普通的采集和标注了。最后直接导出OpenMV适用的tflite模型即可。

参考这个：OpenMV 从入手到跑TensorFlow Lite神经网络进行垃圾分类

不过Edge Impulse太慢了，有机会研究一下手写训练过程，不想用这个平台了。

然后就是训练，我使用了170张剪刀手的数据集。最后平台验证集正确率在98%。

需要注意的是这里是理论正确率，做过嵌入式神经网络的都懂，这种理论正确率不代表实际部署后也能这么高，实际环境下只可能更坏（）。

3. 模型部署

将模型py文件、代码trained.tflite、标签labels.txt都放入OpenMV的盘中后即可。效果如图：

图像格式这样设置的

1 2	sensor.set_pixformat(sensor.RGB565) sensor.set_framesize(sensor.QQQVGA)

这样设置图片很小，只有60*80，但是不影响检测，小图片跑神经网络帧率还可以的，实测OpenMV Cam H7 Plus能跑到40帧。

对手势位置绘制十字的代码如下：

# Edge Impulse - OpenMV Object Detection Example

import sensor, image, time, os, tf, math, uos, gc

sensor.reset()
sensor.set_pixformat(sensor.RGB565)
sensor.set_framesize(sensor.QQQVGA)
sensor.set_auto_gain(True)
sensor.set_auto_whitebal(True)
sensor.skip_frames(time=2000)

net = None
labels = None
min_confidence = 0.98

try:
    # load the model, alloc the model file on the heap if we have at least 64K free after loading
    net = tf.load("trained.tflite", load_to_fb=uos.stat('trained.tflite')[6] > (gc.mem_free() - (64*1024)))
except Exception as e:
    raise Exception('Failed to load "trained.tflite", did you copy the .tflite and labels.txt file onto the mass-storage device? (' + str(e) + ')')

try:
    labels = [line.rstrip('\n') for line in open("labels.txt")]
except Exception as e:
    raise Exception('Failed to load "labels.txt", did you copy the .tflite and labels.txt file onto the mass-storage device? (' + str(e) + ')')

colors = [ # Add more colors if you are detecting more than 7 types of classes at once.
    (255,   0,   0),
    (  0, 255,   0),
    (255, 255,   0),
    (  0,   0, 255),
    (255,   0, 255),
    (  0, 255, 255),
    (255, 255, 255),
]

while(True):
    img = sensor.snapshot()

    for i, detection_list in enumerate(net.detect(img, thresholds=[(math.ceil(min_confidence * 255), 255)])):
        if (i == 0): continue  # 忽略背景类
        if (len(detection_list) == 0): continue  # 当前类别无检测对象

        print("********** %s **********" % labels[i])
        for d in detection_list:
            [x, y, w, h] = d.rect()  # 获取边界框坐标和大小
            confidence = d.output()  # 直接获取置信度
            center_x = math.floor(x + (w / 2))
            center_y = math.floor(y + (h / 2))

            # 打印坐标和置信度
            print('x %d\ty %d\tw %d\th %d\tconfidence %.2f' % (x, y, w, h, confidence))

            # 在边界框中心绘制十字标记
            img.draw_cross(center_x, center_y, color=colors[i], size=10, thickness=2)  # size 控制十字的大小

Categories

Tags