国产欧美精品一区二区,中文字幕专区在线亚洲,国产精品美女网站在线观看,艾秋果冻传媒2021精品,在线免费一区二区,久久久久久青草大香综合精品,日韩美aaa特级毛片,欧美成人精品午夜免费影视

多模態(tài)特征融合的長(cháng)視頻行為識別方法
DOI:
CSTR:
作者:
作者單位:

西安建筑科技大學(xué)

作者簡(jiǎn)介:

通訊作者:

中圖分類(lèi)號:

基金項目:

陜西省自然科學(xué)基金面上項目(2020JM-473,2020JM-472)、西安建筑科技大學(xué)基礎研究基金項目(JC1703)、西安建筑科技大學(xué)自然科學(xué)基金項目(ZR19046)


Long Video Action Recognition Method Based on Multimodal Feature Fusion
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 圖/表
  • |
  • 訪(fǎng)問(wèn)統計
  • |
  • 參考文獻
  • |
  • 相似文獻
  • |
  • 引證文獻
  • |
  • 資源附件
  • |
  • 文章評論
    摘要:

    行為識別技術(shù)在視頻檢索具有重要的應用價(jià)值。針對基于卷積神經(jīng)網(wǎng)絡(luò )的行為識別方法存在的長(cháng)時(shí)序行為識別能力不足、尺度特征提取困難、光照變化及復雜背景干擾等問(wèn)題,提出一種多模態(tài)特征融合的長(cháng)視頻行為識別方法。首先,考慮到長(cháng)時(shí)序行為幀間差距較小,易造成視頻幀的冗余,基于此,通過(guò)均勻稀疏采樣策略完成全視頻段的時(shí)域建模,在降低視頻幀冗余度的前提下實(shí)現長(cháng)時(shí)序信息的充分保留;其次,通過(guò)多列卷積獲取多尺度時(shí)空特征,弱化視角變化對視頻圖像帶來(lái)的干擾;后引入光流數據信息,通過(guò)空間注意力機制引導的特征提取網(wǎng)絡(luò )獲取光流數據的深層次特征,進(jìn)而利用不同數據模式之間的優(yōu)勢互補,提高網(wǎng)絡(luò )在不同場(chǎng)景下的準確性和魯棒性。最后,將獲取的多尺度時(shí)空特征和光流信息在網(wǎng)絡(luò )的全連接層進(jìn)行融合,實(shí)現了端到端的長(cháng)視頻行為識別。實(shí)驗結果表明,所提方法在UCF101和HMDB51數據集上平均精度分別為97.2%和72.8%,優(yōu)于其他對比方法,實(shí)驗結果證明了該方法的有效性。

    Abstract:

    Action recognition technology has important application value in video retrieval. In order to solve the problems of convolutional neural network based action recognition methods, such as insufficient ability of long time sequence action recognition, difficulty in scale feature extraction, illumination change and complex background interference, a long-video action recognition method based on multi-mode feature fusion is proposed. Firstly, considering that the gap between the frames of the long-sequence behavior is small, it is easy to cause the redundancy of the video frames. Based on this, the time-domain modeling of the whole video segment is completed by using the uniform sparse sampling strategy, and the long-sequence information is fully retained on the premise of reducing the redundancy of the video frames. Secondly, multi-column convolution is used to obtain multi-scale spatial and temporal features, so as to weaken the interference caused by the change of perspective on video images. Then, the optical flow data information is introduced, and the deep features of the optical flow data are obtained through the feature extraction network guided by the spatial attention mechanism. Furthermore, the complementary advantages among different data modes are utilized to improve the accuracy and robustness of the network in different scenarios. Finally, the obtained multi-scale spatial and temporal features and optical flow information are fused in the full connection layer of the network to realize end-to-end long video action recognition. Experimental results show that the average accuracy of the proposed method on UCF101 and HMDB51 datasets is 97.2% and 72.8%, respectively, which is better than other comparison methods. The experimental results prove the effectiveness of the method.

    參考文獻
    相似文獻
    引證文獻
引用本文

王婷,劉光輝,張鈺敏,孟月波,徐勝軍.多模態(tài)特征融合的長(cháng)視頻行為識別方法計算機測量與控制[J].,2021,29(11):165-170.

復制
分享
文章指標
  • 點(diǎn)擊次數:
  • 下載次數:
  • HTML閱讀次數:
  • 引用次數:
歷史
  • 收稿日期:2021-04-08
  • 最后修改日期:2021-05-11
  • 錄用日期:2021-05-12
  • 在線(xiàn)發(fā)布日期: 2021-11-22
  • 出版日期:
文章二維碼
连江县| 鸡西市| 土默特左旗| 榕江县| 石台县| 昌平区| 全南县| 米林县| 大田县| 龙门县| 华池县| 聊城市| 安丘市| 黄梅县| 广安市| 隆尧县| 那曲县| 丰原市| 射阳县| 德钦县| 盱眙县| 阳信县| 基隆市| 三河市| 和龙市| 彭山县| 酒泉市| 东兴市| 巢湖市| 太谷县| 宜宾县| 河南省| 乌兰县| 武川县| 云霄县| 仁怀市| 噶尔县| 当雄县| 分宜县| 印江| 太仓市|