基于字词混用集成模型的电力变压器缺陷记录文本挖掘方法
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

TM8

基金项目:

国家自然科学基金资助项目(52107165)


Character-word level ensemble integrated model for power transformer defect recording text mining method
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    变压器运维管理中积累了海量以文本形式记录的非结构化缺陷数据,但缺乏有效挖掘手段导致其利用率极低。文中提出一种基于字词混用集成模型的变压器缺陷记录文本挖掘方法,首先对变压器缺陷文本进行文本分词、去除停用词、文本增强、文本特征表示等预处理,以文本数学向量形式为输入,集成多个词汇级和字符级分类模型,通过元学习器对各基学习器性能的协同互补作用,实现变压器缺陷类型的准确识别和分类。与单一文本分类算法相比,该方法能够更全面地获得文本的语义特征,分类精确率达91%,模型准确率和召回率的综合评价分数F1=0.9。将自然语言处理技术应用于电力设备缺陷记录文本,可以实现精准高效分类和故障识别,唤醒数据资源,显著提升电力变压器智能化管理水平。

    Abstract:

    The operation and maintenance management of transformers has accumulated a large amount of unstructured defect recording data in the form of text. However,the lack of effective mining method has led to an extremely low utilization rate. A text mining method for transformer defect recording text based on a character-word level ensemble integrated model is proposed in this paper. Firstly,the transformer defect recording texts are preprocessed with text segmentation,stop word removal,text augmentation,and text feature representation to convert the data into mathematical vectors for input. By integrating multiple word- and character-level classification models,the method can realize accurate identification and classification of transformer defect types through the synergistic and complementary effects of meta-learners on the individual base learners. Compared to single-text classification algorithms,this method can obtain the semantic features of the text more comprehensively,achieving a classification precision of 91% and F1 score of 0.9,which is the comprehensive evaluation score for model precision and recall. By applying natural language processing technology to precise power equipment defect recoding text classification and efficient fault recognition,data resources are awakened,and the intelligent management level of power transformers is significantly improved.

    参考文献
    相似文献
    引证文献
引用本文

李元,李睿,林金山,金凌峰,邵先军,张冠军.基于字词混用集成模型的电力变压器缺陷记录文本挖掘方法[J].电力工程技术,2024,43(6):153-162

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-04-07
  • 最后修改日期:2024-06-29
  • 录用日期:2023-12-30
  • 在线发布日期: 2024-11-26
  • 出版日期: 2024-11-28
文章二维码