基于主题模型的水利信息分类方案设计
DOI:
CSTR:
作者:
作者单位:

武汉大学计算机学院,,,,

作者简介:

通讯作者:

中图分类号:

基金项目:


Design of water conservancy information classification scheme based on theme model
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    水利信息分类是水利科学数据共享标准化最为重要的一项工作,因此对水利领域大量数据信息的分类十分有必要。针对水利文本数据非结构化的特点,为此设计一个基于主题模型的水利文本信息分类方案,通过结合LDA主题模型和Glove词向量的优点,提出一种新的主题模型。利用AdaBoost算法改进KNN分类器,在迭代中对分类器的错误进行适应性调整,最终得到分类器的集合。实验结果表明,使用AdaBoost提升KNN对于水利文本分类效果良好,分类效果远好于常见的朴素贝叶斯和决策树,和原来的KNN分类器相比,微观准确率提高1.1个百分点,宏观准确率提高了4.1个百分点。说明在水利文本分类中使用AdaBoost算法提升KNN分类器的有效性。

    Abstract:

    The classification of water conservancy information is the most important work of data sharing standardization in water conservancy science. Therefore, it is necessary to classify a large amount of data information in water conservancy fields. Aiming at the unstructured characteristics of water-based text data, a topic-based model of water-based text information classification scheme was designed. By combining the advantages of LDA theme model and GloVe word vector, a new topic model was proposed. The AdaBoost algorithm is used to improve the KNN classifier, and the error of the classifier is adaptively adjusted in the iteration, and finally the set of classifiers is obtained. The experimental results show that using AdaBoost to improve KNN has a good effect on classification of water conservancy texts, and the classification effect is much better than the common naive Bayes and decision trees. Compared with the original KNN classifier, the microscopic accuracy is improved by 1.1%, and the macro accuracy rate is improved. Increased by 4.1 percentage points. Explain that the AdaBoost algorithm is used to improve the validity of the KNN classifier in the classification of hydraulic texts.

    参考文献
    相似文献
    引证文献
引用本文

诸葛庆子,张审问,蔡朝晖,等.基于主题模型的水利信息分类方案设计[J].水利信息化,2018(6).

复制
分享
相关视频

文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2018-09-29
  • 最后修改日期:2018-11-20
  • 录用日期:2018-11-20
  • 在线发布日期: 2018-12-24
  • 出版日期:
文章二维码