基于Apriori算法的关联规则挖掘系统的设计

以下是资料介绍,如需要完整的请充值下载. 本资料已审核过,确保内容和网页里介绍一致.  
无需注册登录,支付后按照提示操作即可获取该资料.
资料介绍:

基于Apriori算法的关联规则挖掘系统的设计(任务书,开题报告,论文15000字)
摘要
数据挖掘就是从大量的、不完全的、有噪声的、模糊的、随机的数据集合中,提取隐含在其中的并且有用的信息和知识的过程,它是一种应用前景非常广泛的数据分析方法,特别是用来解决随着信息技术发展所带来的大数据分析问题。在众多数据挖掘算法中,关联规则挖掘是其中一个十分重要的研究方向。由于关联规则挖掘算法能够较为有效的捕捉数据之间的隐藏的重要关系,再加之所挖掘的规则形式简洁、易于理解,近年来在越来越多的领域得以应用,卓有成效。本文的主要内容是基于Apriori算法实现一个关联规则可视化的系统,以清晰的展示如何从数据集合中发掘关联规则,除此之外还通过实际数据集来验证Apriori算法的效果,以及各个参数设置对于算法运行效果和准确度的影响。
    在系统实现方面,我们基于Web技术构建了一个关联规则可视化的平台,前端通过网页输入最小支持度(support)和最小置信度(confidence)的阀值并且上传训练数据集,然后通过HTTP(HyperText Transfer Protocol)协议将这些数据传输给服务器,服务器端将对所得到的数据进行关联规则挖掘计算。并将得到的关联规则集传送到浏览器端并进行可视化的显示。这样我们就能通过该系统对实际数据中隐含的规律进行挖掘,并验证的到的关联规则的合理性和准确性。
为了测试算法的效果,我们使用了几组实际数据进行交叉验证。通过调整最小支持度和最小置信度的阀值来改变算法的运行过程,将得到的关联规则对测试集中数据进行测试,统计成功匹配的概率和匹配成功情况下关联规则能够正确预测的概率。经过对实验结果分析,我们发现支持度和置信度阀值的设定对于算法的性能和运算时间有着不同的影响。同时,我们对找出的关联规则进行分析,发现部分规则有相应的理论依据支持,一定程度上反映了Apriori算法的合理性和有效性。
关键词: 数据挖掘;关联规则挖掘;Apriori算法
Design and Implementationof Association Rule         Mining System Based on Apriori Algorithm               
Abstract
Data mining is a process of extracting and useful information and knowledge from a large, incomplete, noisy, fuzzy, and random data set. It is a very broad application of data analysis Methods, in particular, to address the development of information technology with the large data analysis.Among the many data mining algorithms, association rule mining is one of the most important research directions. Because the association rules mining algorithm can capture the hidden important relationship between the data effectively, and then the mining rules are simple and easy to understand. In recent years, it has been applied in more and more fields. The main content of this paper is to implement an association rule visualization system based on Apriori algorithm to clearly show how to find association rules from data set. In addition, we can verify the effect of Apriori algorithm and the parameter setting through the actual data set. Algorithm operation effect and accuracy influence.
In terms of system implementation, we build a platform for visualizing association rules based on Web technology. The front end inputs the minimum support and minimum confidence thresholds through the web page and uploads the training data set, and then passes the HTTP (HyperText Transfer Protocol) protocol to the data transmission to the server, the server will be on the data obtained by the association rules mining calculation. And sends the resulting association rule set to the browser and visualizes the display. In this way, we can use the system to excavate the implicit laws in the actual data and verify the rationality and accuracy of the associated rules.
In order to test the effect of the algorithm, we used several sets of actual data for cross validation. By adjusting the threshold of minimum support and minimum confidence, the operation process of the algorithm is changed, and the obtained association rules are used to test the test data. The probability of successful matching and the probability of the association rule can be correctly predicted. After analyzing the experimental results, we found that the setting of the support and confidence thresholds had different effects on the performance and computation time of the algorithm. At the same time, we analyze the association rules and find that some of the rules have the corresponding theoretical basis to support, to a certain extent, reflects the Apriori algorithm is reasonable and effective.
Keywords: Data mining; association rule mining; Apriori algorithm

目录
第一章 绪论    1
1.1 论文研究的介绍    1
1.2系统实现中所使用的技术    2
1.3 国内外发展现状    2
1.3.1 国外研究现状    2
1.3.2 国内研究现状    4
1.4论文组织结构    4
第二章 关联规则相关概念    5
2.1 引言    5
2.2关联规则的基本概念    5
2.2.1关联规则的相关定义    5
2.2.2 关联规则的性质    7
2.3 关联规则的分类    7
2.4 关联规则挖掘步骤    9
第三章 Apriori可视化算法实现    9
3.1 Apriori算法概述    9
3.2 Apriori算法思想    10
3.3 Apriori算法实现过程    10
3.4 Apriori算法可视化实现    14
3.5 Apriori算法可视化系统介绍    15
3.6 规则匹配预测    16
第四章 数据测试及结果分析    17
4.1测试数据集说明    17
4.2结果分析    17
4.2.1 Apriori算法运行时间    17
4.2.2 验证关联规则    20
4.2.3关联规则的实际意义    24
第五章 总结与展望    25
5.1总结    25
5.2 展望    26
参考文献    27