欢迎浏览论文快速发表网,我们为你提供专业的论文发表咨询和论文写作指导。 [设为首页] [加入收藏]
社科类论文 文艺类论文 建筑类论文 新闻类论文 农林类论文 教育类论文 管理类论文 医学类论文 科技类论文 法学类论文
论文范文

Frequent Symptom Sets Identification from Uncertain Medical Data in Differentially Private Way
时间:2017-09-04 21:40   来源:未知   作者:admin   点击:
       Abstract:Data mining techniques are applied to identify hidden patterns in large amounts of patient data. These patterns can assist physicians in making more accurate diagnosis. For different physical conditions of patients, the same physiological index corresponds to a different symptom association probability for each patient. Data mining technologies based on certain data cannot be directly applied to these patients’ data. Patient data are sensitive data. An adversary with sufficient background information can make use of the patterns mined from uncertain medical data to obtain the sensitive information of patients. In this paper, a new algorithm is presented to determine the top  most frequent itemsets from uncertain medical data and to protect data privacy. Based on traditional algorithms for mining frequent itemsets from uncertain data, our algorithm applies sparse vector algorithm and the Laplace mechanism to ensure differential privacy for the top  most frequent itemsets for uncertain medical data and the expected supports of these frequent itemsets. We prove that our algorithm can guarantee differential privacy in theory. Moreover, we carry out experiments with four real-world scenario datasets and two synthetic datasets. The experimental results demonstrate the performance of our algorithm.
1. Introduction
       The Internet of Things (IoT) involves a lot of different base technologies, such as wireless sensors, data management, and cloud computing [1]. Today, IoT technology is successfully applied in the field of eHealth [2–4]. Medical personnel can utilize IoT technology to collect large amounts of patient data that can assist them in providing better medical services to patients [5, 6].
       Frequent itemsets mining is applied in fields such as eHealth and bioinformatics. Traditional algorithms for mining frequent itemsets from medical data are based on certain data [7] and can be applied to discover hidden symptom patterns from a huge amount of data on patient symptoms. These patterns can be used by health managers to provide better healthcare for users [8]. For example, in [9, 10], the Apriori algorithm was applied to identify prevalent diseases and analyze medical billing. However, the Apriori algorithm mines frequent itemsets from certain data. In medicine, for different physical conditions of patients, the same physiological index corresponds to a different symptom association probability for each patient. As a result, there is uncertainty in patient data. Therefore, traditional algorithms for mining frequent itemsets from certain data cannot be directly applied to patient data.
       Another important factor is that medical records contain sensitive patient information. An adversary with sufficient background information can make use of frequent patterns mined from patient data to obtain the sensitive information of patients. Hence, it is very important to protect patient privacy when mining frequent itemsets from medical data [11].
       The set of symptoms that a patient suffers from constitute the patient’s data. Because of the probabilities associated with these symptoms, there is uncertainty in patient data. A large amount of patient data constitutes uncertain data. In the field of medicine, there are plenty of researches on symptom association probability. For example, one study monitored oesophageal pH over a 24 h period to obtain symptom association probability, which was then utilized to evaluate the association between a patient’s symptoms and gastroesophageal reflux [12]. By analyzing the large amounts of patient data, Beglinger et al. determined the probability that a patient suffering from Huntington’s disease also had obsessive and compulsive symptoms [13]. By analyzing the data of patients suffering from irritable bowel syndrome, Arsiè et al. determined the probability that indicated the association between meal ingestion and abdominal pain symptoms for patients suffering from irritable bowel syndrome [14]. In this paper, based on symptom association probability obtained by medical technology, we focus on how to mine frequent itemsets from uncertain medical data, while also protecting data privacy. In the uncertain medical data, each item corresponds to a symptom of patients.


推荐期刊 论文范文 学术会议资讯 论文写作 发表流程 期刊征稿 常见问题 网站通告
论文快速发表网(www.k-fabiao.com)版权所有,专业学术期刊论文发表网站
代理杂志社征稿、杂志投稿、省级期刊、国家级期刊、SCI/EI期刊、学术论文发表,中国学术期刊网全文收录