论文快速发表网

社科类论文文艺类论文建筑类论文新闻类论文农林类论文教育类论文管理类论文医学类论文科技类论文法学类论文

EI Compendex Source List（2020年1月） EI Compendex Source List（2019年5月） EI Compendex Source List（2018年9月） EI Compendex Source List（2018年5月） EI Compendex Source List（2018年1月）中国科学引文数据库来源期刊列 CSSCI(2017-2018)及扩展期刊目录 2017年4月7日EI检索目录（最新） 2017年3月EI检索目录最新公布北大中文核心期刊目录 SCI期刊（含影响因子） EI Compendex Source List

本站服务项目浅谈青年教师如何开展教科研大学青年教师如何处理教学和科

论文范文

Frequent Symptom Sets Identification from Uncertain Medical Data in Differentially Private Way 时间:2017-09-04 21:40 来源:未知作者:admin 点击: 次 Abstract:Data mining techniques are applied to identify hidden patterns in large amounts of patient data. These patterns can assist physicians in making more accurate diagnosis. For different physical conditions of patients, the same physiological index corresponds to a different symptom association probability for each patient. Data mining technologies based on certain data cannot be directly applied to these patients’ data. Patient data are sensitive data. An adversary with sufficient background information can make use of the patterns mined from uncertain medical data to obtain the sensitive information of patients. In this paper, a new algorithm is presented to determine the top most frequent itemsets from uncertain medical data and to protect data privacy. Based on traditional algorithms for mining frequent itemsets from uncertain data, our algorithm applies sparse vector algorithm and the Laplace mechanism to ensure differential privacy for the top most frequent itemsets for uncertain medical data and the expected supports of these frequent itemsets. We prove that our algorithm can guarantee differential privacy in theory. Moreover, we carry out experiments with four real-world scenario datasets and two synthetic datasets. The experimental results demonstrate the performance of our algorithm.
1. Introduction
The Internet of Things (IoT) involves a lot of different base technologies, such as wireless sensors, data management, and cloud computing [1]. Today, IoT technology is successfully applied in the field of eHealth [2–4]. Medical personnel can utilize IoT technology to collect large amounts of patient data that can assist them in providing better medical services to patients [5, 6].
Frequent itemsets mining is applied in fields such as eHealth and bioinformatics. Traditional algorithms for mining frequent itemsets from medical data are based on certain data [7] and can be applied to discover hidden symptom patterns from a huge amount of data on patient symptoms. These patterns can be used by health managers to provide better healthcare for users [8]. For example, in [9, 10], the Apriori algorithm was applied to identify prevalent diseases and analyze medical billing. However, the Apriori algorithm mines frequent itemsets from certain data. In medicine, for different physical conditions of patients, the same physiological index corresponds to a different symptom association probability for each patient. As a result, there is uncertainty in patient data. Therefore, traditional algorithms for mining frequent itemsets from certain data cannot be directly applied to patient data.
Another important factor is that medical records contain sensitive patient information. An adversary with sufficient background information can make use of frequent patterns mined from patient data to obtain the sensitive information of patients. Hence, it is very important to protect patient privacy when mining frequent itemsets from medical data [11].
The set of symptoms that a patient suffers from constitute the patient’s data. Because of the probabilities associated with these symptoms, there is uncertainty in patient data. A large amount of patient data constitutes uncertain data. In the field of medicine, there are plenty of researches on symptom association probability. For example, one study monitored oesophageal pH over a 24 h period to obtain symptom association probability, which was then utilized to evaluate the association between a patient’s symptoms and gastroesophageal reflux [12]. By analyzing the large amounts of patient data, Beglinger et al. determined the probability that a patient suffering from Huntington’s disease also had obsessive and compulsive symptoms [13]. By analyzing the data of patients suffering from irritable bowel syndrome, Arsiè et al. determined the probability that indicated the association between meal ingestion and abdominal pain symptoms for patients suffering from irritable bowel syndrome [14]. In this paper, based on symptom association probability obtained by medical technology, we focus on how to mine frequent itemsets from uncertain medical data, while also protecting data privacy. In the uncertain medical data, each item corresponds to a symptom of patients.