EPAM Systems Inc.
Title: Data Analysis Approaches for Structured and Unstructured Health Data Mining
Abstract: Health related data are either collected by various tools or generated by a wide range of medical devices and IoT devices. This data has firmly heterogeneous nature; it can be stored in relational databases, non-relational databases such as NoSQL databases, or in a specific file formats defined by different software vendors. In order to perform meaningful data analysis for such kind of data there are a set of challenging problems including but not limited to building a unified data pre-processing and cleaning infrastructure to obtain a smaller but more efficient search space on which it is more considerable to perform various data analysis tasks. This talk will give a comparison and summary about industry standard big data analysis platforms such as Hadoop and Elastic Search, and then will introduce most efficient non-trivial information retrieval algorithms that are best match for health data based on success stories achieved by global software vendors based in the field of Health Data Management and Analysis.