UMBC High Performance Computing Facility
Big Data Analytics for high dimensional and heterogeneous
datasets
Vandana Janeja, Information Systems
Akshay Grover
Jay Gholap
With the diversity and amounts of data increasing there is an increasing need to evaluate big data
frameworks and how well they adapt to analytics techniques. In this project we will evaluate
the performance of big data solutions across multiple analytic approaches. Publicly available
healthcare data will be utilized as a test bed where analytics techniques, particularly ensemble
based learning will be evaluated. Key parameters will be measured including algorithmic
outcomes (such as diversity and size of training samples), usability, adaptability and modularity,
robustness and efficiency.