Author

Publication Date

Date of Final Oral Examination (Defense)

Type of Culminating Activity

Degree Title

Department

Computer Science

Major Advisor

Steven M. Cutchin, Ph.D.

Advisor

Jerry Alan Fails, Ph.D.

Advisor

Maria Soledad Pera, Ph.D.

Advisor

Catherine Olschanowsky, Ph.D.

Abstract

Data visualization has proven effective at detecting patterns and drawing inferences from raw data by transforming it into visual representations. As data grows large, visualizing it faces two major challenges: 1) limited resolution i.e. a screen is limited to a few million pixels but the data can have a billion data points, and 2) computational load i.e. processing of this data becomes computationally challenging for a single node system. This work addresses both of these issues for efficient big data visualization. In the developed system, a High Pixel Density and Large Format display was used enabling the display of fine details on the screen when visualizing data. Apache Spark and Hadoop used in the system allow the computation to be done on a cluster.

The system is demonstrated using a global wind flow simulation. The Global Surface Summary of the Day dataset is processed and visualized using web browsers with Data-Driven Documents (D3).js code. We conducted both a performance evaluation and a user study to measure the performance and effectiveness of the system. It was seen that the system was most efficient when visualizing data using streamed bitmap images rather than streamed raw data. The system only rendered images at 6-10 Frames Per Second (FPS) and did not meet our target of rendering images at 30 FPS. The results of the user study concluded that the system is effective and easy to use for data visualization. The outcome of our experiment suggests that the current state of Google Chrome may not be as powerful as required to perform heavy 2D data visualization on the web and still needs more development for visualizing data of large magnitude.