CUIDS Distinguished Speaker Series: Big OLAP Data Cube Compression Algorithms in Column-Oriented Cloud/Edge Data Infrastructures

CUIDS Distinguished Speaker Series: Big OLAP Data Cube Compression Algorithms in Column-Oriented Cloud/Edge Data Infrastructures

Categories: Lectures and Seminars | Intended for , , ,

Tuesday, October 15, 2024

10:00 AM - 11:00 AM | Add to calendar

5345 Herzberg Laboratories

1125 Colonel By Dr, Ottawa, ON

Contact Information

CUIDS, 613-520-2600 ext. 8751, CUIDS@carleton.ca

Cost

$0

About this Event

Host Organization: Carleton University Institute for Data Science
More Information: Please click here for additional details.

Big data is gaining momentum in the research community, due to the several challenges posed by managing it and is relevant not only in the academic context, but also in the industrial context, where they play the major role. Indeed, several kinds of application are now exploiting big data, such as: Web advertisement, social network intelligence, e-science applications, smart city applications, and so forth. Among big data, big multidimensional data are a special case of big data that fully expose the “famous” 3V (volume, velocity, variety) and are of relevant interest at now. Within this research context, this speaker series with Prof. Alfredo Cuzzocrea titled “Big OLAP Data Cube Compression Algorithms in Column-Oriented Cloud/Edge Data Infrastructures” focuses the attention on the issue of compressing so-called big OLAP data cubes over column-oriented cloud/edge data infrastructures.

More specifically, this talk proposes a specialized representation of massive OLAP data cubes over cloud/edge environments via column-oriented paradigms, which have been traditionally used in fortunate in-memory database query engines. Under this decomposition mechanism, each “column” is then compressed via a state-of-the-art synopsis data structure, called D-Syn, which already proofed its effectiveness and efficiency in multidimensional data compression, thanks to an innovative analytical interpretation of multidimensional data cubes. This talk also discusses several alternatives according to which the deriving synopsis chunks can be effectively and efficiently distributed across Cloud and/or Edge nodes, and how to support approximate query answering over such big data structures.