CUIDS Distinguished Speaker Series: Big OLAP Data Cube Compression Algorithms in Column-Oriented Cloud/Edge Data Infrastructures
CUIDS Distinguished Speaker Series: Big OLAP Data Cube Compression Algorithms in Column-Oriented Cloud/Edge Data Infrastructures
Categories: Lectures and Seminars | Intended for Alumni, Carleton Community, Current Students, Staff/Faculty
5345 Herzberg Laboratories
1125 Colonel By Dr, Ottawa, ON
Contact Information
CUIDS, 613-520-2600 ext. 8751, CUIDS@carleton.ca
Registration
Cost
$0
About this Event
Host Organization: Carleton University Institute for Data Science
More Information: Please click here for additional details.
Big data is gaining momentum in the research community, due to the several challenges posed by managing it and is relevant not only in the academic context, but also in the industrial context, where they play the major role. Indeed, several kinds of application are now exploiting big data, such as: Web advertisement, social network intelligence, e-science applications, smart city applications, and so forth. Among big data, big multidimensional data are a special case of big data that fully expose the “famous” 3V (volume, velocity, variety) and are of relevant interest at now. Within this research context, this speaker series with Prof. Alfredo Cuzzocrea titled “Big OLAP Data Cube Compression Algorithms in Column-Oriented Cloud/Edge Data Infrastructures” focuses the attention on the issue of compressing so-called big OLAP data cubes over column-oriented cloud/edge data infrastructures.
More specifically, this talk proposes a specialized representation of massive OLAP data cubes over cloud/edge environments via column-oriented paradigms, which have been traditionally used in fortunate in-memory database query engines. Under this decomposition mechanism, each “column” is then compressed via a state-of-the-art synopsis data structure, called D-Syn, which already proofed its effectiveness and efficiency in multidimensional data compression, thanks to an innovative analytical interpretation of multidimensional data cubes. This talk also discusses several alternatives according to which the deriving synopsis chunks can be effectively and efficiently distributed across Cloud and/or Edge nodes, and how to support approximate query answering over such big data structures.