Technical Program

Paper Detail

Paper: SS-L7.6
Session: Signal and Information Processing for 'Big Data'
Location: Room E
Session Time: Thursday, March 29, 10:30 - 12:30
Presentation Time: Thursday, March 29, 12:10 - 12:30
Presentation: Lecture
Topic:
Paper Title: DYNAMIC DISTRIBUTED DIMENSIONAL DATA MODEL (D4M) DATABASE AND COMPUTATION SYSTEM
Authors: Jeremy Kepner, William Arcand, William Bergeron, Nadya T. Bliss, Robert Bond, Chansup Byun, Gary Condon, Kenneth Gregson, Matthew Hubbell, Jonathan Kurz, Andrew McCabe, Peter Michaleas, Andrew Prout, Albert Reuther, Antonio Rosa, Charles Yee, MIT Lincoln Laboratory, United States
Abstract: A crucial element of large web companies is their ability to collect and analyze massive amounts of data. Tuple store databases are a key enabling technology employed by many of these companies (e.g., Google Big Table and Amazon Dynamo). Tuple stores are highly scalable and run on commodity clusters, but lack interfaces to support efficient development of mathematically based analytics. D4M (Dynamic Distributed Dimensional Data Model) has been developed to provide a mathematically rich interface to tuple stores (and structured query language "SQL" databases). D4M allows linear algebra to be readily applied to databases. Using D4M, it is possible to create composable analytics with significantly less effort than using traditional approaches. This work describes the D4M technology and its application and performance.