The Alliance vs. The Horde: A Struggle for the Future of Data Analytics

24 Mar
03/24/2010 -
12:00pm to 1:00pm
Distinguished Lecturer Series

Michael J. Franklin
University of California, Berkeley
Computer Science

Computer Science Building, Rooms 150 & 151

Faculty Host: Yanlei Diao

Data analytics are increasingly at the heart of many aspects of our lives: commerce, entertainment, science, health care, and even dating. Data analysis in enterprises has traditionally been the domain of SQL-based data warehouses and associated business intelligence systems. These technologies represent a large and increasingly vibrant segment of the computer industry. Recently, however, a new technology stack inspired by MapReduce and related systems developed at large web companies has been gaining significant attention across a wide range of industries. These emerging technologies have demonstrated impressive scalability in cloud computing environments, running over Petabytes of data on thousands of machines. Such results have led some to advocate a major rethinking of large-scale data management and analysis, including what has been dubbed (perhaps unproductively) the "NoSQL Movement". Not surprisingly, many in the database world see problems with this approach, and have not been shy about pointing them out.

In this talk I will try to frame the debate between these two schools of thought. I will then address the potential for a "via media" that combines advantages of both and describe some early research that is motivated by this goal. In the spirit of even-handedness I will leave unspecified which camp is represented by the Alliance, and which is represented by the Horde - but you are invited to draw your own conclusions.


Michael Franklin is a Professor of Computer Science at UC Berkeley focusing on new approaches for data management and data analysis. His recent projects have spanned systems ranging from dynamic networks of tiny wireless sensor devices to large-scale scientific grid computing and cloud computing infrastructures. He is a co-founder and CTO of Truviso, Inc. a real-time data analytics company that enables customers to quickly make sense of diverse, high-speed, continuous streams of information. His background also includes industrial experience developing text retrieval systems and one of the earlier massively-parallel database systems. He has consulted for major computer companies, start-ups and technology investors.

He is a Fellow of the ACM, and a recipient of the National Science Foundation CAREER award and the ACM SIGMOD "Test of Time" award. He received his Ph.D. from the University of Wisconsin-Madison (1993) and his M.S.E. from the Wang Institute of Graduate Studies (1986). He was one of three siblings to receive a B.S. in Computer and Information Science from the University of Massachusetts, Amherst (1983) where in 2009 he was honored in the inaugural group of recipients of the CS Department Outstanding Alumni Achievement award.

A reception will be held at 3:40 PM in the atrium, outside the presentation room.