XLDB - Extremely Large Databases

XLDB-2015 Invited Speakers

Engineer, Facebook AI Research

Keith Adams is an engineer in Facebook AI Research, where he is helping to build infrastructure for training intelligent systems. Before that he was a founding member of the HipHop Virtual Machine team, which created the PHP engine that powers Facebook, Wikipedia, Etsy, and many other sites. Before Facebook, he spent 9 years in VMware's virtual machine monitor team. Keith holds an ScB in computer science from Brown University.

Chief Data Officer, City and County of San Francisco

Joy Bonaguro is the first Chief Data Officer for the City and County of San Francisco, where she manages the City’s open data program. Joy has spent more than a decade working at the nexus of public policy, data, and technology. She worked from the birth of the open data and open government field, spending seven years designing and managing the development of information systems to support planning and decision-making at the Greater New Orleans Community Data. Prior to joining the City, Joy worked at Lawrence Berkeley National Laboratory to help develop technology, cyber and privacy policy working closely with both the National Lab CIO Council and the Department of Energy Information Management Advisory Group. Joy earned her Masters from UC Berkeley’s Goldman School of Public Policy, where she focused on IT policy.

Vice President of Infrastructure at Google

Eric Brewer is a vice president of infrastructure at Google. He pioneered the use of clusters of commodity servers for Internet services, based on his research at Berkeley. His “CAP Theorem” covers basic tradeoffs required in the design of distributed systems and followed from his work on a wide variety of systems, from live services, to caching and distribution services, to sensor networks. He is a member of the National Academy of Engineering, and winner of the ACM Foundation award for his work on large-scale services. Eric was named a "Global Leader for Tomorrow" by the World Economic Forumand “most influential person on the architecture of the Internet” by InfoWorld.

Stephen Brobst
Chief Technology Officer, Teradata Corporation

Stephen Brobst is the Chief Technology Officer for Teradata Corporation. Stephen performed his graduate work in Computer Science at the Massachusetts Institute of Technology where his Masters and PhD research focused on high-performance parallel processing. He also completed an MBA with joint course and thesis work at the Harvard Business School and the MIT Sloan School of Management.
Stephen has been on the faculty of The Data Warehousing Institute since 1996 in the areas of Big Data Analytics, High Performance Data Warehouse Design, Advanced Data Visualization, Enterprise Information Management, Capacity Planning, Social Network Analysis, and Real-Time Analytics.
During Barack Obama's first term he was also appointed to the Presidential Council of Advisors on Science and Technology (PCAST) in the working group on Networking and Information Technology Research and Development (NITRD). He was recently ranked by ExecRank as the #4 CTO in the United States (behind the CTOs from Amazon.com, Tesla Motors, and Intel) out of a pool of 10,000+ CTOs.

Kurt Brown

Kurt leads the Data Platform team at Netflix. His group architects and manages the technical infrastructure underpinning the company's analytics. The Netflix data infrastructure includes various Big Data technologies (e.g. Hadoop, Hive, and Pig), Netflix open sourced applications and services (e.g. Lipstick and Genie), and traditional BI tools (e.g. Teradata, MicroStrategy, and Tableau).


Rene Brun

Rene Brun obtained his PhD from the Blaise Pascal University in Clermont-Ferrand in 1973. Since 1973 he has been working at CERN, supervising the design and implementation of several large software systems such as
-GEANT: a detector simulation package
-PAW: a Physics Analysis workstation system
-ROOT: An Object-Oriented Data Storage and Analysis framework.
These systems are used by several thousand scientists in many Universities or laboratories worldwide. Since his retirement in 2013, he is working on a physics model describing elementary particles and their interactions.

Michael Carey
UC Irvine

Michael J. Carey is a Bren Professor of Information and Computer Sciences at UC Irvine. Before joining UCI in 2008, Carey worked at BEA Systems for seven years and led the development of BEA's AquaLogic Data Services Platform product for virtual data integration. He also spent a dozen years teaching at the University of Wisconsin-Madison, five years at the IBM Almaden Research Center working on object-relational databases, and a year and a half at e-commerce platform startup Propel Software during the infamous 2000-2001 Internet bubble. Carey is an ACM Fellow, a member of the National Academy of Engineering, and a recipient of the ACM SIGMOD E.F. Codd Innovations Award. His current interests all center around data-intensive computing and scalable data management (a.k.a. Big Data).

John M. Chambers
Stanford University, Consulting Professor

John Chambers was a member of Bell Labs research from 1966 until his retirement in 2005. In 1997, he became the first statistician to be named a Bell Labs Fellow, cited for ''pioneering contributions to the field of statistical computing''. His research has touched on nearly all aspects of computing with data but he is best known for the design (and continuing re-design) of the S language and its successor, R. Since 2008, he has been Consulting Professor, Department of Statistics, Stanford University.

In May of 1999, the Association for Computing Machinery presented him its Software System Award for the design of the S system. The ACM citation stated that ''S has forever altered the way people analyze, visualize, and manipulate data''. This is the only time the award has been made for statistical software (previous citations included Unix, TeX, and the World-Wide Web). The money from the award was donated to the American Statistical Association to establish an annual student prize for software.

The University of Waterloo awarded John Chambers an honorary Doctor of Mathematics degree in 2004. The citation for the award began: ''John M. Chambers is a name synonymous, worldwide, with statistical computing; his leadership and impact are unequalled.''

Since retiring from Bell Labs, Dr. Chambers has visited and taught at several universities, including University of Auckland, UCLA, and Stanford.

He is the author or co-author of eight books; his most recent book is ''Software for Data Analysis: Programming with R'' (Springer, 2008). His first book, ''Computational Methods for Data Analysis'' (Wiley, 1977), was the first general treatment of statistical computing. He was co-author of ''Graphical Methods for Data Analysis'', the first exposition of data visualization. Other books introduced versions of the S language and related topics, including ''Statistical Models in S''.

Dr. Chambers's professional activities have included being president of the International Association for Statistical Computing and various offices in the ISI, ASA and AAAS. He is a fellow of the ASA, the IMS, and the AAAS, and an elected member of the International Statistical Institute. He is a member of the board of the R Foundation.

At Bell Labs, he served as head of the advanced software department (1981-1983) and the statistics and data analysis research department (1983-1989), before returning to full-time research in 1989. He continues active research on computing with data, with many plans for the future of computing with data.

Dr. Chambers obtained his Ph. D. in statistics in 1966 from Harvard University, after receiving a B. Sc. degree in 1963 from the University of Toronto.

Karim Chine
Founder at Cloud Era Limited

Karim Chine is a London-based software architect and entrepreneur with a background in theoretical physics. After graduating from the French Ecole Polytechnique and Telecom ParisTech, he has held positions within academic research laboratories and industrial R&D departments including Imperial College London, EBI, IBM and Schlumberger. Karim's interests include large scale distributed software design, cloud computing's applications in research and education, open-source software ecosystems and open science. Since 2009, he has been collaborating with the European Commission as an independent expert for the research e-infrastructure program and for the future and emerging technologies program. He has been an evaluator and a reviewer of many of EU’s flagship projects related to grids, desktop grids, scientific clouds and science gateways. Karim is the author and designer of ElasticR, a Virtual Research Environments factory enabling clouds federation, advanced data analysis, rapid data science applications/services prototyping, reproducible research and real-time collaboration.

CTO of Socrata

Deep Dhillon serves as the CTO at Socrata, a leader in cloud solutions for open data and data-driven governments. Deep is an accomplished technologist with extensive experience conceptualizing, architecting, and deploying multiple machine learning, distributed computing, and natural language processing (NLP) based systems. Before joining Socrata, Deep served as a technology executive for companies such as Alliance Health Network, Evri, Insightful Corporation (now Tibco), and Cantametrix (now Sony). Deep holds several patents in the areas of NLP, search and music analysis.

Timotej Gavrilovic

Timotej Gavrilovic is a supervisor on the Demand Side Analytics (DSA) team at PG&E. His team focuses on providing analytical and strategic support to decision making for various demand-side programs at PG&E. In addition to working with Energy Efficiency, Demand Response, Pricing Products, Distributed Generation and Electric Vehicles, the team also focuses on interactive effects these programs have on Transmission and Distribution, as well as Renewables Integration. Timotej has been with PG&E for four years, focusing on demand-side analytics.



Chris Holcombe is an engineer at Canonical leading the cloud storage team. He is helping to scale Canonical's storage offering. Before Canonical, Chris was at Nebula working towards building an automated Ceph deployment system. Before that he was a storage engineer working at Facebook on distributed POSIX storage systems. Chris holds a BSBA in computer information systems from Rider University.


Colin Kerrigan is a Senior Analyst on the Demand Side Analytics (DSA) team at PG&E. The DSA team focuses on providing analytical and strategic support to decision making for various demand-side programs at PG&E. Colin has worked extensively on forecasting distributed generation and understanding its impact on PG&E’ service. In addition, he has led efforts to increase the utilization of SmartMeter data in the management of PG&E’s customer side programs, such as demand response and energy efficiency. Recently, this work has focused on leveraging data to integrate these programs with PG&E’s other lines of business.

Hannes Muehleisen
Researcher / CWI

Hannes Muehleisen is a researcher at the CWI Database Architectures group in Amsterdam. His main research interest is the interplay between statistical analysis and data management.

Oliver Ratzesberger
Senior Vice President-software, Teradata Labs

Oliver Ratzesberger is Senior Vice President, Software at Teradata, where he leads software teams for Teradata Labs, including the Teradata Database, Teradata Aster, Client tools and Hadoop integration. Oliver joined Teradata in July 2013. Previously, he worked for Sears Holdings, where he drove a large analytics effort to consolidate disparate systems into a newly redesigned Unified Data Architecture. Before that, Oliver spent seven years at eBay, where he was responsible for its data warehouse and big data platforms. During his tenure at eBay, he led eBay's expansion of analytics and led the Hadoop platform engineering teams, driving the initial integration for Teradata and Hadoop.

Nachum Shacham
Principal Data Scientist at PayPal

Nachum Shacham is a Principal Data Scientist at PayPal where he is working on modeling and extracting business value from large transactional, behavioral, and system performance datasets. Before, he was with eBay, analyzing performance of large data platforms. Prior, he was with SRI International, leading research in internet technologies, wireless internet, and real-time voice and video communications over mobile networks. As co-founder and CTO of Metreo, he developed models for B2B pricing, and subsequently created revenue models for online display and search advertising. Nachum holds BScEE & MScEE from the Technion, Israel Institute of Technology, and PHD in EECS from UC Berkeley. Dr. Shacham is a Fellow of the IEEE.

CEO, Planet OS

Manasi Vartak
MIT Database Group

Manasi is a PhD candidate in the MIT Database Group. Her research interests lie at the intersection of data management and machine learning with a particular interest in health data management. Her current research focuses on automating data analysis tasks, data visualization, and building predictive models linking nutrition and GI disease. Manasi holds a Bachelors degree in Computer Science and Mathematics from Worcester Polytechnic Institute,and a Masters degree in Computer Science from Massachusetts Institute of Technology.

Lead Product Manager for Big Data on Google Cloud Platform

William Vambenepe leads the product management team responsible for Big Data services on Google Cloud Platform. Those services include Cloud-native fully-managed services (e.g. BigQuery for SQL analytics, Cloud Dataflow for stream and batch pipelines) as well as open source offerings for Hadoop and Spark. William was previously an Architect at Oracle and before that a Distinguished Technologist at HP. He holds an engineer degree from Ecole Centrale Paris, a Diploma in Computer Science from Cambridge University and a Master of Science in Engineering Management from Stanford University. He is on Twitter as @vambenepe.

Theo Vassilakis
Founder & CEO of Metanautix

Prior to founding Metanautix, Theo spent nearly eight years at Google, most recently as a Principal Engineer and Engineering Director of a 75 engineer team in data warehousing, visualization, and analysis. He led the development of Dremel, a large-scale, interactive ad hoc query engine for big data processing that powers Google's BigQuery, as well as Tenzing, a SQL implementation on MapReduce.
Theo also worked on developing large-scale machine learning systems for personalized search ads, audience analysis systems for display ads, and early prototypes of page preview and results-as-you-type for search. Before Google, Theo was a software engineer at Microsoft and Microsoft Research, building data cleaning features for SQL Server and speech recognition models for Windows and Office. He holds a PhD from Brown University and a BS from Stanford University, both in Mathematics.

Huy T. Vo
Research Scientist, NYU-CUSP

Dr. Vo is a Research Scientist at the Center for Urban Science and Progress (CUSP), New York University. His research focuses on large-scale data analysis and visualization, big data systems, and scalable displays. He is also a Research Assistant Professor of Computer Science and Engineering at NYU’s Polytechnic School of Engineering since 2011. He is one of the co-creators of VisTrails, an open-source scientific workflow and provenance management system, where he led the design of the VisTrails Provenance SDK. He received his B.S. in Computer Science (2005) and PhD in Computing (2011) from the University of Utah and was a two-time recipient of the NVIDIA Fellowship awards (2009-2010 and 2010-2011).

Stephen Wolfram
CEO, Wolfram Research

Stephen Wolfram is a distinguished scientist, technologist and entrepreneur. He has devoted his career to the development and application of computational thinking.

His Mathematica software system launched in 1988 has been central to technical research and education for more than a generation. His work on basic science---summarized in his bestselling book A New Kind of Science---has defined a major new intellectual direction, with applications across the sciences, technology and the arts. In 2009 Wolfram built on his earlier work to launch Wolfram|Alpha to make as much of the world's knowledge as possible computable---and accessible on the web and in intelligent assistants like Apple's Siri.

In 2014, as a culmination of more than 30 years of work, Wolfram began to roll out the Wolfram Language, which dramatically raises the level of automation and built-in knowledge available in a programming language, and makes possible a new generation of readily deployed computational applications.

Stephen Wolfram has been the CEO of Wolfram Research since its founding in 1987. He was educated at Eton, Oxford and Caltech, receiving his PhD in theoretical physics at the age of 20.

