XLDB - Extremely Large Databases

XLDB-2012 Conference Program

Tuesday, September 11, 2012
08:00 AM   Continental Breakfast  (registration starts 7:30 AM, Knight Management Center)  
9:00 AM 10 Welcome James Williams
pdf video
9:10 AM 15 Conference Introduction, Logistics
Update on XLDB activities, XLDB-2012 logistics, agenda
Jacek Becla
(SLAC, XLDB2012 chair)
pdf video
Biology Bytes Back Moderator: Carol J. McCall
9:25 AM 10 Session Introductions Carol J. McCall
(GNS Healthcare)
9:35 AM 35 Streaming and Compression Approaches for Terascale Biological Sequence Data Analysis C. Titus Brown
(Michigan State Univ)
pdf video
10:10 AM 35 Extracting Medical Knowledge from High Throughput Genomics
Ted Goldstein
(U.C. Santa Cruz)
pdf video
10:45 AM 25 Coffee Break
11:10 AM 35 Bionimbus: Lessons from a Petabyte-Scale Science Cloud Service Provider Robert Grossman
(Univ of Chicago)
pdf video
11:45 AM 45 Discussion panel
12:30 PM 60 Lunch
Community & Hybrid Cloud Computing Moderator: Kian-Tat Lim
1:30 PM 45 Big Data; It's Rocket Science Joshua McKenty
(Piston Cloud)
pdf video
2:15 PM 30 Crunching Big Data with Google BigQuery Ryan Boyd
pdf video
Lightning Talks Moderator: Jacek Becla
2:45 PM 45 Lightning Talks (9 x 5 min)
1. Big-Data Analytics Usability, Magdalena Balazinska / University of Washington pdf video
2. Multi-temperature Data Management for XLDB Deployment, Stephen Brobst / Teradata Corporation pdf video
3. Astronomical Data Processing Using SciQL, an SQL Based Query Language for Array Data, Ying Zhang / CWI, Bart Scheers / CWI and UvA, Martin Kersten / CWI pdf video
4. EarthServer: Half a Petabyte Flocking Around the Rasdaman Array Analytics Engine, Peter Baumann / Jacobs University
pdf video
5. Cataloging Scientific Data with araXne, Timothy M. Shead, David H. Rogers and Patricia J. Crossno / SNL pdf video
6. Agrios: A Hybrid Approach to Scalable Data Analysis Systems, Patrick Leyshock / Portland State University pdf video
7. Introducing the Big Data Benchmark Community, Chaitan Baru / SDSC, Milind Bhandarkar / Greenplum, Raghunath Nambiar / Cisco, Meikel Poess / Oracle, and Tilmann Rabl / University of Toronto pdf video
8. NCBI Genotype Archive, Douglas J. Slotta / NCBI pdf video
9. How We Add Over One Billion Data Points Per Day To The Half a Trillion We Store in OpenTSDB,
Benoit Sigoure / StumbleUpon
pdf video
3:30 PM 40 Poster Session + Ice Creams
Reference Cases from Industry Moderator: Jacek Becla
4:10 PM 40 10 Billion Here, 10 Billion there, Pretty Soon You Have to Manage the Data Tom Brown
4:50 PM 30 Petascale Naturalistic Driving Study Clark Gaylord
pdf video
Logistics and Dinner
5:20 PM 10 Dinner Discussions Planning Jacek Becla
5:30 PM Adjourn
6:00 PM Reception and non-seated dinner, ad-hoc group discussions

WEDNESDAY, September 12, 2012
08:00 AM Continental Breakfast (registration starts 7:30 AM, Knight Management Center)  
9:00 AM 5 Announcements XLDB Organizers  
Scientific Data Management and Analytics Moderator: Jacek Becla
9:05 AM 45 Finding the "Higgs" in the Haystack Stephen Gowdy
pdf video
9:50 AM 30  A self-organizing Repository for Fusion Science Tim Frazier
pdf video
What can be done to make Hive a better database system? Moderator: David DeWitt
10:20 AM 20 Principles and Patterns in the Extreme Data & Analytics Ecosystem Michael McIntire
-- video
10:40 AM 25 Coffee Break
11:05 AM 20 A Survey of Petabyte Scale Databases and Storage Systems Deployed at Facebook Dhruba Borthakur
pdf video
11:25 AM 20 Analytics at Zynga Daniel McCaffrey
pdf video
11:45 AM 20 Big Data and the Era of One Size (doesn't) Fit All Omer Trajman
pdf video
12:05 PM 25 Discussion panel   video
12:30 PM Lunch
XLDB-related Announcements
1:30 PM 30 Announcement: ArrayQL Working Group Kian-Tat Lim pdf video
1:30 PM 30 Announcement: use cases, referrals, partnership Jacek Becla pdf video
Hardware and Infrastructure Trends (CPUs, memory, storage, networking) Moderator: Daniel Wang
2:00 PM 30  Heterogenous Computing Architectures
Michael Houston
pdf video
2:30 PM 30 Memory's Role in Improving the Efficiency of Extremely Large Databases Kenny Han
(Samsung Semiconductor Inc.)
pdf video
Lightning Talks Moderator: Jacek Becla
3:00 PM 45 Lightning Talks (9 x 5 min)
1. Developing an Analytic Cloud, Justin Simonds and TC Janes / HP Master Technologists pdf video
2. Maverick: A Fast, Energy Efficient Next-Generation NVM-based SSD Architecture for Bioinformatics Applications, Arup De / UCSD and LLNL, Maya Gokhale / LLNL, Steve Swanson / UCSD, and Rajesh Gupta / UCSD pdf video
3. G-CODE Enables Actionable Clinical Knowledge from BIG DATA, Subha Madhavan, Michael Harris, Krithika Bhuvaneshwar, Andrew Shinohara, Kevin Rosso, and Yuriy Gusev / Georgetown University pdf video
4. Hive Vs Pig : Similarities and Differences, Ashutosh Chauhan / Apache Software Foundation pdf video
5. Handling the Genome Data Tsunami: Storage and Transfer of Cancer Genomes at Petabyte Scale,
Dan Maltbie / Annai Systems and Chris Wilks / UCSC
pdf video
6. MongoDB and Big Data at CERN, Paul Pedersen / 10gen pdf video
7. Parallel Data Storage, Analysis, and Visualization of a Trillion Particles, Surendra Byna / Lawrence Berkeley National Laboratory pdf video
8. Real-time Analytics Streaming System at Zynga, Michael Fan and Rushan Chen / Zynga pdf video
9. Jubatus: Distributed Online Machine Learning Framework for Realtime Analysis of Big Data, Hiroyuki Makino / NTT Software Innovation Center pdf video
3:45 PM 40 Poster Session + Extended Coffee Break
Platinum Sponsor
4:25 PM 10 Tradeoffs between Massively Parallel Analytical Systems Andrew Lamb
pdf video
Conclusions and Closeout
4:35 PM 40 Big Data is (at least) Three Different Problems Michael Stonebraker
pdf video
5:15 PM 15 Closing Remarks
Next conference planning, final conclusions and closeout
Jacek Becla pdf video
5:30 PM Adjourn
Privacy Statement -