| Tuesday, October 18, 2011 |
| 08:00 AM |
Continental Breakfast |
|
|
|
| 9:00 AM |
Welcome |
Amber Boehnlein (SLAC, head of scientific computing) |
 |
|
| 9:15 AM |
Conference Introduction, Logistics and Announcements
Main objectives, logistics, agenda, announcements |
Jacek Becla (SLAC, XLDB2011 chair) |
 |
 |
| Reference Cases from Industry |
| 9:30 AM |
Industrial Perspective on Tools for Big Data - Introduction
Introduction to the morning session |
Kian-Tat Lim (SLAC) |
 |
 |
| 9:40 AM |
Real-time Analytics at Facebook
Data management & analytics at Facebook, main focus: HBase |
Zheng Shao (Facebook) |
 |
 |
| 10:05 AM |
Data Infrastructure at LinkedIn |
Shirshanka Das (LinkedIn) |
 |
 |
| 10:30 AM |
Coffee Break |
|
|
|
| 10:55 AM |
Extreme Analytics at eBay
Data analytics at eBay, main focus: Teradata/Singularity |
Thomas Fastner (eBay) |
 |
 |
| 11:20 AM |
Youtube Data Warehouse
Latest technologies behind
Youtube, including Dremel and Tenzing |
Biswapesh Chattopadhyay (Google) |
 |
 |
| 12:00 AM |
Industrial Perspective on Tools for Big Data - Discussion Panel
Discussion panel (Facebook,
LinkedIn, eBay, Google) |
Moderator: Kian-Tat Lim (SLAC) |
|
 |
| 12:30 PM |
Lunch |
|
|
|
| Statistics at Scale |
| 1:30 PM |
Big Data System Metrics - Managing Systems of Extreme Scale
Optimizing extreme scale systems: efficiency metrics,
workload management, SLAs vs SLGs, queuing systems, job shops
and more |
Nachum Shacham (eBay) |
 |
 |
| 2:00 PM |
Extremely Large Data Challenges - What R Can and Can't Do
R strengths and weaknesses in peta-scale context, roadmap |
Susan Holmes (Stanford Statistics Dept) |
 |
 |
| Lightning Talks |
| 2:30 PM |
Lightning Talks (9 x 5 min)
| 1. |
Techniques for Discovering Relationships in Massive-Scale Data, Peter J Haas / IBM |
 |
| 2. |
Lightning Queries,
Migueal Branco / EPFL |
 |
| 3. |
The 1000 Genomes Project, User Accessibility, Laura Clarke / EBI |
 |
| 4. |
Scalable Analytics on SciDB, a scientific data
management & analytics platform, Paul Brown / SciDB |
 |
| 5. |
bigBed/bigWig file format and usage case
for efficient remote access to large data sets, Hiram Clawson / UCSC |
 |
| 6. |
InfiniteGraph - A Scalable, Distributed
Graph Database, Leon Guzenda / Objectivity |
 |
| 7. |
MCDB--The Monte Carlo Database System, Chris Jermaine / Rice Univ
|
 |
| 8. |
[moved to Wed] |
|
| 9. |
Hadoop jobs require one-disk-per-core,
Myth or Fact?, Min Xu / Seamicro |
 |
|
|
| 3:20 PM |
Poster Session + Ice Creams |
|
|
|
| Visualization |
| 4:00 PM |
Visualizing Large, Complex Data
Challenges related to visualizing large data sets |
Kwan-Liu Ma (UC Davis) |
 |
|
| 4:20 PM |
Integrated Analysis and Visualization for Data Intensive Science:
Challenges and Opportunities
Challenges related to visualizing large data sets |
Attila Gyulassy (UC Davis) |
 |
 |
| 4:40 PM |
Database Requirements for Visualizing Large Multiscale Simulation Data
Challenges related to visualizing large data sets |
Ralf Kaehler (SLAC/Kavli) |
 |
|
| 5:00 PM |
Adjourn |
|
|
|
| 5:15 PM |
SciDB Community Meeting
The SciDB team will hold a 45 minute Community Meeting
immediately after the end of the official XLDB program.
Registration is not required - everybody is welcome to attend, including
these not attending the XLDB conference. Location: ROB
building |
|
 |
|
| 6:30 PM |
Reception and dinner
|
|
|
|
| WEDNESDAY, October 19, 2011 |
| 08:00 AM |
Continental Breakfast |
|
|
|
| 9:00 AM |
Announcements |
|
|
|
| Reference Case from Science |
| 9:05 AM |
Sequence Read Archive: Validation, Archival, and Distribution of Raw Sequencing Data
Growing pains, unstable formats, privacy issues, and solutions
|
Eugene Yaschenko (NCBI/NIH) |
 |
 |
| 9:35 AM |
Managing the Data Bonanza: Generating, Analysing and Sharing Data for Megasequencing Projects |
Narayan Desai (ANL) |
 |
 |
| 10:05 AM |
Functional Annotation of the Protein Sequence Universe |
Eugene Kolker (Seattle Children's Hospital) |
 |
 |
| 10:30 AM |
Coffee Break |
|
|
|
| Growing to Large Scale Stories |
| 11:00 AM |
Drug Discovery in the Era of Big Data |
Gregory McAllister (Novartis) |
 |
 |
| 11:30 AM |
Growing to Large Scale at Netflix |
Eric Colson (Netflix) |
|
|
| Short Surveys |
| 12:00 PM |
Value of Train Scheduling
Is it needed and why? |
Daniel Wang (SLAC/LSST) |
 |
 |
| 12:15 PM |
Shared-nothing vs Shared-disk?
Can we get away with shared disks at peta scale? |
Michael Stonebraker (MIT) |
 |
 |
| 12:30 PM |
Lunch |
|
|
|
| Data Intensive Simulation |
| 1:30 PM |
In-situ Scientific Data Processing for Extreme Scale Computing
Challenges related to data intensive simulation |
Scott Klasky (ORNL) |
 |
 |
| Platinum Sponsor Talk |
| 2:15 PM |
Building Blocks for Large Analytic Systems
Architectural Building Blocks
for Extremely Large Analytic Systems |
Andrew Lamb (Vertica Systems) |
 |
 |
| Lightning Talks |
| 2:30 PM |
Lightning Talks (9 x 5 min)
|
|
|
| 3:20 AM |
Poster Session + Extended Coffee Break |
|
|
|
|
Cloud Computing at Scale |
| 4:00 PM |
Scaling Up Quickly on the Cloud
|
Edmond Lau (Quora) |
 |
 |
| 4:10 PM |
One Billion Rows a Second: Fast, Scalable OLAP in the Cloud
|
Michael Driscoll (Metamarkets) |
 |
 |
| 4:20 PM |
Cloud Computing at Scale
|
Roger Barga (Microsoft) |
 |
 |
| 4:30 PM |
Questions and Discussion |
Moderator: Jacek Becla |
|
|
| Closeout |
| 4:45 PM |
Closeout
Next conference planning, final conclusions and closeout |
Jacek Becla |
 |
|
| 5:00 PM |
Adjourn |
|
|
|