Venue
Monday, 5th Jun
Room 2002
Room 2004
Room 2006
Room 2008
Room 2010
Tuesday, 6th Jun
Room 2002
- David Ojika (Doctoral Student University of Florida)
- Session of 30 minutes
Doctoral Student University of Florida
David Ojika is an Intel-fellowship recipient and a 4th-year doctoral student of computer engineering at the University of Florida. He completed several internships at Intel, working on near-memory accelerators and on heterogeneous platforms (Xeon+FPGA). Working with Dr. Darin Acosta and Dr. Ann Gordon-Ross, his research focuses on the intersection of computing and physics by investigating machine learning systems that enhance the study of high-energy particles (such as muons) at CERN. In the summer of 2017, David will join Microsoft’s AI & Research group to embark on an internship with the group’s Project Catapult.
Room 2002
Tuesday, 6th Jun, 05:00 - 05:30
- Research
- Jonathan Bloom (Co-Founder, Hail Team Broad Institute of MIT and Harvard), Timothy Poterba (Engineer and Computational Biologist Broad Institute of MIT and Harvard)
- Session of 30 minutes
Co-Founder, Hail Team Broad Institute of MIT and Harvard
Jonathan Bloom is a mathematician, engineer, and co-founder of the Hail team at the Broad Institute of MIT and Harvard. Prior to joining the Broad, he did research in geometry and algebraic topology as a Moore Instructor and NSF Fellow in Mathematics at the Massachusetts Institute of Technology. While there, he re-architected the department’s introductory course on probability and statistics, now available on MIT OpenCourseWare. He received his B.A. from Harvard University and Ph.D. from Columbia University in Mathematics.
Engineer and Computational Biologist Broad Institute of MIT and Harvard
Tim Poterba is an engineer and computational biologist on the Hail team at the Broad Institute of MIT and Harvard. Prior to joining the Broad, he studied protein folding dynamics at the Max Planck Institute for Biochemistry on a Fulbright Scholarship. He received his B.A. in Biophysics from Amherst College in 2013.
Room 2002
Tuesday, 6th Jun, 11:00 - 11:30
- Research
- Matteo Interlandi (Scientist Microsoft CISL)
- Session of 30 minutes
Scientist Microsoft CISL
Matteo Interlandi recently joined Microsoft CISL as a Research Scientist. Prior to joining Microsoft, Matteo was Postdoctoral Scholar at the University of California, Los Angeles. His research lies in between databases, distributed systems and declarative languages. In particular, he loves to build systems and tools that make it easier to design and implement data-driven distributed applications.
Room 2002
Tuesday, 6th Jun, 11:40 - 12:10
- Research
- Patrick Stuedi (Research Staff Member IBM)
- Session of 30 minutes
Research Staff Member IBM
I’m a member of the research staff at IBM research Zurich. My research interests are in distributed systems, networking and operating systems. I graduated with a PhD from ETH Zurich in 2008 and spent two years (2008-2010) as a Postdoc at Microsoft Research Silicon Valley. My current work is about exploiting fast network and storage hardware in data processing systems.
Room 2002
Tuesday, 6th Jun, 12:20 - 12:50
- Research
Room 2003
- Nan Zhu (Software Engineer Microsoft)
- Session of 30 minutes
Software Engineer Microsoft
Nan Zhu is a Software Engineer from Microsoft, where he works on serving Spark Streaming/Structured Streaming on Azure HDInsight. He is a contributor of Apache Spark (known as CodingCat) and also serves as the committee member of Distributed Machine Learning Community (DMLC) and Apache MxNet (incubator).
Room 2003
Tuesday, 6th Jun, 12:20 - 12:50
- Spark Ecosystem
- Ali Zaidi (Data Scientist Microsoft)
- Session of 30 minutes
Data Scientist Microsoft
Ali is a data scientist in the Algorithms and Data Science team at Microsoft. He spends his day trying to make distributed computing in the cloud easier, more efficient, and more enjoyable for data scientists and developers alike. He focuses on R, Spark, and Bayesian learning.
Room 2003
Tuesday, 6th Jun, 14:40 - 15:10
- Spark Ecosystem
- Ryan Williams (Software Developer Mount Sinai School of Medicine)
- Session of 30 minutes
Software Developer Mount Sinai School of Medicine
Ryan writes tools for analyzing genomic data using Spark at Hammer Lab.
Room 2003
Tuesday, 6th Jun, 16:20 - 16:50
- Spark Ecosystem
- Shay Nativ (Software Developer Redis Labs)
- Session of 30 minutes
Software Developer Redis Labs
Shay is an experienced software developer, architect, and entrepreneur. He was the founder and VP R&D of Peak-Dynamics—an energy saving solution for water utilities and CTO at Utab, a web platform for musicians. Shay loves solving complex problems and writing performant code.
Room 2003
Tuesday, 6th Jun, 17:40 - 18:10
- Spark Ecosystem
Room 2006
- Yin Huai (Software Engineer Databricks)
- Session of 30 minutes
Software Engineer Databricks
Yin Huai is a Software Engineer at Databricks and mainly works on Spark SQL. Before joining Databricks, he was a PhD student at The Ohio State University and was advised by Xiaodong Zhang. His interests include storage systems, database systems, and query optimization. He is also an Apache Hive committer.
Room 2006
Tuesday, 6th Jun, 11:00 - 11:30
- Developer
Software Engineer Facebook
Tejas is a software engineer at Facebook. For the past 3 years, he has been part of the Data Infrastructure group at Facebook and primarily works on building large scale distributed data processing systems responsible for handling batch workloads. He is currently a PMC member and committer of Apache Nutch and has contributed to several open source projects. Tejas obtained a Master’s Degree in Computer Science from University Of California, Irvine.
Room 2006
Tuesday, 6th Jun, 12:20 - 12:50
- Developer
- Joseph Bradley (Software Engineer Databricks)
- Session of 30 minutes
Software Engineer Databricks
Joseph Bradley is a Spark Committer working on MLlib at Databricks. Previously, he was a postdoc at UC Berkeley after receiving his Ph.D. in Machine Learning from Carnegie Mellon U. in 2013. His research included probabilistic graphical models, parallel sparse regression, and aggregation mechanisms for peer grading in MOOCs.
Room 2006
Tuesday, 6th Jun, 14:00 - 14:30
- Developer
- Ted Malaska (Technical Group Architect Blizzard Inc.)
- Session of 30 minutes
Technical Group Architect Blizzard Inc.
Ted is working on the Battle.net team at Blizzard, helping support great titles like World of Warcraft, Overwatch, HearthStone, and much more. Previously, he was a Principal Solutions Architect at Cloudera, helping clients be successful with Hadoop and the Hadoop ecosystem. Previously, he was a Lead Architect at the Financial Industry Regulatory Authority (FINRA). He has also contributed code to Apache Flume, Apache Avro, Apache Yarn, Apache HDFS, Apache Spark, Apache Sqoop, and many more. Ted is also a co-author or O’Reilly “Hadoop Application Architectures” and a frequent speaker at many conferences, and a frequent blogger on data architectures.
Room 2006
Tuesday, 6th Jun, 15:20 - 15:50
- Developer
Room 2016
- Prabhu Kasinthan (Chief Data Engineer Paypal)
- Session of 30 minutes
Chief Data Engineer Paypal
Prabhu Kasinathan is the chief data engineer in Big Data Platform at Paypal with 5+ years of big data experience. He is creating APIs, tools and services for Spark platform to support multi-tenancy and large scale computation-intensive applications. He is an expert in building data warehousing solutions on Hadoop and Teradata platform with 11+ years of data experience.
Room 2016
Tuesday, 6th Jun, 11:00 - 11:30
- Enterprise
- Derek Bennet (Platform Infrastructure Team Lead Stitch Fix)
- Session of 30 minutes
Platform Infrastructure Team Lead Stitch Fix
Derek Bennett is the lead for the Platform Infrastructure team in the Algorithms group at Stitch Fix. He and his team develop and support our Spark capabilities, event logging infrastructure using Amazon Kinesis and Apache Kafka, along with associated tools and applications to help make data available and useable. Derek holds a Ph.D. in Operations Research from UC Berkeley.
Room 2016
Tuesday, 6th Jun, 14:40 - 15:10
- Enterprise
- Songtao Guo (Principal Data Scientist LinkedIn), Wei Di (Business Analytic Data mining team LinkedIn)
- Session of 30 minutes
Principal Data Scientist LinkedIn
Songtao Guo is a Principal Data Scientist and tech lead of Data Mining team at Linkedin where he leads many of data driven products and analytics systems. His work involves building large-scale knowledge base, inventing data mining platforms to scale business analytics and partnering with product, sales, and marketing to deliver impactful solutions. Before joining LinkedIn, Songtao was a senior researcher at AT&T interactive, focusing on improving data quality and search relevancy for local business search. He holds a PhD in computer science from University of North Carolina at Charlotte.
Business Analytic Data mining team LinkedIn
Wei Di is currently the staff member in Business Analytic Data mining team. She is passionate about creating smart and scalable solutions that can impact millions of individuals and empower successful business. She has wide interests covering artificial intelligence, machine learning and computer vision. She was previously associated with eBay Human Language Technology and eBay Research Labs, with focus on large scale image understanding and joint learning from visual and text information. Prior to that, she was with Ancestry.com working in the areas of record linkage and search relevance. She received her PhD from Purdue University in 2011.
Room 2016
Tuesday, 6th Jun, 15:20 - 15:50
- Enterprise
Room 2020
- Felix Cheung (PMC/Committer Microsoft)
- Session of 30 minutes
PMC/Committer Microsoft
Felix Cheung is a Committer of Apache Spark and a PMC/Committer of Apache Zeppelin. He has been active in the Big Data space for 3+ years, he is a co-organizer of the Seattle Spark Meetup, presented several times and he was a teaching assistant to the very popular edx Introduction to Big Data with Apache Spark, and Scalable Machine Learning MOOCs in the summer of 2015.
Room 2020
Tuesday, 6th Jun, 11:00 - 11:30
- Streaming
- Jim Dowling (Associate Professor KTH Royal Institute of Technology)
- Session of 30 minutes
Associate Professor KTH Royal Institute of Technology
Jim Dowling is an Associate Professor at the School of Information and Communications Technology in the Department of Software and Computer Systems at KTH Royal Institute of Technology as well as a Senior Researcher at SICS – Swedish ICT. He received his Ph.D. in Distributed Systems from Trinity College Dublin (2005) and worked at MySQL AB (2005-2007). He is a distributed systems researcher and his research interests are in the area of large-scale distributed computer systems. He is lead architect of Hadoop Open Platform-as-a-Service (www.hops.io), a next generation distribution of Hadoop for Humans.
Room 2020
Tuesday, 6th Jun, 11:40 - 12:10
- Streaming
- J White Bear ( IBM)
- Session of 30 minutes
IBM
University of Michigan—Computer Science Databases, Machine Learning/Computational Biology, Cryptography University of California San Francisco—Computational Biology/Bioinformatics Machine Learning/Multi Objective Optimization/Statistical Mechanics for Protein-Protein Interactions McGill University Machine Learning/Multi-objective Optimization for Path Planning/ Cryptography
Room 2020
Tuesday, 6th Jun, 15:20 - 15:50
- Streaming
Room 2022
- Xiangrui Meng (Software Engineer Databricks)
- Session of 30 minutes
Software Engineer Databricks
Xiangrui Meng is an Apache Spark PMC member and a software engineer at Databricks. His main interests center around developing and implementing scalable algorithms for scientific applications. He has been actively involved in the development and maintenance of Spark MLlib since he joined Databricks. Before Databricks, he worked as an applied research engineer at LinkedIn, where he was the main developer of an offline machine learning framework in Hadoop MapReduce. His Ph.D. work at Stanford is on randomized algorithms for large-scale linear regression problems.
Room 2022
Tuesday, 6th Jun, 11:00 - 11:30
- Machine Leearning
- Min Shen (Engineer LinkedIn)
- Session of 30 minutes
Engineer LinkedIn
Min Shen is an engineer on LinkedIn’s Hadoop infrastructure development team, where he builds services and tools to tackle scaling challenges in operating large-scale multi-tenancy Hadoop deployment. Recently, he has been helping with creating tools to support operating Spark at scale as well as developing and running Spark jobs easily at LinkedIn.
Room 2022
Tuesday, 6th Jun, 12:20 - 12:50
- Machine Leearning
Founder 8 Path Solutions
Min Shen is an engineer on LinkedIn’s Hadoop infrastructure development team, where he builds services and tools to tackle scaling challenges in operating large-scale multi-tenancy Hadoop deployment. Recently, he has been helping with creating tools to support operating Spark at scale as well as developing and running Spark jobs easily at LinkedIn.
Room 2022
Tuesday, 6th Jun, 14:40 - 15:10
- Machine Leearning
Wednesday, 7th Jun
Room 2002
Oracle
Michael Malak is the lead author of Spark GraphX In Action and has been developing Spark solutions at two Fortune 200 companies since early 2013. He has been programming computers since before they could be bought pre-assembled in stores.
Room 2002
Wednesday, 7th Jun, 12:20 - 12:50
- Research
- Ali Zaidi (Data Scientist Microsoft)
- Session of 30 minutes
Data Scientist Microsoft
Ali is a data scientist in the Algorithms and Data Science team at Microsoft. He spends his day trying to make distributed computing in the cloud easier, more efficient, and more enjoyable for data scientists and developers alike. He focuses on R, Spark, and Bayesian learning.
Room 2002
Wednesday, 7th Jun, 14:00 - 14:30
- Research
- Hossein Falaki (Software Engineer Databricks)
- Session of 30 minutes
Software Engineer Databricks
Hossein Falaki is a software engineer and data scientist at Databricks, working on the next big thing. Prior to that he was a data scientist at Apple’s personal assistant, Siri. He graduated with a Ph.D. in Computer Science from UCLA, where he was a member of the Center for Embedded Networked Sensing (CENS).
Room 2002
Wednesday, 7th Jun, 16:20 - 16:50
- Research
- Ross Gardler (VP Apache Software Foundation)
- Session of 30 minutes
VP Apache Software Foundation
Ross Gardler has been involved with open source in one form or another since the mid ‘90s. He is a member of the Apache Software Foundation where he currently serves as the foundation’s President. He works at Microsoft on the Linux Compute team in Azure where he is responsible for the Azure Container Service.
Room 2002
Wednesday, 7th Jun, 17:00 - 17:30
- Research
Room 2003
Pepperdata
Kimoon joined Pepperdata in 2013. Previously, he worked for the Google Search and Yahoo Search teams for many years. Kimoon has hands-on experience with large distributed systems processing massive data sets.
Room 2003
Wednesday, 7th Jun, 11:00 - 11:30
- Spark Ecosystem
- Alex Dadgar (Project Lead Hashicorp)
- Session of 30 minutes
Project Lead Hashicorp
Alex is the project lead for Nomad, a distributed, highly-available cluster scheduler by HashiCorp. Prior to joining HashiCorp, Alex worked at Google where he architected a streaming-processing system to handle terabytes of YouTube data a day. Having seen the dream of infrastructure at Google, he joined HashiCorp to build it for the rest of the world!
Room 2003
Wednesday, 7th Jun, 11:40 - 12:10
- Spark Ecosystem
Room 2006
Mozilla
Sam Penrose loves how working with data at scale for Mozilla brings out the power and beauty of mathematics. Previously he helped Industrial Light and Magic bring the power and beauty of giant robots out to movie screens everywhere.
Room 2006
Wednesday, 7th Jun, 11:40 - 12:10
- Developer
Netflix
SRyan Blue works on open source projects, including Spark, Avro, and Parquet, at Netflix.
Room 2006
Wednesday, 7th Jun, 14:00 - 14:30
- Developer
Room 2016
- Jordan Volz (Systems Engineer Cloudera)
- Session of 30 minutes
Systems Engineer Cloudera
Jordan Volz is a Systems Engineer at Cloudera. He helps clients design and implement big data solutions using Cloudera’s Distribution of Hadoop, across a variety of industry verticals. Previously, he has worked as a consultant for HP Autonomy delivering compliance archiving, e-Discovery, and electronic surveillance solutions to regulated financial services companies, and as a developer at Epic Systems building HIPPA-compliant EMR software.
Room 2016
Wednesday, 7th Jun, 11:30 - 11:30
- Enterprise
- Gwen Shapira (Product Manager Confluent)
- Session of 30 minutes
Product Manager Confluent
Gwen is a product manager at Confluent. She has 15 years of experience working with code and customers to build scalable data architectures, integrating relational and big data technologies. Gwen is the author of “Kafka – The Definitive Guide” and “Hadoop Application Architectures”, and a frequent presenter at industry conferences. Gwen is a PMC member on the Apache Kafka project and committer on Apache Sqoop. When Gwen isn’t building data pipelines or thinking up new is-features, you can find her pedaling on her bike exploring the roads and trails of California, and beyond.
Room 2016
Wednesday, 7th Jun, 17:00 - 17:30
- Enterprise
Room 2022
- Leah McGuire (Technical Staff Salesforce.com)
- Session of 30 minutes
Technical Staff Salesforce.com
Leah McGuire is a Lead Member of Technical Staff at Salesforce, building platforms to enable the integration of machine learning into Salesforce products. Before joining Salesforce, Leah was a Senior Data Scientist on the data products team at LinkedIn working on personalization, entity resolution, and relevance for a variety of LinkedIn data products. She completed a PhD and a Postdoctoral Fellowship in Computational Neuroscience at the University of California, San Francisco, and at University of California, Berkeley, where she studied the neural encoding and integration of sensory signals.
Room 2022
Wednesday, 7th Jun, 11:00 - 11:30
- Machine Leearning
CTO MemSQL
Nikita Shamgunov co-founded MemSQL and has served as CTO since inception. Prior to co-founding the company, Nikita worked on core infrastructure systems at Facebook. He served as a senior database engineer at Microsoft SQL Server for more than half a decade. Nikita holds a bachelor’s, master’s and doctorate in computer science, has been awarded several patents and was a world medalist in ACM programming contests.
Room 2022
Wednesday, 7th Jun, 14:40 - 15:10
- Machine Leearning