PyCon 2017

Tracks

Wednesday, 17th May

Tutorials

09:00 - 12:20

Fantastic Data and Where To Find Them: An introduction to APIs, RSS, and Scraping

Nicole Donnelly, Tony Ojeda, Will Voorhees

Whether you’re building a custom web application, getting started in machine learning, or just want to try something new, everyone needs data. And while the web offers a seemingly boundless source for custom data sets, the collection of that data can present a whole host of obstacles. From ever-changing APIs to rate-limiting woes, from nightmarishly nested XML to convoluted DOM trees, working with APIs and web scraping are challenging but critically useful skills for application developers and data scientists alike. In this tutorial, we’ll introduce RESTful APIs, RSS feeds, and web scraping in order to see how different ingestion techniques impact application development. We’ll explore how and when to use Python libraries such as feedparser, requests, beautifulsoup, and urllib. And finally we will present common data collection problems and how to overcome them. We’ll take a hands-on, directed exercise approach combined with short presentations to engage a range of different APIs (with and without authentication), explore examples of how and why you might web scrape, and learn the ethical and legal considerations for both. To prepare attendees to create their own data ingestion scripts, the tutorial will walk through a set of examples for robust and responsible data collection and ingestion. This tutorial will conclude with a case study of Baleen, an automated RSS ingestion service designed to construct a production-grade text corpus for NLP research and machine learning applications. Exercises will be presented both as Jupyter Notebooks and Python scripts.

Nicole Donnelly

Nicole Donnelly is a data management IT specialist with the Office of the Chief Technology Officer, District of Columbia. She believes a city that consumes and understands its own data is acting in the true spirit of public service by improving the lives of its residents. She has a professional certificate in data science from Georgetown, where she continued as TA and instructor, and has completed the Data Science Immersive program at General Assembly. She is a faculty member with DC data science collaborative District Data Labs. She has Bachelor's degrees from Rutgers University in Computer Science and Art History.

Tony Ojeda

[Tony Ojeda](https://www.linkedin.com/in/tonyojeda) is a data scientist, author, and entrepreneur with expertise in streamlining business processes and over a decade of experience creating innovative data products. He is the Founder of District Data Labs and a Co-founder and former President of Data Community DC. Tony has an MS in Finance from Florida International University and an MBA in Strategy and Entrepreneurship from DePaul University. He co-authored the Practical Data Science Cookbook, published by Packt, and is also a co-author of the forthcoming O'Reilly book Applied Text Analytics with Python.

Will Voorhees

Will is a software developer that designs and builds enterprise-scale security products for distributed systems. He's worked on high performance C++ cryptography tools, distributed firewalls, and key management systems for hundreds of thousands of servers. His free time is occupied with YouTube, video games, and cooking. He lives in Seattle with his wife and dog.

Room 8

Wednesday, 17th May, 09:00 - 12:20

Tutorials

09:00 - 12:20

An introduction to secure web development with Django and Python

James Bennett

You can't afford to have security be an optional or 'nice - to - have' feature in your applications. Luckily, Django has your back: this workshop will introduce you to thinking about security, cover a broad range of security concerns from the mundane to the arcane, and walk you through, in detail, how Django and the broader Django and Python ecosystems can help protect you and your users from them.

James Bennett

Philosopher turned web geek. Django committer, chronic documenter, now on the engineering team at Clover Health, trying to make health insurance better.

Room 3

Wednesday, 17th May, 09:00 - 12:20

Tutorials

09:00 - 12:20

Using Functional Programming for efficient Data Processing and Analysis

Reuben Cummings

As a multi paradigm language, Python has great support for functional programming. For better or for worse, leading data libraries such as Pandas eschew the this style for object-oriented programming. This tutorial will explain how to take advantage of Python's excellent functional programming capabilities to efficiently obtain, clean, transform, and store data from disparate sources.

Reuben Cummings

Reuben Cummings is a data scientist and software developer skilled in business development, entrepreneurship, and programming. As Managing Director of Nerevu Development, he has worked with clients including the UN Humanitarian Data Exchange, Moringa School, and Africa’s Talking. Reuben specializes in data analysis, visualization, API development, and workflow automation.Reuben previously served in IT and business roles at social enterprise Global Cycle Solutions in Arusha, Tanzania; and as an analyst at MIDIOR Consulting in Cambridge, Massachusetts. He holds a degree in Chemical Engineering from the Massachusetts Institute of Technology, and is Lead Organizer of the Arusha Coders meetup in Tanzania.

Room 6

Wednesday, 17th May, 09:00 - 12:20

Tutorials

09:00 - 12:20

Network Analysis Made Simple

Eric J. Ma

Have you ever wondered about how those data scientists at Facebook and LinkedIn make friend recommendations? Or how epidemiologists track down patient zero in an outbreak? If so, then this tutorial is for you. In this tutorial, we will use a variety of datasets to help you understand the fundamentals of network thinking, with a particular focus on constructing, summarizing, and visualizing complex networks.

Eric J. Ma

I am a 5th year PhD Candidate in the Runstadler Lab in the Biological Engineering department at MIT. I study the influenza virus, which is like a self-replicating deck of 8 poker cards. I am using Python to solve infectious disease data science problems.

Room 9

Wednesday, 17th May, 09:00 - 12:20

Tutorials

09:00 - 12:20

Beginning Python Bootcamp

Matt Harrison

Are you new to Python? Or do you feel like you grok the syntax, but would like to understand new idioms and where to use them? Want to watch an experienced Python developer create code from nothing? Instead of just covering the syntax, we will introduce most of Python as we build code together. Bring your laptop and we will program a predictive text engine from scratch together. Follow along as we start with IDLE (or your favorite editor) and a blank file, and end with a tested idiomatic Python module. It will learn from any text we pass into it, and predict characters or words for us. Just like your phone!

Matt Harrison

Matt is a Python user, presenter, author, and user group organizer. He helps run the Utah Python user group. He authored Treading on Python Vol 1 & 2. His work experience covers search, business intelligence, & data science.

Room 1

Wednesday, 17th May, 09:00 - 12:20

Tutorials

09:00 - 12:20

Contract-First API Development Using The OpenAPI Specification (Swagger)

Dave Forgac, Ian Zelikman

Often developers will implement APIs and then only after they’re released think about things like specifications and documentation. Instead we can make the design of the API contract an explicit part of our development process using The OpenAPI Specification (Swagger) and open source tools. In this workshop we will:Discuss the contract-first approach, Build and validate a simple OpenAPI Specification, Generate reference documentation and show how you can incorporate it with other docs, Run a mock server so clients can test using the API, Generate stub code based on the specification, Implement a basic working API using Flask, Show how you can iteratively add features and make changes, Discuss generating specifications for existing APIs, Participants will leave with: An understanding of how to incorporate a contract-first process into their API development workflow, An example specification that can be used as reference for their own API design, Working code for a basic API that can be used as a basis for their own development, Participants are expected to have a basic familiarity with HTTP / RESTful APIs, understanding of simple git operations, and some development experience.

Dave Forgac

Dave Forgac has been a FOSS enthusiast ever since installing Linux for the first time in the late 90's. He got a taste of Python in the early 00's and was hooked. He currently works as a Sr. Software Engineer at American Greetings in Cleveland, OH where he is responsible for API development, application deployment, and developer happiness engineering. He loves building communities and is an organizer of PyOhio, ClePy, and the Cleveland API Meetup.

Ian Zelikman

Ian has worked with web technologies for over 10 years. He started his career working in the J2EE world and a passion to developing application for weather forecasting and analysis. After a short experience working on a white label platform with Grails framework Ian discovered the Python community.

Room 4

Wednesday, 17th May, 09:00 - 12:20

Tutorials

09:00 - 12:20

Cross-platform Native GUI development with BeeWare

Russell Keith-Magee

Russell Keith-Magee has been member of the Django core team for 10 years, and was the President of the Django Software Foundation from 2010-15.He's also the founder of the Python BeeWare project, a set of tools and libraries for developing native user interfaces in Python.He lives in Perth, Western Australia with his wife, son, daughter, and two cats.

Room 5

Wednesday, 17th May, 09:00 - 12:20

Tutorials

09:00 - 12:20

Exploratory data analysis in python

Chloe Mawer, Jonathan Whitmore

With the recent advancements in machine learning algorithms and statistical techniques, and the increasing ease of implementing them in Python, it is tempting to ignore the power and necessity of exploratory data analysis (EDA), the crucial step before diving into machine learning or statistical modeling. Simply applying machine learning algorithms without a proper orientation of the dataset can lead to wasted time and spurious conclusions. EDA allows practitioners to gain intuition for the pattern of the data, identify anomalies, narrow down a set of alternative modeling approaches, devise strategies to handle missing data, and ensure correct interpretation of the results. Further, EDA can rapidly generate insights and answer many questions without requiring complex modeling. Python is a fantastic language not only for machine learning, but also EDA. In this tutorial, we will walk through two hands-on examples of how to perform EDA using Python and discuss various EDA techniques for cross-section data, time-series data, and panel data. One example will demonstrate how to use EDA to answer questions, test business assumptions, and generate hypotheses for further analysis. The other example will focus on performing EDA to prepare for modeling. Between these two examples, we will cover: Data profiling and quality assessment, Basic describing of the data, Visualizing the data including interactive visualizations, Identifying patterns in the data (including patterns of correlated missing data), Dealing with many attributes (columns), Dealing with large datasets using sampling techniques, Informing the engineering of features for future modeling, Identifying challenges of using the data (e.g. skewness, outliers), Developing an intuition for interpreting the results of future modeling. The intended audience for this tutorial are aspiring and practicing data scientists and analysts, or anyone who wants to be able to get insights out of data. Students must have at least an intermediate-level knowledge of Python and some familiarity with analyzing data would be beneficial. Installation of Jupyter Notebook will be required (and potentially, we will also demonstrate analysis in JupyterLab, if its development in the next few months allows). Instructions will be sent on what packages to install beforehand.

Chloe Mawer

Chloe Mawer is a Senior Data Scientist at Silicon Valley Data Science, a small consulting company located in Mountain View, California that focuses on transforming businesses through data strategy, science, and engineering. At SVDS, Chloe has worked on problems for pharmaceutical and retail companies, which heavily rely on using Python for data analysis and modeling. Prior to SVDS, she obtained her PhD in Environmental Engineering at Stanford, where she focused on developing methods for monitoring water’s movement in the subsurface using electrical measurements.

Jonathan Whitmore

Jonathan Whitmore, PhD, is a Senior Data Scientist at Silicon Valley Data Science. He is the author of the O'Reilly screencast: Jupyter Notebook for Data Science Teams. Before moving into the tech industry, Dr. Whitmore worked as an astrophysicist in Melbourne, Australia, researching whether the fundamental physical constants have changed over the age of the universe. Dr. Whitmore received his PhD in physics from the University of California, San Diego.

Room 7

Wednesday, 17th May, 09:00 - 12:20

Tutorials

09:00 - 12:20

Web programming from the beginning

Thomas Ballinger

Modern web frameworks such as Django and Flask are immensely powerful. However, these useful tools obscure the foundations of network programming upon which they are based, which can be very helpful to understand. So instead of building useful applications with these libraries, let's experiment with sockets! At this tutorial, a Python-flavored history of the web will be presented and attendees will write or modify a TCP chat client, a static site web server, an HTTP client, a CGI script, and a WSGI-compliant server and web application. We will learn what all those things are and how they fit together, bringing the architecture of modern web apps into better focus. The material will be accessible to participants with no web development experience, however, they must be able to write and run Python scripts at the command-line. This tutorial might appeal to someone also attending an introductory web development tutorial, but it covers separate, complementary material. Web development experience is not required but a little exposure would be helpful; for instance, installing flask and running the minimal application on the quickstart page. Similarly, prior exposure to HTML would be useful but is not necessary.

Thomas Ballinger

Tom is a Python enthusiast employed by Dropbox. Previously he worked as a facilitator at the Recurse Center in NYC where he explored dark corners of Python with participants and helped bring more BitTorrent clients into the world.

Room 2

Wednesday, 17th May, 09:00 - 12:20

Tutorials

13:20 - 16:20

Decorators and descriptors decoded

Luciano Ramalho

Python developers use decorators and descriptors on a daily basis, but many don't understand them well enough to create (or debug) them. Decorators are widely deployed in popular Python Web frameworks. Descriptors are the key to the database mappers used with those frameworks, but under the covers they play an even more crucial role in Python as the device that turns plain functions into bound methods, setting the value of the self argument. This tutorial is a gentle introduction these important language features, using a test-driven presentation and exercises, and covering enhancements in Python 3.6 that make class metaprogramming easier to get right.Decorators without closures are presented first, highlighting the difference between run time and import time that is crucial when meta-programming. We then get a firm grounding on closures and how they are implemented in Python, before moving to higher order function decorators and class decorators. Coverage of descriptors starts with a close look at Python's property built-in function and dynamic attribute look up. We then implement some ORM-like field validation descriptors, encounter a usability problem, and leverage PEP 487 -- Simpler customisation of class creation -- to solve it. Alternative implementations using a class decorator and a metaclass will be contrasted to the PEP 487 solution.

Luciano Ramalho

Luciano Ramalho is a Technical Principal at ThoughtWorks and the author of the bestselling book Fluent Python (O'Reilly, 2015). Since 1998 he has deployed Python on some of the largest Web sites in Brazil. His speaking record includes PyCon US, OSCON, OSCON-EU, PythonBrasil, RuPy and an ACM Webinar that was attended by more than 900 people. Ramalho is a fellow of the PSF and co-founder of the Brazilian Python Association and of Garoa Hacker Clube, the first hackerspace in Brazil.

Room 1

Wednesday, 17th May, 13:20 - 16:20

Tutorials

13:20 - 16:40

ContainerOrchestration.py: The tutorial session

Mike Bright, Haïkel Guémar, Mario Loriedo

Container Orchestration is the new hot topic in design of scalable system architecture. In this tutorial we look at the main choices for container orchestrators: Docker Swarm, Kubernetes and Apache Mesos. We will look at the use of the respective Python APIs for interacting with thoses engines.This 3-hr session will provide hands-on use of those orchestrators with real use-cases.

Mike Bright

Solution Architect at Hewlett-Packard Enterprise working in the EMEA OpenNFV lab (Cloud Computing for Telecom), based in Grenoble France.Passionate about Containers, Orchestration and Programming Languages.Runs the Grenoble Python User Group.Like to travel, danse (Argentinian Tango, Salsa, Rock)

Haïkel Guémar

FedoraProject.org Developer & Ambassador. Senior Software Engineer in a small HPC/SaaS Startup. Lyon Coding Dojo founder & leader.

Mario Loriedo

Principal Software Engineer at Red Hat, developer and container enthusiast

Room 5

Wednesday, 17th May, 13:20 - 16:40

Tutorials

13:20 - 16:40

How to Write and Debug C Extension Modules

Joe Jevnik

The CPython interpreter allows us implement modules in C for performance critical code or to interface with external libraries while presenting users with a high level Python API. This tutorial will teach you how to leverage to power of C in your Python projects. We will start by explaining the C representation of Python objects and how to manipulate them from within C. We will then move on to implementing functions in C for use in Python. We will discuss reference counting and correct exception handling. We will also talk about how to package and build your new extension module so that it may be shared on PyPI. (We will only be covering building extension modules on GNU/Linux and OSX, not Windows). After the break, we will show how to implement a new type in C. This will cover how to hook into various protocols and properly support cyclic garbage collection. We will also discuss techniques for debugging C extension modules with gdb using the CPython gdb extension.

Joe Jevnik

Joe Jevnik works at Quantopian where he works on integrating data from various sources into the platform. Joe works on Zipline, Quantopian's open source backtester. He also works on the Blaze ecosystem, mainly on blaze core, odo, and datashape

Room 3

Wednesday, 17th May, 13:20 - 16:40

Tutorials

13:20 - 16:40

Introduction to Statistical Modeling with Python

Christopher Fonnesbeck

This intermediate-level tutorial will provide students with hands-on experience applying practical statistical modeling methods on real data. Unlike many introductory statistics courses, we will not be applying 'cookbook' methods that are easy to teach, but often inapplicable; instead, we will learn some foundational statistical methods that can be applied generally to a wide variety of problems: maximum likelihood, bootstrapping, linear regression, and other modern techniques. The tutorial will start with a short introduction on data manipulation and cleaning using pandas, before proceeding on to simple concepts like fitting data to statistical distributions, and how to use Monte Carlo simulation for data analysis. Slightly more advanced topics include bootstrapping (for estimating uncertainty around estimates) and flexible linear regression methods using Bayesian methods. By using and modifying hand-coded implementations of these techniques, students will gain an understanding of how each method works. Students will come away with knowledge of how to deal with very practical statistical problems, such as how to deal with missing data, how to check a statistical model for appropriateness, and how to properly express the uncertainty in the quantities estimated by statistical methods.

Christopher Fonnesbeck

Chris Fonnesbeck is an Assistant Professor in the Department of Biostatistics at the Vanderbilt University School of Medicine. He specializes in computational statistics, Bayesian methods, meta-analysis, and applied decision analysis. He originally hails from Vancouver, BC and received his Ph.D. from the University of Georgia.

Room 7

Wednesday, 17th May, 13:20 - 16:40

Tutorials

13:20 - 16:40

Microservices with Python and Flask

Miguel Grinberg

Microservices are receiving the buzzword treatment these days, and as such, they have a cloud of hype surrounding them that makes it hard to separate substance from fluff. In this tutorial, Miguel Grinberg starts with an introduction to this architecture, including what's great and not so great about it, and then teaches you how a traditional monolithic application written in Flask can be refactored into a modern distributed system based on microservices.

Miguel Grinberg

Miguel Grinberg is a Software Developer at Rackspace, where he works on cloud applications. He blogs at http://blog.miguelgrinberg.com about a variety of topics including web development, Python, robotics, photography and the occasional movie review. Miguel is the author of the O'Reilly book Flask Web Development. He lives in beautiful Portland, Oregon. Follow @miguelgrinberg on Twitter.

Room 2

Wednesday, 17th May, 13:20 - 16:40

Tutorials

13:20 - 16:40

Complexity Science

Allen Downey, Jason Woodard

Complexity Science is an approach to modeling systems using tools from discrete mathematics and computer science, including networks, cellular automata, and agent-based models. It has applications in many areas of natural and social science. Python is a particularly good language for exploring and implementing models of complex systems. In this tutorial, we present material from the draft second edition of Think Complexity, and from a class we teach at Olin College. We will work with random networks using NetworkX, with cellular automata using NumPy, and we will implement simple agent-based models.

Allen Downey

Allen Downey is a professor of computer science at Olin College, a new engineering college near Boston with the mission to fix engineering education. He is the author of Think Python, Think Stats, Think Bayes, Think Complexity, and several other books all available under free licenses.

Jason Woodard

Jason Woodard is an associate professor of engineering and entrepreneurship at Olin College. He studied complex systems and computational modeling at the Santa Fe Institute, and uses complexity science to model the evolution of technology and markets.

Room 9

Wednesday, 17th May, 13:20 - 16:40

Tutorials

13:20 - 16:40

Readable Regular Expressions

Trey Hunner

What are regular expressions, what are they useful for, and why are they so hard to read? We'll learn what regular expressions are good for, how to make our own regular expressions, and how to make our regular expressions friendly and readable (yes it's possible, sort of).

Trey Hunner

Trey Hunner is a Python & Django mentor specializing in on-site corporate training. Trey holds weekly online Python chats, is a director at the PSF, and is heavily involved with his local Python meetup group in San Diego.

Room 4

Wednesday, 17th May, 13:20 - 16:40

Tutorials

13:20 - 16:40

Time Series Analysis

Aileen Nielsen

Time series analysis is more relevant than ever with the rise of big data, the internet of things, and the general availability of data that follows events through time. This tutorial will introduce participants to the many versatile tools Python offers for exploring, analyzing, and predicting time series data. The tutorial will be a mix of lecture and practice, and it will be broken down into four components: (1) Handling timestamped data in Python (2) Commonly encountered problems with time series (3) Time series prediction exercises (4) Time series classification exercises

Aileen Nielsen

Since completing degees in anthropology, law, and physics from Princeton, Yale, and Columbia respectively, Aileen Nielsen has worked in corporate law, physics research laboratories, and, most recently, NYC startups oriented towards improving daily life for under-served populations - particularly groups who have yet to fully enjoy the benefits of mobile technology. She has interests ranging from defensive software engineering to UX designs for reducing cognitive load to the interplay between law and technology. Coming off a recent stint as a data scientist in Hillary Clinton's presidential campaig, Aileen now engineers One Drop's diabetes-management products.

Room 6

Wednesday, 17th May, 13:20 - 16:40

Tutorials

13:20 - 16:40

bokeh: Data Visualization in Python

Chalmer Lowe

Bokeh is a powerful data visualization library that creates fully interactive plots and integrates well with the data analysis tools you already know and love: pandas, matplotlib, seaborn, ggplot. Bokeh can produce stand-alone browser-based plots and much more sophisticated server-hosted visualizations. Learn to use bokeh to create everything from basic graphs to advanced interactive plots, dashboards, and data applications. Incorporate bokeh within your Jupyter/IPython notebooks, Partner bokeh with other libraries such as matplotlib, seaborn, pandas, and ggplot, Learn about bokeh server: to serve up even more impressive realtime visualizations, Explore configurations and settings, Recognize and overcome common problems

Chalmer Lowe

Chalmer Lowe founded PyHawaii, the largest programming meetup in Hawaii and served as the Chairman of the Python Education Summit (2015, 2016). He co-founded Dark Art of Coding, a programming school to enable him and his son to fulfill their passion of teaching Python and Data Analysis to geeks at home, at work and in the community. He performs data analysis for his employer: Booz Allen Hamilton and teaches Python to his colleagues, clients and anyone who will stand still long enough.

Room 8

Wednesday, 17th May, 13:20 - 16:40

Tutorials

Sponsor tutorials

11:00 - 12:30

Red Hat: Deploying Python web applications to OpenShift/Kubernetes

Graham Dumpleton

This will be a hands on workshop where you will get to experience yourself how easy it is to deploy a Python web application to OpenShift. The latest version of OpenShift is implemented on top of Kubernetes for container orchestration and Docker for the container runtime. On top of these tools OpenShift adds its own special magic sauce to even further simplify the deployment of applications. In the workshop you will learn how to deploy a Python web application direct from a Git repository holding the application source code, with the build process being handled by the Source-to-Image (S2I) tool. Next you will deploy a database from a pre-existing Docker-formatted container image and learn how to hook your Python web application up to it. Finally you will configure a Git repository webhook to automate the deployment process so that every time you commit and push up changes your application will be automatically rebuilt and deployed. During the workshop we will be throwing in various other tidbits to help explain what OpenShift is, how it works and how it can help you to host not only your Python web site, but also more complex applications, be they legacy systems, or new micro service architecture applications, in any language. For the workshop, you will be provided access to an online instance of OpenShift Origin with everything you need. The only piece of software you will need to install locally on your own computer will be a single program binary for our command line client for OpenShift.

Graham Dumpleton

Graham is the author of mod_wsgi, the Apache module for hosting of Python web applications using the WSGI interface. He also has a keen interest in Docker and Platform as a Service (PaaS) technologies. He is currently a developer advocate for OpenShift at Red Hat.

Room B110-111

Wednesday, 17th May, 11:00 - 12:30

Sponsor tutorials

11:00 - 12:30

Twist Bioscience: Using Python in Synthetic DNA Manufacturing

Kieran Hervold

Quick Introduction about how Twist Bioscience is enabling world changing innovation like enabling DNA computing and turning sugar into vaccines. Rust Bindings with Python We will demonstrate how we combine the best of Rust and Python by creating a Rust library with Python bindings. Hardware Scripting with Embedded IronPython We’ll demonstrate how we use the flexibility of IronPython runtime to make our hardware scriptable and enable debugging within the same process. This will require a Windows machine or VM image with visual studio installed (free community install). Polymorphic Marshmallowed JSON with SQLAlchemy This exercise will guide you through creating polymorphic objects that are transparently serialized in and out of Postgres JSON fields. We will also work on creating graph objects with mutability tracking

Kieran Hervold

Kieran Hervold is a Senior Engineer on the bioinformatics team at Twist Biosciences. He is an expert at building complex pipelines with Python. Lately, he started using Rust through Python to provide massive speed-ups to our DNA design algorithms while retaining the flexibility and interactivity of Python.

Room B118-119

Wednesday, 17th May, 11:00 - 12:30

Sponsor tutorials

13:30 - 15:00

Clover Health: Transforming and Analyzing Healthcare Data with Python

Bijan Vakili

This workshop will give you an introduction to how we use python for testing, analysis, and processing at Clover. This includes a walkthrough of our tech stack along with a dive into two use cases. The first use case is from a Data Science perspective which will go over how we test SQL queries in our data pipeline. This will get into an example of statistical modeling in a particular insurance operations context. The second use case is from a Engineering perspective which will show how we transform nested JSON structures into consumable flat table structures. This will also touch on techniques for processing large amounts of data. Clover uses lots of python tools and libraries which we're happy to discuss. We rely heavily on Postgres as our primary database solution. However, this talk will highlight SQLAlchemy, Jupyter Notebook, pytest, generators, partial functions, and LRU caching.

Bijan Vakili

Bijan Vakili is a Senior Software Engineer at Clover Health where he is building applications, improving infrastructure and mentoring developers. Prior to joining Clover, Bijan worked in currency and derivative trading, gaming, and network applications, and disaster recovery. He has worked in multiple roles including software developer, team lead, and project manager. Bijan holds a Bachelor’s degree in Software Engineering & Human Biology from the University of Toronto and a MBA from University of Toronto - Rotman School of Management.

Room B118-119

Wednesday, 17th May, 13:30 - 15:00

Sponsor tutorials

13:30 - 15:00

Intel: Bring deep learning to the fingertips of data scientists with Python & BigDL on Apache Spark

Jiao Wang

We have seen trends that the data science and big data community begin to engage further with artificial intelligence and deep learning technologies, and efforts to bridge the gap between the deep learning communities and data science / big data communities begin to emerge. However, developing deep neural nets is an intricate procedure, and scaling that to big data scale is an even more challenging process. Therefore, deep learning tools and frameworks, especially visualization support, that can run smoothly on top of big data platforms are essential for scientists to understand, inspect and manipulate their big models and big data. In this talk, we will share how we bring deep learning to the fingertips of big data users and data scientists, by providing visualizations (through widely used frameworks such as Jupyter Notebooks and/or Tensorboard) as well as Python toolkits (e.g., Numpy, Scipy, Scikit-learn, NLTK, Kesra, etc.) on top of BigDL, an open source distributed deep learning library for Apache Spark. In addition, we will also share how real-world big data users and data scientists use these tools to build AI-powered big data analytics applications.

Jiao Wang

Jiao Wang is a software engineer on the Big Data Technology team at Intel who works in the area of big data analytics. She is engaged in developing and optimizing distributed deep learning frameworks on Apache Spark.

Room B110-111

Wednesday, 17th May, 13:30 - 15:00

Sponsor tutorials

15:30 - 17:00

Intel: Scalable, distributed deep learning with Python and Pachyderm

Daniel Whitenack

Daniel Whitenack (@dwhitena) is a Ph.D. trained data scientist working with Pachyderm (@pachydermIO). Daniel develops innovative, distributed data pipelines which include predictive models, data visualizations, statistical analyses, and more. He has spoken at conferences around the world (Datapalooza, DevFest Siberia, GopherCon, and more), teaches data science/engineering with Ardan Labs (@ardanlabs), maintains the Go kernel for Jupyter, and is actively helping to organize contributions to various open source data science projects.

Room B110-111

Wednesday, 17th May, 15:30 - 17:00

Sponsor tutorials

Keynote

11:00 - 12:30

Twist Bioscience: Using Python in Synthetic DNA Manufacturing

Kelsey Hightower ( Google), Katy Huff ( University of Illinois), Jake Vanderplas ( University of Washington), Lisa Guo ( Instagram), Hui Ding ( Instagram)

Kelsey Hightower

Google

Kelsey Hightower is an open source advocate and recovering sysadmin who is currently serving the application container and distributed systems community as an educator and toolsmith. He is currently employed by Google.

Katy Huff

University of Illinois

Dr. Kathryn D. Huff is an unapologetic advocate for open reproducible scientific computing and for emissions-free base-load nuclear energy. She is currently an Assistant Professor in the Department of Nuclear, Plasma, and Radiological Engineering at the University of Illinois at Urbana-Champaign where she leads the Advanced Reactors and Fuel Cycles Research Group. She holds an affiliate faculty position with the National Center for Supercomputing Applications and is one of the University of Illinois' most recent Blue Waters Professors.

Jake Vanderplas

University of Washington

Jake VanderPlas is an astronomer by training, and a long-time user and developer of the scientific Python stack. He currently works as an interdisciplinary research director at the University of Washington, where he writes, teaches, collaborates on research, and spends time consulting with local scientists from a wide range of fields.

Lisa Guo

Instagram

Lisa Guo is a networking, platform, and scalability software engineer with over 20 years experience. She has been working with the Instagram Infrastructure team since 2014, where she led efforts to expand from a single to multiple data centers and improve efficiency and cost-effectiveness.

Hui Ding

Instagram

Hui Ding is Head of Infrastructure org at Instagram, where he oversees the scaling of Instagram backend platform that supports hundreds of millions of concurrent users on a daily basis. Hui has been with Instagram since 2012, and has led the development of many Instagram product launches as well as all infrastructure efforts.

Keynotes venue

Wednesday, 17th May, 11:00 - 12:30

Keynote

Summit

10:00 - 16:00

Python Language Summit

Larry Hastings, Barry Warsaw

6 hours session

The Python Language Summit is an invitation-only event for the developers of Python implementations (CPython, IronPython, Jython, Parrot, PyPy, etc.) to share information, discuss our shared problems, and — hopefully — solve them. These issues might be related to the language itself, the standard library, the development process, Python 2 and 3, the documentation, package index, web site, et cetera. The Summit will focus on discussion more than on presentations.

Larry Hastings

Larry has been using Python since the lateish 90s. He is a CPython core contributor, and is Release Manager for Python 3.4.

Barry Warsaw

Barry is a long time Pythonista, having worked with Guido and the rest of the Pythonlabs crew since 1995. He is the project leader for GNU Mailman, working mainly on Mailman 3 these days. He's been the release manager for several previous versions of Python, and still comes up with terrible ideas for Python.

Keynotes venue

Wednesday, 17th May, 10:00 - 16:00

Summit

Thursday, 18th May

Tutorials

09:00 - 12:20

Build a data pipeline with Luigi

Aaron Knight

Luigi is a Python library for building pipelines of batch processes. It 'handles dependency resolution, workflow management, visualization, handling failures, command line integration, and much more' In this tutorial, we will use Luigi to build a data pipeline that runs a series of interdependent jobs. We will also discuss some real-world use cases for Luigi, and show how it can make running a data pipeline much more robust and reliable.

Aaron Knight

Aaron Knight is a full-stack engineer at Voxy, a company that creates dynamic, personalized educational courses for English learners.

Room 9

Thursday, 18th May, 09:00 - 12:20

Tutorials

09:00 - 12:20

Creating And Consuming Modern Web Services with Twisted

Moshe Zadka, Michele Pratusevich

This tutorial will show students how to write applications and services which efficiently publish and consume services and APIs. To do so, we will combine 4 Python-based technologies: Jupyter is a real-time development environment. Twisted is a powerful platform for network programming that supports many protocols, including HTTP. Klein is a Twisted-based web application framework. Treq is a Requests-style HTTP client based on Twisted. By combining all of these we will guide students through interactively prototyping a production quality web application that publishes both service APIs and web resources such as HTML, and that can efficiently consume many back-end services such as 3rd-party APIs.

Moshe Zadka

Moshe has been involved with Python since 1998, when he helped obsolete math-related modules. Since then, he made some contributions that were not just deleting code to core Python, and has been a contributor on the Twisted project since its inception.

Michele Pratusevich

I graduated from MIT with my BS in computer science in 2013 followed by an MEng in 2015. I currently work at Amazon on large-scale computer vision problems and systems, with Python and C++ being the languages of choice. I like hiking, climbing, and fun programming problems.

Room 5

Thursday, 18th May, 09:00 - 12:20

Tutorials

09:00 - 12:20

Deploy and scale containers with Docker native, open source orchestration

Jerome Petazzoni, AJ Bowen

Deploy your own cluster! Use it to 'build,ship,and run ' containerized applications! Learn how to implement logging, metrics, stateful services, and more! Learn the True Way of DevOps! Alright, we can't promise anything about the True Way of DevOps, but everything else will definitely be in this tutorial. We will run a demo app featuring Python components and see some best practices to 'Dockerize ' Python code and Flask in particular; but the tutorial also includes other languages and frameworks. Come with your laptop! You don't need to install anything before the workshop, as long as you have a web browser and a SSH client. Each student will have their own private cluster during the tutorial, to get immediately applicable first-hand experience.

Jerome Petazzoni

Jerome is a senior engineer at Docker, where he helps others to containerize all the things. In another life he built and operated Xen clouds when EC2 was just the name of a plane, developed a GIS to deploy fiber interconnects through the French subway, managed commando deployments of large-scale video streaming systems in bandwidth-constrained environments such as conference centers, operated and scaled the dotCloud PAAS, and various other feats of technical wizardry. When annoyed, he threatens to replace things with a very small shell script.

AJ Bowen

As a Solutions Engineer at Convox, AJ Bowen is on a mission to containerize all the things and help others to do the same. Her other experience includes developer advocacy at Gandi.net as well as heavy involvement in the Occupy movement in Kansas City, where she discovered the delightful and complicated world of horizontal organization in groups of passionate people with wildly diverse political perspectives, and humanitarian work in Haiti. AJ doesn’t like being told what to do—or telling others what to do—and is on a constant mission to eliminate unnecessary hierarchy.

Room 4

Thursday, 18th May, 09:00 - 12:20

Tutorials

09:00 - 12:20

Faster Python Programs - Measure, don't Guess

Mike Muller

Optimization can often help to make Python programs faster or use less memory. Developing a strategy, establishing solid measuring and visualization techniques as well as knowing about algorithmic basics and datastructures are the foundation for a successful optimization. The tutorial will cover these topics. Examples will give you a hands-on experience on how to approach efficiently. Python is a great language. But it can be slow compared to other languages for certain types of tasks. If applied appropriately, optimization may reduce program runtime or memory consumption considerably. But this often comes at a price. Optimization can be time consuming and the optimized program may be more complicated. This, in turn, means more maintenance effort. How do you find out if it is worthwhile to optimize your program? Where should you start? This tutorial will help you to answer these questions. You will learn how to find an optimization strategy based on quantitative and objective criteria. You will experience that one's gut feeling what to optimize is often wrong. The solution to this problem is: „Measure, Measure, and Measure!“. You will learn how to measure program run times as well as profile CPU and memory. There are great tools available. You will learn how to use some of them. Measuring is not easy because, by definition, as soon as you start to measure, you influence your system. Keeping this impact as small as possible is important. Therefore, we will cover different measuring techniques. Furthermore, we will look at algorithmic improvements. You will see that the right data structure for the job can make a big difference. Finally, you will learn about different caching techniques.

Mike Muller

Mike Muller has been using Python as his primary programming language since 1999. He is a Python trainer and the CEO at Python Academy (www.python-academy.com).He teaches a wide variety of Python topics including Introduction to Python, Python for Scientists and Engineers, Advanced Python as well as Optimization and Extensions of Python Programs.He is the chairman of the Python Software Verband e.V., a PSF member, a PSF community service award holder, User Group co-founder. He chaired EuroSciPy 2008 and 2009, PyCon DE 2011 and 2012 as well as EuroPython 2014 in Berlin, Germany.

Room 3

Thursday, 18th May, 09:00 - 12:20

Tutorials

09:00 - 12:20

Hands-On Intro to Python for New Programmers

Trey Hunner

Brand new to programming and want to get some hands-on Python experience? Let's learn some Python together! During this tutorial we will work through a number of programming exercises together. We'll be doing a lot of asking questions, taking guesses, trying things out, and seeking out help from others. In this tutorial we'll cover: Types of things in Python: strings, numbers, lists, Conditionally executing code, Repeating code with loops, Getting user input

Trey Hunner

Room 1

Thursday, 18th May, 09:00 - 12:20

Tutorials

09:00 - 12:20

Intro to Bayesian Machine Learning with PyMC3 and Edward

Torsten Scholak, Diego Maniloff

There has been uprising of probabilistic programming and Bayesian statistics. These techniques are tremendously useful, because they help us to understand, to explain, and to predict data through building a model that accounts for the data and is capable of synthesizing it. This is called the generative approach to statistical pattern recognition. Estimating the parameters of Bayesian models has always been hard, impossibly hard actually in many cases for anyone but experts. However, recent advances in probabilistic programming have endowed us with tools to estimate models with a lot of parameters and for a lot of data. In this tutorial, we will discuss two of these tools, PyMC3 and Edward. These are black box tools, swiss army knifes for Bayesian modeling that do not require knowledge in calculus or numerical integration. This puts the power of Bayesian statistics into the hands of everyone, not only experts of the field. And, it's great that these are implemented in Python with its rich, beginner-friendly ecosystem. It means we can immediately start playing with it. We have planned three awesome parts, spread over three awesome hours: First hour: Introduction to Bayesian machine learning. Second hour: Baby steps in PyMC3 and Edward. Third hour: Solve a real-world problem with PyMC3 or Edward (model, fit, criticize).

Torsten Scholak

In a former life, Torsten was a theoretical physicist and spent his life hacking together simulations of complex quantum systems, mostly in Python and C. More recently, he has been dedicating his time to data science and, in particular, Bayesian probabilistic modeling. Torsten lives in Toronto with his wife, and enjoys contributing to the local Meetup scene.

Diego Maniloff

Diego Maniloff is VP of Engineering at Unata, where he uses Python everyday to build recommendation systems. Diego was a research fellow at MIT, where he collaborated with GE to characterize the predictability of human diseases from millions of medical records, and with Audi to design an onboard recommender. Diego was a Fulbright scholar at UIC's AI Lab, where he focused on algorithms for offline/online planning under uncertainty. He received his BS+MEng degree in Telecommunications Engineering from UBP in beautiful Córdoba, Argentina. Back home, he learned his most valuable skill: how to prepare a great asado.

Room 6

Thursday, 18th May, 09:00 - 12:20

Tutorials

09:00 - 12:20

IPython and Jupyter in Depth: High productivity, interactive Python

Matthias Bussonnier, Mike Bright, Min Ragan-Kelley

IPython and Jupyter provide tools for interactive computing that are widely used in scientific computing, education, and data science, but can benefit any Python developer. You will learn how to use IPython in different ways, as: an interactive shell, a graphical console, a network-aware VM (Virtual machine) in GUIs, a web-based notebook combining code, graphics and rich HTML. We will demonstrate how to deploy a custom environment with Docker that not only contains multiple Python kernels but also a couple of other languages.

Matthias Bussonnier

Matthias is PostDoc at UC Berkeley Institute for Data science, and have been a core Developer of IPython and Jupyter for a couple of years. With a background in Physics Matthias spend most of his time developing tools for the scientific community and for education as well as promoting Python 3.

Mike Bright

Min Ragan-Kelley

Min has been a core developer of IPython (and now Jupyter) since 2006. He holds a PhD from UC Berkeley in Applied Science & Technology, with an emphasis in computational plasma physics. He now works as a postdoctoral researcher at Simula Research Laboratory in Oslo, Norway, on the Jupyter and OpenDreamKit projects, focusing on JupyterHub and the Jupyter protocols for interactive computing.

Room 7

Thursday, 18th May, 09:00 - 12:20

Tutorials

09:00 - 12:20

Let's build a web framework!

Jacob Kaplan-Moss

If you're building a web app, you probably reach for your favorite framework -- Django, Flask, Pyramid, etc. But we rarely stop to think about what these tools are doing under the hood. In this hands-on tutorial, you'll gain a deeper understanding of what frameworks are and how they work by implementing your own framework from scratch. We'll build a complete (if minimal) web framework that handles the WSGI request/response cycle, routing, controllers, templating, and a data layer. Along the way you'll gain a deeper understanding of the decisions web frameworks make, their relative merits, and inner workings.

Jacob Kaplan-Moss

Jacob is a core contributor to Django, co-owner of Revolution Systems, and Engineering Security Lead at 18F.

Room 2

Thursday, 18th May, 09:00 - 12:20

Tutorials

09:00 - 12:20

Mastering scipy.spatial

Tyler Reddy

The heavily-used scipy library is so large that each of the major modules could fill its own tutorial syllabus. It is also production-quality software with a 1.0 release imminent. In this tutorial, my focus is to cover the scipy.spatial component of the library in great detail, from the perspective of a heavy user and active developer of the computational geometry components of scipy. From distance matrices to Voronoi diagrams and Hausdorff distances, we will explore the corners of scipy.spatial code--both long-established features and even proposed features that haven't yet made it into a stable release.

Tyler Reddy

I have a PhD in biochemistry and molecular biology and am a post-doctoral fellow in computational virology. I build computational models of viruses (like influenza A and dengue) to better understand their biophysical properties. This requires extensive use of the Python programming language to parse shapes, volumes and areas. This is accomplished by leveraging numpy and scipy to perform computational geometry calculations. I am working with the scientific Python community to improve our computational geometry capabilities--my most recent presentations were at PyData London 2015 and PyCon 2016, both of which focus on computational geometry in Python.

Room 8

Thursday, 18th May, 09:00 - 12:20

Tutorials

13:20 - 16:40

Intermediate Python Bootcamp

Matt Harrison

Are you new to Python and want to learn to step it up to the next level? Have you heard about closures, decorators, context managers, generators, list comprehensions, or generator expressions? What are these and why do advanced Pythonistas keep mentioning them? Don't be intimidated, learn to take advantage of these to make you own code more idiomatic. This hands-on tutorial will cover these intermediate subjects in detail. We will modify existing Python code to take advantage of them. We will start with a basic file, and then introduce these features into it using the REPL, command line, and tests. The audience will get to follow along using their own computer and editor of choice (or can use IDLE as the instructor). We will teach the 'code smells' to look for. You will know when you should apply these new techniques to your code.

Matt Harrison

Room 3

Thursday, 18th May, 13:20 - 16:40

Tutorials

13:20 - 16:40

Introduction to Digital Signal Processing

Allen Downey

Spectral analysis is an important and useful technique in many areas of science and engineering, and the Fast Fourier Transform is one of the most important algorithms, but the fundamental ideas of signal processing are not as widely known as they should be. Fortunately, Python provides an accessible and enjoyable way to get started. In this tutorial, I present material from my book, Think DSP, and from a class I teach at Olin College. We will work with audio signals, including music and other recorded sounds, and visualize their spectrums and spectrograms. We will synthesize simple sounds and learn about harmonic structure, chirps, filtering, and convolution.

Allen Downey

Room 5

Thursday, 18th May, 13:20 - 16:40

Tutorials

13:20 - 16:40

Effectively running python applications in Kubernetes/OpenShift

Maciej Szulik

Google, Red Hat, Intel, Huawei, Mirantis, Deis and many, many others are investing a lot of time and effort into improving Kubernetes. I bet, you have encountered that name at least once in the past twelve months, either on Hacker News, Reddit, or somewhere else. Do you want to learn more about the best container orchestration in the universe, but were afraid of the setup complexity? Do you want to see how easy it is to run any application using containers? Do you want to experience the joy of scaling application with a single click? This, and a lot more will be discussed in details. In this tutorial, every attendee will be provided with an environment, and step by step instructions necessary to setup the environment, build and deploy a microservices based sample application. Alternatively, a sample application of any choosing can be used throughout the entire tutorial. All that will be performed on OpenShift, which is a Red Hat distribution of Kuberenets with some add-ons that will be described in details at the beginning of the tutorial. To wet your appetite even more, here are some of the topics we are going to cover: git integration, image registry integration, scaling application, containers security, batch tasks and much more. After the session, every person will be able to play around with the accompanying code repository that was used in the tutorial, which includes detailed instructions how to run it on your own from scratch.

Maciej Szulik

Maciej is a passionate developer with over 10 years of experience in many languages. Currently, he is hacking on bugs.python.org and CPython's IMAP library by night. Whereas in the light of day, he's working on OpenShift and Kubernetes for Red Hat. In his spare time he organizes PyCon PL, helps reviewing talks for PyCon, talks at various events and meet ups around Europe.

Room 4

Thursday, 18th May, 13:20 - 16:40

Tutorials

13:20 - 16:40

IoT Lab with Micropython and Friends

Sev Leonard

Come learn about the Internet of Things and Micropython in this hands-on hardware tutorial, no soldering or hardware experience required! We will be building a wifi-enabled temperature sensor as a vehicle for learning IoT concepts including data capture, building security into data transmission, and messaging between IoT clients and servers. Attendees will have an opportunity to take their sensors out into the conference venue to take measurements, reconvening to discuss analysis and visualization of IoT data. All the hardware needed will be provided, and attendees will be able to program the devices via a locally-hosted web interface. This tutorial will be a great introduction for folks interested in Internet of Things, Micropython, or hardware hacking. You do not need prior experience in any of these topics to attend. We will be using the ESP8266 micro controller and the MQTT protocol for messaging. Attendees should download the mosquitto MQTT broker in addition to collateral that will be sent out to attendees ahead of the tutorial. We will be using the WebREPL interface for programming the ESP8266. If you are wondering what the heck all of this means do not despair! These topics will be covered in the tutorial.

Sev Leonard

Sev likes crisp fall days, ice cream, and aggregating data. He is a Python developer & sciencer of data living in Portland, OR who enjoys writing and teaching technical topics

Room 6

Thursday, 18th May, 13:20 - 16:40

Tutorials

13:20 - 16:40

Applied Modern Cryptography in Python

Amirali Sanatinia

Today we use cryptography in almost everywhere. From surfing the web over https, to working remotely over ssh. Although most developers don't need to implement cryptography primitives, the knowledge and understanding of these building block allows them to better deploy them in their application. In modern crypto we have all the building block to develop secure application. However, we see instances of insecure code everywhere. Most of these vulnerabilities are not because of theoretic shortcomings, but due to bad implementation or a flawed protocol design. Cryptography is a delicate art where nuances matter, and failure to comprehend the subtleties of these building blocks leads to critical vulnerabilities. To add insult to injury most of the resources available are either outdated or wrong, and inarguably, using bad crypto more dangerous than not using it.In this tutorial we look at the basic building blocks of modern cryptography. We will cover the encryption techniques, hashing mechanisms, and key devastation algorithms. Furthermore we review two of the most widely used protocol suites, SSL and PGP. We conclude by implementing a simplified version of Pretty Good Privacy (PGP), that is used for encryption of texts, e-mails, files, directories, and whole disk partitions.

Amirali Sanatinia

Amirali Sanatinia is a Computer Science PhD candidate at Northeastern and holds a Bachelors degree in CS from St Andrews University. His research focuses on cyber security and privacy, and was covered by venues such as MIT Technology Review, ACM Tech News, Ars Technica, Threatpost, etc. He is a RSAC security scholar. He has talked at different security conferences such as DEF CON, Crypto Village, Virus Bulletin, and BSides Boston.

Room 9

Thursday, 18th May, 13:20 - 16:40

Tutorials

13:20 - 16:40

Django Admin: Basics and Beyond

Kenneth Love

Django's admin is a great tool but it isn't always the easiest or friendliest to set up and customize. The ModelAdmin class has a lot of attributes and methods to understand and come to grips with. On top of these attributes, the admin's inlines, custom actions, custom media, and more mean that, really, you can do anything you need with the admin...if you can figure out how. The docs are good but leave a lot to experimentation and the code is notoriously dense. In this tutorial, you'll learn the basics of setting up the admin so you can get your job done. Then we'll dive deeper and see how advanced features like autocomplete, Markdown editors, image editors, and others would be added to make the admin really shine.

Kenneth Love

Besides teaching Python at Treehouse, Kenneth Love is a husband & father. He created a few Python libraries (django-braces being the most popular), worked at most levels of the web stack, & was the Creative Director of a newspaper. He likes board games, activism, & dry humor.

Room 2

Thursday, 18th May, 13:20 - 16:40

Tutorials

13:20 - 16:40

Best Testing Practices for Data Science

Eric J. Ma

So you're a data scientist wrangling with data that's continually avalanching in, and there's always errors cropping up! NaNs, strings where there are supposed to be integers, and more. Moreover, your team is writing code that is getting reused, but that code is failing in mysterious places. How do you solve this? Testing is the answer! In this tutorial, you will gain practical hands-on experience writing tests in a data science setting so that you can continually ensure the integrity of your code and data. You will learn how to use py.test, coverage.py, and hypothesis to write better tests for your code.

Eric J. Ma

Room 8

Thursday, 18th May, 13:20 - 16:40

Tutorials

13:20 - 16:40

Parallel Data Analysis

Ben Zaitlen, Matthew Rocklin, Min Ragan-Kelley

An overview of parallel computing techniques available from Python and hands-on experience with a variety of frameworks. This course has two primary goals: 1. Teach students how to reason about parallel computing 2. Provide hands-on experience with a variety of different parallel computing frameworks. Students will walk away with both a high-level understanding of parallel problems and how to select and use an appropriate parallel computing framework for their problem. They will get hands-on experience using tools both on their personal laptop, and on a cluster environment that will be provided for them at the tutorial. For the first half we cover programming patterns for parallelism found across many tools, notably map, futures, and big-data collections. We investigate these common APIs by diving into a sequence of examples that require increasingly complex tools. We learn the benefits and costs of each API and the sorts of problems where each is appropriate. For the second half, we focus on the performance aspects of frameworks and give intuition on how to pick the right tool for the job. This includes common challenges in parallel analysis, such as communication costs, debugging parallel code, as well as deployment and setup strategies.

Ben Zaitlen

Ben is a data scientist and developer at Continuum Analytics. He has several years of experience with Python and is passionate about any and all forms of data. Part of his duties at Continuum include exploring a vast array of data (social networks, climate, astronomy, biology, finance, etc.).

Matthew Rocklin

Matthew is a full time open source developer at Continuum Analytics where he builds Python tools for parallel data analysis.

Min Ragan-Kelley

Room 7

Thursday, 18th May, 13:20 - 16:40

Tutorials

13:20 - 16:40

Python Epiphanies

Stuart Williams

This tutorial is for developers who've been using Python for a while and would consider themselves at an intermediate level, but are looking for a deeper understanding of the language. It focuses on how Python differs from other languages in subtle but important ways that are often confusing, and it demystifies a number of language features that are sometimes misunderstood.

Stuart Williams

Stuart Williams lives in Winnipeg. He has taught Python at University, PyCon, and various companies; and to user groups, high school students, and elementary schoolchildren. His loves include biking in sun or snow, sailing, his spouse and three children, and teaching Python.

Room 1

Thursday, 18th May, 13:20 - 16:40

Tutorials

Sponsor tutorials

09:00 - 10:30

Intel: Accelerating Python across the range of applications: the right tools for the job

David Liu

Python's popularity has given way to its use in many areas--from web frameworks all the way to machine learning and scientific computing. However, getting the best performance from Python requires an intimate knowledge of the right tools and techniques that are available today. In this tutorial, participants will learn how to measure, tune and accelerate Python workflows across various domains. This tutorial will cover the following topics: -Performance speedups for scientific computing using Intel® Distribution for Python, multithreading with Intel® Threading Building Blocks library, Numba, and Intel® VTune Amplifier -Data Analytics and machine learning acceleration with pyDAAL -Web framework, scripting, and infrastructure acceleration using the PyPy JIT

David Liu

David is a Technical Consultant Engineer at Intel Corporation in Austin, TX, where he represents Intel's Python products and projects. He is focused on solving customer problems in Python while simultaneously developing and shaping Intel's software products to match customer needs. In the past, he worked as a software engineer utilizing Python in machine learning, network infrastructure, and web work. David holds an MS in Software Engineering from the University of Texas at Austin.

Room B110-111

Thursday, 18th May, 09:00 - 10:30

Sponsor tutorials

11:00 - 12:30

Anaconda: Data Science Apps with Anaconda

Ian Stokes-Rees

Anaconda provides a rich foundation of Python and R packages for data science. This tutorial will demonstrate how Anaconda can be used to turn simple models, scripts, or Jupyter notebooks into deployable applications. Participants should have Anaconda installed and have basic Python programming experience. We'll make use of machine learning and AI libraries such as Pandas, Scikit-learn, Tensorflow, and Keras. The tutorial will also demonstrate the app deployment capabilities of Anaconda Cloud.

Ian Stokes-Rees

Ian is a computational scientist and engineer at Continuum Analytics. He loves Python, and finding great ways to use it to solve big hairy problems in scientific computing, data analysis, and visualization. Ian helped develop a Python-based computational infrastructure for the CERN LHCb experiment during his PhD at Oxford, and followed that with work on distributed MC option pricing algorithms while a postdoctoral research at INRIA (France). Prior to joining Continuum, Ian spent 5 years at Harvard, first developing a science gateway for computational biology (in Python, of course), and then as lecturer in the School of Engineering.

Room B118-119

Thursday, 18th May, 11:00 - 12:30

Sponsor tutorials

11:00 - 12:30

Datadog: Distributed Tracing for Python

Aaditya Talwai

Tracing is a specialized form of logging that is designed to work effectively in large, distributed environments. When done right, tracing follows the path of a request across process and service boundaries. This provides a big step-up in application observability, and can help inform a developer why certain requests are slow, or why they might have behaved unexpectedly. This tutorial will familiarize users with the benefits of tracing, and describe a general toolkit for emitting traces from Python applications in a minimally intrusive way. We will walk through a simple example app, which receives an HTTP request, and gradually instrument it to be observable via traces. We will discuss language constructs that can generate traces - namely decorators, monkey-patching and context managers - and give users hints on how they might add tracing to their own applications and libraries. In the process users will become familiar with the existing standards for modelling traces, and some of the challenges involved in adhering to this model in a distributed, asynchronous environment.

Aaditya Talwai

Aaditya Talwai works on large-scale monitoring systems at Datadog. He's enthusiastic about server and application observability, and curious about tools that can give a fresh view into dynamic infrastructure. Over the past year he's been working on open-source libraries and agents to absorb application transaction traces into Datadog's backend. He has also given talks on monitoring and observability in the context of the OpenStack cloud framework.

Room B110-111

Thursday, 18th May, 11:00 - 12:30

Sponsor tutorials

13:30 - 15:00

Anaconda: Accelerating your Python Data Science code with Dask and Numba

Ian Stokes-Rees

Anyone doing numerical computing with Python will have run into performance barriers. Using Anaconda is a great start to get a suite of extension packages where the underlying data structures and algorithms are written in C or Fortan. We'll briefly review the state of numerical computing in Python, look at some examples to help you remember why you should use NumPy based packages whenever possible, and focus on two options for acceleration: faster serial computing or parallelization. Continuum Analytics has developed two popular open source packages to address these issues: Numba, which provides an LLVM-based JIT that can be easily accessed just through a decorator; and Dask, which provides a distributed computing framework and some high quality data structures that are similar to a Pandas DataFrame or a NumPy NDarray. Participants should have the latest release of Anaconda installed and have some familiarity with Python in order to follow along interactively with the tutorial where we'll learn how to efficiently leverage Dask and Numba.

Ian Stokes-Rees

Room B118-119

Thursday, 18th May, 13:30 - 15:00

Sponsor tutorials

15:30 - 17:00

Elasticsearch and Python: Doing more with open source tools

Honza Kral

Description: Elasticsearch, a distributed, RESTful search and analytics engine, has wide variety of capabilities that can be used from Python. In this workshop we will explore several different use cases and showcase how the associated Python libraries can be used to help you. It is intended for intermediate users who have basic familiarity with Elasticsearch and want to further their understanding. Some of the topics that will be covered are: * bulk loading data into Elasticsearch * how to efficiently building queries and aggregations * using Elasticsearch for persistence in your application * syncing data in Elasticsearch and other data stores. Attendees will leave the workshop with the skills to integrate Elasticsearch into their apps and additional tools for analyzing data with Python.

Honza Kral

Honza is a Python programmer and Django core developer – since he is scared ofthe bright and shiny world of browsers, designers, and users he prefers to stayburied deep in the infrastructure code and just provides others with tools todo the actual site-building.Since 2008 Honza has been building content web sites for fun and profit. Duringthis time he discovered Elasticsearch which lead to him joining the companybehind it in 2013 to work on the Python drivers.

Room B110-111

Thursday, 18th May, 15:30 - 17:00

Sponsor tutorials

15:30 - 17:00

Google: How I Learned to Stop Worrying and Love Python at Google | Tech Talk Series

Jasmine Hsu

JASMINE HSU is a software engineer with the Google Brain team, focusing on applying deep learning research to robotics. Her recent work was in imitation learning, and is now currently working on robotic grasping and manipulation. Previous to Google, she worked for several years in the defense industry working on natural language related projects. Feel free to ping her at @hellojas or find her camping out at random coffee shops.

Room B110-111

Thursday, 18th May, 15:30 - 17:00

Sponsor tutorials

15:30 - 17:00

Metis: Natural Language Processing and Text Mining in Python

Michael Galvin

This session will cover NLP and text mining using Python and offer several examples of real world applications. Participants will be introduced to various text processing techniques and learn about text classification, clustering, and topic modeling. By the end of the workshop, participants will be able to use Python to explore and build their own models on text data. Michael Galvin is the Executive Director of Data Science at Metis. He came to Metis from General Electric where he worked to establish their data science strategy and capabilities for field services and to build solutions supporting Global operations, risk, engineering, sales, and marketing. Prior to GE, Michael spent several years as a data scientist working on problems in credit modeling at Kabbage and corporate travel and procurement at TRX. Michael holds a Bachelor's degree in Mathematics and a Master's degree in Computational Science and Engineering from the Georgia Institute of Technology where he also spent 3 years working on machine learning research problems related to computational biology and bioinformatics. Additionally, Michael spent 12 years in the United States Marine Corps where he held various leadership roles within aviation, logistics, and training units.

Michael Galvin

Michael Galvin is the Executive Director of Data Science at Metis. He came to Metis from General Electric where he worked to establish their data science strategy and capabilities for field services and to build solutions supporting Global operations, risk, engineering, sales, and marketing. Prior to GE, Michael spent several years as a data scientist working on problems in credit modeling at Kabbage and corporate travel and procurement at TRX. Michael holds a Bachelor's degree in Mathematics and a Master's degree in Computational Science and Engineering from the Georgia Institute of Technology where he also spent 3 years working on machine learning research problems related to computational biology and bioinformatics. Additionally, Michael spent 12 years in the United States Marine Corps where he held various leadership roles within aviation, logistics, and training units.

Room B118-119

Thursday, 18th May, 15:30 - 17:00

Sponsor tutorials

Summit

09:00 - 17:00

Python Education Summit

Al Sweigart, Luciano Ramalho

8 hours session

In 2017, PyCon will be holding its fifth annual Python Education Summit. The Summit is a gathering of teachers and educators focused on bringing coding literacy, through Python, to as broad a group of audiences as possible. We invite educators from all venues to consider joining the discussion, share insights, learn new techniques and tools and generally share their passion for education. We are looking for educators from many venues: authors; schools, colleges, universities; community-based workshops; online programs; and government. Not only will we have a wide array of full-blown talks, we will also have a round of lightning talks!

Al Sweigart

Al Sweigart is a software developer and the author of _Automate the Boring Stuff with Python_, _Invent Your Own Computer Games with Python_, _Making Games with Python & Pygame_, and _Hacking Secret Ciphers with Python_. These books are freely available under a Creative Commons license at [http://inventwithpython.com][1].Al enjoys haunting coffee shops, writing educational materials, cat whispering, and making useful software. He lives in San Francisco. [1]: http://inventwithpython.com

Luciano Ramalho

Room A105-106

Thursday, 18th May, 09:00 - 17:00

Summit

09:00 - 17:00

Python Education Summit

2 hour session

Join us for a 5k run along the beautiful Willamette River in Portland, OR.

Room A105-106

Thursday, 18th May, 09:00 - 17:00

Summit

Friday, 19th May

Talks

10:50 - 11:20

In-Memory Event Resequencing: Realistic Testing For Impossible Bugs

Michele Pratusevich

As we all know, we should write testable code, and automated tests. But as we also know, no test plan survives contact with the real world. Complex, distributed systems fail in complex, distributed ways, and even the simplest web app today is a complex distributed system. So, as our code accrues little fixes to bugs that only show up in production, our test suites eventually either become slow integration testing monstrosities that are 'realistic' but flaky and unreliable, or useless piles of mocks which are fast and deterministic but don't give you confidence. In this talk, we'll explore how to leverage event-driven programming, or 'async I / O ', to structure code in such a way that its tests are fast, realistic, and reliable, even in the face of horrible race-conditions you only discover in production.

Michele Pratusevich

Portland Ballroom 252-253

Friday, 19th May, 10:50 - 11:20

Talks

10:50 - 11:20

Requests Under The Hood

Cory Benfield

Requests is widely acknowledged as a library that saves users an enormous amount of time, effort, and pain through its intuitive and clear API. For this reason, most people who have never looked at the code assume that its code is as intuitive, well-structured, and clear as the API. Of course, the truth is more complex than that. Real software that deals with real problems is rarely ideal: there are edge cases, terrible hacks, and awkward workarounds for problems. Often in the software industry we pretend that these imperfections in our software don’t exist, or we try to hide them. These imperfections frequently cause people to reinvent wheels in order to simplify the code, which has benefits for understandability but frequently has downsides for resilience. When people talk about “battle-tested” code, they mean code that has been dirtied up over time from its original Platonic ideal implementation to something that is just as complex and warty as real life. In this talk, one of the Requests and urllib3 core maintainers lays bare all of the worst and hackiest corners of the codebases of these two libraries. The goal is to help expose all of the invisible work done in mature codebases to tolerate edge cases and misbehaviour, as well as to try to remind us all that the perfect is the enemy of the good.

Cory Benfield

Cory is an open source Python developer heavily involved in the Python HTTP community. He's a Requests core contributor, a urllib3 core contributor, and the lead maintainer of the Hyper Project, a collection of HTTP and HTTP/2 tools for Python. For his sins, he also helps out with the Python Cryptographic Authority on PyOpenSSL.When he's not feeling perplexed about his life choices, Cory loves computer networks, film and TV, and travel.

Oregon Ballroom 201-202

Friday, 19th May, 10:50 - 11:20

Talks

10:50 - 11:20

Optimizations which made Python 3.6 faster than Python 3.5

Victor Stinner

Various optimizations made Python 3.6 faster than Python 3.5. Let's see in detail what was done and how. Python 3.6 is faster than any other Python version on many benchmarks. We will see results of the Python benchmark suite on Python 2.7, 3.5 and 3.6. The bytecode format and instructions to call functions were redesign to run bytecode faster. A new C calling convention, called 'fast call' , was introduced to avoid temporary tuple and dict. The way Python parses arguments was also optimized using a new internal cache. Operations on bytes and encodes like UTF-8 were optimized a lot thanks to a new API to create bytes objects. The API allows very efficient optimizations and reduces memory reallocations. Some parts of asyncio were rewritten in C to speedup code up to 25%. The PyMem_Malloc() function now also uses the fast pymalloc allocator also giving tiny speedup for free. Finally, we will see optimization projects for Python 3.7: use fast calls in more cases, speed up method calls, a cache on opcodes, a cache on global variables.

Victor Stinner

Python core developer since 2010, I'm the author of various Python applications and libraries Python. See my profile on Bitbucket ( https://bitbucket.org/haypo/ ) and and Github ( http://github.com/haypo/ ). I'm working on OpenStack for Red Hat from France.

Portland Ballroom 251 & 258

Friday, 19th May, 10:50 - 11:20

Talks

10:50 - 11:20

Experiment Assignment on the Web

Jessica Stringham

A popular way of improving websites is to run experiments on it. We split users into groups, show two or more variations of the site, measure how well each one does, and then show the best version to everyone. In this talk, I'll walk through a toy Python program that does the first step: splits users into groups. A few interesting problems arise: grouping users, whitelists, and scaling. I'll share different ways to address them. I'll also give examples of things that can go terribly wrong when designing experiment assignment code.

Jessica Stringham

Jessica is a software engineer at Yelp. At Yelp, she works on the experimentation infrastructure, and previously worked on mobile app analytics and the activity feed. She sometimes blogs at jessicastringham.com.

Oregon Ballroom 203–204

Friday, 19th May, 10:50 - 11:20

Talks

10:50 - 11:20

Title Available On Request: An Introduction to Lazy Evaluation

Joe Jevnik

Lazy evaluation, also known as 'call by need ', is an evaluation strategy where values are produced only when needed. Lazy evaluation is the opposite of eager evaluation, Python's normal evaluation model, where functions are executed as seen and values are produced immediately. In this talk we will define lazy evaluation and contrast it with eager evaluation. We will discuss tools that exist in Python for using lazy evaluation and show how we can build on the primitives to better represent computations. We will introduce common vocabulary for discussing evaluation models, and compare different systems for implementing lazy evaluation. Finally, we will discuss optimizations that can be made to optimize lazily evaluated expressions.

Joe Jevnik

Portland Ballroom 254-255

Friday, 19th May, 10:50 - 11:20

Talks

11:30 - 12:00

Big picture software testing: unit testing, Lean Startup, and everything in-between

Itamar Turner-Trauring

There are many ways you can test your software: unit testing, manual testing, end-to-end testing, and so forth. Take a step back and you'll discover even more form of testing, many of them very different in their goals: A/B testing, say, where you see which of two versions of your website results in more signups or ad clicks. How do these forms of testing differ, how do they relate to each other? How do you choose which kind of testing to pursue, given limited time and resources? How do you deal with strongly held yet opposite views arguing either that a particular kind of testing is essential or that it's a waste time? This talk will provide you with a model, a way to organize all forms of testing and understand what exactly they provide, and why. Once you understand the model you will be able to choose the right form of testing for your situation and goals.

Itamar Turner-Trauring

Itamar Turner-Trauring has been attending Python conferences since 2000. Currently he works at Datawire, building infrastructure for microservices. You can read his blog at https://codewithoutrules.com.

Portland Ballroom 252–253

Friday, 19th May, 11:30 - 12:00

Talks

11:30 - 12:00

Type uWSGI; press enter; what happens?

Asheesh Laroia, Philip James

You're a pretty knowledgeable Python web application developer, but how does that web application get served to the world? For many of us, uWSGI is the magic that makes our application available, and in this talk we'll look at how uWSGI works with the OS and the networking stack to make the magic happen.

Asheesh Laroia

By day, Asheesh Laroia is a member of the technical staff at Sandstorm in Palo Alto. By night, he is volunteer President at OpenHatch, helping create workshops that teach students how to get involved in open source. His technical background touches machine learning, security, and linguistics. He helped start the Boston Python Workshop for women and their friends, has been teaching Python to newcomers since 2004, including at Noisebridge and the EFF, and has advised user groups on how to make their events more newcomer-friendly and gender-diverse.

Philip James

Philip James is a Software Engineer at Stripe, a Civic Technologist and a Director for the City of Alameda Democratic Club and the East Bay Young Democrats. In his free time he builds ContactOtter, a smart address book you can share.

Oregon Ballroom 203-204

Friday, 19th May, 11:30 - 12:00

Talks

11:30 - 12:00

Passing Exceptions 101: Paradigms in Error Handling

Amandine Lee

Exception handling in Python can sometimes feel like a Wild West. If you have a send_email function, and the caller inputs an invalid email address, should it: A) Return None or some other special return value, B) Let the underlying exception it might cause bubble up, C) Check via a regex and type checking and raise a ValueError immediately, or D) Make a custom EmailException subclass and raise that? What if there is a network error while the email was sending? Or what if the function calls a helper _format_email that returns an integer (clearly wrong!), or raises an TypeError itself? Should it crash the program or prompt a retry? This talk will introduce the concept of an exception, explain the built-in Python exception hierarchy and the utility of custom subclasses, demonstrate try/except/finally/else syntax, and then explore different design patterns for exception control flow and their tradeoffs using examples. It will also make comparisons to error handling philosophy in other languages, like Eiffel and Go.

Amandine Lee

I was trained in physics and statistics, and began programming by learning MatLab and R for modeling in research. A couple of years and a few computer science classes later, I attended the Recurse Center, where I decided to learn Python to have an versatile and concise home programming language with which to learn about CS theory and software engineering. I now work at Dropbox, on the Desktop Release Engineering team, where Python is a part of my daily life.

Oregon Ballroom 201–202

Friday, 19th May, 11:30 - 12:00

Talks

11:30 - 12:00

Immutable Programming - Writing Functional Python

Calen Pennington

The world of Haskell and functional programming may seem like a distant place to many working Python developers, but some of the techniques used there are remarkably useful when developing in Python. In this talk, I will cover some of the pitfalls of mutability that you may run into while writing Python programs, and some tools and techniques that Python has built in that will let you avoid them. You'll see namedtuples, enums and properties, and also some patterns for structuring immutable programs that will make them easier to build, extend, and test.

Calen Pennington

Cale is a lead architect at edX.org. Python engineer by day, Haskell hacker by night, he loves to use just the right amount of magic to clarify his designs. When not programming, he plays boardgames and parents a very active toddler.

Portland Ballroom 254–255

Friday, 19th May, 11:30 - 12:00

Talks

11:30 - 12:00

Prehistoric Patterns in Python

Lennart Regebro

Why do some code use dictionaries that have None for all values? Is it true that you shouldn't concatenate strings with +? Will Python optimize constant calculations? This talk will go through some patterns that used to be common in Python, but which now are regarded as outdated and see if they really are outdated and why. The results surprised me, maybe they'll surprise you.

Lennart Regebro

Lennart Regebro has been using Python so long that he has forgotten all other programming languages. He is born in Sweden, works at Boston company Shoobx and lives in Poland, with his wife, daughter, two cats and a quince tree.His book Supporting Python 3 is available for free online.

Portland Ballroom 251 & 258

Friday, 19th May, 11:30 - 12:00

Talks

12:10 - 12:55

Introduction to Threat Modeling

Ying Li

Are you a website or application developer? Are you worried about security? Don’t know what you need to know, and what you can safely leave to the experts? Come learn about how to analyze your application’s design for potential security flaws, how to think like a security engineer, and see some of the most common pitfalls that programs fall victim to. In this talk we will work through the process of threat modeling - understanding how your system might get attacked, what its weak points are, and how to defend it.

Ying Li

Software developer, mainly in Python and Go. Varied interests, including security.

Portland Ballroom 254–255

Friday, 19th May, 12:10 - 12:55

Talks

12:10 - 12:55

Grok the GIL: Write Fast And Thread-Safe Python

A. Jesse Jiryu Davis

I wrote Python for years while holding mistaken notions about the Global Interpreter Lock, and I've met others in the same boat. The GIL's effect is simply this: only one thread can execute Python code at a time, while N other threads sleep or await network I/O. Let's read CPython interpreter source and try some examples to grok the GIL, and learn to write fast and thread-safe Python.

A. Jesse Jiryu Davis

Staff Engineer at MongoDB in New York City specializing in C, Python, and async. Lead developer of the MongoDB C Driver libraries libbson and libmongoc. Author of Motor, an async MongoDB driver for Tornado and asyncio. Contributor to Python, PyMongo, MongoDB, Tornado, and asyncio. Co-author with Guido van Rossum of “A Web Crawler With asyncio Coroutines”, a chapter in the “500 Lines or Less” book in the Architecture of Open Source Applications series.Blogs at emptysquare.net and for the PSF at pyfound.blogspot.com.

Oregon Ballroom 201-202

Friday, 19th May, 12:10 - 12:55

Talks

12:10 - 12:40

Web identity: OAuth2 and OpenIDConnect

Brendan McCollam

Interested in adding single sign-on to your application, but confused about the variety of different web authentication methods out there? OAuth, OAuth2, OpenID, OpenIDConnect, SAML, Facebook Connect? This talk will clarify the different protocols, examining OAuth2 and OpenIDConnect in greater detail. It will demonstrate a basic client implementation using FLOSS libraries, and briefly touch on some of the issues involved in server implementation.

Brendan McCollam

Brendan McCollam is a developer for the University of Chicago, working on the Globus project (globus.org): a data management platform for large-scale scientific data. Formerly he worked at Leapfrog Online, where he released the 'rstr' package. He is a graduate of Pomona College, where he studied neuroscience and mathematics; he was the recipient of a 2008 Watson Fellowship. Brendan lives outside London.

Oregon Ballroom 203-204

Friday, 19th May, 12:10 - 12:40

Talks

12:10 - 12:55

The Gilectomy: How's It Going?

Larry Hastings

One of the most interesting projects in Python today is Larry Hastings' 'Gilectomy' project: the removal of Python's Global Interpreter Lock, or 'GIL'. Come for an up-to-the-minute status report: what's been tried, what has and hasn't worked, and what performance is like now.

Larry Hastings

Larry has been using Python since the lateish 90s. He is a CPython core contributor, and is Release Manager for Python 3.4.

Portland Ballroom 251 & 258

Friday, 19th May, 12:10 - 12:55

Talks

12:10 - 12:40

Next Level Testing

James Saryerwinnie

Unit, functional, and integration tests are great first steps towards improving the quality of your python project. Ever wonder if there’s even more you can do? Maybe you've heard of property-based testing, fuzzing, and mutation testing but you're unsure exactly how they can help you. In this talk we’ll cover additional types of tests that can help improve the quality and robustness of your python projects: property-based testing, fuzz testing, stress testing, long term reliability testing, and mutation testing. We’ll also go beyond just covering what these tests are. For each of the test types above, I’ll give you real world examples from open source software that I maintain that shows you the types of bugs each test type can find. I’ll also show you how you can integrate these tests into your Travis CI and/or Jenkins environment.

James Saryerwinnie

James Saryerwinnie is a Software Development Engineer at Amazon Web Services where he works on Boto3, the AWS SDK for Python, and the AWS CLI, the unified command line interface for AWS. He also maintains several open source projects including jmespath, fakeredis, and python-keepassx.

Portland Ballroom 251 & 258

Friday, 19th May, 12:10 - 12:40

Talks

13:40 - 14:25

Solid Snakes or: How to Take 5 Weeks of Vacation

Hynek Schlawack

No matter whether you run a web app, search for gravitational waves, or maintain a backup script: being responsible for a piece of software or infrastructure means that you either get a pager right away, or that you get angry calls from people affected by outages. Being paged at 4am in everyday life is bad enough. Having to fix problems from hotel rooms while your travel buddies go for brunch is even worse. And while incidents can’t be prevented completely, there are ways to make your systems more reliable and minimize the need for (your!) manual intervention. This talk will help you to get calm nights and relaxing vacations by teaching you some of them.

Hynek Schlawack

Hynek Schlawack is a lead infrastructure and software engineer from Berlin, Germany, PSF fellow, and contributor to a wide variety of open source projects including high-profile ones like Twisted and CPython.His main areas of interest are security, networks, and solid software engineering.

Portland Ballroom 252-253

Friday, 19th May, 13:40 - 14:25

Talks

13:40 - 14:25

Implementing Concurrency and Parallelism From The Ground Up

Amber Brown

When writing an application, it is common to want to do many things at once. For web servers, this is serve multiple web requests, for GUI applications it's doing a background task whilst keeping the UI responsive. But how do we actually do that? This talk will go into how concurrency and parallelism work from the CPU, OS, and threads up, how state (data) is shared between them, and how this interacts with the functions that you, the programmer, write, and how you can write properly behaving concurrent or parallel software.

Amber Brown

Amber Brown (HawkOwl) is the Twisted Release Manager.

Oregon Ballroom 203-204

Friday, 19th May, 13:40 - 14:25

Talks

13:55 - 14:25

Packaging Let’s Encrypt: Lessons learned shipping Python code to hundreds of thousands of users

Noah Swartz

Let's Encrypt launhced on April 12th 2016, for the first time allowing anyone access to free SSL certificates that could be automatically fetched and renewed. The demand was massive, and so was the need for a client to fetch these certificates for all of those users. This client is called Certbot, and it's written entirely in Python. Unfortunately for the sanity of Certbot developers, these users of Let's Encrypt can't decide on a single operating system to use! This requires us to ship our software, and all of its dependencies, to a variety of systems all with different web servers, Python versions, package managers,and underlying packages. Learn how we got through this mess!

Noah Swartz

Noah is a Staff Technologist on the Tech Projects team. He works on the various software the EFF produces and maintains, including but not limited to Privacy Badger and Certbot.Before joining EFF Noah was a researcher at the MIT Media Lab as well as a technomancer and free software/culture advocate. An avid game enthusiast, Noah has ascended in nethack four times. He lives in the Mission District of San Francisco with his family of twitterbots.

Portland Ballroom 251 & 258

Friday, 19th May, 13:55 - 14:25

Talks

13:55 - 14:25

Dial M For Mentor

Mariatta Wijaya

One of the nicest things about Python community is the availability of mentors willing to help you. Various mentors have helped me navigate the open source community and help advanced my skills. I realized finding a mentor is not as easy as it seems, and it takes a lot of courage to reach out in the first place. And then, there is impostor syndrome, where one may feel like they don’t deserve the help. In this talk, I will provide advice about working with a mentor. Asking for help is not a failure.

Mariatta Wijaya

Mariatta is a new Python core developer. She works as a Software Engineer in Vancouver, where she also helps organize Vancouver PyLadies. She's credited in three movies: Storks, The Angry Birds Movie, and Hotel Transylvania 2.

Portland Ballroom 254-255

Friday, 19th May, 13:55 - 14:25

Talks

13:55 - 14:25

Community powered packaging: conda-forge

Filipe Pires Alvarenga Fernandes

The Python scientific community always wanted a package manager that is cross platform, does not require sudo, and lets Python be awesome! The conda package manager solved that problem, but created a new ones...This talk is a tour disguised as a beginner tutorial to conda-forge packaging. We will try to discuss some myths and misconceptions about conda and conda-forge, as well as a quick comparison with pip and wheels.

Filipe Pires Alvarenga Fernandes

I am a physical oceanographer turned developer due to a failed PhD.It all started when I decided to ditch Matlab in favor of Python for my daily work,and soon I started re-writing all the oceanographic Matlab tools to python.Thanks to that effort I got a job working for NOAA teaching/developing/promoting python as a tool for Earth scientist.

Oregon Ballroom 201-202

Friday, 19th May, 13:55 - 14:25

Talks

14:35 - 15:05

Instagram Filters in 15 Lines of Python

Michele Pratusevich

Images tell stories, and we love Instagram filters because they give emotion to our images. Do you want to explore what makes up Instagram filters? In this talk, we will talk about the basic elements of Instagram filters and implement them in Python. The staple libraries we will use are scikit-image and numpy - matplotlib and jupyter notebooks for plotting and interactivity. In the end, we will implement the (now-defunct) Gotham Instagram filter in 15 lines of Python (not including imports). Throughout the process, there will be many pretty pictures.

Michele Pratusevich

Oregon Ballroom 201-202

Friday, 19th May, 14:35 - 15:05

Talks

14:35 - 15:05

Factory Automation with Python - Stories about Robots, Serial Ports, and Barcode Readers

Jonas Neubert

In industrial automation tried and tested always beats latest and greatest: The machines that make smartphones have a serial port and are configured with .csv files. But when your factory automates complex non-linear workflows and is jam-packed with sensors and robots, you are quickly faced with software engineering challenges that call for modern tools. Python turns out to be surprisingly versatile in this setting, whether for prototyping a single conveyor belt or taming a building full of robots. This talk explains how to use Python for interfacing with two common industrial automation devices: a barcode scanner and a Programmable Logic Controller (PLC). After a simple demo, you’ll hear about lessons learned using Python packages that weren’t written with robots in mind, like Celery and pandas, with robots.

Jonas Neubert

Jonas is a robotics engineer and software developer interested in automating complex processes. Among the many places where Jonas has caused downtime are: a steel processing plant, an automated genomics lab, and a cinema projection room (during a Harry Potter screening!)I'm currently making electronics prototyping faster with Tempo Automation in San Francisco. Previously, I've been building big robots at Counsyl and tiny robots at Cornell University. I've also created smartphone apps that make physical things, and a web app that 3D-prints topographical maps.

Oregon Ballroom 203-204

Friday, 19th May, 14:35 - 15:05

Talks

14:35 - 15:05

Dr. Microservices, Or How I Learned to Stop Worrying and Love the API

Ryan Anguiano

Assuming that you already know how to build a monolithic app, you must be wondering how you can use all this 'microservice' stuff that you keep hearing about. Well, a good word of advice is that you probably don't need it. If designed properly, a monolithic app should be able to scale and fit the needs of most businesses. Even so, you should keep your development as simple as possible until you have proven and solidified your business concepts. But if you do need to grow to Internet scale, then you have a long road ahead of you. Moving from a monolithic application to microservices is a natural evolution that is often of necessity. There are several competing schools of thought that are still being battle-tested in these early days of microservice architecture. Among all the competing paradigms, most of the requirements can be agreed upon, but are mostly differentiated by the tools used to fulfill the requirements. This talk will cover setting up the required infrastructure, and demonstrate how to migrate a sample monolithic Django application into a microservices platform. The demo application will use the following technologies: Django, Flask, Fabric, Terraform, Ansible, CentOS, Docker, Mesos, Consul, Nginx, Pgbouncer, Kafka

Ryan Anguiano

Ryan has been designing and implementing technological solutions in various industries for over 13 years. Over the past decade, he has been a part of helping a number of companies venture into the digital realm and implement policies that allow for vast creativity and innovation to occur. He has built many platforms that utilize state-of-the-art technology in order to gain a competitive edge and created tools that help businesses achieve their goals. Ryan currently works at RevPoint Media developing www.jangl.com

Portland Ballroom 252-253

Friday, 19th May, 14:35 - 15:05

Talks

14:35 - 15:05

Rants and Ruminations From A Job Applicant After 💯 CS Job Interviews in Silicon Valley

Susan Tan

What is it like to interview at 1 technology company? Stressful and tiring. What is it like to interview at 100 technology companies? I have done that. In late August 2016, I quit an uninspiring full-time software job and talked to 100 employers in the San Francisco Bay Area to find the best fit. The hiring process reflects the company culture and its values. Listen to my rants and ruminations of interviewing at tiny seed-stage startups to large technology companies in Silicon Valley. Learn how to reform your own hiring process to be more considerate and thoughtful. Learn how to prepare for interviews efficiently.

Susan Tan

Susan is a software engineer in San Francisco. She was a web applications engineer at Cisco via the acquisition of Piston, a cloud computing startup located in San Francisco. Prior to Piston, she was a web applications engineer at Rotten Tomatoes the movie review website. She likes to use Python-based web frameworks. Sheâ€™s a core committer of a Django-based web application project at www.openhatch.org. Susan loves to drink warm cups of oolong tea while coding.

Portland Ballroom 254-255

Friday, 19th May, 14:35 - 15:05

Talks

14:35 - 15:05

Snek in the Browser

Katie McLaughlin

Python is a decades-strong language with a large community, and it has a solid foundation on the server, but it doesn't have a good user story in the browser... until now. The BeeWare project aims to bring Python natively, everywhere. Using a combination of the Batavia and Toga projects, we can develop and entirely native web experience in Python, no JavaScript required. During this talk, you will learn about how the BeeWare project has built Batavia, a Python virtual machine in JavaScript; and Toga, a multi-platform native API wrapper; a combination of which can be used to build an entire web platform in Python only.

Katie McLaughlin

Katie has worn many different hats over the years. She has previously been a software developer for many languages, systems administrator for multiple operating systems, and speaker on many different topics.She's currently an operations engineer, core contributor for the BeeWare project, DjangoCon AU organiser, and former council member of Linux AustraliaWhen she's not changing the world, she enjoys making tapestries, cooking, and seeing just how well various application stacks handle emoji.

Room 3

Friday, 19th May, 14:35 - 15:05

Talks

15:15 - 15:45

5 ways to deploy your Python web app in 2017

Andrew T. Baker

You’ve built a fine Python web application and now you’re ready to share it with the world. But what’s the best way to deploy your app in 2017?This talk will demonstrate popular techniques for deploying Python web applications. We’ll start with a simple Flask application and expose it to the world five times over as we learn to use different tools and services available to the modern Python developer.Specific topics covered include:Exposing your local dev environment with ngrok, Using a Platform-as-a-Service (PaaS) like Heroku, Going “serverless” with AWS Lambda, Configuring your own VM with Google Compute Engine, Thinking inside the box using Docker. We’ll also briefly touch on the pros and cons of each technique to help you figure out which one is right for your app. At the end of this talk you will have a basic understanding of how each of these techniques work and you’ll be ready to try them out yourself.

Andrew T. Baker

Andrew Baker is a Developer Educator at Twilio where he’s on a mission to change the way developers learn how to use new APIs.Equal parts Python developer and educator, he loves writing well tested code and running hands on workshops. Andrew found his calling focusing on Twilio's documentation, where he is equally upset spotting errant semicolons in Python code or tutorial prose.

Portland Ballroom 252-253

Friday, 19th May, 15:15 - 15:45

Talks

15:15 - 16:00

I Installed Python 3.6 on Windows and I Liked It

Steve Dower

Python has a great reputation as a cross-platform language, which for many people means different varieties of Linux. But a huge number of Python users are running on Windows - a fundamentally different operating system where things do not always work the same. However, Python has always worked incredibly well across different platforms including Windows, going to great lengths to support and expose the platform without making development more complex. In this session, CPython core developer and Microsoft engineer Steve Dower will discuss some of the reasons why cross-platform support is not an accident, and how Python 3.6 makes it even easier to support both Windows and Linux.

Steve Dower

Steve is an engineer who tells people about Python and then gives them excuses to use it and great tools to use it with. He works on Python support for Visual Studio, contributes to many of Microsoft's Python libraries, and is a core contributor and Windows expert for CPython.

Portland Ballroom 251 & 258

Friday, 19th May, 15:15 - 16:00

Talks

15:15 - 14:00

Python from Space: Analyzing Open Satellite Imagery Using the Python Ecosystem

Katherine Scott

Earth imaging satellites, just like our computers, are shrinking and becoming more ubiquitous than ever before. It is now possible to obtain open satellite data on a daily if not weekly basis and for this data to be put to work; helping us better understand our planet and quickly respond to disaster situations. In this talk we will work through a jupyter notebook that covers the satellite data ecosystem and the python tools that can be used to sift through and analyze that data. Topics include python tools for using Open Street Maps data, the Geospatial Data Abstraction Library (GDAL), and OpenCV and NumPy for image processing. This talk is intended for novice and intermediate python developers who are interested in using data science and satellite imagery for social good and fundamental scientific research.

Katherine Scott

Katherine Scott is a senior software engineer for Planet where she works on the data pipeline team making sense of millions of satellite images. Katherine was the lead developer of the SimpleCV computer vision library and co-author of the SimpleCV O'Reilly Book . In her spare time she hacks together robots and teaches kids about robotics and programming.

Oregon Ballroom 201–202

Friday, 19th May, 15:15 - 14:00

Talks

15:15 - 16:00

Constructive Code Review

Erik Rose

“Your code is bad and you are bad. Have a bad day.” Too many code reviews feel like this, and it saps the enthusiasm that drives open source. Instead, let’s explore how to give reviews that are truthful but encouraging, boosting the skill level of contributors and the quality of the project. We’ll look at “tact hacks” that nudge communication in a friendly direction, antipatterns to avoid, the pesky human emotions that can tempt us into reviewing poorly, and techniques for leveling up newcomers without losing all your coding time.

Erik Rose

Erik Rose coordinates the impact of 108 spring-loaded buttons at Mozilla, venting a byproduct of static analysis, search, and pattern-finding software. By day, he leads the development of DXR, a comprehension tool for large codebases; by night, a bevy of eclectic Python libraries. Skeletons in his closet include the self-bootstrapping mechanism for Let's Encrypt, the hash-verification functionality in pip, and a book about Plone. He is a frequent PyCon speaker and has keynoted PyCon Argentina and Bangalore's The Fifth Elephant.

Portland Ballroom 254-255

Friday, 19th May, 15:15 - 16:00

Talks

15:15 - 15:45

What's in your pip toolbox?

Jon Banafato

pip is a great tool, but dependency management doesn't stop there. I'll explore several tools that work with pip to make managing your dependencies easier, faster, and safer. I'll cover generating dependencies a better way, maintaining your requirements.txt for the long-term, and exploring existing Python environments. Afterward, you'll never want to pip freeze > requirements.txt again.

Jon Banafato

Jon Banafato is a Python developer and event organizer living in NYC. He would love to see you attend PyGotham.

Oregon Ballroom 203-204

Friday, 19th May, 15:15 - 15:45

Talks

15:15 - 16:00

Ending Py2/Py3 compatibility in a user friendly manner

Matthias Bussonnier, Min Ragan-Kelley

Four shalt thou not count, neither count thou two, excepting that thou then proceed to three.'Monty Python and the Holy Grail; Scene 33. Python 3 has been around for more than eight years, and much of the Python ecosystem is now available both on Python 2 and Python 3, often using a single code base. Nonetheless, this compatibility comes at a development cost and some library authors are considering ending support for Python 2 . These once-python-2-compatible libraries are at risk of being upgraded on non compatible system and cause user (and developer) frustration. While it may seem simple to cease support for Python 2, the challenge is not in ending support, but doing so in a way that does not wreak havoc for users who stay on Python 2. And that is not only a communications problem, but a technical one : up until recently, it was impossible to tag a release as Python 3 only; today it is possible. Like any maintainer of a widely used library, we want to ensure that users continue to use Python 2 continue to have functioning libraries, even after development proceeds in a way that does not support Python 2. One approach is to ensure easy installation of older versions if possible avoid incompatible versions altogether. Users should not need to manually pin maximal version dependencies across their development environments and projects if all they want is to use the latest versions of libraries that are compatible with their system. Even if we did expect that of users, consider what would happen when a package they rely on converts to be only Python 3 compatible. If they were not tracking the complete dependency tree, they might discover, on upgrade, that their projects no longer work. To avert this they would need to pin those at the last version compatible with Python 2. Users that want to use older python versions should not have to go through so much anguish to do so. In order to solve this problem, and thereby make both users' and maintainers' lives easier, we ventured into the rabbit-hole called Packaging. Though we set off with a singular quest, our tale roves through many lands. We'll narrate the story of our amending PEPs, our efforts in building the ramparts of the pypa/Warehouse Castle, battles with the dragons of Pip, and errands in the 'land of no unit tests 'otherwise known as PyPI legacy.By the end of the above tale,the audience members will know the road to Python 3 only libraries had once had hazards that are now easily avoidable.So long as users upgrade their package management tools.

Matthias Bussonnier

Min Ragan-Kelley

Portland Ballroom 251 & 258

Friday, 19th May, 15:15 - 16:00

Talks

16:15 - 19:00

A gentle introduction to deep learning with TensorFlow

Michelle Fullwood

Deep learning's explosion of spectacular results over the past few years may make it appear esoteric and daunting, but in reality, if you are familiar with traditional machine learning, you're more than ready to start exploring deep learning. This talk aims to gently bridge the divide by demonstrating how deep learning operates on core machine learning concepts and getting attendees started coding deep neural networks using Google's TensorFlow library.

Michelle Fullwood

I dissect words for science at MIT.

Oregon Ballroom 203-204

Friday, 19th May, 16:15 - 19:00

Talks

16:15 - 17:00

Dask: A Pythonic Distributed Data Science Framework

Matthew Rocklin

Dask is a general purpose parallel computing system capable of Celery-like task scheduling, Spark-like big data computing, and Numpy/Pandas/Scikit-learn level complex algorithms, written in Pure Python. Dask has been adopted by the PyData community as a Big Data solution. This talk focuses on the distributed task scheduler that powers Dask when running on a cluster. We'll focus on how we built a Big Data computing system using the Python networking stack (Tornado/AsyncIO) in service of its data science stack (NumPy/Pandas/Scikit Learn). Additionally we'll talk about the challenges of effective task scheduling in a data science context (data locality, resilience, load balancing) and how we manage this dynamically with aggressive measurement and dynamic scheduling heuristics.

Matthew Rocklin

Matthew is a full time open source developer at Continuum Analytics where he builds Python tools for parallel data analysis.

Portland Ballroom 252–253

Friday, 19th May, 16:15 - 17:00

Talks

16:30 - 17:00

Re-Programming the Human Genome with Python

Riley Doyle

Modern genome editing techniques such as CRISPR-Cas9 are revolutionizing the way we discover and treat the root genetic causes of disease. Many of the most popular tools and libraries in this cutting edge application are written in Python. This talk will provide a general, software-centric introduction to the exciting new area of genome editing, describe the central string search, machine learning, and data management problems involved, and review how Python frameworks and libraries are used today to solve these problems in Production in order to benefit human health. This talk assumes no prior lab experience: only a proficiency with Python and curiosity!

Riley Doyle

Riley is the principal architect of the DESKGEN genome editing platform, its underlying sequence storage and search engine, and inventor of the patent-pending AutoClone search algorithm. Currently his focus is applying CRISPR, machine learning, data engineering, and automated control systems to cell and gene therapy. Prior to Desktop Genetics, Riley was a BioEngineer at Genentech, Inc.Riley is interested the business impact synthetic biology, biotech information and automation systems on the world of science and business. He’s currently working on finding biotech’s next “killer app.Mostly he just sends emails.

Portland Ballroom 254-255

Friday, 19th May, 16:30 - 17:00

Talks

16:30 - 17:00

The Glory of pdb's set_trace

Nicole Zuckerman

Everyone needs to debug code, and it can take up a non-trivial portion of our time to wait for code to complete execution and write print messages to stdout. There’s one function in particular in the python debugger (pdb) library that can give you a much clearer understanding of what’s going on in your code, much more quickly; pdb.set_ trace(). In this talk, we’ll identify the most useful things you can do when you use set trace, that can make debugging exponentially more efficient and enjoyable.

Nicole Zuckerman

Nicole Zuckerman is a software engineer at Clover Health, where she writes the endpoints and data pipelines to help surface better health care for members. She's also deeply invested in effectively on-boarding entry-level engineers, and improving diversity and inclusion in tech. In her free time, Nicole is an avid dancer and teacher, sci-fi book fanatic, soul and jazz aficionado, and cheese lover. She holds an MA in English Literature and Women's Studies from the University of Liverpool.

Oregon Ballroom 201-202

Friday, 19th May, 16:30 - 17:00

Talks

16:30 - 17:00

Building A Gigaword Corpus: Lessons on Data Ingestion, Management, and Processing for NLP

Rebecca Bilbro

As the applications we build are increasingly driven by text, doing data ingestion, management, loading, and preprocessing in a robust, organized, parallel, and memory-safe way can get tricky. This talk walks through the highs (a custom billion-word corpus!), the lows (segfaults, 400 errors, pesky mp3s), and the new Python libraries we built to ingest and preprocess text for machine learning. While applications like Siri, Cortana, and Alexa may still seem like novelties, language-aware applications are rapidly becoming the new norm. Under the hood, these applications take in text data as input, parse it into composite parts, compute upon those composites, and then recombine them to deliver a meaningful and tailored end result. The best applications use language models trained on domain-specific corpora (collections of related documents containing natural language) that reduce ambiguity and prediction space to make results more intelligible. Here's the catch: these corpora are huge, generally consisting of at least hundreds of gigabytes of data inside of thousands of documents, and often more! In this talk, we'll see how working with text data is substantially different from working with numeric data, and show that ingesting a raw text corpus in a form that will support the construction of a data product is no trivial task. For instance, when dealing with a text corpus, you have to consider not only how the data comes in (e.g. respecting rate limits, terms of use, etc.), but also where to store the data and how to keep it organized. Because the data comes from the web, it's often unpredictable, containing not only text but audio files, ads, videos, and other kinds of web detritus. Since the datasets are large, you need to anticipate potential performance problems and ensure memory safety through streaming data loading and multiprocessing. Finally, in anticipation of the machine learning components, you have to establish a standardized method of transforming your raw ingested text into a corpus that's ready for computation and modeling. In this talk, we'll explore many of the challenges we experienced along the way and introduce two Python packages that make this work a bit easier: Baleen and Minke. Baleen is a package for ingesting formal natural language data from the discourse of professional and amateur writers, like bloggers and news outlets, in a categorized fashion. Minke extends Baleen with a library that performs parallel data loading, preprocessing, normalization, and keyphrase extraction to support machine learning on a large-scale custom corpus.

Rebecca Bilbro

Dr. Rebecca Bilbro is lead data scientist at Bytecubed, a small tech company in Washington, DC, where she and her team use supervised and unsupervised machine learning, programming in Python and R, and distributed computing with Spark and Hadoop. In her free time, Rebecca works with the open source collaborative District Data Labs (check out the Yellowbrick Project), writes for the DDL blog, practices ukulele, and rides her bicycle around DC. Rebecca earned her doctorate from the University of Illinois Urbana-Champaign, conducting research on communication practices in Engineering, and holds a Bachelor's degree in Mathematics and English from Skidmore College.

Portland Ballroom 251 & 258

Friday, 19th May, 16:30 - 17:00

Talks

17:10 - 17:40

Snakes on a Hyperplane: Python Machine Learning in Production

Jessica Lundin

Companies with an artificial-intelligence plan have a differentiating strategy in the intelligence economy; however, implementing robust machine-learning in production is nontrivial, often requiring a close collaboration between data scientists and developers, and retooling the production stack and workflows to develop and maintain accurate models. Machine learning in production involves model application, handling missing data, data artifacts, and data outside of the training calibration. A rigorous evaluation framework draws upon logging to determine characteristics of model coverage, model performance, auditing, and run-time performance. Model coverage includes the number of times the model produced sensible output relative to number of times it is called. Model coverage is reduced if the model does not converge or model criteria are not met. Model performance is evaluated with a suite of metrics (accuracy, AUC, FPR, TPR, RMSE, MAPE, etc.), which assist in determining the most appropriate model to use in the production scenario and the validity of the model training. Regularly performing manual audits for spot checks is important for debugging and ensuring the model passes sanity checks. Model performance includes run times and profiling model pieces, ensuring performance is within specified requirements and refactoring otherwise. In the AI renaissance, where ML is a critical piece of intelligent products, seamlessly integrating model evaluation into workflows is an important component of making robust products and building a satisfying customer experience. Python is a great language to build intelligent products with its abundance of ML libraries and wrappers contributed as open-source software in addition to rich full-stack capabilities.

Jessica Lundin

Jessica’s research journey began in astro-particle physics doing fieldwork on a neutrino-detector project at South Pole Station. She completed a Masters and PhD in applied mathematics and geophysics at the University of Washington, studying ice physics and climatology, i.e. applied machine learning. After a postdoc in Japan she joined two later-acquired start-ups. She began at Microsoft by way of VoloMetrix, an enterprise productivity company acquired by O365, and is currently compelled to address the technology-undeserved field of medicine, working in the health and IoT domain in Microsoft Research.

Oregon Ballroom 203–204

Friday, 19th May, 17:10 - 17:40

Talks

17:10 - 17:40

Looping Like a Pro in Python

David DB Baumgold

The humble loop: it's hard to write a program without it. Whether it's processing numbers in a sequence, lines in a text file, users in a database, or any other list of things, you use loops all the time. But did you know that Python has a lot of different ways to write loops? Reaching for the right looping tool can make your code cleaner, more readable, easier to test, and it can even make it run faster! By the end of this talk, you'll be looping like a pro, and your code will be better for it.

David DB Baumgold

DB is a freelance web developer living in the Boston area. He loves Python for its simplicity, its clarity, and its community -- especially in welcoming beginners. In addition to Python, he also enjoys Javascript, databases, design, teaching, and learning. He genuinely believes that everything will work out alright in the end, somehow.

Oregon Ballroom 201-202

Friday, 19th May, 17:10 - 17:40

Talks

17:10 - 17:40

The Wild West of Data Wrangling

Sarah Guido

Data science introductory courses might give you the impression that dealing with data is neat, tidy, and simple. They present you with a simplistic dataset and the scikit-learn or Pandas documentation, and a day or so later, you're done! Piece of cake, right? The real world of data isn't that easy! As a data scientist who has worked in the industry for several years, I have had a lot of experience dealing with messy, inaccurate, incomplete data, and I want to share those experiences with you. I'll talk my way through three real-world situations where I've had to analyze and build models on untidy and complex data, going through how I've preprocessed the data and prepared it for modeling. You'll leave with an understanding of how a data scientist thinks about data and what she does when the data is complicated.

Sarah Guido

Sarah is a Senior Data Scientist at Mashable where she studies user behavior through data. She is the chair of the Machine Learning/Artificial Intelligence track at the 2017 SciPy Conference and is an accomplished conference speaker. She is also an O'Reilly Media author, having co-authored Introduction to Machine Learning with Python. Community involvement is very important to Sarah, and she is a co-organizer of the NYC Python Meetup, the largest Python meetup in the world. Sarah attended graduate school at the University of Michigan's School of Information.

Portland Ballroom 254-255

Friday, 19th May, 17:10 - 17:40

Talks

17:10 - 17:40

Algorithmic Music Generation

Padmaja V Bhagwat

Music is mainly an artistic act of inspired creation and is unlike some of the traditional math problems. Music cannot be solved by a simple set of formulae. The most interesting and challenging part is producing unique music without infringing the copyright. The generated music has to sound good, and what sounds good is very subjective and varies from culture to culture. Artificial Neural Network/Deep Learning has a wide range of applications, such as in Image processing, Natural language processing, Time series prediction, etc. But what about its usage in art? Could we use deep learning to create music? This talk is about how deep learning models were used to produce music - catering particularly to Bollywood. This talk would show how an exquisite piece of art i.e. music can be generated using deep learning model which helps in automated feature extraction. In order to automate the music generation, the model must be able to remember the learned features over the longer period of time, this is achieved by a special type of Recurrent Neural Network (RNN) called as LSTM (Long Short Term Memory) network. Implementation of such complex model can be made much easier using inbuilt Python libraries such as Keras with Theano as backend. It allows for easy and fast prototyping. Packages like numpy and scipy are being used for easier mathematical computation of input vectors and for reading/writing the WAV files respectively. The neural network architecture makes use of numerous amount of music samples to train the model. After an adequate number of iterations and training time, this model generates music that is unique and original. In this talk, steps involved in preprocessing of data, training the model, testing the model and generating the music from the trained model will be discussed. This talk will also cover some of the challenges and tradeoffs made for algorithmic music generation.

Padmaja V Bhagwat

Padmaja V Bhagwat is a junior pursuing Bachelors in Information Technology at National Institute of Technology, Karnataka - India. Her interest lies in the field of Machine Learning and Artificial Intelligence. She is motivated to make the world a better place by using machine learning. When she isn’t working on machine learning, she dances.

Portland Ballroom 251 & 258

Friday, 19th May, 17:10 - 17:40

Talks

17:40 - 17:40

Leveraging Serverless Architecture for Powerful Data Pipelines

Jason Myers

Serverless Architectures that allow us to run python functions in the cloud in an event-driven parallel fashion can be used to create extremely dynamic and powerful data pipelines for use in ETL and data science. Join me for an exploration of how to build data pipelines on Amazon Web Services Lambda with python. We'll cover a single introduction to event-driven programming. Then, we'll walk through building an example pipeline while discussing some of the frameworks and tools that can make building your pipeline easier. Finally, we'll discuss how to maintain observability on your pipeline to ensure proper performance and troubleshooting information.

Jason Myers

Jason works at Juice Analytics as a Software Engineer after spending 15 years as a systems architect and building datacenters and cloud architectures for several of the largest tech companies, hospitals, stadiums, and telecomm providers. He's a passionate developer who regularly speaks at local and national events about technology. He's also the chair of the PyTennessee conference. He's also the author of Essential SQLAlchemy 2nd via O'Reilly Media, Introduction to Relational Databases via Data Camp, and the upcoming Serverless Data Pipelines with Python via Manning Press.

Portland Ballroom 252-253

Friday, 19th May, 17:40 - 17:40

Talks

Saturday, 20th May

Talks

10:50 - 11:20

Automate AWS With Python

Moshe Zadka

AWS is one of the best-known cloud vendors. Using the Web UI is fine when starting out, but automating cloud operations is important. Boto3 provides a great Pythonic API to AWS, but using it correctly can be subtle.The talk will cover how to automate AMI builds, building Cloud Formation Templates and automating S3 bucket management.

Moshe Zadka

Portland Ballroom 252-253

Saturday, 20th May, 10:50 - 11:20

Talks

10:50 - 11:20

Debugging in Python 3.6: Better, Faster, Stronger

Elizaveta Shashkova

Python 3.6 was released in December of 2016 and it has a lot of new cool features. Some of them are quite easy for using: a developer can read, for example, about f-strings and they can start using them in their programs as soon as possible. But sometimes features are not so evident, and a new frame evaluation API is one of them. The new frame evaluation API was introduced to CPython in PEP 523 and it allows to specify a per-interpreter function pointer to handle the evaluation of frames. It might not be evident how to use this new feature in everyday life, but it’s quite easy to understand how to build a fast debugger based on it. In this talk we are going to explain how standard way of debugging in Python works and how a new frame evaluation API may be useful for creating the fast debugger. Also we will consider why such fast debugging was not possible in the previous versions of Python. If someone hasn’t made a final decision to move to Python 3.6 this talk will provide some new reasons to do it.

Elizaveta Shashkova

Elizaveta Shashkova is a software developer of PyCharm IDE at JetBrains. She is working on the Python debugger which is part of PyCharm and PyDev. Also she is the author of PyCharm's Concurrency Visualiser.

Oregon Ballroom 201-202

Saturday, 20th May, 10:50 - 11:20

Talks

10:50 - 11:20

The trends in choosing licenses in Python ecosystem

Anwesha Das

The software licenses are the permissions over copyrighted software. The permission and/ grant includes the grant to use, to redistribute, to prepare derivative works etc. These software licenses also set forth the limitations over these aforesaid rights. The software licenses basically marks the boundary for the usage of the code. Therefore from the above mentioned introduction it very clear that it is very important for developers to choose the license for their code wisely and correctly. PyPI, the Python Package Index is a repository of software for the Python programming language. There are currently 80000+ packages there. This talk will go through the licenses of the top 2500 packages. We will see the trend of choosing a license for these top Python projects. We will discuss the licenses individually, compare them with each other. the advantages and disadvantages of the same. We will further explain that why a license and/or licenses are being favored by the developers.

Anwesha Das

Anwesha Das, an Advocate practicing law for 6 years. She is an active user of FOSS technologies. She maintains her blog at https://anweshadas.in/. One of her major goal is to help the FOSS developers with the legal aspect of the software industry. She currently blogs for PSF.

Portland Ballroom 254-255

Saturday, 20th May, 10:50 - 11:20

Talks

10:50 - 11:20

The Memory Chronicles: A Tale of Two Pythons

Kavya Joshi

MicroPython is the leanest, meanest full Python implementation. Designed for microcontrollers, this variant of Python runs in less than 300KB of memory, and retains support for all your favorite Python features. So what does it take to make the smallest Python? Put differently, why does CPython have a large memory footprint? This talk will explore the internals of MicroPython and contrast it with CPython, focusing on the aspects that relate to memory use. We will delve into the Python object models in each and the machinery for managing them. We will touch upon how the designs of the bytecode compiler and interpreter of each differ and why that matters.

Kavya Joshi

Kavya writes code for a living at a start-up in San Francisco. She’s primarily a backend/ systems developer, but has of late been dabbling in firmware too.Her favorite aspects of being a programmer are reasoning about systems at scale, and delving into the inner workings of sophisticated software. When not programming, she tends to spend her time on rock walls and mountain tops.Before moving to San Francisco to be an Adult, Kavya was at MIT where she got a Bachelor's and Master's in Computer Science.

Portland Ballroom 251 & 258

Saturday, 20th May, 10:50 - 11:20

Talks

10:50 - 11:20

It's time for datetime

Mario Corchero

Working with time is not a trivial challenge. Python includes a native module in the standard library to work with it but datetime keeps being together with unicode a common source of errors. This often leads to the widespread of many other libraries in the attempt of easing the work of working with datetime. Datetime is one of those API that looks easy to use but given the many concepts around time, is it easy to get backfired if the developer has not solid knowledge about the them. In this talk we will overview the main concepts about timestamps represented through datetime objects, the limitations on the standard library and some simple steps to try to avoid the common mistakes that everyone can fall into. Naive datetimes (which the datetime API works by default with) are a great tool to represent calendar times, but when talking about timestamps (focus of this talk) timezones is n essential part of it and the datetime module can be tricky to use for that use cases. We will also speak about different standards of time, time zones, Daylight Saving Times, leap seconds, serialization and datetime arithmetics. The talk will be focused on giving the foundations that everyone knows to be able to understand and work efficiently and without making painful mistakes when dealing with time related algorithms.

Mario Corchero

I'm a Senior Software Developer at Bloomberg LP where I work automating the generation of news with the vast available data in our system using Python and C++ to write small reusable services.But that is the boring part! I am a young Spanish (from Spain) coder wishing to learn as much as possible and take advantage of any chance of networking.

Oregon Ballroom 203-204

Saturday, 20th May, 10:50 - 11:20

Talks

11:30 - 12:00

Readability Counts

Trey Hunner

Have you found unreadable PEP8-compliant code and wondered how to fix it? Have you ever seen code that was simply a pleasure to read? If you've ever wondered what makes code easy to read, this talk is for you. During this talk we'll learn a number of techniques for refactoring code to improve readability and maintainability.We'll discuss: whitespace, self-documenting code, modularity, expectation management. We'll end with a checklist for improving the readability of your own code.

Trey Hunner

Oregon Ballroom 201-202

Saturday, 20th May, 11:30 - 12:00

Talks

11:30 - 12:00

Awesome Command Line Tools

Amjith Ramanujam

Designing a good command line tool is challenging. Command line tools look archaic compared to modern graphical interfaces, discoverability is a big issue for command line tools, proactive help is hard to implement. This talk will cover a set of specific techniques to help alleviate those challenges. How did pgcli and mycli overcome these problems? We will cover specific examples where command line apps shine. The examples will be drawn from pgcli, mycli and bpython. This talk will suggest libraries and show how they can help you implement a wonderful command line interface. The libraries covered in this talk include python-prompt-toolkit, pygments, click, fuzzyfind. The goal of the talk is to distil the ideas that exist in successful command line applications as guidelines for building powerful command line applications.

Amjith Ramanujam

I am the author of pgcli and mycli, command line clients for Postgres and MySQL with auto-completion.

Portland Ballroom 252-253

Saturday, 20th May, 11:30 - 12:00

Talks

11:30 - 12:00

When the abyss gazes back: staring down Python's surprising internals

David Wolever

Python's fantastic until it isn't. This talk dives into some of the surprising implementation details of CPython, then explains exactly how they could be discovered from first principles. Attendees will leave with some dangerous Python trivia, and the tools they'll need to uncovery their own trivia when surprises strike.

David Wolever

David is a Pythonista from Toronto, Canada. He's co-founder of both PyCon Canada – Canada's regional Python conference – and Akindi.com – a small company that's making multiple-choice bubble sheet tests a little bit less terrible. He's also the author of nose-parameterized, a parameterized testing for every Python testrunner, and pprint++, a Python pretty-printer that's actually pretty. Say hi on Twitter: @wolever!

Portland Ballroom 251 & 258

Saturday, 20th May, 11:30 - 12:00

Talks

11:30 - 12:00

How documentation works, and how to make it work for your project

Daniele Procida

Nearly everyone (especially in the Python community) agrees that good documentation is important to the success of software projects, and yet very few projects actually have good documentation. Often, it's not for want of effort - the project's developers have worked hard on it - nor for lack of documentation - the authors have produced a lot of it. It simply turns out to be not very good - not helpful enough for the users who should be able to rely on it, and a depressing chore for the authors who have to maintain it. The good news is that both these problems can be solved by understanding how documentation works, and what its different functions are. Structuring documentation according to those distinct functions helps ensure that each of them is adequately served. It also makes it far easier to write and maintain. Using real-life examples I'll draw out the key functions of documentation, and how they map onto different ways of writing it. Putting this into practice is simple when armed with some basic guidelines. The benefits are huge, and available with a minimum of effort. I won't be discussing documentation tools or software or other topics that have been covered amply elsewhere, but some neglected aspects of software documentation that will make your software projects more successful.

Daniele Procida

I am an accidental programmer. As part of a pilot project I took a five-day introductory Python/Django course in April 2009 to enable me to communicate better with our developers. This was my first experience in programming since encountering Commodore 64 BASIC as a teenager. By day three of the course I realised that building the system I had envisaged was actually something I could do myself - so I did. I work at Divio (Zürich, Switzerland) and I'm a member of the Django Project core development team.

Portland Ballroom 254-255

Saturday, 20th May, 11:30 - 12:00

Talks

11:30 - 12:00

Temporal Data Structures with SQLAlchemy and Postgres

Joseph Leingang

SQLAlchemy (http://www.sqlalchemy.org) and Postgres (https://www.postgresql.org) provide several useful tools that allow us to build and query records through time: temporal models. Combining a need to have robust auditing, as well feature development on per-property history, we can turn “regulatory overhead” into an exciting technical challenge. At Clover Health we have built a small library to automate the task of decorating a model and making it “temporal.” This talk aims to demonstrate the underlying data model and interface for building this system.

Joseph Leingang

Joey Leingang is an Engineering Manager at Clover Health where he focuses on engineering team leadership, scalable development, systems management. Joey has 14 years of development experience, including 5 years of engineering management, and has held technical roles at companies including: Sirono, Arizona Public Media, and the University of Arizona.

Oregon Ballroom 203–204

Saturday, 20th May, 11:30 - 12:00

Talks

12:10 - 12:55

Static Types for Python

Jukka Lehtosalo

Over the past year and a half, Dropbox has been investing in the development of mypy, a static type checker for Python, as a way to make our multimillion-line Python codebase easier to understand, navigate, and maintain. In this talk, we will discuss the benefits of type annotations, explain how to use them, and give a peak into how mypy works behind the scenes. Mypy is an open-source type-checker for Python which supports the PEP 484 standard for gradual typing. Originally created by Jukka Lehtosalo as part of his PhD thesis in 2013, it is now under active development by a small team at Dropbox which includes David Fisher, Greg Price, and Guido van Rossum. It supports Python 3.2 and higher, as well as Python 2.7 (via type comments).

Jukka Lehtosalo

Jukka Lehtosalo is the creator of mypy, a static type checker for Python. He is also one of the authors of the type hinting notation introduced in Python 3.5 (PEP 484). He enjoys working at Dropbox and helping engineers be productive when working on multi-million-line Python codebases. Hailing from Finland, Jukka currently lives in the UK.

Oregon Ballroom 201-202

Saturday, 20th May, 12:10 - 12:55

Talks

12:10 - 12:40

Exploring Network Programmability with Python and YANG

Lisa N Roach

Recently, networking vendors and Silicon Valley giants have been putting forth a concerted effort to build standardized models for networking devices. These models allow for building reusable and versatile scripts with predictable, standardized data. Without such models, the wide variety of inputs and outputs required by different devices and vendors made scripting a tedious and challenging endeavor. The modeling language in use is called YANG, and a variety of standards have emerged. A vendor agnostic standard called OpenConfig has lately become stable enough to begin programming devices with it. Using Python, YANG is surprisingly easy to work with, and extremely powerful applications can be written with basic knowledge of JSON or XML and RPCs.The talk will start with use-cases for programming networking devices, and will detail a specific, trivial, use case that will be used in the talk. Next, we will discuss the ‘legacy’ way of programming devices (SSH and screenscraping), and highlight the challenges, such as complex regular expressions, slow responses, and lack of reusability between devices. From there we will dive into YANG, focusing on OpenConfig models. A YANG model is essentially a template, and JSON or XML can be mapped to the YANG template. This makes it perfect for Pythonic manipulation. In the use case there will be a GET RPC returning a YANG representation of the box’s state in JSON, which we will search for the relevant health indicator by drilling down in the JSON dictionary. A simple change to the dictionary will remediate the problem, and a PATCH RPC merges the new configuration onto the box. Since open, standard models are in use, this script could be run on many devices across a network to achieve the same effect with no changes needed. We will finish up with the pros and cons of YANG before opening the talk for Q&A.

Lisa N Roach

Lisa Roach is currently a Production Enginerer at Facebook focusing on security, but previously was a software systems engineer at Cisco Systems. Her work at Cisco focused on building python applications that can manipulate and monitor networking devices, from off-box or on-box. The open source projects she contributes to include exaBGP (the BGP swiss army knife), Solenoid (a Cisco-RIB injection application), and CPython. Examples of her work can be found on github under the username lisroach.

Portland Ballroom 252-253

Saturday, 20th May, 12:10 - 12:40

Talks

12:10 - 12:55

Level up! Rethinking the Web API framework.

Tom Christie

Think there's nothing left to explore in how we design Web API frameworks? Think again. The author of Django REST framework walks through how we might approach designing a new Python-based API framework from scratch, and looks at how we can start building smarter, more productive API tooling as a result. You should come away from this talk with a better appreciation of: How best to provide API client libraries and API documentation to your users. How to build APIs that support both realtime and request/response interfaces. How to build APIs that are web-browsable. Why you might want to consider taking a schema-first approach to your API design.

Tom Christie

Web & API developer. Author of Django REST framework.

Portland Ballroom 254-255

Saturday, 20th May, 12:10 - 12:55

Talks

12:10 - 12:40

What's new in Python 3.6

Brett Cannon

Python 3.6 has turned out to be quite the release! With 16 Python Enhancement Proposals incorporated into the version, Python 3.6 is only surpassed by Python 3.0 for having more PEPs included in a single release. This talk will be an overview of those 16 PEPs and other changes outlined in the What's New document for Python 3.6.

Brett Cannon

Brett has been a core contributor to Python since 2003. Currently hs is a member of the Azure Data Science Tools team at Microsoft working a whole bunch of Python-related stuff. Brett lives in Vancouver, Canada with his wife, Andrea, and his cat Gidget who is too cute and smart for her own good (but that doesn't make him a cat person; his wife thinks otherwise).

Oregon Ballroom 203-204

Saturday, 20th May, 12:10 - 12:40

Talks

12:10 - 12:55

Modern Python Dictionaries -- A confluence of a dozen great ideas

Raymond Hettinger

Python's dictionaries are stunningly good. Over the years, many great ideas have combined together to produce the modern implementation in Python 3.6. This fun talk uses pictures and little bits of pure python code to explain all of the key ideas and how they evolved over time. Includes newer features such as key-sharing, compaction, and versioning.

Raymond Hettinger

Python core developer for 15 years. Creator of the collections module and itertools modules. Designer of Python's compact-and-ordered dict. Former multi-year PSF board member. Recipient of the PSF Distinguished Service Award. Popular trainer having taught Python to over 4,000 Engineers. Frequent presenter and occassional keynote speaker at Python conferences around the world for the last ten years

Portland Ballroom 251 & 258

Saturday, 20th May, 12:10 - 12:55

Talks

13:40 - 14:25

An Introduction to Reinforcement Learning

Jessica Forde

Reinforcement learning (RL) is a subfield of machine learning focused on building agents: software that can robustly achieve a desired objective under varying states of the world. This introduction will provide you with an overview of RL and tools to build your own agents. In this talk, we will provide an overview of terminology in reinforcement learning and a Jupyter Notebook outlining basic algorithms to learn 'policies', strategies for an agent, and visualize them with numpy, pandas, and seaborn. Newer developments in reinforcement learning apply deep learning to improve performance. We will further discuss deep reinforcement learning and how to use deep learning libraries, such as TensorFlow or Theano, with the latest RL libraries: OpenAI Gym, OpenAI Universe, and DeepMind Lab.

Jessica Forde

Jessica is a Visiting Fellow in Computer Science at Harvard, where she researches reinforcement learning. Her focus is applications that allow for human interaction, primarily in healthcare. Jessica is also a data scientist at Careful, a sports medicine startup. Her interest in industrial applications of reinforcement learning began at Columbia, where she developed energy-saving software for skyscrapers in Manhattan, which she demoed at NIPS. Previously, Jessica developed the DARPA-funded open source machine learning library, datamicroscopes, at Qadium. She has worked with consultants at McKinsey to provide data-driven human resources recommendations to clients. She holds an MA in Statistics from Columbia.

Portland Ballroom 252-253

Saturday, 20th May, 13:40 - 14:25

Talks

13:40 - 14:25

Text is More Complicated Than You Think: Comparing and Sorting Unicode

Morgan Wahl

Few people realize just how complicated text can be. Did you know sorting and even case-folding can depend on a user's locale? That different strings of characters can be semantically completely equivalent? That there are over a thousand Latin letters? Legacy text encodings like ASCII made a lot of simplifying assumptions about how written languages work, and we all put up with them because it was cool to even have computers in the first place. Unicode removes many of those assumptions and provides the tools we need to write software that can just do the right thing regardless of what text users throw at it. Even if you don't translate your UI, getting the details of string comparison, sorting, and searching right can eliminate annoying surprises for you and your users.

Morgan Wahl

Workaday web developer for eight years. Linguistics enthusiast. Bachelor of Arts in Computer Science from UMass Boston.

Oregon Ballroom 203-204

Saturday, 20th May, 13:40 - 14:25

Talks

13:55 - 14:25

How to write a Python transpiler

Russell Keith-Magee

We all know Python is a powerful and expressive programming language. What you may not know is how much of the internals of Python itself is exposed for you to use and manipulate. In this talk, you'll be introduced to the tools and libraries Python provides to manipulate the compilation and execution of Python code. You will also see how you can use those tools to target execution environments other than the CPython virtual machine.

Russell Keith-Magee

Portland Ballroom 254-255

Saturday, 20th May, 13:55 - 14:25

Talks

13:55 - 14:25

Know thy self: Methods and method binding

Thomas Ballinger

Methods are like functions, but different. How? Why? And what's will having to type 'self ' all the time? We'll explore partial application of functions and review why it might be nice to start using classes. Then to clarify how method objects work we'll examine the result of accessing the method attributes of an object without calling them. Understanding the behavior we uncover here will require more attribute lookup experiments, which will lead us discover the power of descriptors. Along the way we'll peek in at other languages' approaches to method binding, hopefully coming to appreciate the way Python does things enough to type 'self' a few thousand more times.

Thomas Ballinger

Oregon Ballroom 201–202

Saturday, 20th May, 13:55 - 14:25

Talks

13:55 - 14:25

Magic Method, on the wall, who, now, is the `__fairest__` one of all?

Sep Dehpour

Magic methods are a very powerful feature of Python and can open a whole new door for you. However, with great power comes great responsibility. In this talk we explore magic method's capabilities by first designing new interfaces in a series of fun experiments. Secondly, we play with creating undeletable objects and learn about the mighty Garbage Collector in cPython and how a single magic method can overturn the fate of the object. Lastly, we create a lazy Redis client to illustrate a practical application of magic methods and learn about lazy loading. Once you see what magic methods can bring to the table, the limit is only your imagination!

Sep Dehpour

Sep is based in Los Angeles, California where he is a principal engineer at [ChowNow](https://www.chownow.com/).Sep is the author of some open source Python libraries such as [DeepDiff](https://github.com/seperman/deepdiff) and [RedisWorks](https://github.com/seperman/redisworks) . He loves Python because it is really fun to code in Python and the community rocks. In his free time Sep enjoys skiing, rock climbing and surfing. From time to time Sep posts in his blog: [zepworks.com](http://zepworks.com). He can be reached through sep_at_zepworks_dot_com

Portland Ballroom 251 & 258

Saturday, 20th May, 13:55 - 14:25

Talks

14:35 - 15:05

The Fastest FizzBuzz in the West: Make Your Own Language with RPLY and RPython

Dustin Ingram

In this talk, you'll learn how I built DIVSPL (Dustin Ingram's Very Special Programming Language), a tongue-in-cheek domain-specific language, which is particularly good for implementing FizzBuzz -- as quickly as possible. We'll build DIVSPL with RPLY, an implementation of David Beazley's PLY (but with a 'cooler' API) and make it compatible with RPython, a restricted subset of the Python programming language. Along the way, you'll learn about lexers, parsers, and grammars, and in the end, you'll know how to build your own language.

Dustin Ingram

Dustin Ingram is an engineer at PromptWorks, a Philadelphia-based software development company. He's also a member of the Python Packaging Authority and a maintainer of Warehouse, the next-generation Python Package Index (PyPI).

Portland Ballroom 254–255

Saturday, 20th May, 14:35 - 15:05

Talks

14:35 - 15:05

Decorators, unwrapped: How do they work?

Katie Silverio

Decorators are a syntactically-pleasing way of modifying the behavior of functions in Python. However, they can be highly opaque to Python beginners. It took me a while to learn how to write one, and even after I was confident writing my own decorators, felt like they were magical. The goal of this talk is to demystify decorators by methodically stepping through how and why they work. Along the way we'll touch on closures, scopes, and how Python is compiled.

Katie Silverio

Software engineer at Venmo. Recurse Center Fall 2013. Ask me about text adventures, Markov chains, bytecode, dead code, comics, astrophysics, and Harry Potter.

Oregon Ballroom 201-202

Saturday, 20th May, 14:35 - 15:05

Talks

14:35 - 15:05

Designing secure APIs with state machines

Ashwini Oruganti

Did you ever need to create an application whose behavior varies with its state, while still presenting a consistent interface to its callers? A good, layered design using state machines can help avoid the tedious 'if' checks for flags, and ensure that if your code runs at all, it will run with all the required values initialized. I will demonstrate this with examples, and talk about some available tools and libraries to build state machines in Python. I will also discuss how to effectively use the process of threat modeling to build secure web applications. Threat modeling is a computer security technique that helps you better understand the systems you create, identify attacks, and build defenses. I will talk about things that we, as software developers, can do to assess the security of our applications in the real world through this process.

Ashwini Oruganti

Ashwini is a software engineer in San Francisco and an open source developer. She is the author of pyca/tls, a pure-python TLS 1.2 implementation that is designed with various functionally independent layers that plug together in the form of a state machine, presenting users with opinionated and secure APIs. In the past, she has worked on Twisted - an asynchronous event-driven networking framework, and Hippy - a PHP implementation in RPython.

Portland Ballroom 251 & 258

Saturday, 20th May, 14:35 - 15:05

Talks

14:35 - 15:05

Human-Machine Collaboration for Improved Analytical Processes

Tony Ojeda

Over the last several years, Python developers interested in data science and analytics have acquired a variety of tools and libraries that aim to facilitate analytical processes. Libraries such as Pandas, Statsmodels, Scikit-learn, Matplotlib, Seaborn, and Yellowbrick have made tasks such as data wrangling, statistical modeling, machine learning, and data visualization much quicker and easier. They have accomplished this by automating and abstracting away some of the more tedious, repetitive processes involved with analyzing and modeling data. Over the next few years, we are sure to witness the introduction of new tools that are increasingly intelligent and have the ability to automate more complex analytical processes. However, as we begin using these tools (and developing new ones), we should strongly consider the level of automation that is most appropriate for each case. Some analytical processes are technically difficult to automate, and therefore require large degrees of human steering. Others are relatively easy to automate but perhaps should not be due to the unpredictability of results or outputs requiring a level of compassionate decision-making that machines simply don’t possess. Such processes would benefit greatly from the collaboration between automated machine tasks and uniquely human ones. After all, it is often systems that utilize a combination of both human and machine intelligence that achieve better results than either could on their own. In this talk, we will discuss human-machine collaboration as it applies to analyzing data with Python. We will review a framework for exploratory data analysis with the goal of identifying which tasks should be automated, which tasks should not, and which tasks would benefit from a more interactive, symbiotic, and collaborative process between the human and the machine. We will explore Python libraries that we can use to build tools that allow us to perform different types of analysis. We’ll also introduce the Cultivar project, an example of a hybrid analytics tool that combines a Django framework with Javascript visualizations and Celery for task management to facilitate more efficient and effective human-machine systems for data analysis.

Tony Ojeda

Portland Ballroom 252-253

Saturday, 20th May, 14:35 - 15:05

Talks

14:35 - 15:05

How to make a good library API

Flavio Juvenal

It's not easy to write libraries with great APIs. We're aware of that. However, it's not always clear how we can follow abstract ideals like elegance, simplicity, and extensibility to improve our APIs. That's why in this talk we'll discuss good and bad APIs with real-world examples. For each thing learned, we'll come up with a checklist to help us with practical advice for writing good APIs.

Flavio Juvenal

Flavio is a software engineer from Recife, Brazil. He is a partner at Vinta, a full stack software consultancy from Brazil focused on building high-quality web products. At Vinta, Flávio worked on a variety of large web applications, including developing from scratch a ticketing system for an IT security company and helping a Series-B startup reduce technical debt by solving dozens of issues. While solving clients problems, Flávio contributed to a number of open source projects and helped to maintain well-known ones like tapioca-wrapper and django-templated-email.

Oregon Ballroom 203-204

Saturday, 20th May, 14:35 - 15:05

Talks

15:15 - 15:45

Library UX: Using abstraction towards friendlier APIs

Mali Akmanalp

Complicated libraries can be a pain in the butt to use. It's not surprising that there are a lot of 'X for humans ' libraries out there, some of which are mostly wrappers around more frustrating interfaces. This is not a theoretical talk. I'll touch upon theory to give you context, but will then talk about what that means for you in practice so that you can write better libraries. I'll talk about why library UX matters, about abstraction as a general concept, about out what happens when you over/under abstract, and about some useful tips to help build friendly APIs. Meanwhile, I'll show some positive examples from libraries we know and love (flask, SQLAlchemy, Requests, etc). Once you recognize these effects in play, you'll be able to apply them to your own code and make life better for everyone!

Mali Akmanalp

I'm a software engineer who accidentally got into the social sciences. I work at the Growth Lab at the Harvard Center for International Development, where we come up with and test theories as to how and why certain countries develop while other ones don't. I spend most of my time munging data and building data visualization tools. I've worked at a variety of places from my own tiny startup to large software companies to academia. I think writing code is a craft that gets better with curiosity and practice. I love teaching, giving talks, and nerdy things in general.

Oregon Ballroom 203-204

Saturday, 20th May, 15:15 - 15:45

Talks

15:15 - 15:45

Lights, camera, action! Scraping a great dataset to predict Oscar winners

Deborah Hanus

Using Jupyter notebooks and scikit-learn, you’ll predict whether a movie is likely to win an Oscar or be a box office hit. Together, we’ll step through the creation of an effective dataset: asking a question your data can answer, writing a web scraper, and answering those questions using nothing but Python libraries and data from the Internet.

Deborah Hanus

Deborah Hanus graduated MIT with a Masters in Electrical Engineering & Computer Science. As a Fulbright Scholar in Cambodia, she learned about how education translates into job creation in technology. After working as an early engineer at a software company, she decided to take a hiatus to work on exciting data-related projects as a machine learning researcher at Harvard University.

Portland Ballroom 252-253

Saturday, 20th May, 15:15 - 15:45

Talks

15:15 - 16:00

Executing python functions in the linux kernel by transpiling to bpf

Alex Gartrell

ebpf is a linux kernel byte-code which can be used for functionality ranging from tracing system calls with kprobe to routing packets with tc. This talk is about a pure-python front-end for ebpf that allows users to write simple python functions to be executed in the kernel. I'll first explain how this was made to work and then I'll show off some of the features/capabilities of this approach with working examples.

Alex Gartrell

I'm a Software Engineer in Infrastructure at Facebook

Portland Ballroom 254-255

Saturday, 20th May, 15:15 - 16:00

Talks

15:15 - 16:00

The Dictionary Even Mightier

Brandon Rhodes

Since my “Mighty Dictionary” talk at PyCon 2010, the Python dictionary has evolved dramatically. Come learn about all of the the improvements, up to and including the re-architecture that has just landed with Python 3.6! The talk will discuss iterable views, the dictionary’s dedicated comprehension syntax, random key ordering, the special key-sharing dictionary designed to underlie object collections, and, most famously of all, the new “compact dictionary” that cuts dictionary storage substantially — and carries a fascinating side-effect. Each new feature that the talk discusses will be motivated by considering the trade-offs inherent in hash table data structure design, and followed up with hints about how you can now use the dictionary even more effectively in your own code!

Brandon Rhodes

Brandon Rhodes works at Dropbox where he strives to deliver a world-class Python development environment to his fellow Dropboxers. His most popular open source projects are astronomy libraries for Python that he has been maintaining since the late 1990s. He speaks at several Python conferences a year, and in 2016–2017 is the Chair of the main worldwide PyCon conference as it is held in Portland.

Oregon Ballroom 201-202

Saturday, 20th May, 15:15 - 16:00

Talks

16:30 - 17:00

No More Sad Pandas: Optimizing Pandas Code for Speed and Efficiency

Sofia Heisler

When I first began working with the Python Pandas library, I was told by an experienced Python engineer: 'Pandas is fine for prototyping a bit of calculations,but it's too slow for any time-sensitive applications.' Over multiple years of working with the Pandas library, I have realized that this was only true if not enough care is put into identifying proper ways to optimize the code's performance.This talk will review some of the most common beginner pitfalls that can cause otherwise perfectly good Pandas code to grind to a screeching halt, and walk through a set of tips and tricks to avoid them.Using a series of examples, we will review the process for identifying the elements of the code that may be causing a slowdown,and discuss a series of optimizations, ranging from good practices of input data storage and reading, to the best methods for avoiding inefficient iterations, to using the power of vectorization to optimize functions for Pandas dataframes.

Sofia Heisler

Sofia Heisler is the Lead Data Scientist at Upside Travel, where she develops pricing and product selection algorithms for the travel industry. Previously, she headed up data analytics for a D.C. startup dedicated to connecting small businesses to vendors, as well as performed data analysis on behalf of some of the largest Fortune 500 companies as a Senior Consultant at an economic consulting company. She holds a Master’s degree in Predictive Analytics from Northwestern University, as well as a B.A. and a B.S. in Economics with a concentration in Statistics from the University of Pennsylvania.

Oregon Ballroom 201-202

Saturday, 20th May, 16:30 - 17:00

Talks

16:30 - 17:00

The Python Visualization Landscape

Jake VanderPlas

So you want to visualize some data in Python: which library do you choose? From Matplotlib to Seaborn to Bokeh to Plotly, Python has a range of mature tools to create beautiful visualizations, each with their own strengths and weaknesses. In this talk I’ll give an overview of the landscape of dataviz tools in Python, as well as some deeper dives into a few, so that you can intelligently choose which library to turn to for any given visualization task.

Jake VanderPlas

Jake VanderPlas is a data science fellow at the University of Washington's eScience Institute, where his work focuses on data-intensive physical science research in an interdisciplinary setting. In the Python world, Jake is the author of the Python Data Science Handbook, and is active in maintaining and/or contributing to several well-known Python scientific computing packages, including Scikit-learn, Scipy, Matplotlib, Astropy, Altair, and others. He occasionally blogs on python-related topics at http://jakevdp.github.io/

Portland Ballroom 252-253

Saturday, 20th May, 16:30 - 17:00

Talks

16:30 - 17:00

Slot or not: higher performance custom objects in pure Python

Aaron Hall

__slots__ are versatile for certain kinds of uses and users, if you know how they work. At first glance, they seem like a free lunch, with improvements in both time and space. At second glance, they seem to have so many caveats to make them not worth using. This talk is a deep dive into how __slots__ work, how to wring every benefit out, as well as the actual caveats and alternatives, with recommendations for writers of core libraries as well as end users.

Aaron Hall

Aaron Hall has developed reports, tools, portals, and user interfaces for both technologists and business users - focused on building an optimally maintainable code base using recognized best practices, and is in the top 1% on StackOverflow focusing on Python. He has experience in Python software development lifecycle and risk architecture.

Portland Ballroom 254-255

Saturday, 20th May, 16:30 - 17:00

Talks

16:30 - 17:00

Writing a C Python extension in 2017

Jean-Baptiste Aviat

This talk describes the build of a C Python extension, with prebuilt binaries, in 2017, where modern packaging standards, as well as Docker, have been a game changer in the Python extensions world. Most examples come from our experience building PyMiniRacer, an embedded Python / JavaScript bridge used in production across hundreds of companies. We will describe the different aspects of building a binary extension, including: using the modern manylinux wheel type in order to ship a built binary, usable in most Linux distributions; the choices offered to developers when building an extension: the Python public C API, cffi, ...; testing of a binary module across various platforms; troubleshooting & debugging an extension: the basics you need to tackle most common issues.

Jean-Baptiste Aviat

Jean-Baptiste Aviat is CTO at Sqreen.He spent half a decade hunting vulnerabilities at Apple, helping developers solve them, and developing security software.

Portland Ballroom 251 & 258

Saturday, 20th May, 16:30 - 17:00

Talks

16:30 - 17:00

Yes, It's Time to Learn Regular Expressions

Al Sweigart

Regular expressions have a reputation as opaque and inscrutable. However, the basic concepts behind 'regex' and text pattern recognition are simple to grasp. This talk is for any programmer who isn't familiar with Python's re module and its best practices. Stop putting it off, it's time to learn regular expressions!

Al Sweigart

Oregon Ballroom 203-204

Saturday, 20th May, 16:30 - 17:00

Talks

17:10 - 17:40

Fuzzy Search Algorithms: How and When to Use Them

Eleanor Stribling

Fuzzy Searching or approximate string matching is powerful because often text data is messy. For example, shorthand and abbreviated text are common in various data sets. In addition, outputs from OCR or voice to text conversions tend to be messy or imperfect. Thus, we want to be able to make the most of our data by extrapolating as much information as possible.In this talk, we will explore the various approaches used in fuzzy string matching and demonstrate how they can be used as a feature in a model or a component in your python code. We will dive deep into the approaches of different algorithms such as Soundex, Trigram/n-gram search, and Levenshtein distances and what the best use cases are. We will also discuss situations where it’s important to take into account the meaning or intent of a word and demonstrate approaches for measuring semantic similarity using nltk and word2vec. Furthermore, we will demonstrate via live coding how to implement some of these fuzzy search algorithms using python and/or built-in fuzzy search functions within PostgreSQL.

Eleanor Stribling

Eleanor Stribling is a product manager, developer and team builder for tech startups. Since 2015, she has been the VP of Product at Kevala, an energy analytics start-up and was previously an early employee and VP of Product Management and consumer insights at TubeMogul, an ad tech company (NASDAQ:TUBE). Outside of work, she volunteers for healthcare, gun safety and political causes. Eleanor earned her MBA at the Massachusetts Institute of Technology and a BA at the University of Toronto. She lives in San Francisco with her family.

Oregon Ballroom 203-204

Saturday, 20th May, 17:10 - 17:40

Talks

17:10 - 17:40

Tracing, Fast and Slow: Digging into and improving your web service’s performance

Lynn Root

Do you maintain a Rube Goldberg-like service? Perhaps it’s highly distributed? Or you recently walked onto a team with an unfamiliar codebase? Have you noticed your service responds slower than molasses? This talk will walk you through how to pinpoint bottlenecks, approaches and tools to make improvements, and make you seem like the hero! All in a day’s work. The talk will describe various types of tracing a web service, including black & white box tracing, tracing distributed systems, as well as various tools and external services available to measure performance. I’ll also present a few different rabbit holes to dive into when trying to improve your service’s performance.

Lynn Root

Lynn Root is a Site Reliability Engineer at Spotify in NYC. She is also a global leader of PyLadies, an international mentorship group for women and friends in the Python community, and the founder & former leader of the San Francisco PyLadies. When her hands are not on a keyboard, they are usually holding a pair of knitting needles.

Oregon Ballroom 201-202

Saturday, 20th May, 17:10 - 17:40

Talks

17:10 - 17:40

Unicode: what is the big deal?

Lukasz Langa

Ever wondered why people complain that text processing is a hard problem? Or why Python 3 would introduce such a big backward incompatibility with switching to Unicode? Wonder no more, this talk is for you. In 30 minutes I'm going to demonstrate real world text processing problems and how Python 3 helps solve them. The talk is going to explain how you should split your text from binary data in your application, what are sensible defaults and what are possible gotchas. All this sprinkled with a healthy dose of frustration by a guy whose first name starts with Ł.

Lukasz Langa

Python committer since 2010. Chronic perfectionist. Pianist. Dad. In his freetime uses Python 3 to help Facebook maintain cache consistency.

Portland Ballroom 254-255

Saturday, 20th May, 17:10 - 17:40

Talks

17:10 - 17:40

Python for mathematical visualization: a four-dimensional case study

David Dumas

This is a talk about creating pictures of a mathematical object---specifically, a 4-dimensional fractal 'dust' that has been the subject of mathematical research in hyperbolic geometry since the 1980s. In the end this is accomplished using a little algebra, a little geometry, and a healthy dose of Python. That is, I will present a case study of using Python in several aspects of a mathematical visualization project, from the computation itself, to transforming and converting data, and finally for scripting the process of generating the images. Along the way I'll explain how Python's convenient idioms and containers (e.g. sets and set comprehensions) are a good fit for some of the algebraic and geometric questions that come up, how Scipy and Numpy enable fast numerical calculations, and how Python's strength as a language for scripting and automation allows easy orchestration of rendering of still images and frames of animations. The mathematical visualization project we describe is a collaboration with François Guéritaud (Université de Lille).

David Dumas

David Dumas is a professor of mathematics at the University of Illinois at Chicago. His research in low-dimensional geometry and topology frequently incorporates results of computer experiments and visualizations, most of which he develops in Python.

Portland Ballroom 252-253

Saturday, 20th May, 17:10 - 17:40

Talks

17:10 - 17:40

Cython as a Game Changer for Efficiency

Alex Orlov

Are you running a Web application? Do you suffer from CPU bottlenecks that slow down your growth? There's a tool that can easily fix all that, and then some. C++ knowledge not required. Come learn how Instagram, the world's largest Django deployment with more than 600M active users, saved ~30% of global CPU by rewriting a handful of modules on the critical path in Cython. Learn to apply those techniques to your own projects with little effort and stop worrying about switching to other programming languages or rewriting stable components in C++.

Alex Orlov

Software Engineer at Instagram

Portland Ballroom 251 & 258

Saturday, 20th May, 17:10 - 17:40

Talks

Sunday, 21st May

Talks

13:10 - 13:40

Hacking Classic Nintendo Games with Python

Sam Agnew

Do you feel like using your superpowers as a developer to bring the games of your childhood into the future with the power of the Internet? In this live coded journey, we'll build an SMS powered 'Game Genie' allowing the audience to send text messages to manipulate the Nintendo games being played in real time. This will involve working with Flask, the Twilio API, the FCEUX NES emulator and how to bridge them with quick Lua scripts.

Sam Agnew

Sam Agnew is a developer evangelist at Twilio and loves inspiring and equipping developers around the world. He particularly enjoys being a member of the New York and Boston Python communities. The only thing he finds more satisfying than writing Python is playing fast guitar solos and figuring out how old video games work.

Portland Ballroom 252-253

Sunday, 21st May, 13:10 - 13:40

Talks

13:10 - 13:40

Share Your Code! Python Packaging Without Complication

Dave Forgac

If you want people to use your code you should package it! You may have heard that packaging is hard but the Python packaging ecosystem has evolved a lot over the years. Taking your beautiful code and sharing it with the world is complex but it doesn't have to be complicated. In this talk you will learn how to take advantage of modern tooling and practices so you can get boring stuff out of the way, publish quickly and frequently, and focus on your code. This talk will cover: A (brief) history of Python packaging, Python Packaging User Guide recommendations, Distribution formats, Anatomy of a package, Automating package creation, Adding:Testing, CI, Documentation, Testing package installation, Releasing to PyPI. This talk is for you if you're new to Python packaging and would like to learn how to share you code or if you've worked with Python for a while and just aren't up-to-date with the latest packaging practices.

Dave Forgac

Oregon Ballroom 201-202

Sunday, 21st May, 13:10 - 13:40

Talks

13:10 - 13:40

Probabilistic Programming with PyMC3

Christopher Fonnesbeck

Bayesian statistics offers robust and flexible methods for data analysis that, because they are based on probability models, have the added benefit of being readily interpretable by non-statisticians. Until recently, however, the implementation of Bayesian models has been prohibitively complex for use by most analysts. But, the advent of probabilistic programming has served to abstract the complexity of Bayesian statistics, making such methods more broadly available. PyMC3 is a open-source Python module for probabilistic programming that implements several modern, computationally-intensive statistical algorithms for fitting Bayesian models, including Hamiltonian Monte Carlo (HMC) and variational inference. PyMC3’s intuitive syntax is helpful for new users, and the reliance on Theano for much of the computational work has allowed developers to keep the code base simple, making it easy to extend the software to meet analytic needs. PyMC3 itself extends Python's powerful 'scientific stack' of development tools, which provide fast and efficient data structures, parallel processing, and interfaces for describing statistical models.

Christopher Fonnesbeck

Portland Ballroom 251 & 258

Sunday, 21st May, 13:10 - 13:40

Talks

13:10 - 13:40

Piecing it Together: A beginner's guide to application configuration

Mary Nagle

Assembling all the necessary setup for an application you’re building can often be more frustrating than writing the app itself. Learning to do this well is difficult, especially for those who are new to Python and might not know where to begin or what questions to ask. While there is no “right way” to set up a development environment or application, understanding the components involved and how they interact can empower you to customize your setup to best suit your needs. This talk will dive into what happens when setting up a database, the purpose and configuration of an isolated environment, how Python packages are installed, and finally, how each of these components interact with each other and the application itself; in particular, how an application's structure facilitates said interactions.

Mary Nagle

Engineer at Warby Parker

Portland Ballroom 254-255

Sunday, 21st May, 13:10 - 13:40

Talks

13:10 - 13:40

async/await and asyncio in Python 3.6 and beyond

Yury Selivanov

The talk overviews async/await, asynchronous generators and comprehensions in Python 3.6 and the asyncio module. We'll discuss when and how asyncio should be used in a modern applications and services, what is uvloop, and what asyncio frameworks and libraries one should use. I'll share our ideas about where asyncio is headed and what to expect in Python 3.7.

Yury Selivanov

Yury is a co-founder of MagicStack (magic.io) and a core Python developer since 2014. Yury authored and implemented PEPs 492, 525, and 530 (async/await support in Python), maintains asyncio module with Guido, and created popular asyncio packages uvloop and asyncpg.

Oregon Ballroom 203-204

Sunday, 21st May, 13:10 - 13:40

Talks

13:50 - 14:20

Asynchronous Python for the Complete Beginner

Miguel Grinberg

With the introduction of the asyncio package in Python 3.4, you can hear lots of people talking about asynchronous programming, most in a favorable way, some not so much. In this talk, I will tell you what this async fever is about and what can it do for you that regular Python can't, not only with asyncio, but also with other frameworks that existed long before it.

Miguel Grinberg

Oregon Ballroom 203-204

Sunday, 21st May, 13:50 - 14:20

Talks

13:50 - 14:20

The Next Step: Finding Model Parameters With Random Walks

Christine Waigl

The statistician John Tukey -- who designed the box plot and coined the term 'bit' -- wrote: 'An approximate answer to the right problem is worth a good deal more than an exact answer to an approximate problem '. Python has become one of the major languages for statistical data analysis, not least because of the expressiveness of the language itself and the availability of tools like Jupyter Notebooks, which enable iterative reasoning about a problem and its solutions. This talks takes one step beyond an introduction to statistics with Python and aims to familiarize the audience with two concepts: a class of problems (so-called inverse problems), and a powerful statistical tool (the random walk, or more formally Markov-Chain Monte Carlo (MCMC) sampling with the Metropolis algorithm). In inverse problems, model parameters are estimated from observational data. Both model and data are expected to be affected by error. The objective is not only to find parameters that best describe the observations, but also to figure out how good, or how possibly bad, a solution might be. Inverse problems are extremely common in many fields and crop up each time we attempt to reconstruct a reality from sensor, radar, scattering or imaging data. The Metropololis-Hastings algorithm offers a solution via random sampling of a Bayesian posterior distribution. Even though listed as one of the 20th century's top 10 algorithms by the journal Computing in Science & Engineering, the Metropolis algorithm is easy to understand and implement, and a fun and instructive way to explore even complicated multi-variate probability distributions.

Christine Waigl

Chris is a scientist working on wildfires in the forests of northern North America. She uses Python every day to handle satellite imagery, when she isn't out in the field , and has authored operational tools for research stations. Before turning her back on living in large European cities and run away to Alaska she worked 10 years in the commercial software field in roles ranging from tech support to engineer and project manager on projects both internal and client-facing. She's also taught in the French school system and worked in the archives of a museum.

Portland Ballroom 251 & 258

Sunday, 21st May, 13:50 - 14:20

Talks

13:50 - 14:20

Python in The Serverless Era

Benny Bauer

Serverless is the latest phase in the evolution of cloud development. Its building blocks are functions, a bunch of stateless “nano-services”, that can scale automatically and charged only when used. It enables teams to focus more on development while having fully managed servers. In this talk I'll cover the Serverless Architectures practices, use cases, tooling and the role python plays in it.

Benny Bauer

Benny Bauer is a cloud architecture expert and a Serverless enthusiast. As an Ex-Autodesk software architect, in the last years he has been busy with rebuilding the AutoCAD backend services, shifting the development from monolith to microservices, porting development to python, establishing a DevOps culture and seeding best practices in the team.

Portland Ballroom 254-255

Sunday, 21st May, 13:50 - 14:20

Talks

13:50 - 14:20

Look mum no hands! From blinking LEDs to a bike speedometer with MicroPython

Kelsey Hightower ( Google)

In this talk I will show you how to use a micro-controller to build a wifi enabled speedometer for your bike, using MicroPython. And some hardware. And a bike (maybe). I will introduce you to the world of MicroPython: a python distribution that runs on micro-controllers. Micro-controllers are small computers that are all around us: in cars, TVs, and your internet connected fridge. We will start with making LEDs blink, then serve webpages, build an interrupt handler and finally put it all together to make a wifi enabled speedometer for a bike.

Kelsey Hightower

Google

Portland Ballroom 252-253

Sunday, 21st May, 13:50 - 14:20

Talks

13:50 - 14:20

One Data Pipeline to Rule Them All

Sam Kitajima-Kimbrel

There are myriad data storage systems available for every use case imaginable, but letting application teams choose storage engines independently can lead to duplicated efforts and wheel reinvention. This talk will explore how to build a reusable data pipeline based on Kafka to support multiple applications, datasets, and use cases including archival, warehousing and analytics, stream and batch processing, and low-latency 'hot' storage.

Sam Kitajima-Kimbrel

Sam Kitajima-Kimbrel is a software engineer with many feels about distributed systems, data routing and storage, and usable APIs. He currently leads Twilio's Data Platform team, building scalable and reusable data infrastructure to support a 400-person R&D organization. Sam has a different hair color every month, enjoys cycling and cooking, and resides in the San Francisco Bay Area with his husband Kameron and their dog Basil.

Oregon Ballroom 201-202

Sunday, 21st May, 13:50 - 14:20

Talks

14:30 - 15:00

Hacking Cars with Python

Eric Evenchick

Modern cars are networks of computers, and a high end vehicle could have nearly 100 different computers inside. These devices control everything from the engine to the airbags. By understanding how these systems work, we can interface with vehicles to read data, perform diagnostics, and even modify operation. In this talk, we'll discuss pyvit, the Python Vehicle Interface Toolkit. This library, combined with some open source hardware, allows developers to talk to automotive controllers from Python. We will begin with an introduction to automotive networks, to provide a basis for understanding the tools. Next, we will look at the tools and show the basics of using them. Finally, we'll discuss real world applications of these tools, and how they're being used in the automotive world today.

Eric Evenchick

Eric Evenchick runs Linklayer Labs, a Canadian company focused on building open source hardware tools for security analysis of embedded systems. Eric's experience with cars began at the University of Waterloo, where he was part of a team developing alternative fuel vehicles. He was also an intern at Tesla Motors, and was one of the first people to find vulnerabilities at the company. His work on automotive security and open source hardware has been featured by several publications, including Wired and Forbes. He has spoken at a a variety of conferences, including Blackhat, Toorcon, PyCon Canada and SecTor.

Portland Ballroom 252-253

Sunday, 21st May, 14:30 - 15:00

Talks

14:30 - 15:00

Gothic Colors: Using Python to understand color in nineteenth-century literature

Eleanor Stribling

Do you love literature and programming? Have you ever been curious about what the heck “Digital Humanities” are? Join us for a quick survey of what’s going on in this growing field and learn about a specific project, “Gothic Colors” where we set out to enumerate and analyze color references and mood in 19th century Gothic novels, using Python and a couple of popular libraries.

Eleanor Stribling

Portland Ballroom 254–255

Sunday, 21st May, 14:30 - 15:00

Talks

14:30 - 15:00

Building Stream Processing Applications

Amit Ramesh

Do you have a stream of data that you would like to process in real time? There are many components with Python APIs that you can put together to build a stream processing application. We will go through some common design patterns, tradeoffs and available components / frameworks for designing such systems. We will solve an example problem during the presentation to make these points concrete. Much of what will be presented is based on experience gained from building production pipelines for the real-time processing of ad streams at Yelp. This talk will cover topics such as consistency, availability, idempotency, scalability, etc.

Amit Ramesh

Amit Ramesh is a Software Engineer within the Ads group at Yelp. He revels in designing and building data processing pipelines. Outside of work he likes to dabble in abstract math, armchair physics and hand-wavy philosophy.

Oregon Ballroom 201-202

Sunday, 21st May, 14:30 - 15:00

Talks

14:30 - 15:00

Bayesian Statistical Analysis with Python

Eric J. Ma

You've got some data, and now you want to analyze it with Python. You're on your way to greatness! Now the problem comes: do I do the t-test? Chi-squared test? How do I decide? In this talk, inspired by many Pythonista Bayesians (@jakevdp, @allendowney, @twiecki, @fonnesbeck) before, I will show you how you can take common statistical decision problems, formulate them as a Bayesian analysis problem, and use PyMC3 as your workhorse tool for gaining insights. This talk will be math-light and code-heavy, and if you download the slides, you'll have a simple template for more complex Bayesian analysis down the road!

Eric J. Ma

Portland Ballroom 251 & 258

Sunday, 21st May, 14:30 - 15:00

Talks

14:30 - 15:00

aiosmtpd - A better asyncio based SMTP server

Barry Warsaw

smtpd.py has been in the standard library for many years. It's been a common tool for deploying SMTP and LMTP servers that handle email-based communication in Python, providing both basic protocol implementations and a fundamental module for higher level tools, such as lazr.smtptest for testing email clients. Based on asyncore and asynchat, smtpd.py is showing its age, and its API is unwieldy. Fortunately, there's a new alternative available. aiosmtpd is a modern reinvention based on asyncio, with all the improvements that come along with such a new implementation. It provides servers for both the SMTP and LMTP protocols, as well as a higher level 'controller ' API for testing SMTP and LMTP clients. It exposes a much better API for customization, allowing the user to associate a simple 'handler ' to process incoming messages without having to worry about the details of the protocols, and it provides some useful hooks for subclassing. This talk will describe the purpose and history of smtpd.py and aiosmtpd, show how users can extend the servers and implement specialized handlers, and show how applications can use the testing API for ensuring that their email sending applications do the right things. Examples will be taken from GNU Mailman 3, which uses aiosmtpd extensively.

Barry Warsaw

Oregon Ballroom 203-204

Sunday, 21st May, 14:30 - 15:00

Talks

Tracks List

Keynote

Sponsor tutorials

Summit

Talks

Tutorials

Rooms List

Keynotes venue

Oregon Ballroom 201-202

Oregon Ballroom 201–202

Oregon Ballroom 203-204

Oregon Ballroom 203–204

Portland Ballroom 251 & 258

Portland Ballroom 252-253

Portland Ballroom 252–253

Portland Ballroom 254-255

Portland Ballroom 254–255

Room 1

Room 2

Room 3

Room 4

Room 5

Room 6

Room 7

Room 8

Room 9

Room A105-106

Room B110-111

Room B118-119