Schedule

Track: [Clear Filter]
Room: [Clear Filter]

Wednesday, 17th May

09:00 - 12:20

An introduction to secure web development with Django and Python

You can't afford to have security be an optional or 'nice - to - have' feature in your applications. Luckily, Django has your back: this workshop will introduce you to thinking about security, cover a broad range of security concerns from the mundane to the arcane, and walk you through, in detail, how Django and the broader Django and Python ecosystems can help protect you and your users from them.

Beginning Python Bootcamp

Are you new to Python? Or do you feel like you grok the syntax, but would like to understand new idioms and where to use them? Want to watch an experienced Python developer create code from nothing? Instead of just covering the syntax, we will introduce most of Python as we build code together. Bring your laptop and we will program a predictive text engine from scratch together. Follow along as we start with IDLE (or your favorite editor) and a blank file, and end with a tested idiomatic Python module. It will learn from any text we pass into it, and predict characters or words for us. Just like your phone!

Contract-First API Development Using The OpenAPI Specification (Swagger)

Often developers will implement APIs and then only after they’re released think about things like specifications and documentation. Instead we can make the design of the API contract an explicit part of our development process using The OpenAPI Specification (Swagger) and open source tools. In this workshop we will:Discuss the contract-first approach, Build and validate a simple OpenAPI Specification, Generate reference documentation and show how you can incorporate it with other docs, Run a mock server so clients can test using the API, Generate stub code based on the specification, Implement a basic working API using Flask, Show how you can iteratively add features and make changes, Discuss generating specifications for existing APIs, Participants will leave with: An understanding of how to incorporate a contract-first process into their API development workflow, An example specification that can be used as reference for their own API design, Working code for a basic API that can be used as a basis for their own development, Participants are expected to have a basic familiarity with HTTP / RESTful APIs, understanding of simple git operations, and some development experience.

Cross-platform Native GUI development with BeeWare

Exploratory data analysis in python

With the recent advancements in machine learning algorithms and statistical techniques, and the increasing ease of implementing them in Python, it is tempting to ignore the power and necessity of exploratory data analysis (EDA), the crucial step before diving into machine learning or statistical modeling. Simply applying machine learning algorithms without a proper orientation of the dataset can lead to wasted time and spurious conclusions. EDA allows practitioners to gain intuition for the pattern of the data, identify anomalies, narrow down a set of alternative modeling approaches, devise strategies to handle missing data, and ensure correct interpretation of the results. Further, EDA can rapidly generate insights and answer many questions without requiring complex modeling. Python is a fantastic language not only for machine learning, but also EDA. In this tutorial, we will walk through two hands-on examples of how to perform EDA using Python and discuss various EDA techniques for cross-section data, time-series data, and panel data. One example will demonstrate how to use EDA to answer questions, test business assumptions, and generate hypotheses for further analysis. The other example will focus on performing EDA to prepare for modeling. Between these two examples, we will cover: Data profiling and quality assessment, Basic describing of the data, Visualizing the data including interactive visualizations, Identifying patterns in the data (including patterns of correlated missing data), Dealing with many attributes (columns), Dealing with large datasets using sampling techniques, Informing the engineering of features for future modeling, Identifying challenges of using the data (e.g. skewness, outliers), Developing an intuition for interpreting the results of future modeling. The intended audience for this tutorial are aspiring and practicing data scientists and analysts, or anyone who wants to be able to get insights out of data. Students must have at least an intermediate-level knowledge of Python and some familiarity with analyzing data would be beneficial. Installation of Jupyter Notebook will be required (and potentially, we will also demonstrate analysis in JupyterLab, if its development in the next few months allows). Instructions will be sent on what packages to install beforehand.

Fantastic Data and Where To Find Them: An introduction to APIs, RSS, and Scraping

Whether you’re building a custom web application, getting started in machine learning, or just want to try something new, everyone needs data. And while the web offers a seemingly boundless source for custom data sets, the collection of that data can present a whole host of obstacles. From ever-changing APIs to rate-limiting woes, from nightmarishly nested XML to convoluted DOM trees, working with APIs and web scraping are challenging but critically useful skills for application developers and data scientists alike. In this tutorial, we’ll introduce RESTful APIs, RSS feeds, and web scraping in order to see how different ingestion techniques impact application development. We’ll explore how and when to use Python libraries such as feedparser, requests, beautifulsoup, and urllib. And finally we will present common data collection problems and how to overcome them. We’ll take a hands-on, directed exercise approach combined with short presentations to engage a range of different APIs (with and without authentication), explore examples of how and why you might web scrape, and learn the ethical and legal considerations for both. To prepare attendees to create their own data ingestion scripts, the tutorial will walk through a set of examples for robust and responsible data collection and ingestion. This tutorial will conclude with a case study of Baleen, an automated RSS ingestion service designed to construct a production-grade text corpus for NLP research and machine learning applications. Exercises will be presented both as Jupyter Notebooks and Python scripts.

Network Analysis Made Simple

Have you ever wondered about how those data scientists at Facebook and LinkedIn make friend recommendations? Or how epidemiologists track down patient zero in an outbreak? If so, then this tutorial is for you. In this tutorial, we will use a variety of datasets to help you understand the fundamentals of network thinking, with a particular focus on constructing, summarizing, and visualizing complex networks.

Using Functional Programming for efficient Data Processing and Analysis

As a multi paradigm language, Python has great support for functional programming. For better or for worse, leading data libraries such as Pandas eschew the this style for object-oriented programming. This tutorial will explain how to take advantage of Python's excellent functional programming capabilities to efficiently obtain, clean, transform, and store data from disparate sources.

Web programming from the beginning

Modern web frameworks such as Django and Flask are immensely powerful. However, these useful tools obscure the foundations of network programming upon which they are based, which can be very helpful to understand. So instead of building useful applications with these libraries, let's experiment with sockets! At this tutorial, a Python-flavored history of the web will be presented and attendees will write or modify a TCP chat client, a static site web server, an HTTP client, a CGI script, and a WSGI-compliant server and web application. We will learn what all those things are and how they fit together, bringing the architecture of modern web apps into better focus. The material will be accessible to participants with no web development experience, however, they must be able to write and run Python scripts at the command-line. This tutorial might appeal to someone also attending an introductory web development tutorial, but it covers separate, complementary material. Web development experience is not required but a little exposure would be helpful; for instance, installing flask and running the minimal application on the quickstart page. Similarly, prior exposure to HTML would be useful but is not necessary.
10:00 - 16:00

Python Language Summit

The Python Language Summit is an invitation-only event for the developers of Python implementations (CPython, IronPython, Jython, Parrot, PyPy, etc.) to share information, discuss our shared problems, and — hopefully — solve them. These issues might be related to the language itself, the standard library, the development process, Python 2 and 3, the documentation, package index, web site, et cetera. The Summit will focus on discussion more than on presentations.
11:00 - 12:30

Red Hat: Deploying Python web applications to OpenShift/Kubernetes

This will be a hands on workshop where you will get to experience yourself how easy it is to deploy a Python web application to OpenShift. The latest version of OpenShift is implemented on top of Kubernetes for container orchestration and Docker for the container runtime. On top of these tools OpenShift adds its own special magic sauce to even further simplify the deployment of applications. In the workshop you will learn how to deploy a Python web application direct from a Git repository holding the application source code, with the build process being handled by the Source-to-Image (S2I) tool. Next you will deploy a database from a pre-existing Docker-formatted container image and learn how to hook your Python web application up to it. Finally you will configure a Git repository webhook to automate the deployment process so that every time you commit and push up changes your application will be automatically rebuilt and deployed. During the workshop we will be throwing in various other tidbits to help explain what OpenShift is, how it works and how it can help you to host not only your Python web site, but also more complex applications, be they legacy systems, or new micro service architecture applications, in any language. For the workshop, you will be provided access to an online instance of OpenShift Origin with everything you need. The only piece of software you will need to install locally on your own computer will be a single program binary for our command line client for OpenShift.

Twist Bioscience: Using Python in Synthetic DNA Manufacturing

Quick Introduction about how Twist Bioscience is enabling world changing innovation like enabling DNA computing and turning sugar into vaccines. Rust Bindings with Python We will demonstrate how we combine the best of Rust and Python by creating a Rust library with Python bindings. Hardware Scripting with Embedded IronPython We’ll demonstrate how we use the flexibility of IronPython runtime to make our hardware scriptable and enable debugging within the same process. This will require a Windows machine or VM image with visual studio installed (free community install). Polymorphic Marshmallowed JSON with SQLAlchemy This exercise will guide you through creating polymorphic objects that are transparently serialized in and out of Postgres JSON fields. We will also work on creating graph objects with mutability tracking

Twist Bioscience: Using Python in Synthetic DNA Manufacturing

13:20 - 16:20

Decorators and descriptors decoded

Python developers use decorators and descriptors on a daily basis, but many don't understand them well enough to create (or debug) them. Decorators are widely deployed in popular Python Web frameworks. Descriptors are the key to the database mappers used with those frameworks, but under the covers they play an even more crucial role in Python as the device that turns plain functions into bound methods, setting the value of the self argument. This tutorial is a gentle introduction these important language features, using a test-driven presentation and exercises, and covering enhancements in Python 3.6 that make class metaprogramming easier to get right.Decorators without closures are presented first, highlighting the difference between run time and import time that is crucial when meta-programming. We then get a firm grounding on closures and how they are implemented in Python, before moving to higher order function decorators and class decorators. Coverage of descriptors starts with a close look at Python's property built-in function and dynamic attribute look up. We then implement some ORM-like field validation descriptors, encounter a usability problem, and leverage PEP 487 -- Simpler customisation of class creation -- to solve it. Alternative implementations using a class decorator and a metaclass will be contrasted to the PEP 487 solution.
13:20 - 16:40

bokeh: Data Visualization in Python

Bokeh is a powerful data visualization library that creates fully interactive plots and integrates well with the data analysis tools you already know and love: pandas, matplotlib, seaborn, ggplot. Bokeh can produce stand-alone browser-based plots and much more sophisticated server-hosted visualizations. Learn to use bokeh to create everything from basic graphs to advanced interactive plots, dashboards, and data applications. Incorporate bokeh within your Jupyter/IPython notebooks, Partner bokeh with other libraries such as matplotlib, seaborn, pandas, and ggplot, Learn about bokeh server: to serve up even more impressive realtime visualizations, Explore configurations and settings, Recognize and overcome common problems

Complexity Science

Complexity Science is an approach to modeling systems using tools from discrete mathematics and computer science, including networks, cellular automata, and agent-based models. It has applications in many areas of natural and social science. Python is a particularly good language for exploring and implementing models of complex systems. In this tutorial, we present material from the draft second edition of Think Complexity, and from a class we teach at Olin College. We will work with random networks using NetworkX, with cellular automata using NumPy, and we will implement simple agent-based models.

ContainerOrchestration.py: The tutorial session

Container Orchestration is the new hot topic in design of scalable system architecture. In this tutorial we look at the main choices for container orchestrators: Docker Swarm, Kubernetes and Apache Mesos. We will look at the use of the respective Python APIs for interacting with thoses engines.This 3-hr session will provide hands-on use of those orchestrators with real use-cases.

How to Write and Debug C Extension Modules

The CPython interpreter allows us implement modules in C for performance critical code or to interface with external libraries while presenting users with a high level Python API. This tutorial will teach you how to leverage to power of C in your Python projects. We will start by explaining the C representation of Python objects and how to manipulate them from within C. We will then move on to implementing functions in C for use in Python. We will discuss reference counting and correct exception handling. We will also talk about how to package and build your new extension module so that it may be shared on PyPI. (We will only be covering building extension modules on GNU/Linux and OSX, not Windows). After the break, we will show how to implement a new type in C. This will cover how to hook into various protocols and properly support cyclic garbage collection. We will also discuss techniques for debugging C extension modules with gdb using the CPython gdb extension.

Introduction to Statistical Modeling with Python

This intermediate-level tutorial will provide students with hands-on experience applying practical statistical modeling methods on real data. Unlike many introductory statistics courses, we will not be applying 'cookbook' methods that are easy to teach, but often inapplicable; instead, we will learn some foundational statistical methods that can be applied generally to a wide variety of problems: maximum likelihood, bootstrapping, linear regression, and other modern techniques. The tutorial will start with a short introduction on data manipulation and cleaning using pandas, before proceeding on to simple concepts like fitting data to statistical distributions, and how to use Monte Carlo simulation for data analysis. Slightly more advanced topics include bootstrapping (for estimating uncertainty around estimates) and flexible linear regression methods using Bayesian methods. By using and modifying hand-coded implementations of these techniques, students will gain an understanding of how each method works. Students will come away with knowledge of how to deal with very practical statistical problems, such as how to deal with missing data, how to check a statistical model for appropriateness, and how to properly express the uncertainty in the quantities estimated by statistical methods.

Microservices with Python and Flask

Microservices are receiving the buzzword treatment these days, and as such, they have a cloud of hype surrounding them that makes it hard to separate substance from fluff. In this tutorial, Miguel Grinberg starts with an introduction to this architecture, including what's great and not so great about it, and then teaches you how a traditional monolithic application written in Flask can be refactored into a modern distributed system based on microservices.

Readable Regular Expressions

What are regular expressions, what are they useful for, and why are they so hard to read? We'll learn what regular expressions are good for, how to make our own regular expressions, and how to make our regular expressions friendly and readable (yes it's possible, sort of).

Time Series Analysis

Time series analysis is more relevant than ever with the rise of big data, the internet of things, and the general availability of data that follows events through time. This tutorial will introduce participants to the many versatile tools Python offers for exploring, analyzing, and predicting time series data. The tutorial will be a mix of lecture and practice, and it will be broken down into four components: (1) Handling timestamped data in Python (2) Commonly encountered problems with time series (3) Time series prediction exercises (4) Time series classification exercises
13:30 - 15:00

Clover Health: Transforming and Analyzing Healthcare Data with Python

This workshop will give you an introduction to how we use python for testing, analysis, and processing at Clover. This includes a walkthrough of our tech stack along with a dive into two use cases. The first use case is from a Data Science perspective which will go over how we test SQL queries in our data pipeline. This will get into an example of statistical modeling in a particular insurance operations context. The second use case is from a Engineering perspective which will show how we transform nested JSON structures into consumable flat table structures. This will also touch on techniques for processing large amounts of data. Clover uses lots of python tools and libraries which we're happy to discuss. We rely heavily on Postgres as our primary database solution. However, this talk will highlight SQLAlchemy, Jupyter Notebook, pytest, generators, partial functions, and LRU caching.

Intel: Bring deep learning to the fingertips of data scientists with Python & BigDL on Apache Spark

We have seen trends that the data science and big data community begin to engage further with artificial intelligence and deep learning technologies, and efforts to bridge the gap between the deep learning communities and data science / big data communities begin to emerge. However, developing deep neural nets is an intricate procedure, and scaling that to big data scale is an even more challenging process. Therefore, deep learning tools and frameworks, especially visualization support, that can run smoothly on top of big data platforms are essential for scientists to understand, inspect and manipulate their big models and big data. In this talk, we will share how we bring deep learning to the fingertips of big data users and data scientists, by providing visualizations (through widely used frameworks such as Jupyter Notebooks and/or Tensorboard) as well as Python toolkits (e.g., Numpy, Scipy, Scikit-learn, NLTK, Kesra, etc.) on top of BigDL, an open source distributed deep learning library for Apache Spark. In addition, we will also share how real-world big data users and data scientists use these tools to build AI-powered big data analytics applications.
15:30 - 17:00

Intel: Scalable, distributed deep learning with Python and Pachyderm

Thursday, 18th May

09:00 - 10:30

Intel: Accelerating Python across the range of applications: the right tools for the job

Python's popularity has given way to its use in many areas--from web frameworks all the way to machine learning and scientific computing. However, getting the best performance from Python requires an intimate knowledge of the right tools and techniques that are available today. In this tutorial, participants will learn how to measure, tune and accelerate Python workflows across various domains. This tutorial will cover the following topics: -Performance speedups for scientific computing using Intel® Distribution for Python, multithreading with Intel® Threading Building Blocks library, Numba, and Intel® VTune Amplifier -Data Analytics and machine learning acceleration with pyDAAL -Web framework, scripting, and infrastructure acceleration using the PyPy JIT
09:00 - 12:20

Build a data pipeline with Luigi

Luigi is a Python library for building pipelines of batch processes. It 'handles dependency resolution, workflow management, visualization, handling failures, command line integration, and much more' In this tutorial, we will use Luigi to build a data pipeline that runs a series of interdependent jobs. We will also discuss some real-world use cases for Luigi, and show how it can make running a data pipeline much more robust and reliable.

Creating And Consuming Modern Web Services with Twisted

This tutorial will show students how to write applications and services which efficiently publish and consume services and APIs. To do so, we will combine 4 Python-based technologies: Jupyter is a real-time development environment. Twisted is a powerful platform for network programming that supports many protocols, including HTTP. Klein is a Twisted-based web application framework. Treq is a Requests-style HTTP client based on Twisted. By combining all of these we will guide students through interactively prototyping a production quality web application that publishes both service APIs and web resources such as HTML, and that can efficiently consume many back-end services such as 3rd-party APIs.

Deploy and scale containers with Docker native, open source orchestration

Deploy your own cluster! Use it to 'build,ship,and run ' containerized applications! Learn how to implement logging, metrics, stateful services, and more! Learn the True Way of DevOps! Alright, we can't promise anything about the True Way of DevOps, but everything else will definitely be in this tutorial. We will run a demo app featuring Python components and see some best practices to 'Dockerize ' Python code and Flask in particular; but the tutorial also includes other languages and frameworks. Come with your laptop! You don't need to install anything before the workshop, as long as you have a web browser and a SSH client. Each student will have their own private cluster during the tutorial, to get immediately applicable first-hand experience.

Faster Python Programs - Measure, don't Guess

Optimization can often help to make Python programs faster or use less memory. Developing a strategy, establishing solid measuring and visualization techniques as well as knowing about algorithmic basics and datastructures are the foundation for a successful optimization. The tutorial will cover these topics. Examples will give you a hands-on experience on how to approach efficiently. Python is a great language. But it can be slow compared to other languages for certain types of tasks. If applied appropriately, optimization may reduce program runtime or memory consumption considerably. But this often comes at a price. Optimization can be time consuming and the optimized program may be more complicated. This, in turn, means more maintenance effort. How do you find out if it is worthwhile to optimize your program? Where should you start? This tutorial will help you to answer these questions. You will learn how to find an optimization strategy based on quantitative and objective criteria. You will experience that one's gut feeling what to optimize is often wrong. The solution to this problem is: „Measure, Measure, and Measure!“. You will learn how to measure program run times as well as profile CPU and memory. There are great tools available. You will learn how to use some of them. Measuring is not easy because, by definition, as soon as you start to measure, you influence your system. Keeping this impact as small as possible is important. Therefore, we will cover different measuring techniques. Furthermore, we will look at algorithmic improvements. You will see that the right data structure for the job can make a big difference. Finally, you will learn about different caching techniques.

Hands-On Intro to Python for New Programmers

Brand new to programming and want to get some hands-on Python experience? Let's learn some Python together! During this tutorial we will work through a number of programming exercises together. We'll be doing a lot of asking questions, taking guesses, trying things out, and seeking out help from others. In this tutorial we'll cover: Types of things in Python: strings, numbers, lists, Conditionally executing code, Repeating code with loops, Getting user input

Intro to Bayesian Machine Learning with PyMC3 and Edward

There has been uprising of probabilistic programming and Bayesian statistics. These techniques are tremendously useful, because they help us to understand, to explain, and to predict data through building a model that accounts for the data and is capable of synthesizing it. This is called the generative approach to statistical pattern recognition. Estimating the parameters of Bayesian models has always been hard, impossibly hard actually in many cases for anyone but experts. However, recent advances in probabilistic programming have endowed us with tools to estimate models with a lot of parameters and for a lot of data. In this tutorial, we will discuss two of these tools, PyMC3 and Edward. These are black box tools, swiss army knifes for Bayesian modeling that do not require knowledge in calculus or numerical integration. This puts the power of Bayesian statistics into the hands of everyone, not only experts of the field. And, it's great that these are implemented in Python with its rich, beginner-friendly ecosystem. It means we can immediately start playing with it. We have planned three awesome parts, spread over three awesome hours: First hour: Introduction to Bayesian machine learning. Second hour: Baby steps in PyMC3 and Edward. Third hour: Solve a real-world problem with PyMC3 or Edward (model, fit, criticize).

IPython and Jupyter in Depth: High productivity, interactive Python

IPython and Jupyter provide tools for interactive computing that are widely used in scientific computing, education, and data science, but can benefit any Python developer. You will learn how to use IPython in different ways, as: an interactive shell, a graphical console, a network-aware VM (Virtual machine) in GUIs, a web-based notebook combining code, graphics and rich HTML. We will demonstrate how to deploy a custom environment with Docker that not only contains multiple Python kernels but also a couple of other languages.

Let's build a web framework!

If you're building a web app, you probably reach for your favorite framework -- Django, Flask, Pyramid, etc. But we rarely stop to think about what these tools are doing under the hood. In this hands-on tutorial, you'll gain a deeper understanding of what frameworks are and how they work by implementing your own framework from scratch. We'll build a complete (if minimal) web framework that handles the WSGI request/response cycle, routing, controllers, templating, and a data layer. Along the way you'll gain a deeper understanding of the decisions web frameworks make, their relative merits, and inner workings.

Mastering scipy.spatial

The heavily-used scipy library is so large that each of the major modules could fill its own tutorial syllabus. It is also production-quality software with a 1.0 release imminent. In this tutorial, my focus is to cover the scipy.spatial component of the library in great detail, from the perspective of a heavy user and active developer of the computational geometry components of scipy. From distance matrices to Voronoi diagrams and Hausdorff distances, we will explore the corners of scipy.spatial code--both long-established features and even proposed features that haven't yet made it into a stable release.
09:00 - 17:00

Python Education Summit

In 2017, PyCon will be holding its fifth annual Python Education Summit. The Summit is a gathering of teachers and educators focused on bringing coding literacy, through Python, to as broad a group of audiences as possible. We invite educators from all venues to consider joining the discussion, share insights, learn new techniques and tools and generally share their passion for education. We are looking for educators from many venues: authors; schools, colleges, universities; community-based workshops; online programs; and government. Not only will we have a wide array of full-blown talks, we will also have a round of lightning talks!

Python Education Summit

Join us for a 5k run along the beautiful Willamette River in Portland, OR.
11:00 - 12:30

Anaconda: Data Science Apps with Anaconda

Anaconda provides a rich foundation of Python and R packages for data science. This tutorial will demonstrate how Anaconda can be used to turn simple models, scripts, or Jupyter notebooks into deployable applications. Participants should have Anaconda installed and have basic Python programming experience. We'll make use of machine learning and AI libraries such as Pandas, Scikit-learn, Tensorflow, and Keras. The tutorial will also demonstrate the app deployment capabilities of Anaconda Cloud.

Datadog: Distributed Tracing for Python

Tracing is a specialized form of logging that is designed to work effectively in large, distributed environments. When done right, tracing follows the path of a request across process and service boundaries. This provides a big step-up in application observability, and can help inform a developer why certain requests are slow, or why they might have behaved unexpectedly. This tutorial will familiarize users with the benefits of tracing, and describe a general toolkit for emitting traces from Python applications in a minimally intrusive way. We will walk through a simple example app, which receives an HTTP request, and gradually instrument it to be observable via traces. We will discuss language constructs that can generate traces - namely decorators, monkey-patching and context managers - and give users hints on how they might add tracing to their own applications and libraries. In the process users will become familiar with the existing standards for modelling traces, and some of the challenges involved in adhering to this model in a distributed, asynchronous environment.
13:20 - 16:40

Applied Modern Cryptography in Python

Today we use cryptography in almost everywhere. From surfing the web over https, to working remotely over ssh. Although most developers don't need to implement cryptography primitives, the knowledge and understanding of these building block allows them to better deploy them in their application. In modern crypto we have all the building block to develop secure application. However, we see instances of insecure code everywhere. Most of these vulnerabilities are not because of theoretic shortcomings, but due to bad implementation or a flawed protocol design. Cryptography is a delicate art where nuances matter, and failure to comprehend the subtleties of these building blocks leads to critical vulnerabilities. To add insult to injury most of the resources available are either outdated or wrong, and inarguably, using bad crypto more dangerous than not using it.In this tutorial we look at the basic building blocks of modern cryptography. We will cover the encryption techniques, hashing mechanisms, and key devastation algorithms. Furthermore we review two of the most widely used protocol suites, SSL and PGP. We conclude by implementing a simplified version of Pretty Good Privacy (PGP), that is used for encryption of texts, e-mails, files, directories, and whole disk partitions.

Best Testing Practices for Data Science

So you're a data scientist wrangling with data that's continually avalanching in, and there's always errors cropping up! NaNs, strings where there are supposed to be integers, and more. Moreover, your team is writing code that is getting reused, but that code is failing in mysterious places. How do you solve this? Testing is the answer! In this tutorial, you will gain practical hands-on experience writing tests in a data science setting so that you can continually ensure the integrity of your code and data. You will learn how to use py.test, coverage.py, and hypothesis to write better tests for your code.

Django Admin: Basics and Beyond

Django's admin is a great tool but it isn't always the easiest or friendliest to set up and customize. The ModelAdmin class has a lot of attributes and methods to understand and come to grips with. On top of these attributes, the admin's inlines, custom actions, custom media, and more mean that, really, you can do anything you need with the admin...if you can figure out how. The docs are good but leave a lot to experimentation and the code is notoriously dense. In this tutorial, you'll learn the basics of setting up the admin so you can get your job done. Then we'll dive deeper and see how advanced features like autocomplete, Markdown editors, image editors, and others would be added to make the admin really shine.

Effectively running python applications in Kubernetes/OpenShift

Google, Red Hat, Intel, Huawei, Mirantis, Deis and many, many others are investing a lot of time and effort into improving Kubernetes. I bet, you have encountered that name at least once in the past twelve months, either on Hacker News, Reddit, or somewhere else. Do you want to learn more about the best container orchestration in the universe, but were afraid of the setup complexity? Do you want to see how easy it is to run any application using containers? Do you want to experience the joy of scaling application with a single click? This, and a lot more will be discussed in details. In this tutorial, every attendee will be provided with an environment, and step by step instructions necessary to setup the environment, build and deploy a microservices based sample application. Alternatively, a sample application of any choosing can be used throughout the entire tutorial. All that will be performed on OpenShift, which is a Red Hat distribution of Kuberenets with some add-ons that will be described in details at the beginning of the tutorial. To wet your appetite even more, here are some of the topics we are going to cover: git integration, image registry integration, scaling application, containers security, batch tasks and much more. After the session, every person will be able to play around with the accompanying code repository that was used in the tutorial, which includes detailed instructions how to run it on your own from scratch.

Intermediate Python Bootcamp

Are you new to Python and want to learn to step it up to the next level? Have you heard about closures, decorators, context managers, generators, list comprehensions, or generator expressions? What are these and why do advanced Pythonistas keep mentioning them? Don't be intimidated, learn to take advantage of these to make you own code more idiomatic. This hands-on tutorial will cover these intermediate subjects in detail. We will modify existing Python code to take advantage of them. We will start with a basic file, and then introduce these features into it using the REPL, command line, and tests. The audience will get to follow along using their own computer and editor of choice (or can use IDLE as the instructor). We will teach the 'code smells' to look for. You will know when you should apply these new techniques to your code.

Introduction to Digital Signal Processing

Spectral analysis is an important and useful technique in many areas of science and engineering, and the Fast Fourier Transform is one of the most important algorithms, but the fundamental ideas of signal processing are not as widely known as they should be. Fortunately, Python provides an accessible and enjoyable way to get started. In this tutorial, I present material from my book, Think DSP, and from a class I teach at Olin College. We will work with audio signals, including music and other recorded sounds, and visualize their spectrums and spectrograms. We will synthesize simple sounds and learn about harmonic structure, chirps, filtering, and convolution.

IoT Lab with Micropython and Friends

Come learn about the Internet of Things and Micropython in this hands-on hardware tutorial, no soldering or hardware experience required! We will be building a wifi-enabled temperature sensor as a vehicle for learning IoT concepts including data capture, building security into data transmission, and messaging between IoT clients and servers. Attendees will have an opportunity to take their sensors out into the conference venue to take measurements, reconvening to discuss analysis and visualization of IoT data. All the hardware needed will be provided, and attendees will be able to program the devices via a locally-hosted web interface. This tutorial will be a great introduction for folks interested in Internet of Things, Micropython, or hardware hacking. You do not need prior experience in any of these topics to attend. We will be using the ESP8266 micro controller and the MQTT protocol for messaging. Attendees should download the mosquitto MQTT broker in addition to collateral that will be sent out to attendees ahead of the tutorial. We will be using the WebREPL interface for programming the ESP8266. If you are wondering what the heck all of this means do not despair! These topics will be covered in the tutorial.

Parallel Data Analysis

An overview of parallel computing techniques available from Python and hands-on experience with a variety of frameworks. This course has two primary goals: 1. Teach students how to reason about parallel computing 2. Provide hands-on experience with a variety of different parallel computing frameworks. Students will walk away with both a high-level understanding of parallel problems and how to select and use an appropriate parallel computing framework for their problem. They will get hands-on experience using tools both on their personal laptop, and on a cluster environment that will be provided for them at the tutorial. For the first half we cover programming patterns for parallelism found across many tools, notably map, futures, and big-data collections. We investigate these common APIs by diving into a sequence of examples that require increasingly complex tools. We learn the benefits and costs of each API and the sorts of problems where each is appropriate. For the second half, we focus on the performance aspects of frameworks and give intuition on how to pick the right tool for the job. This includes common challenges in parallel analysis, such as communication costs, debugging parallel code, as well as deployment and setup strategies.

Python Epiphanies

This tutorial is for developers who've been using Python for a while and would consider themselves at an intermediate level, but are looking for a deeper understanding of the language. It focuses on how Python differs from other languages in subtle but important ways that are often confusing, and it demystifies a number of language features that are sometimes misunderstood.
13:30 - 15:00

Anaconda: Accelerating your Python Data Science code with Dask and Numba

Anyone doing numerical computing with Python will have run into performance barriers. Using Anaconda is a great start to get a suite of extension packages where the underlying data structures and algorithms are written in C or Fortan. We'll briefly review the state of numerical computing in Python, look at some examples to help you remember why you should use NumPy based packages whenever possible, and focus on two options for acceleration: faster serial computing or parallelization. Continuum Analytics has developed two popular open source packages to address these issues: Numba, which provides an LLVM-based JIT that can be easily accessed just through a decorator; and Dask, which provides a distributed computing framework and some high quality data structures that are similar to a Pandas DataFrame or a NumPy NDarray. Participants should have the latest release of Anaconda installed and have some familiarity with Python in order to follow along interactively with the tutorial where we'll learn how to efficiently leverage Dask and Numba.
15:30 - 17:00

Elasticsearch and Python: Doing more with open source tools

Description: Elasticsearch, a distributed, RESTful search and analytics engine, has wide variety of capabilities that can be used from Python. In this workshop we will explore several different use cases and showcase how the associated Python libraries can be used to help you. It is intended for intermediate users who have basic familiarity with Elasticsearch and want to further their understanding. Some of the topics that will be covered are: * bulk loading data into Elasticsearch * how to efficiently building queries and aggregations * using Elasticsearch for persistence in your application * syncing data in Elasticsearch and other data stores. Attendees will leave the workshop with the skills to integrate Elasticsearch into their apps and additional tools for analyzing data with Python.

Google: How I Learned to Stop Worrying and Love Python at Google | Tech Talk Series

Metis: Natural Language Processing and Text Mining in Python

This session will cover NLP and text mining using Python and offer several examples of real world applications. Participants will be introduced to various text processing techniques and learn about text classification, clustering, and topic modeling. By the end of the workshop, participants will be able to use Python to explore and build their own models on text data. Michael Galvin is the Executive Director of Data Science at Metis. He came to Metis from General Electric where he worked to establish their data science strategy and capabilities for field services and to build solutions supporting Global operations, risk, engineering, sales, and marketing. Prior to GE, Michael spent several years as a data scientist working on problems in credit modeling at Kabbage and corporate travel and procurement at TRX. Michael holds a Bachelor's degree in Mathematics and a Master's degree in Computational Science and Engineering from the Georgia Institute of Technology where he also spent 3 years working on machine learning research problems related to computational biology and bioinformatics. Additionally, Michael spent 12 years in the United States Marine Corps where he held various leadership roles within aviation, logistics, and training units.

Friday, 19th May

10:50 - 11:20

Experiment Assignment on the Web

A popular way of improving websites is to run experiments on it. We split users into groups, show two or more variations of the site, measure how well each one does, and then show the best version to everyone. In this talk, I'll walk through a toy Python program that does the first step: splits users into groups. A few interesting problems arise: grouping users, whitelists, and scaling. I'll share different ways to address them. I'll also give examples of things that can go terribly wrong when designing experiment assignment code.

In-Memory Event Resequencing: Realistic Testing For Impossible Bugs

As we all know, we should write testable code, and automated tests. But as we also know, no test plan survives contact with the real world. Complex, distributed systems fail in complex, distributed ways, and even the simplest web app today is a complex distributed system. So, as our code accrues little fixes to bugs that only show up in production, our test suites eventually either become slow integration testing monstrosities that are 'realistic' but flaky and unreliable, or useless piles of mocks which are fast and deterministic but don't give you confidence. In this talk, we'll explore how to leverage event-driven programming, or 'async I / O ', to structure code in such a way that its tests are fast, realistic, and reliable, even in the face of horrible race-conditions you only discover in production.

Optimizations which made Python 3.6 faster than Python 3.5

Various optimizations made Python 3.6 faster than Python 3.5. Let's see in detail what was done and how. Python 3.6 is faster than any other Python version on many benchmarks. We will see results of the Python benchmark suite on Python 2.7, 3.5 and 3.6. The bytecode format and instructions to call functions were redesign to run bytecode faster. A new C calling convention, called 'fast call' , was introduced to avoid temporary tuple and dict. The way Python parses arguments was also optimized using a new internal cache. Operations on bytes and encodes like UTF-8 were optimized a lot thanks to a new API to create bytes objects. The API allows very efficient optimizations and reduces memory reallocations. Some parts of asyncio were rewritten in C to speedup code up to 25%. The PyMem_Malloc() function now also uses the fast pymalloc allocator also giving tiny speedup for free. Finally, we will see optimization projects for Python 3.7: use fast calls in more cases, speed up method calls, a cache on opcodes, a cache on global variables.

Requests Under The Hood

Requests is widely acknowledged as a library that saves users an enormous amount of time, effort, and pain through its intuitive and clear API. For this reason, most people who have never looked at the code assume that its code is as intuitive, well-structured, and clear as the API. Of course, the truth is more complex than that. Real software that deals with real problems is rarely ideal: there are edge cases, terrible hacks, and awkward workarounds for problems. Often in the software industry we pretend that these imperfections in our software don’t exist, or we try to hide them. These imperfections frequently cause people to reinvent wheels in order to simplify the code, which has benefits for understandability but frequently has downsides for resilience. When people talk about “battle-tested” code, they mean code that has been dirtied up over time from its original Platonic ideal implementation to something that is just as complex and warty as real life. In this talk, one of the Requests and urllib3 core maintainers lays bare all of the worst and hackiest corners of the codebases of these two libraries. The goal is to help expose all of the invisible work done in mature codebases to tolerate edge cases and misbehaviour, as well as to try to remind us all that the perfect is the enemy of the good.

Title Available On Request: An Introduction to Lazy Evaluation

Lazy evaluation, also known as 'call by need ', is an evaluation strategy where values are produced only when needed. Lazy evaluation is the opposite of eager evaluation, Python's normal evaluation model, where functions are executed as seen and values are produced immediately. In this talk we will define lazy evaluation and contrast it with eager evaluation. We will discuss tools that exist in Python for using lazy evaluation and show how we can build on the primitives to better represent computations. We will introduce common vocabulary for discussing evaluation models, and compare different systems for implementing lazy evaluation. Finally, we will discuss optimizations that can be made to optimize lazily evaluated expressions.
11:30 - 12:00

Big picture software testing: unit testing, Lean Startup, and everything in-between

There are many ways you can test your software: unit testing, manual testing, end-to-end testing, and so forth. Take a step back and you'll discover even more form of testing, many of them very different in their goals: A/B testing, say, where you see which of two versions of your website results in more signups or ad clicks. How do these forms of testing differ, how do they relate to each other? How do you choose which kind of testing to pursue, given limited time and resources? How do you deal with strongly held yet opposite views arguing either that a particular kind of testing is essential or that it's a waste time? This talk will provide you with a model, a way to organize all forms of testing and understand what exactly they provide, and why. Once you understand the model you will be able to choose the right form of testing for your situation and goals.

Immutable Programming - Writing Functional Python

The world of Haskell and functional programming may seem like a distant place to many working Python developers, but some of the techniques used there are remarkably useful when developing in Python. In this talk, I will cover some of the pitfalls of mutability that you may run into while writing Python programs, and some tools and techniques that Python has built in that will let you avoid them. You'll see namedtuples, enums and properties, and also some patterns for structuring immutable programs that will make them easier to build, extend, and test.

Passing Exceptions 101: Paradigms in Error Handling

Exception handling in Python can sometimes feel like a Wild West. If you have a send_email function, and the caller inputs an invalid email address, should it: A) Return None or some other special return value, B) Let the underlying exception it might cause bubble up, C) Check via a regex and type checking and raise a ValueError immediately, or D) Make a custom EmailException subclass and raise that? What if there is a network error while the email was sending? Or what if the function calls a helper _format_email that returns an integer (clearly wrong!), or raises an TypeError itself? Should it crash the program or prompt a retry? This talk will introduce the concept of an exception, explain the built-in Python exception hierarchy and the utility of custom subclasses, demonstrate try/except/finally/else syntax, and then explore different design patterns for exception control flow and their tradeoffs using examples. It will also make comparisons to error handling philosophy in other languages, like Eiffel and Go.

Prehistoric Patterns in Python

Why do some code use dictionaries that have None for all values? Is it true that you shouldn't concatenate strings with +? Will Python optimize constant calculations? This talk will go through some patterns that used to be common in Python, but which now are regarded as outdated and see if they really are outdated and why. The results surprised me, maybe they'll surprise you.

Type uWSGI; press enter; what happens?

You're a pretty knowledgeable Python web application developer, but how does that web application get served to the world? For many of us, uWSGI is the magic that makes our application available, and in this talk we'll look at how uWSGI works with the OS and the networking stack to make the magic happen.
12:10 - 12:40

Next Level Testing

Unit, functional, and integration tests are great first steps towards improving the quality of your python project. Ever wonder if there’s even more you can do? Maybe you've heard of property-based testing, fuzzing, and mutation testing but you're unsure exactly how they can help you. In this talk we’ll cover additional types of tests that can help improve the quality and robustness of your python projects: property-based testing, fuzz testing, stress testing, long term reliability testing, and mutation testing. We’ll also go beyond just covering what these tests are. For each of the test types above, I’ll give you real world examples from open source software that I maintain that shows you the types of bugs each test type can find. I’ll also show you how you can integrate these tests into your Travis CI and/or Jenkins environment.

Web identity: OAuth2 and OpenIDConnect

Interested in adding single sign-on to your application, but confused about the variety of different web authentication methods out there? OAuth, OAuth2, OpenID, OpenIDConnect, SAML, Facebook Connect? This talk will clarify the different protocols, examining OAuth2 and OpenIDConnect in greater detail. It will demonstrate a basic client implementation using FLOSS libraries, and briefly touch on some of the issues involved in server implementation.
12:10 - 12:55

Grok the GIL: Write Fast And Thread-Safe Python

I wrote Python for years while holding mistaken notions about the Global Interpreter Lock, and I've met others in the same boat. The GIL's effect is simply this: only one thread can execute Python code at a time, while N other threads sleep or await network I/O. Let's read CPython interpreter source and try some examples to grok the GIL, and learn to write fast and thread-safe Python.

Introduction to Threat Modeling

Are you a website or application developer? Are you worried about security? Don’t know what you need to know, and what you can safely leave to the experts? Come learn about how to analyze your application’s design for potential security flaws, how to think like a security engineer, and see some of the most common pitfalls that programs fall victim to. In this talk we will work through the process of threat modeling - understanding how your system might get attacked, what its weak points are, and how to defend it.

The Gilectomy: How's It Going?

One of the most interesting projects in Python today is Larry Hastings' 'Gilectomy' project: the removal of Python's Global Interpreter Lock, or 'GIL'. Come for an up-to-the-minute status report: what's been tried, what has and hasn't worked, and what performance is like now.
13:40 - 14:25

Implementing Concurrency and Parallelism From The Ground Up

When writing an application, it is common to want to do many things at once. For web servers, this is serve multiple web requests, for GUI applications it's doing a background task whilst keeping the UI responsive. But how do we actually do that? This talk will go into how concurrency and parallelism work from the CPU, OS, and threads up, how state (data) is shared between them, and how this interacts with the functions that you, the programmer, write, and how you can write properly behaving concurrent or parallel software.

Solid Snakes or: How to Take 5 Weeks of Vacation

No matter whether you run a web app, search for gravitational waves, or maintain a backup script: being responsible for a piece of software or infrastructure means that you either get a pager right away, or that you get angry calls from people affected by outages. Being paged at 4am in everyday life is bad enough. Having to fix problems from hotel rooms while your travel buddies go for brunch is even worse. And while incidents can’t be prevented completely, there are ways to make your systems more reliable and minimize the need for (your!) manual intervention. This talk will help you to get calm nights and relaxing vacations by teaching you some of them.
13:55 - 14:25

Community powered packaging: conda-forge

The Python scientific community always wanted a package manager that is cross platform, does not require sudo, and lets Python be awesome! The conda package manager solved that problem, but created a new ones...This talk is a tour disguised as a beginner tutorial to conda-forge packaging. We will try to discuss some myths and misconceptions about conda and conda-forge, as well as a quick comparison with pip and wheels.

Dial M For Mentor

One of the nicest things about Python community is the availability of mentors willing to help you. Various mentors have helped me navigate the open source community and help advanced my skills. I realized finding a mentor is not as easy as it seems, and it takes a lot of courage to reach out in the first place. And then, there is impostor syndrome, where one may feel like they don’t deserve the help. In this talk, I will provide advice about working with a mentor. Asking for help is not a failure.

Packaging Let’s Encrypt: Lessons learned shipping Python code to hundreds of thousands of users

Let's Encrypt launhced on April 12th 2016, for the first time allowing anyone access to free SSL certificates that could be automatically fetched and renewed. The demand was massive, and so was the need for a client to fetch these certificates for all of those users. This client is called Certbot, and it's written entirely in Python. Unfortunately for the sanity of Certbot developers, these users of Let's Encrypt can't decide on a single operating system to use! This requires us to ship our software, and all of its dependencies, to a variety of systems all with different web servers, Python versions, package managers,and underlying packages. Learn how we got through this mess!
14:35 - 15:05

Dr. Microservices, Or How I Learned to Stop Worrying and Love the API

Assuming that you already know how to build a monolithic app, you must be wondering how you can use all this 'microservice' stuff that you keep hearing about. Well, a good word of advice is that you probably don't need it. If designed properly, a monolithic app should be able to scale and fit the needs of most businesses. Even so, you should keep your development as simple as possible until you have proven and solidified your business concepts. But if you do need to grow to Internet scale, then you have a long road ahead of you. Moving from a monolithic application to microservices is a natural evolution that is often of necessity. There are several competing schools of thought that are still being battle-tested in these early days of microservice architecture. Among all the competing paradigms, most of the requirements can be agreed upon, but are mostly differentiated by the tools used to fulfill the requirements. This talk will cover setting up the required infrastructure, and demonstrate how to migrate a sample monolithic Django application into a microservices platform. The demo application will use the following technologies: Django, Flask, Fabric, Terraform, Ansible, CentOS, Docker, Mesos, Consul, Nginx, Pgbouncer, Kafka

Factory Automation with Python - Stories about Robots, Serial Ports, and Barcode Readers

In industrial automation tried and tested always beats latest and greatest: The machines that make smartphones have a serial port and are configured with .csv files. But when your factory automates complex non-linear workflows and is jam-packed with sensors and robots, you are quickly faced with software engineering challenges that call for modern tools. Python turns out to be surprisingly versatile in this setting, whether for prototyping a single conveyor belt or taming a building full of robots. This talk explains how to use Python for interfacing with two common industrial automation devices: a barcode scanner and a Programmable Logic Controller (PLC). After a simple demo, you’ll hear about lessons learned using Python packages that weren’t written with robots in mind, like Celery and pandas, with robots.

Instagram Filters in 15 Lines of Python

Images tell stories, and we love Instagram filters because they give emotion to our images. Do you want to explore what makes up Instagram filters? In this talk, we will talk about the basic elements of Instagram filters and implement them in Python. The staple libraries we will use are scikit-image and numpy - matplotlib and jupyter notebooks for plotting and interactivity. In the end, we will implement the (now-defunct) Gotham Instagram filter in 15 lines of Python (not including imports). Throughout the process, there will be many pretty pictures.

Rants and Ruminations From A Job Applicant After 💯 CS Job Interviews in Silicon Valley

What is it like to interview at 1 technology company? Stressful and tiring. What is it like to interview at 100 technology companies? I have done that. In late August 2016, I quit an uninspiring full-time software job and talked to 100 employers in the San Francisco Bay Area to find the best fit. The hiring process reflects the company culture and its values. Listen to my rants and ruminations of interviewing at tiny seed-stage startups to large technology companies in Silicon Valley. Learn how to reform your own hiring process to be more considerate and thoughtful. Learn how to prepare for interviews efficiently.

Snek in the Browser

Python is a decades-strong language with a large community, and it has a solid foundation on the server, but it doesn't have a good user story in the browser... until now. The BeeWare project aims to bring Python natively, everywhere. Using a combination of the Batavia and Toga projects, we can develop and entirely native web experience in Python, no JavaScript required. During this talk, you will learn about how the BeeWare project has built Batavia, a Python virtual machine in JavaScript; and Toga, a multi-platform native API wrapper; a combination of which can be used to build an entire web platform in Python only.
15:15 - 14:00

Python from Space: Analyzing Open Satellite Imagery Using the Python Ecosystem

Earth imaging satellites, just like our computers, are shrinking and becoming more ubiquitous than ever before. It is now possible to obtain open satellite data on a daily if not weekly basis and for this data to be put to work; helping us better understand our planet and quickly respond to disaster situations. In this talk we will work through a jupyter notebook that covers the satellite data ecosystem and the python tools that can be used to sift through and analyze that data. Topics include python tools for using Open Street Maps data, the Geospatial Data Abstraction Library (GDAL), and OpenCV and NumPy for image processing. This talk is intended for novice and intermediate python developers who are interested in using data science and satellite imagery for social good and fundamental scientific research.
15:15 - 15:45

5 ways to deploy your Python web app in 2017

You’ve built a fine Python web application and now you’re ready to share it with the world. But what’s the best way to deploy your app in 2017?This talk will demonstrate popular techniques for deploying Python web applications. We’ll start with a simple Flask application and expose it to the world five times over as we learn to use different tools and services available to the modern Python developer.Specific topics covered include:Exposing your local dev environment with ngrok, Using a Platform-as-a-Service (PaaS) like Heroku, Going “serverless” with AWS Lambda, Configuring your own VM with Google Compute Engine, Thinking inside the box using Docker. We’ll also briefly touch on the pros and cons of each technique to help you figure out which one is right for your app. At the end of this talk you will have a basic understanding of how each of these techniques work and you’ll be ready to try them out yourself.

What's in your pip toolbox?

pip is a great tool, but dependency management doesn't stop there. I'll explore several tools that work with pip to make managing your dependencies easier, faster, and safer. I'll cover generating dependencies a better way, maintaining your requirements.txt for the long-term, and exploring existing Python environments. Afterward, you'll never want to pip freeze > requirements.txt again.
15:15 - 16:00

Constructive Code Review

“Your code is bad and you are bad. Have a bad day.” Too many code reviews feel like this, and it saps the enthusiasm that drives open source. Instead, let’s explore how to give reviews that are truthful but encouraging, boosting the skill level of contributors and the quality of the project. We’ll look at “tact hacks” that nudge communication in a friendly direction, antipatterns to avoid, the pesky human emotions that can tempt us into reviewing poorly, and techniques for leveling up newcomers without losing all your coding time.

Ending Py2/Py3 compatibility in a user friendly manner

Four shalt thou not count, neither count thou two, excepting that thou then proceed to three.'Monty Python and the Holy Grail; Scene 33. Python 3 has been around for more than eight years, and much of the Python ecosystem is now available both on Python 2 and Python 3, often using a single code base. Nonetheless, this compatibility comes at a development cost and some library authors are considering ending support for Python 2 . These once-python-2-compatible libraries are at risk of being upgraded on non compatible system and cause user (and developer) frustration. While it may seem simple to cease support for Python 2, the challenge is not in ending support, but doing so in a way that does not wreak havoc for users who stay on Python 2. And that is not only a communications problem, but a technical one : up until recently, it was impossible to tag a release as Python 3 only; today it is possible. Like any maintainer of a widely used library, we want to ensure that users continue to use Python 2 continue to have functioning libraries, even after development proceeds in a way that does not support Python 2. One approach is to ensure easy installation of older versions if possible avoid incompatible versions altogether. Users should not need to manually pin maximal version dependencies across their development environments and projects if all they want is to use the latest versions of libraries that are compatible with their system. Even if we did expect that of users, consider what would happen when a package they rely on converts to be only Python 3 compatible. If they were not tracking the complete dependency tree, they might discover, on upgrade, that their projects no longer work. To avert this they would need to pin those at the last version compatible with Python 2. Users that want to use older python versions should not have to go through so much anguish to do so. In order to solve this problem, and thereby make both users' and maintainers' lives easier, we ventured into the rabbit-hole called Packaging. Though we set off with a singular quest, our tale roves through many lands. We'll narrate the story of our amending PEPs, our efforts in building the ramparts of the pypa/Warehouse Castle, battles with the dragons of Pip, and errands in the 'land of no unit tests 'otherwise known as PyPI legacy.By the end of the above tale,the audience members will know the road to Python 3 only libraries had once had hazards that are now easily avoidable.So long as users upgrade their package management tools.

I Installed Python 3.6 on Windows and I Liked It

Python has a great reputation as a cross-platform language, which for many people means different varieties of Linux. But a huge number of Python users are running on Windows - a fundamentally different operating system where things do not always work the same. However, Python has always worked incredibly well across different platforms including Windows, going to great lengths to support and expose the platform without making development more complex. In this session, CPython core developer and Microsoft engineer Steve Dower will discuss some of the reasons why cross-platform support is not an accident, and how Python 3.6 makes it even easier to support both Windows and Linux.
16:15 - 17:00

Dask: A Pythonic Distributed Data Science Framework

Dask is a general purpose parallel computing system capable of Celery-like task scheduling, Spark-like big data computing, and Numpy/Pandas/Scikit-learn level complex algorithms, written in Pure Python. Dask has been adopted by the PyData community as a Big Data solution. This talk focuses on the distributed task scheduler that powers Dask when running on a cluster. We'll focus on how we built a Big Data computing system using the Python networking stack (Tornado/AsyncIO) in service of its data science stack (NumPy/Pandas/Scikit Learn). Additionally we'll talk about the challenges of effective task scheduling in a data science context (data locality, resilience, load balancing) and how we manage this dynamically with aggressive measurement and dynamic scheduling heuristics.
16:15 - 19:00

A gentle introduction to deep learning with TensorFlow

Deep learning's explosion of spectacular results over the past few years may make it appear esoteric and daunting, but in reality, if you are familiar with traditional machine learning, you're more than ready to start exploring deep learning. This talk aims to gently bridge the divide by demonstrating how deep learning operates on core machine learning concepts and getting attendees started coding deep neural networks using Google's TensorFlow library.
16:30 - 17:00

Building A Gigaword Corpus: Lessons on Data Ingestion, Management, and Processing for NLP

As the applications we build are increasingly driven by text, doing data ingestion, management, loading, and preprocessing in a robust, organized, parallel, and memory-safe way can get tricky. This talk walks through the highs (a custom billion-word corpus!), the lows (segfaults, 400 errors, pesky mp3s), and the new Python libraries we built to ingest and preprocess text for machine learning. While applications like Siri, Cortana, and Alexa may still seem like novelties, language-aware applications are rapidly becoming the new norm. Under the hood, these applications take in text data as input, parse it into composite parts, compute upon those composites, and then recombine them to deliver a meaningful and tailored end result. The best applications use language models trained on domain-specific corpora (collections of related documents containing natural language) that reduce ambiguity and prediction space to make results more intelligible. Here's the catch: these corpora are huge, generally consisting of at least hundreds of gigabytes of data inside of thousands of documents, and often more! In this talk, we'll see how working with text data is substantially different from working with numeric data, and show that ingesting a raw text corpus in a form that will support the construction of a data product is no trivial task. For instance, when dealing with a text corpus, you have to consider not only how the data comes in (e.g. respecting rate limits, terms of use, etc.), but also where to store the data and how to keep it organized. Because the data comes from the web, it's often unpredictable, containing not only text but audio files, ads, videos, and other kinds of web detritus. Since the datasets are large, you need to anticipate potential performance problems and ensure memory safety through streaming data loading and multiprocessing. Finally, in anticipation of the machine learning components, you have to establish a standardized method of transforming your raw ingested text into a corpus that's ready for computation and modeling. In this talk, we'll explore many of the challenges we experienced along the way and introduce two Python packages that make this work a bit easier: Baleen and Minke. Baleen is a package for ingesting formal natural language data from the discourse of professional and amateur writers, like bloggers and news outlets, in a categorized fashion. Minke extends Baleen with a library that performs parallel data loading, preprocessing, normalization, and keyphrase extraction to support machine learning on a large-scale custom corpus.

Re-Programming the Human Genome with Python

Modern genome editing techniques such as CRISPR-Cas9 are revolutionizing the way we discover and treat the root genetic causes of disease. Many of the most popular tools and libraries in this cutting edge application are written in Python. This talk will provide a general, software-centric introduction to the exciting new area of genome editing, describe the central string search, machine learning, and data management problems involved, and review how Python frameworks and libraries are used today to solve these problems in Production in order to benefit human health. This talk assumes no prior lab experience: only a proficiency with Python and curiosity!

The Glory of pdb's set_trace

Everyone needs to debug code, and it can take up a non-trivial portion of our time to wait for code to complete execution and write print messages to stdout. There’s one function in particular in the python debugger (pdb) library that can give you a much clearer understanding of what’s going on in your code, much more quickly; pdb.set_ trace(). In this talk, we’ll identify the most useful things you can do when you use set trace, that can make debugging exponentially more efficient and enjoyable.
17:10 - 17:40

Algorithmic Music Generation

Music is mainly an artistic act of inspired creation and is unlike some of the traditional math problems. Music cannot be solved by a simple set of formulae. The most interesting and challenging part is producing unique music without infringing the copyright. The generated music has to sound good, and what sounds good is very subjective and varies from culture to culture. Artificial Neural Network/Deep Learning has a wide range of applications, such as in Image processing, Natural language processing, Time series prediction, etc. But what about its usage in art? Could we use deep learning to create music? This talk is about how deep learning models were used to produce music - catering particularly to Bollywood. This talk would show how an exquisite piece of art i.e. music can be generated using deep learning model which helps in automated feature extraction. In order to automate the music generation, the model must be able to remember the learned features over the longer period of time, this is achieved by a special type of Recurrent Neural Network (RNN) called as LSTM (Long Short Term Memory) network. Implementation of such complex model can be made much easier using inbuilt Python libraries such as Keras with Theano as backend. It allows for easy and fast prototyping. Packages like numpy and scipy are being used for easier mathematical computation of input vectors and for reading/writing the WAV files respectively. The neural network architecture makes use of numerous amount of music samples to train the model. After an adequate number of iterations and training time, this model generates music that is unique and original. In this talk, steps involved in preprocessing of data, training the model, testing the model and generating the music from the trained model will be discussed. This talk will also cover some of the challenges and tradeoffs made for algorithmic music generation.

Looping Like a Pro in Python

The humble loop: it's hard to write a program without it. Whether it's processing numbers in a sequence, lines in a text file, users in a database, or any other list of things, you use loops all the time. But did you know that Python has a lot of different ways to write loops? Reaching for the right looping tool can make your code cleaner, more readable, easier to test, and it can even make it run faster! By the end of this talk, you'll be looping like a pro, and your code will be better for it.

Snakes on a Hyperplane: Python Machine Learning in Production

Companies with an artificial-intelligence plan have a differentiating strategy in the intelligence economy; however, implementing robust machine-learning in production is nontrivial, often requiring a close collaboration between data scientists and developers, and retooling the production stack and workflows to develop and maintain accurate models. Machine learning in production involves model application, handling missing data, data artifacts, and data outside of the training calibration. A rigorous evaluation framework draws upon logging to determine characteristics of model coverage, model performance, auditing, and run-time performance. Model coverage includes the number of times the model produced sensible output relative to number of times it is called. Model coverage is reduced if the model does not converge or model criteria are not met. Model performance is evaluated with a suite of metrics (accuracy, AUC, FPR, TPR, RMSE, MAPE, etc.), which assist in determining the most appropriate model to use in the production scenario and the validity of the model training. Regularly performing manual audits for spot checks is important for debugging and ensuring the model passes sanity checks. Model performance includes run times and profiling model pieces, ensuring performance is within specified requirements and refactoring otherwise. In the AI renaissance, where ML is a critical piece of intelligent products, seamlessly integrating model evaluation into workflows is an important component of making robust products and building a satisfying customer experience. Python is a great language to build intelligent products with its abundance of ML libraries and wrappers contributed as open-source software in addition to rich full-stack capabilities.

The Wild West of Data Wrangling

Data science introductory courses might give you the impression that dealing with data is neat, tidy, and simple. They present you with a simplistic dataset and the scikit-learn or Pandas documentation, and a day or so later, you're done! Piece of cake, right? The real world of data isn't that easy! As a data scientist who has worked in the industry for several years, I have had a lot of experience dealing with messy, inaccurate, incomplete data, and I want to share those experiences with you. I'll talk my way through three real-world situations where I've had to analyze and build models on untidy and complex data, going through how I've preprocessed the data and prepared it for modeling. You'll leave with an understanding of how a data scientist thinks about data and what she does when the data is complicated.
17:40 - 17:40

Leveraging Serverless Architecture for Powerful Data Pipelines

Serverless Architectures that allow us to run python functions in the cloud in an event-driven parallel fashion can be used to create extremely dynamic and powerful data pipelines for use in ETL and data science. Join me for an exploration of how to build data pipelines on Amazon Web Services Lambda with python. We'll cover a single introduction to event-driven programming. Then, we'll walk through building an example pipeline while discussing some of the frameworks and tools that can make building your pipeline easier. Finally, we'll discuss how to maintain observability on your pipeline to ensure proper performance and troubleshooting information.

Saturday, 20th May

10:50 - 11:20

Automate AWS With Python

AWS is one of the best-known cloud vendors. Using the Web UI is fine when starting out, but automating cloud operations is important. Boto3 provides a great Pythonic API to AWS, but using it correctly can be subtle.The talk will cover how to automate AMI builds, building Cloud Formation Templates and automating S3 bucket management.

Debugging in Python 3.6: Better, Faster, Stronger

Python 3.6 was released in December of 2016 and it has a lot of new cool features. Some of them are quite easy for using: a developer can read, for example, about f-strings and they can start using them in their programs as soon as possible. But sometimes features are not so evident, and a new frame evaluation API is one of them. The new frame evaluation API was introduced to CPython in PEP 523 and it allows to specify a per-interpreter function pointer to handle the evaluation of frames. It might not be evident how to use this new feature in everyday life, but it’s quite easy to understand how to build a fast debugger based on it. In this talk we are going to explain how standard way of debugging in Python works and how a new frame evaluation API may be useful for creating the fast debugger. Also we will consider why such fast debugging was not possible in the previous versions of Python. If someone hasn’t made a final decision to move to Python 3.6 this talk will provide some new reasons to do it.

It's time for datetime

Working with time is not a trivial challenge. Python includes a native module in the standard library to work with it but datetime keeps being together with unicode a common source of errors. This often leads to the widespread of many other libraries in the attempt of easing the work of working with datetime. Datetime is one of those API that looks easy to use but given the many concepts around time, is it easy to get backfired if the developer has not solid knowledge about the them. In this talk we will overview the main concepts about timestamps represented through datetime objects, the limitations on the standard library and some simple steps to try to avoid the common mistakes that everyone can fall into. Naive datetimes (which the datetime API works by default with) are a great tool to represent calendar times, but when talking about timestamps (focus of this talk) timezones is n essential part of it and the datetime module can be tricky to use for that use cases. We will also speak about different standards of time, time zones, Daylight Saving Times, leap seconds, serialization and datetime arithmetics. The talk will be focused on giving the foundations that everyone knows to be able to understand and work efficiently and without making painful mistakes when dealing with time related algorithms.

The Memory Chronicles: A Tale of Two Pythons

MicroPython is the leanest, meanest full Python implementation. Designed for microcontrollers, this variant of Python runs in less than 300KB of memory, and retains support for all your favorite Python features. So what does it take to make the smallest Python? Put differently, why does CPython have a large memory footprint? This talk will explore the internals of MicroPython and contrast it with CPython, focusing on the aspects that relate to memory use. We will delve into the Python object models in each and the machinery for managing them. We will touch upon how the designs of the bytecode compiler and interpreter of each differ and why that matters.

The trends in choosing licenses in Python ecosystem

The software licenses are the permissions over copyrighted software. The permission and/ grant includes the grant to use, to redistribute, to prepare derivative works etc. These software licenses also set forth the limitations over these aforesaid rights. The software licenses basically marks the boundary for the usage of the code. Therefore from the above mentioned introduction it very clear that it is very important for developers to choose the license for their code wisely and correctly. PyPI, the Python Package Index is a repository of software for the Python programming language. There are currently 80000+ packages there. This talk will go through the licenses of the top 2500 packages. We will see the trend of choosing a license for these top Python projects. We will discuss the licenses individually, compare them with each other. the advantages and disadvantages of the same. We will further explain that why a license and/or licenses are being favored by the developers.
11:30 - 12:00

Awesome Command Line Tools

Designing a good command line tool is challenging. Command line tools look archaic compared to modern graphical interfaces, discoverability is a big issue for command line tools, proactive help is hard to implement. This talk will cover a set of specific techniques to help alleviate those challenges. How did pgcli and mycli overcome these problems? We will cover specific examples where command line apps shine. The examples will be drawn from pgcli, mycli and bpython. This talk will suggest libraries and show how they can help you implement a wonderful command line interface. The libraries covered in this talk include python-prompt-toolkit, pygments, click, fuzzyfind. The goal of the talk is to distil the ideas that exist in successful command line applications as guidelines for building powerful command line applications.

How documentation works, and how to make it work for your project

Nearly everyone (especially in the Python community) agrees that good documentation is important to the success of software projects, and yet very few projects actually have good documentation. Often, it's not for want of effort - the project's developers have worked hard on it - nor for lack of documentation - the authors have produced a lot of it. It simply turns out to be not very good - not helpful enough for the users who should be able to rely on it, and a depressing chore for the authors who have to maintain it. The good news is that both these problems can be solved by understanding how documentation works, and what its different functions are. Structuring documentation according to those distinct functions helps ensure that each of them is adequately served. It also makes it far easier to write and maintain. Using real-life examples I'll draw out the key functions of documentation, and how they map onto different ways of writing it. Putting this into practice is simple when armed with some basic guidelines. The benefits are huge, and available with a minimum of effort. I won't be discussing documentation tools or software or other topics that have been covered amply elsewhere, but some neglected aspects of software documentation that will make your software projects more successful.

Readability Counts

Have you found unreadable PEP8-compliant code and wondered how to fix it? Have you ever seen code that was simply a pleasure to read? If you've ever wondered what makes code easy to read, this talk is for you. During this talk we'll learn a number of techniques for refactoring code to improve readability and maintainability.We'll discuss: whitespace, self-documenting code, modularity, expectation management. We'll end with a checklist for improving the readability of your own code.

Temporal Data Structures with SQLAlchemy and Postgres

SQLAlchemy (http://www.sqlalchemy.org) and Postgres (https://www.postgresql.org) provide several useful tools that allow us to build and query records through time: temporal models. Combining a need to have robust auditing, as well feature development on per-property history, we can turn “regulatory overhead” into an exciting technical challenge. At Clover Health we have built a small library to automate the task of decorating a model and making it “temporal.” This talk aims to demonstrate the underlying data model and interface for building this system.

When the abyss gazes back: staring down Python's surprising internals

Python's fantastic until it isn't. This talk dives into some of the surprising implementation details of CPython, then explains exactly how they could be discovered from first principles. Attendees will leave with some dangerous Python trivia, and the tools they'll need to uncovery their own trivia when surprises strike.
12:10 - 12:40

Exploring Network Programmability with Python and YANG

Recently, networking vendors and Silicon Valley giants have been putting forth a concerted effort to build standardized models for networking devices. These models allow for building reusable and versatile scripts with predictable, standardized data. Without such models, the wide variety of inputs and outputs required by different devices and vendors made scripting a tedious and challenging endeavor. The modeling language in use is called YANG, and a variety of standards have emerged. A vendor agnostic standard called OpenConfig has lately become stable enough to begin programming devices with it. Using Python, YANG is surprisingly easy to work with, and extremely powerful applications can be written with basic knowledge of JSON or XML and RPCs.The talk will start with use-cases for programming networking devices, and will detail a specific, trivial, use case that will be used in the talk. Next, we will discuss the ‘legacy’ way of programming devices (SSH and screenscraping), and highlight the challenges, such as complex regular expressions, slow responses, and lack of reusability between devices. From there we will dive into YANG, focusing on OpenConfig models. A YANG model is essentially a template, and JSON or XML can be mapped to the YANG template. This makes it perfect for Pythonic manipulation. In the use case there will be a GET RPC returning a YANG representation of the box’s state in JSON, which we will search for the relevant health indicator by drilling down in the JSON dictionary. A simple change to the dictionary will remediate the problem, and a PATCH RPC merges the new configuration onto the box. Since open, standard models are in use, this script could be run on many devices across a network to achieve the same effect with no changes needed. We will finish up with the pros and cons of YANG before opening the talk for Q&A.

What's new in Python 3.6

Python 3.6 has turned out to be quite the release! With 16 Python Enhancement Proposals incorporated into the version, Python 3.6 is only surpassed by Python 3.0 for having more PEPs included in a single release. This talk will be an overview of those 16 PEPs and other changes outlined in the What's New document for Python 3.6.
12:10 - 12:55

Level up! Rethinking the Web API framework.

Think there's nothing left to explore in how we design Web API frameworks? Think again. The author of Django REST framework walks through how we might approach designing a new Python-based API framework from scratch, and looks at how we can start building smarter, more productive API tooling as a result. You should come away from this talk with a better appreciation of: How best to provide API client libraries and API documentation to your users. How to build APIs that support both realtime and request/response interfaces. How to build APIs that are web-browsable. Why you might want to consider taking a schema-first approach to your API design.

Modern Python Dictionaries -- A confluence of a dozen great ideas

Python's dictionaries are stunningly good. Over the years, many great ideas have combined together to produce the modern implementation in Python 3.6. This fun talk uses pictures and little bits of pure python code to explain all of the key ideas and how they evolved over time. Includes newer features such as key-sharing, compaction, and versioning.

Static Types for Python

Over the past year and a half, Dropbox has been investing in the development of mypy, a static type checker for Python, as a way to make our multimillion-line Python codebase easier to understand, navigate, and maintain. In this talk, we will discuss the benefits of type annotations, explain how to use them, and give a peak into how mypy works behind the scenes. Mypy is an open-source type-checker for Python which supports the PEP 484 standard for gradual typing. Originally created by Jukka Lehtosalo as part of his PhD thesis in 2013, it is now under active development by a small team at Dropbox which includes David Fisher, Greg Price, and Guido van Rossum. It supports Python 3.2 and higher, as well as Python 2.7 (via type comments).
13:40 - 14:25

An Introduction to Reinforcement Learning

Reinforcement learning (RL) is a subfield of machine learning focused on building agents: software that can robustly achieve a desired objective under varying states of the world. This introduction will provide you with an overview of RL and tools to build your own agents. In this talk, we will provide an overview of terminology in reinforcement learning and a Jupyter Notebook outlining basic algorithms to learn 'policies', strategies for an agent, and visualize them with numpy, pandas, and seaborn. Newer developments in reinforcement learning apply deep learning to improve performance. We will further discuss deep reinforcement learning and how to use deep learning libraries, such as TensorFlow or Theano, with the latest RL libraries: OpenAI Gym, OpenAI Universe, and DeepMind Lab.

Text is More Complicated Than You Think: Comparing and Sorting Unicode

Few people realize just how complicated text can be. Did you know sorting and even case-folding can depend on a user's locale? That different strings of characters can be semantically completely equivalent? That there are over a thousand Latin letters? Legacy text encodings like ASCII made a lot of simplifying assumptions about how written languages work, and we all put up with them because it was cool to even have computers in the first place. Unicode removes many of those assumptions and provides the tools we need to write software that can just do the right thing regardless of what text users throw at it. Even if you don't translate your UI, getting the details of string comparison, sorting, and searching right can eliminate annoying surprises for you and your users.
13:55 - 14:25

How to write a Python transpiler

We all know Python is a powerful and expressive programming language. What you may not know is how much of the internals of Python itself is exposed for you to use and manipulate. In this talk, you'll be introduced to the tools and libraries Python provides to manipulate the compilation and execution of Python code. You will also see how you can use those tools to target execution environments other than the CPython virtual machine.

Know thy self: Methods and method binding

Methods are like functions, but different. How? Why? And what's will having to type 'self ' all the time? We'll explore partial application of functions and review why it might be nice to start using classes. Then to clarify how method objects work we'll examine the result of accessing the method attributes of an object without calling them. Understanding the behavior we uncover here will require more attribute lookup experiments, which will lead us discover the power of descriptors. Along the way we'll peek in at other languages' approaches to method binding, hopefully coming to appreciate the way Python does things enough to type 'self' a few thousand more times.

Magic Method, on the wall, who, now, is the `__fairest__` one of all?

Magic methods are a very powerful feature of Python and can open a whole new door for you. However, with great power comes great responsibility. In this talk we explore magic method's capabilities by first designing new interfaces in a series of fun experiments. Secondly, we play with creating undeletable objects and learn about the mighty Garbage Collector in cPython and how a single magic method can overturn the fate of the object. Lastly, we create a lazy Redis client to illustrate a practical application of magic methods and learn about lazy loading. Once you see what magic methods can bring to the table, the limit is only your imagination!
14:35 - 15:05

Decorators, unwrapped: How do they work?

Decorators are a syntactically-pleasing way of modifying the behavior of functions in Python. However, they can be highly opaque to Python beginners. It took me a while to learn how to write one, and even after I was confident writing my own decorators, felt like they were magical. The goal of this talk is to demystify decorators by methodically stepping through how and why they work. Along the way we'll touch on closures, scopes, and how Python is compiled.

Designing secure APIs with state machines

Did you ever need to create an application whose behavior varies with its state, while still presenting a consistent interface to its callers? A good, layered design using state machines can help avoid the tedious 'if' checks for flags, and ensure that if your code runs at all, it will run with all the required values initialized. I will demonstrate this with examples, and talk about some available tools and libraries to build state machines in Python. I will also discuss how to effectively use the process of threat modeling to build secure web applications. Threat modeling is a computer security technique that helps you better understand the systems you create, identify attacks, and build defenses. I will talk about things that we, as software developers, can do to assess the security of our applications in the real world through this process.

How to make a good library API

It's not easy to write libraries with great APIs. We're aware of that. However, it's not always clear how we can follow abstract ideals like elegance, simplicity, and extensibility to improve our APIs. That's why in this talk we'll discuss good and bad APIs with real-world examples. For each thing learned, we'll come up with a checklist to help us with practical advice for writing good APIs.

Human-Machine Collaboration for Improved Analytical Processes

Over the last several years, Python developers interested in data science and analytics have acquired a variety of tools and libraries that aim to facilitate analytical processes. Libraries such as Pandas, Statsmodels, Scikit-learn, Matplotlib, Seaborn, and Yellowbrick have made tasks such as data wrangling, statistical modeling, machine learning, and data visualization much quicker and easier. They have accomplished this by automating and abstracting away some of the more tedious, repetitive processes involved with analyzing and modeling data. Over the next few years, we are sure to witness the introduction of new tools that are increasingly intelligent and have the ability to automate more complex analytical processes. However, as we begin using these tools (and developing new ones), we should strongly consider the level of automation that is most appropriate for each case. Some analytical processes are technically difficult to automate, and therefore require large degrees of human steering. Others are relatively easy to automate but perhaps should not be due to the unpredictability of results or outputs requiring a level of compassionate decision-making that machines simply don’t possess. Such processes would benefit greatly from the collaboration between automated machine tasks and uniquely human ones. After all, it is often systems that utilize a combination of both human and machine intelligence that achieve better results than either could on their own. In this talk, we will discuss human-machine collaboration as it applies to analyzing data with Python. We will review a framework for exploratory data analysis with the goal of identifying which tasks should be automated, which tasks should not, and which tasks would benefit from a more interactive, symbiotic, and collaborative process between the human and the machine. We will explore Python libraries that we can use to build tools that allow us to perform different types of analysis. We’ll also introduce the Cultivar project, an example of a hybrid analytics tool that combines a Django framework with Javascript visualizations and Celery for task management to facilitate more efficient and effective human-machine systems for data analysis.

The Fastest FizzBuzz in the West: Make Your Own Language with RPLY and RPython

In this talk, you'll learn how I built DIVSPL (Dustin Ingram's Very Special Programming Language), a tongue-in-cheek domain-specific language, which is particularly good for implementing FizzBuzz -- as quickly as possible. We'll build DIVSPL with RPLY, an implementation of David Beazley's PLY (but with a 'cooler' API) and make it compatible with RPython, a restricted subset of the Python programming language. Along the way, you'll learn about lexers, parsers, and grammars, and in the end, you'll know how to build your own language.
15:15 - 15:45

Library UX: Using abstraction towards friendlier APIs

Complicated libraries can be a pain in the butt to use. It's not surprising that there are a lot of 'X for humans ' libraries out there, some of which are mostly wrappers around more frustrating interfaces. This is not a theoretical talk. I'll touch upon theory to give you context, but will then talk about what that means for you in practice so that you can write better libraries. I'll talk about why library UX matters, about abstraction as a general concept, about out what happens when you over/under abstract, and about some useful tips to help build friendly APIs. Meanwhile, I'll show some positive examples from libraries we know and love (flask, SQLAlchemy, Requests, etc). Once you recognize these effects in play, you'll be able to apply them to your own code and make life better for everyone!

Lights, camera, action! Scraping a great dataset to predict Oscar winners

Using Jupyter notebooks and scikit-learn, you’ll predict whether a movie is likely to win an Oscar or be a box office hit. Together, we’ll step through the creation of an effective dataset: asking a question your data can answer, writing a web scraper, and answering those questions using nothing but Python libraries and data from the Internet.
15:15 - 16:00

Executing python functions in the linux kernel by transpiling to bpf

ebpf is a linux kernel byte-code which can be used for functionality ranging from tracing system calls with kprobe to routing packets with tc. This talk is about a pure-python front-end for ebpf that allows users to write simple python functions to be executed in the kernel. I'll first explain how this was made to work and then I'll show off some of the features/capabilities of this approach with working examples.

The Dictionary Even Mightier

Since my “Mighty Dictionary” talk at PyCon 2010, the Python dictionary has evolved dramatically. Come learn about all of the the improvements, up to and including the re-architecture that has just landed with Python 3.6! The talk will discuss iterable views, the dictionary’s dedicated comprehension syntax, random key ordering, the special key-sharing dictionary designed to underlie object collections, and, most famously of all, the new “compact dictionary” that cuts dictionary storage substantially — and carries a fascinating side-effect. Each new feature that the talk discusses will be motivated by considering the trade-offs inherent in hash table data structure design, and followed up with hints about how you can now use the dictionary even more effectively in your own code!
16:30 - 17:00

No More Sad Pandas: Optimizing Pandas Code for Speed and Efficiency

When I first began working with the Python Pandas library, I was told by an experienced Python engineer: 'Pandas is fine for prototyping a bit of calculations,but it's too slow for any time-sensitive applications.' Over multiple years of working with the Pandas library, I have realized that this was only true if not enough care is put into identifying proper ways to optimize the code's performance.This talk will review some of the most common beginner pitfalls that can cause otherwise perfectly good Pandas code to grind to a screeching halt, and walk through a set of tips and tricks to avoid them.Using a series of examples, we will review the process for identifying the elements of the code that may be causing a slowdown,and discuss a series of optimizations, ranging from good practices of input data storage and reading, to the best methods for avoiding inefficient iterations, to using the power of vectorization to optimize functions for Pandas dataframes.

Slot or not: higher performance custom objects in pure Python

__slots__ are versatile for certain kinds of uses and users, if you know how they work. At first glance, they seem like a free lunch, with improvements in both time and space. At second glance, they seem to have so many caveats to make them not worth using. This talk is a deep dive into how __slots__ work, how to wring every benefit out, as well as the actual caveats and alternatives, with recommendations for writers of core libraries as well as end users.

The Python Visualization Landscape

So you want to visualize some data in Python: which library do you choose? From Matplotlib to Seaborn to Bokeh to Plotly, Python has a range of mature tools to create beautiful visualizations, each with their own strengths and weaknesses. In this talk I’ll give an overview of the landscape of dataviz tools in Python, as well as some deeper dives into a few, so that you can intelligently choose which library to turn to for any given visualization task.

Writing a C Python extension in 2017

This talk describes the build of a C Python extension, with prebuilt binaries, in 2017, where modern packaging standards, as well as Docker, have been a game changer in the Python extensions world. Most examples come from our experience building PyMiniRacer, an embedded Python / JavaScript bridge used in production across hundreds of companies. We will describe the different aspects of building a binary extension, including: using the modern manylinux wheel type in order to ship a built binary, usable in most Linux distributions; the choices offered to developers when building an extension: the Python public C API, cffi, ...; testing of a binary module across various platforms; troubleshooting & debugging an extension: the basics you need to tackle most common issues.

Yes, It's Time to Learn Regular Expressions

Regular expressions have a reputation as opaque and inscrutable. However, the basic concepts behind 'regex' and text pattern recognition are simple to grasp. This talk is for any programmer who isn't familiar with Python's re module and its best practices. Stop putting it off, it's time to learn regular expressions!
17:10 - 17:40

Cython as a Game Changer for Efficiency

Are you running a Web application? Do you suffer from CPU bottlenecks that slow down your growth? There's a tool that can easily fix all that, and then some. C++ knowledge not required. Come learn how Instagram, the world's largest Django deployment with more than 600M active users, saved ~30% of global CPU by rewriting a handful of modules on the critical path in Cython. Learn to apply those techniques to your own projects with little effort and stop worrying about switching to other programming languages or rewriting stable components in C++.

Fuzzy Search Algorithms: How and When to Use Them

Fuzzy Searching or approximate string matching is powerful because often text data is messy. For example, shorthand and abbreviated text are common in various data sets. In addition, outputs from OCR or voice to text conversions tend to be messy or imperfect. Thus, we want to be able to make the most of our data by extrapolating as much information as possible.In this talk, we will explore the various approaches used in fuzzy string matching and demonstrate how they can be used as a feature in a model or a component in your python code. We will dive deep into the approaches of different algorithms such as Soundex, Trigram/n-gram search, and Levenshtein distances and what the best use cases are. We will also discuss situations where it’s important to take into account the meaning or intent of a word and demonstrate approaches for measuring semantic similarity using nltk and word2vec. Furthermore, we will demonstrate via live coding how to implement some of these fuzzy search algorithms using python and/or built-in fuzzy search functions within PostgreSQL.

Python for mathematical visualization: a four-dimensional case study

This is a talk about creating pictures of a mathematical object---specifically, a 4-dimensional fractal 'dust' that has been the subject of mathematical research in hyperbolic geometry since the 1980s. In the end this is accomplished using a little algebra, a little geometry, and a healthy dose of Python. That is, I will present a case study of using Python in several aspects of a mathematical visualization project, from the computation itself, to transforming and converting data, and finally for scripting the process of generating the images. Along the way I'll explain how Python's convenient idioms and containers (e.g. sets and set comprehensions) are a good fit for some of the algebraic and geometric questions that come up, how Scipy and Numpy enable fast numerical calculations, and how Python's strength as a language for scripting and automation allows easy orchestration of rendering of still images and frames of animations. The mathematical visualization project we describe is a collaboration with François Guéritaud (Université de Lille).

Tracing, Fast and Slow: Digging into and improving your web service’s performance

Do you maintain a Rube Goldberg-like service? Perhaps it’s highly distributed? Or you recently walked onto a team with an unfamiliar codebase? Have you noticed your service responds slower than molasses? This talk will walk you through how to pinpoint bottlenecks, approaches and tools to make improvements, and make you seem like the hero! All in a day’s work. The talk will describe various types of tracing a web service, including black & white box tracing, tracing distributed systems, as well as various tools and external services available to measure performance. I’ll also present a few different rabbit holes to dive into when trying to improve your service’s performance.

Unicode: what is the big deal?

Ever wondered why people complain that text processing is a hard problem? Or why Python 3 would introduce such a big backward incompatibility with switching to Unicode? Wonder no more, this talk is for you. In 30 minutes I'm going to demonstrate real world text processing problems and how Python 3 helps solve them. The talk is going to explain how you should split your text from binary data in your application, what are sensible defaults and what are possible gotchas. All this sprinkled with a healthy dose of frustration by a guy whose first name starts with Ł.

Sunday, 21st May

13:10 - 13:40

async/await and asyncio in Python 3.6 and beyond

The talk overviews async/await, asynchronous generators and comprehensions in Python 3.6 and the asyncio module. We'll discuss when and how asyncio should be used in a modern applications and services, what is uvloop, and what asyncio frameworks and libraries one should use. I'll share our ideas about where asyncio is headed and what to expect in Python 3.7.

Hacking Classic Nintendo Games with Python

Do you feel like using your superpowers as a developer to bring the games of your childhood into the future with the power of the Internet? In this live coded journey, we'll build an SMS powered 'Game Genie' allowing the audience to send text messages to manipulate the Nintendo games being played in real time. This will involve working with Flask, the Twilio API, the FCEUX NES emulator and how to bridge them with quick Lua scripts.

Piecing it Together: A beginner's guide to application configuration

Assembling all the necessary setup for an application you’re building can often be more frustrating than writing the app itself. Learning to do this well is difficult, especially for those who are new to Python and might not know where to begin or what questions to ask. While there is no “right way” to set up a development environment or application, understanding the components involved and how they interact can empower you to customize your setup to best suit your needs. This talk will dive into what happens when setting up a database, the purpose and configuration of an isolated environment, how Python packages are installed, and finally, how each of these components interact with each other and the application itself; in particular, how an application's structure facilitates said interactions.

Probabilistic Programming with PyMC3

Bayesian statistics offers robust and flexible methods for data analysis that, because they are based on probability models, have the added benefit of being readily interpretable by non-statisticians. Until recently, however, the implementation of Bayesian models has been prohibitively complex for use by most analysts. But, the advent of probabilistic programming has served to abstract the complexity of Bayesian statistics, making such methods more broadly available. PyMC3 is a open-source Python module for probabilistic programming that implements several modern, computationally-intensive statistical algorithms for fitting Bayesian models, including Hamiltonian Monte Carlo (HMC) and variational inference. PyMC3’s intuitive syntax is helpful for new users, and the reliance on Theano for much of the computational work has allowed developers to keep the code base simple, making it easy to extend the software to meet analytic needs. PyMC3 itself extends Python's powerful 'scientific stack' of development tools, which provide fast and efficient data structures, parallel processing, and interfaces for describing statistical models.

Share Your Code! Python Packaging Without Complication

If you want people to use your code you should package it! You may have heard that packaging is hard but the Python packaging ecosystem has evolved a lot over the years. Taking your beautiful code and sharing it with the world is complex but it doesn't have to be complicated. In this talk you will learn how to take advantage of modern tooling and practices so you can get boring stuff out of the way, publish quickly and frequently, and focus on your code. This talk will cover: A (brief) history of Python packaging, Python Packaging User Guide recommendations, Distribution formats, Anatomy of a package, Automating package creation, Adding:Testing, CI, Documentation, Testing package installation, Releasing to PyPI. This talk is for you if you're new to Python packaging and would like to learn how to share you code or if you've worked with Python for a while and just aren't up-to-date with the latest packaging practices.
13:50 - 14:20

Asynchronous Python for the Complete Beginner

With the introduction of the asyncio package in Python 3.4, you can hear lots of people talking about asynchronous programming, most in a favorable way, some not so much. In this talk, I will tell you what this async fever is about and what can it do for you that regular Python can't, not only with asyncio, but also with other frameworks that existed long before it.

Look mum no hands! From blinking LEDs to a bike speedometer with MicroPython

In this talk I will show you how to use a micro-controller to build a wifi enabled speedometer for your bike, using MicroPython. And some hardware. And a bike (maybe). I will introduce you to the world of MicroPython: a python distribution that runs on micro-controllers. Micro-controllers are small computers that are all around us: in cars, TVs, and your internet connected fridge. We will start with making LEDs blink, then serve webpages, build an interrupt handler and finally put it all together to make a wifi enabled speedometer for a bike.

One Data Pipeline to Rule Them All

There are myriad data storage systems available for every use case imaginable, but letting application teams choose storage engines independently can lead to duplicated efforts and wheel reinvention. This talk will explore how to build a reusable data pipeline based on Kafka to support multiple applications, datasets, and use cases including archival, warehousing and analytics, stream and batch processing, and low-latency 'hot' storage.

Python in The Serverless Era

Serverless is the latest phase in the evolution of cloud development. Its building blocks are functions, a bunch of stateless “nano-services”, that can scale automatically and charged only when used. It enables teams to focus more on development while having fully managed servers. In this talk I'll cover the Serverless Architectures practices, use cases, tooling and the role python plays in it.

The Next Step: Finding Model Parameters With Random Walks

The statistician John Tukey -- who designed the box plot and coined the term 'bit' -- wrote: 'An approximate answer to the right problem is worth a good deal more than an exact answer to an approximate problem '. Python has become one of the major languages for statistical data analysis, not least because of the expressiveness of the language itself and the availability of tools like Jupyter Notebooks, which enable iterative reasoning about a problem and its solutions. This talks takes one step beyond an introduction to statistics with Python and aims to familiarize the audience with two concepts: a class of problems (so-called inverse problems), and a powerful statistical tool (the random walk, or more formally Markov-Chain Monte Carlo (MCMC) sampling with the Metropolis algorithm). In inverse problems, model parameters are estimated from observational data. Both model and data are expected to be affected by error. The objective is not only to find parameters that best describe the observations, but also to figure out how good, or how possibly bad, a solution might be. Inverse problems are extremely common in many fields and crop up each time we attempt to reconstruct a reality from sensor, radar, scattering or imaging data. The Metropololis-Hastings algorithm offers a solution via random sampling of a Bayesian posterior distribution. Even though listed as one of the 20th century's top 10 algorithms by the journal Computing in Science & Engineering, the Metropolis algorithm is easy to understand and implement, and a fun and instructive way to explore even complicated multi-variate probability distributions.
14:30 - 15:00

aiosmtpd - A better asyncio based SMTP server

smtpd.py has been in the standard library for many years. It's been a common tool for deploying SMTP and LMTP servers that handle email-based communication in Python, providing both basic protocol implementations and a fundamental module for higher level tools, such as lazr.smtptest for testing email clients. Based on asyncore and asynchat, smtpd.py is showing its age, and its API is unwieldy. Fortunately, there's a new alternative available. aiosmtpd is a modern reinvention based on asyncio, with all the improvements that come along with such a new implementation. It provides servers for both the SMTP and LMTP protocols, as well as a higher level 'controller ' API for testing SMTP and LMTP clients. It exposes a much better API for customization, allowing the user to associate a simple 'handler ' to process incoming messages without having to worry about the details of the protocols, and it provides some useful hooks for subclassing. This talk will describe the purpose and history of smtpd.py and aiosmtpd, show how users can extend the servers and implement specialized handlers, and show how applications can use the testing API for ensuring that their email sending applications do the right things. Examples will be taken from GNU Mailman 3, which uses aiosmtpd extensively.

Bayesian Statistical Analysis with Python

You've got some data, and now you want to analyze it with Python. You're on your way to greatness! Now the problem comes: do I do the t-test? Chi-squared test? How do I decide? In this talk, inspired by many Pythonista Bayesians (@jakevdp, @allendowney, @twiecki, @fonnesbeck) before, I will show you how you can take common statistical decision problems, formulate them as a Bayesian analysis problem, and use PyMC3 as your workhorse tool for gaining insights. This talk will be math-light and code-heavy, and if you download the slides, you'll have a simple template for more complex Bayesian analysis down the road!

Building Stream Processing Applications

Do you have a stream of data that you would like to process in real time? There are many components with Python APIs that you can put together to build a stream processing application. We will go through some common design patterns, tradeoffs and available components / frameworks for designing such systems. We will solve an example problem during the presentation to make these points concrete. Much of what will be presented is based on experience gained from building production pipelines for the real-time processing of ad streams at Yelp. This talk will cover topics such as consistency, availability, idempotency, scalability, etc.

Gothic Colors: Using Python to understand color in nineteenth-century literature

Do you love literature and programming? Have you ever been curious about what the heck “Digital Humanities” are? Join us for a quick survey of what’s going on in this growing field and learn about a specific project, “Gothic Colors” where we set out to enumerate and analyze color references and mood in 19th century Gothic novels, using Python and a couple of popular libraries.

Hacking Cars with Python

Modern cars are networks of computers, and a high end vehicle could have nearly 100 different computers inside. These devices control everything from the engine to the airbags. By understanding how these systems work, we can interface with vehicles to read data, perform diagnostics, and even modify operation. In this talk, we'll discuss pyvit, the Python Vehicle Interface Toolkit. This library, combined with some open source hardware, allows developers to talk to automotive controllers from Python. We will begin with an introduction to automotive networks, to provide a basis for understanding the tools. Next, we will look at the tools and show the basics of using them. Finally, we'll discuss real world applications of these tools, and how they're being used in the automotive world today.
Wednesday, 17th May
 
09:00
09:15
09:30
09:45
10:00
10:15
10:30
10:45
11:00
11:15
11:30
11:45
12:00
12:15
12:30
12:45
13:00
13:15
13:30
13:45
14:00
14:15
14:30
14:45
15:00
15:15
15:30
15:45
16:00
16:15
16:30
16:45
17:00
17:15
17:30
17:45
Keynotes venue
Python Language Summit
Python Language Summit (10:00 - 16:00)

About the session

The Python Language Summit is an invitation-only event for the developers of Python implementations (CPython, IronPython, Jython, Parrot, PyPy, etc.) to share information, discuss our shared problems, and — hopefully — solve them. These issues might be related to the language itself, the standard library, the development process, Python 2 and 3, the documentation, package index, web site, et cetera. The Summit will focus on discussion more than on presentations.

Speakers

Larry Hastings
Larry has been using Python since the lateish 90s. He is a CPython core contributor, and is Release Manager for Python 3.4.

Barry Warsaw
Barry is a long time Pythonista, having worked with Guido and the rest of the Pythonlabs crew since 1995. He is the project leader for GNU Mailman, working mainly on Mailman 3 these days. He's been the release manager for several previous versions of Python, and still comes up with terrible ideas for Python.

Keynotes venue

Wednesday, 17th May, 10:00 - 16:00

  •  
  • Summit


Twist Bioscience: Using Python in Synthetic DNA Manufacturing
Twist Bioscience: Using Python in Synthetic DNA Manufacturing (11:00 - 12:30)

About the session


Speakers

Kelsey Hightower

Google

Kelsey Hightower is an open source advocate and recovering sysadmin who is currently serving the application container and distributed systems community as an educator and toolsmith. He is currently employed by Google.

Katy Huff

University of Illinois

Dr. Kathryn D. Huff is an unapologetic advocate for open reproducible scientific computing and for emissions-free base-load nuclear energy. She is currently an Assistant Professor in the Department of Nuclear, Plasma, and Radiological Engineering at the University of Illinois at Urbana-Champaign where she leads the Advanced Reactors and Fuel Cycles Research Group. She holds an affiliate faculty position with the National Center for Supercomputing Applications and is one of the University of Illinois' most recent Blue Waters Professors.

Jake Vanderplas

University of Washington

Jake VanderPlas is an astronomer by training, and a long-time user and developer of the scientific Python stack. He currently works as an interdisciplinary research director at the University of Washington, where he writes, teaches, collaborates on research, and spends time consulting with local scientists from a wide range of fields.

Lisa Guo

Instagram

Lisa Guo is a networking, platform, and scalability software engineer with over 20 years experience. She has been working with the Instagram Infrastructure team since 2014, where she led efforts to expand from a single to multiple data centers and improve efficiency and cost-effectiveness.

Hui Ding

Instagram

Hui Ding is Head of Infrastructure org at Instagram, where he oversees the scaling of Instagram backend platform that supports hundreds of millions of concurrent users on a daily basis. Hui has been with Instagram since 2012, and has led the development of many Instagram product launches as well as all infrastructure efforts.

Keynotes venue

Wednesday, 17th May, 11:00 - 12:30

  •  
  • Keynote


Room 1
Beginning Python Bootcamp
Beginning Python Bootcamp (09:00 - 12:20)

About the session

Are you new to Python? Or do you feel like you grok the syntax, but would like to understand new idioms and where to use them? Want to watch an experienced Python developer create code from nothing? Instead of just covering the syntax, we will introduce most of Python as we build code together. Bring your laptop and we will program a predictive text engine from scratch together. Follow along as we start with IDLE (or your favorite editor) and a blank file, and end with a tested idiomatic Python module. It will learn from any text we pass into it, and predict characters or words for us. Just like your phone!

Speakers

Matt Harrison
Matt is a Python user, presenter, author, and user group organizer. He helps run the Utah Python user group. He authored Treading on Python Vol 1 & 2. His work experience covers search, business intelligence, & data science.

Room 1

Wednesday, 17th May, 09:00 - 12:20

  •  
  • Tutorials


Decorators and descriptors decoded
Decorators and descriptors decoded (13:20 - 16:20)

About the session

Python developers use decorators and descriptors on a daily basis, but many don't understand them well enough to create (or debug) them. Decorators are widely deployed in popular Python Web frameworks. Descriptors are the key to the database mappers used with those frameworks, but under the covers they play an even more crucial role in Python as the device that turns plain functions into bound methods, setting the value of the self argument. This tutorial is a gentle introduction these important language features, using a test-driven presentation and exercises, and covering enhancements in Python 3.6 that make class metaprogramming easier to get right.Decorators without closures are presented first, highlighting the difference between run time and import time that is crucial when meta-programming. We then get a firm grounding on closures and how they are implemented in Python, before moving to higher order function decorators and class decorators. Coverage of descriptors starts with a close look at Python's property built-in function and dynamic attribute look up. We then implement some ORM-like field validation descriptors, encounter a usability problem, and leverage PEP 487 -- Simpler customisation of class creation -- to solve it. Alternative implementations using a class decorator and a metaclass will be contrasted to the PEP 487 solution.

Speakers

Luciano Ramalho
Luciano Ramalho is a Technical Principal at ThoughtWorks and the author of the bestselling book Fluent Python (O'Reilly, 2015). Since 1998 he has deployed Python on some of the largest Web sites in Brazil. His speaking record includes PyCon US, OSCON, OSCON-EU, PythonBrasil, RuPy and an ACM Webinar that was attended by more than 900 people. Ramalho is a fellow of the PSF and co-founder of the Brazilian Python Association and of Garoa Hacker Clube, the first hackerspace in Brazil.

Room 1

Wednesday, 17th May, 13:20 - 16:20

  •  
  • Tutorials


Room 2
Microservices with Python and Flask
Microservices with Python and Flask (13:20 - 16:40)

About the session

Microservices are receiving the buzzword treatment these days, and as such, they have a cloud of hype surrounding them that makes it hard to separate substance from fluff. In this tutorial, Miguel Grinberg starts with an introduction to this architecture, including what's great and not so great about it, and then teaches you how a traditional monolithic application written in Flask can be refactored into a modern distributed system based on microservices.

Speakers

Miguel Grinberg
Miguel Grinberg is a Software Developer at Rackspace, where he works on cloud applications. He blogs at http://blog.miguelgrinberg.com about a variety of topics including web development, Python, robotics, photography and the occasional movie review. Miguel is the author of the O'Reilly book Flask Web Development. He lives in beautiful Portland, Oregon. Follow @miguelgrinberg on Twitter.

Room 2

Wednesday, 17th May, 13:20 - 16:40

  •  
  • Tutorials


Room 2
Web programming from the beginning
Web programming from the beginning (09:00 - 12:20)

About the session

Modern web frameworks such as Django and Flask are immensely powerful. However, these useful tools obscure the foundations of network programming upon which they are based, which can be very helpful to understand. So instead of building useful applications with these libraries, let's experiment with sockets! At this tutorial, a Python-flavored history of the web will be presented and attendees will write or modify a TCP chat client, a static site web server, an HTTP client, a CGI script, and a WSGI-compliant server and web application. We will learn what all those things are and how they fit together, bringing the architecture of modern web apps into better focus. The material will be accessible to participants with no web development experience, however, they must be able to write and run Python scripts at the command-line. This tutorial might appeal to someone also attending an introductory web development tutorial, but it covers separate, complementary material. Web development experience is not required but a little exposure would be helpful; for instance, installing flask and running the minimal application on the quickstart page. Similarly, prior exposure to HTML would be useful but is not necessary.

Speakers

Thomas Ballinger
Tom is a Python enthusiast employed by Dropbox. Previously he worked as a facilitator at the Recurse Center in NYC where he explored dark corners of Python with participants and helped bring more BitTorrent clients into the world.

Room 2

Wednesday, 17th May, 09:00 - 12:20

  •  
  • Tutorials


Room 3
An introduction to secure web development with Django and Python
An introduction to secure web development with Django and Python (09:00 - 12:20)

About the session

You can't afford to have security be an optional or 'nice - to - have' feature in your applications. Luckily, Django has your back: this workshop will introduce you to thinking about security, cover a broad range of security concerns from the mundane to the arcane, and walk you through, in detail, how Django and the broader Django and Python ecosystems can help protect you and your users from them.

Speakers

James Bennett
Philosopher turned web geek. Django committer, chronic documenter, now on the engineering team at Clover Health, trying to make health insurance better.

Room 3

Wednesday, 17th May, 09:00 - 12:20

  •  
  • Tutorials


How to Write and Debug C Extension Modules
How to Write and Debug C Extension Modules (13:20 - 16:40)

About the session

The CPython interpreter allows us implement modules in C for performance critical code or to interface with external libraries while presenting users with a high level Python API. This tutorial will teach you how to leverage to power of C in your Python projects. We will start by explaining the C representation of Python objects and how to manipulate them from within C. We will then move on to implementing functions in C for use in Python. We will discuss reference counting and correct exception handling. We will also talk about how to package and build your new extension module so that it may be shared on PyPI. (We will only be covering building extension modules on GNU/Linux and OSX, not Windows). After the break, we will show how to implement a new type in C. This will cover how to hook into various protocols and properly support cyclic garbage collection. We will also discuss techniques for debugging C extension modules with gdb using the CPython gdb extension.

Speakers

Joe Jevnik
Joe Jevnik works at Quantopian where he works on integrating data from various sources into the platform. Joe works on Zipline, Quantopian's open source backtester. He also works on the Blaze ecosystem, mainly on blaze core, odo, and datashape

Room 3

Wednesday, 17th May, 13:20 - 16:40

  •  
  • Tutorials


Room 4
Contract-First API Development Using The OpenAPI Specification (Swagger)
Contract-First API Development Using The OpenAPI Specification (Swagger) (09:00 - 12:20)

About the session

Often developers will implement APIs and then only after they’re released think about things like specifications and documentation. Instead we can make the design of the API contract an explicit part of our development process using The OpenAPI Specification (Swagger) and open source tools. In this workshop we will:Discuss the contract-first approach, Build and validate a simple OpenAPI Specification, Generate reference documentation and show how you can incorporate it with other docs, Run a mock server so clients can test using the API, Generate stub code based on the specification, Implement a basic working API using Flask, Show how you can iteratively add features and make changes, Discuss generating specifications for existing APIs, Participants will leave with: An understanding of how to incorporate a contract-first process into their API development workflow, An example specification that can be used as reference for their own API design, Working code for a basic API that can be used as a basis for their own development, Participants are expected to have a basic familiarity with HTTP / RESTful APIs, understanding of simple git operations, and some development experience.

Speakers

Dave Forgac
Dave Forgac has been a FOSS enthusiast ever since installing Linux for the first time in the late 90's. He got a taste of Python in the early 00's and was hooked. He currently works as a Sr. Software Engineer at American Greetings in Cleveland, OH where he is responsible for API development, application deployment, and developer happiness engineering. He loves building communities and is an organizer of PyOhio, ClePy, and the Cleveland API Meetup.

Ian Zelikman
Ian has worked with web technologies for over 10 years. He started his career working in the J2EE world and a passion to developing application for weather forecasting and analysis. After a short experience working on a white label platform with Grails framework Ian discovered the Python community.

Room 4

Wednesday, 17th May, 09:00 - 12:20

  •  
  • Tutorials


Readable Regular Expressions
Readable Regular Expressions (13:20 - 16:40)

About the session

What are regular expressions, what are they useful for, and why are they so hard to read? We'll learn what regular expressions are good for, how to make our own regular expressions, and how to make our regular expressions friendly and readable (yes it's possible, sort of).

Speakers

Trey Hunner
Trey Hunner is a Python & Django mentor specializing in on-site corporate training. Trey holds weekly online Python chats, is a director at the PSF, and is heavily involved with his local Python meetup group in San Diego.

Room 4

Wednesday, 17th May, 13:20 - 16:40

  •  
  • Tutorials


Room 5
Cross-platform Native GUI development with BeeWare
Cross-platform Native GUI development with BeeWare (09:00 - 12:20)

About the session


Speakers

Russell Keith-Magee
Russell Keith-Magee has been member of the Django core team for 10 years, and was the President of the Django Software Foundation from 2010-15.He's also the founder of the Python BeeWare project, a set of tools and libraries for developing native user interfaces in Python.He lives in Perth, Western Australia with his wife, son, daughter, and two cats.

Room 5

Wednesday, 17th May, 09:00 - 12:20

  •  
  • Tutorials


ContainerOrchestration.py: The tutorial session
ContainerOrchestration.py: The tutorial session (13:20 - 16:40)

About the session

Container Orchestration is the new hot topic in design of scalable system architecture. In this tutorial we look at the main choices for container orchestrators: Docker Swarm, Kubernetes and Apache Mesos. We will look at the use of the respective Python APIs for interacting with thoses engines.This 3-hr session will provide hands-on use of those orchestrators with real use-cases.

Speakers

Mike Bright
Solution Architect at Hewlett-Packard Enterprise working in the EMEA OpenNFV lab (Cloud Computing for Telecom), based in Grenoble France.Passionate about Containers, Orchestration and Programming Languages.Runs the Grenoble Python User Group.Like to travel, danse (Argentinian Tango, Salsa, Rock)

Haïkel Guémar
FedoraProject.org Developer & Ambassador. Senior Software Engineer in a small HPC/SaaS Startup. Lyon Coding Dojo founder & leader.

Mario Loriedo
Principal Software Engineer at Red Hat, developer and container enthusiast

Room 5

Wednesday, 17th May, 13:20 - 16:40

  •  
  • Tutorials


Room 6
Time Series Analysis
Time Series Analysis (13:20 - 16:40)

About the session

Time series analysis is more relevant than ever with the rise of big data, the internet of things, and the general availability of data that follows events through time. This tutorial will introduce participants to the many versatile tools Python offers for exploring, analyzing, and predicting time series data. The tutorial will be a mix of lecture and practice, and it will be broken down into four components: (1) Handling timestamped data in Python (2) Commonly encountered problems with time series (3) Time series prediction exercises (4) Time series classification exercises

Speakers

Aileen Nielsen
Since completing degees in anthropology, law, and physics from Princeton, Yale, and Columbia respectively, Aileen Nielsen has worked in corporate law, physics research laboratories, and, most recently, NYC startups oriented towards improving daily life for under-served populations - particularly groups who have yet to fully enjoy the benefits of mobile technology. She has interests ranging from defensive software engineering to UX designs for reducing cognitive load to the interplay between law and technology. Coming off a recent stint as a data scientist in Hillary Clinton's presidential campaig, Aileen now engineers One Drop's diabetes-management products.

Room 6

Wednesday, 17th May, 13:20 - 16:40

  •  
  • Tutorials


Room 6
Using Functional Programming for efficient Data Processing and Analysis
Using Functional Programming for efficient Data Processing and Analysis (09:00 - 12:20)

About the session

As a multi paradigm language, Python has great support for functional programming. For better or for worse, leading data libraries such as Pandas eschew the this style for object-oriented programming. This tutorial will explain how to take advantage of Python's excellent functional programming capabilities to efficiently obtain, clean, transform, and store data from disparate sources.

Speakers

Reuben Cummings
Reuben Cummings is a data scientist and software developer skilled in business development, entrepreneurship, and programming. As Managing Director of Nerevu Development, he has worked with clients including the UN Humanitarian Data Exchange, Moringa School, and Africa’s Talking. Reuben specializes in data analysis, visualization, API development, and workflow automation.Reuben previously served in IT and business roles at social enterprise Global Cycle Solutions in Arusha, Tanzania; and as an analyst at MIDIOR Consulting in Cambridge, Massachusetts. He holds a degree in Chemical Engineering from the Massachusetts Institute of Technology, and is Lead Organizer of the Arusha Coders meetup in Tanzania.

Room 6

Wednesday, 17th May, 09:00 - 12:20

  •  
  • Tutorials


Room 7
Exploratory data analysis in python
Exploratory data analysis in python (09:00 - 12:20)

About the session

With the recent advancements in machine learning algorithms and statistical techniques, and the increasing ease of implementing them in Python, it is tempting to ignore the power and necessity of exploratory data analysis (EDA), the crucial step before diving into machine learning or statistical modeling. Simply applying machine learning algorithms without a proper orientation of the dataset can lead to wasted time and spurious conclusions. EDA allows practitioners to gain intuition for the pattern of the data, identify anomalies, narrow down a set of alternative modeling approaches, devise strategies to handle missing data, and ensure correct interpretation of the results. Further, EDA can rapidly generate insights and answer many questions without requiring complex modeling. Python is a fantastic language not only for machine learning, but also EDA. In this tutorial, we will walk through two hands-on examples of how to perform EDA using Python and discuss various EDA techniques for cross-section data, time-series data, and panel data. One example will demonstrate how to use EDA to answer questions, test business assumptions, and generate hypotheses for further analysis. The other example will focus on performing EDA to prepare for modeling. Between these two examples, we will cover: Data profiling and quality assessment, Basic describing of the data, Visualizing the data including interactive visualizations, Identifying patterns in the data (including patterns of correlated missing data), Dealing with many attributes (columns), Dealing with large datasets using sampling techniques, Informing the engineering of features for future modeling, Identifying challenges of using the data (e.g. skewness, outliers), Developing an intuition for interpreting the results of future modeling. The intended audience for this tutorial are aspiring and practicing data scientists and analysts, or anyone who wants to be able to get insights out of data. Students must have at least an intermediate-level knowledge of Python and some familiarity with analyzing data would be beneficial. Installation of Jupyter Notebook will be required (and potentially, we will also demonstrate analysis in JupyterLab, if its development in the next few months allows). Instructions will be sent on what packages to install beforehand.

Speakers

Chloe Mawer
Chloe Mawer is a Senior Data Scientist at Silicon Valley Data Science, a small consulting company located in Mountain View, California that focuses on transforming businesses through data strategy, science, and engineering. At SVDS, Chloe has worked on problems for pharmaceutical and retail companies, which heavily rely on using Python for data analysis and modeling. Prior to SVDS, she obtained her PhD in Environmental Engineering at Stanford, where she focused on developing methods for monitoring water’s movement in the subsurface using electrical measurements.

Jonathan Whitmore
Jonathan Whitmore, PhD, is a Senior Data Scientist at Silicon Valley Data Science. He is the author of the O'Reilly screencast: Jupyter Notebook for Data Science Teams. Before moving into the tech industry, Dr. Whitmore worked as an astrophysicist in Melbourne, Australia, researching whether the fundamental physical constants have changed over the age of the universe. Dr. Whitmore received his PhD in physics from the University of California, San Diego.

Room 7

Wednesday, 17th May, 09:00 - 12:20

  •  
  • Tutorials


Introduction to Statistical Modeling with Python
Introduction to Statistical Modeling with Python (13:20 - 16:40)

About the session

This intermediate-level tutorial will provide students with hands-on experience applying practical statistical modeling methods on real data. Unlike many introductory statistics courses, we will not be applying 'cookbook' methods that are easy to teach, but often inapplicable; instead, we will learn some foundational statistical methods that can be applied generally to a wide variety of problems: maximum likelihood, bootstrapping, linear regression, and other modern techniques. The tutorial will start with a short introduction on data manipulation and cleaning using pandas, before proceeding on to simple concepts like fitting data to statistical distributions, and how to use Monte Carlo simulation for data analysis. Slightly more advanced topics include bootstrapping (for estimating uncertainty around estimates) and flexible linear regression methods using Bayesian methods. By using and modifying hand-coded implementations of these techniques, students will gain an understanding of how each method works. Students will come away with knowledge of how to deal with very practical statistical problems, such as how to deal with missing data, how to check a statistical model for appropriateness, and how to properly express the uncertainty in the quantities estimated by statistical methods.

Speakers

Christopher Fonnesbeck
Chris Fonnesbeck is an Assistant Professor in the Department of Biostatistics at the Vanderbilt University School of Medicine. He specializes in computational statistics, Bayesian methods, meta-analysis, and applied decision analysis. He originally hails from Vancouver, BC and received his Ph.D. from the University of Georgia.

Room 7

Wednesday, 17th May, 13:20 - 16:40

  •  
  • Tutorials


Room 8
Fantastic Data and Where To Find Them: An introduction to APIs, RSS, and Scraping
Fantastic Data and Where To Find Them: An introduction to APIs, RSS, and Scraping (09:00 - 12:20)

About the session

Whether you’re building a custom web application, getting started in machine learning, or just want to try something new, everyone needs data. And while the web offers a seemingly boundless source for custom data sets, the collection of that data can present a whole host of obstacles. From ever-changing APIs to rate-limiting woes, from nightmarishly nested XML to convoluted DOM trees, working with APIs and web scraping are challenging but critically useful skills for application developers and data scientists alike. In this tutorial, we’ll introduce RESTful APIs, RSS feeds, and web scraping in order to see how different ingestion techniques impact application development. We’ll explore how and when to use Python libraries such as feedparser, requests, beautifulsoup, and urllib. And finally we will present common data collection problems and how to overcome them. We’ll take a hands-on, directed exercise approach combined with short presentations to engage a range of different APIs (with and without authentication), explore examples of how and why you might web scrape, and learn the ethical and legal considerations for both. To prepare attendees to create their own data ingestion scripts, the tutorial will walk through a set of examples for robust and responsible data collection and ingestion. This tutorial will conclude with a case study of Baleen, an automated RSS ingestion service designed to construct a production-grade text corpus for NLP research and machine learning applications. Exercises will be presented both as Jupyter Notebooks and Python scripts.

Speakers

Nicole Donnelly
Nicole Donnelly is a data management IT specialist with the Office of the Chief Technology Officer, District of Columbia. She believes a city that consumes and understands its own data is acting in the true spirit of public service by improving the lives of its residents. She has a professional certificate in data science from Georgetown, where she continued as TA and instructor, and has completed the Data Science Immersive program at General Assembly. She is a faculty member with DC data science collaborative District Data Labs. She has Bachelor's degrees from Rutgers University in Computer Science and Art History.

Tony Ojeda
[Tony Ojeda](https://www.linkedin.com/in/tonyojeda) is a data scientist, author, and entrepreneur with expertise in streamlining business processes and over a decade of experience creating innovative data products. He is the Founder of District Data Labs and a Co-founder and former President of Data Community DC. Tony has an MS in Finance from Florida International University and an MBA in Strategy and Entrepreneurship from DePaul University. He co-authored the Practical Data Science Cookbook, published by Packt, and is also a co-author of the forthcoming O'Reilly book Applied Text Analytics with Python.

Will Voorhees
Will is a software developer that designs and builds enterprise-scale security products for distributed systems. He's worked on high performance C++ cryptography tools, distributed firewalls, and key management systems for hundreds of thousands of servers. His free time is occupied with YouTube, video games, and cooking. He lives in Seattle with his wife and dog.

Room 8

Wednesday, 17th May, 09:00 - 12:20

  •  
  • Tutorials


bokeh: Data Visualization in Python
bokeh: Data Visualization in Python (13:20 - 16:40)

About the session

Bokeh is a powerful data visualization library that creates fully interactive plots and integrates well with the data analysis tools you already know and love: pandas, matplotlib, seaborn, ggplot. Bokeh can produce stand-alone browser-based plots and much more sophisticated server-hosted visualizations. Learn to use bokeh to create everything from basic graphs to advanced interactive plots, dashboards, and data applications. Incorporate bokeh within your Jupyter/IPython notebooks, Partner bokeh with other libraries such as matplotlib, seaborn, pandas, and ggplot, Learn about bokeh server: to serve up even more impressive realtime visualizations, Explore configurations and settings, Recognize and overcome common problems

Speakers

Chalmer Lowe
Chalmer Lowe founded PyHawaii, the largest programming meetup in Hawaii and served as the Chairman of the Python Education Summit (2015, 2016). He co-founded Dark Art of Coding, a programming school to enable him and his son to fulfill their passion of teaching Python and Data Analysis to geeks at home, at work and in the community. He performs data analysis for his employer: Booz Allen Hamilton and teaches Python to his colleagues, clients and anyone who will stand still long enough.

Room 8

Wednesday, 17th May, 13:20 - 16:40

  •  
  • Tutorials


Room 9
Network Analysis Made Simple
Network Analysis Made Simple (09:00 - 12:20)

About the session

Have you ever wondered about how those data scientists at Facebook and LinkedIn make friend recommendations? Or how epidemiologists track down patient zero in an outbreak? If so, then this tutorial is for you. In this tutorial, we will use a variety of datasets to help you understand the fundamentals of network thinking, with a particular focus on constructing, summarizing, and visualizing complex networks.

Speakers

Eric J. Ma
I am a 5th year PhD Candidate in the Runstadler Lab in the Biological Engineering department at MIT. I study the influenza virus, which is like a self-replicating deck of 8 poker cards. I am using Python to solve infectious disease data science problems.

Room 9

Wednesday, 17th May, 09:00 - 12:20

  •  
  • Tutorials


Complexity Science
Complexity Science (13:20 - 16:40)

About the session

Complexity Science is an approach to modeling systems using tools from discrete mathematics and computer science, including networks, cellular automata, and agent-based models. It has applications in many areas of natural and social science. Python is a particularly good language for exploring and implementing models of complex systems. In this tutorial, we present material from the draft second edition of Think Complexity, and from a class we teach at Olin College. We will work with random networks using NetworkX, with cellular automata using NumPy, and we will implement simple agent-based models.

Speakers

Allen Downey
Allen Downey is a professor of computer science at Olin College, a new engineering college near Boston with the mission to fix engineering education. He is the author of Think Python, Think Stats, Think Bayes, Think Complexity, and several other books all available under free licenses.

Jason Woodard
Jason Woodard is an associate professor of engineering and entrepreneurship at Olin College. He studied complex systems and computational modeling at the Santa Fe Institute, and uses complexity science to model the evolution of technology and markets.

Room 9

Wednesday, 17th May, 13:20 - 16:40

  •  
  • Tutorials


Room B110-111
Red Hat: Deploying Python web applications to OpenShift/Kubernetes
Red Hat: Deploying Python web applications to OpenShift/Kubernetes (11:00 - 12:30)

About the session

This will be a hands on workshop where you will get to experience yourself how easy it is to deploy a Python web application to OpenShift. The latest version of OpenShift is implemented on top of Kubernetes for container orchestration and Docker for the container runtime. On top of these tools OpenShift adds its own special magic sauce to even further simplify the deployment of applications. In the workshop you will learn how to deploy a Python web application direct from a Git repository holding the application source code, with the build process being handled by the Source-to-Image (S2I) tool. Next you will deploy a database from a pre-existing Docker-formatted container image and learn how to hook your Python web application up to it. Finally you will configure a Git repository webhook to automate the deployment process so that every time you commit and push up changes your application will be automatically rebuilt and deployed. During the workshop we will be throwing in various other tidbits to help explain what OpenShift is, how it works and how it can help you to host not only your Python web site, but also more complex applications, be they legacy systems, or new micro service architecture applications, in any language. For the workshop, you will be provided access to an online instance of OpenShift Origin with everything you need. The only piece of software you will need to install locally on your own computer will be a single program binary for our command line client for OpenShift.

Speakers

Graham Dumpleton
Graham is the author of mod_wsgi, the Apache module for hosting of Python web applications using the WSGI interface. He also has a keen interest in Docker and Platform as a Service (PaaS) technologies. He is currently a developer advocate for OpenShift at Red Hat.

Room B110-111

Wednesday, 17th May, 11:00 - 12:30

  •  
  • Sponsor tutorials


Intel: Bring deep learning to the fingertips of data scientists with Python & BigDL on Apache Spark
Intel: Bring deep learning to the fingertips of data scientists with Python & BigDL on Apache Spark (13:30 - 15:00)

About the session

We have seen trends that the data science and big data community begin to engage further with artificial intelligence and deep learning technologies, and efforts to bridge the gap between the deep learning communities and data science / big data communities begin to emerge. However, developing deep neural nets is an intricate procedure, and scaling that to big data scale is an even more challenging process. Therefore, deep learning tools and frameworks, especially visualization support, that can run smoothly on top of big data platforms are essential for scientists to understand, inspect and manipulate their big models and big data. In this talk, we will share how we bring deep learning to the fingertips of big data users and data scientists, by providing visualizations (through widely used frameworks such as Jupyter Notebooks and/or Tensorboard) as well as Python toolkits (e.g., Numpy, Scipy, Scikit-learn, NLTK, Kesra, etc.) on top of BigDL, an open source distributed deep learning library for Apache Spark. In addition, we will also share how real-world big data users and data scientists use these tools to build AI-powered big data analytics applications.

Speakers

Jiao Wang
Jiao Wang is a software engineer on the Big Data Technology team at Intel who works in the area of big data analytics. She is engaged in developing and optimizing distributed deep learning frameworks on Apache Spark.

Room B110-111

Wednesday, 17th May, 13:30 - 15:00

  •  
  • Sponsor tutorials


Intel: Scalable, distributed deep learning with Python and Pachyderm
Intel: Scalable, distributed deep learning with Python and Pachyderm (15:30 - 17:00)

About the session


Speakers

Daniel Whitenack
Daniel Whitenack (@dwhitena) is a Ph.D. trained data scientist working with Pachyderm (@pachydermIO). Daniel develops innovative, distributed data pipelines which include predictive models, data visualizations, statistical analyses, and more. He has spoken at conferences around the world (Datapalooza, DevFest Siberia, GopherCon, and more), teaches data science/engineering with Ardan Labs (@ardanlabs), maintains the Go kernel for Jupyter, and is actively helping to organize contributions to various open source data science projects.

Room B110-111

Wednesday, 17th May, 15:30 - 17:00

  •  
  • Sponsor tutorials


Room B118-119
Twist Bioscience: Using Python in Synthetic DNA Manufacturing
Twist Bioscience: Using Python in Synthetic DNA Manufacturing (11:00 - 12:30)

About the session

Quick Introduction about how Twist Bioscience is enabling world changing innovation like enabling DNA computing and turning sugar into vaccines. Rust Bindings with Python We will demonstrate how we combine the best of Rust and Python by creating a Rust library with Python bindings. Hardware Scripting with Embedded IronPython We’ll demonstrate how we use the flexibility of IronPython runtime to make our hardware scriptable and enable debugging within the same process. This will require a Windows machine or VM image with visual studio installed (free community install). Polymorphic Marshmallowed JSON with SQLAlchemy This exercise will guide you through creating polymorphic objects that are transparently serialized in and out of Postgres JSON fields. We will also work on creating graph objects with mutability tracking

Speakers

Kieran Hervold
Kieran Hervold is a Senior Engineer on the bioinformatics team at Twist Biosciences. He is an expert at building complex pipelines with Python. Lately, he started using Rust through Python to provide massive speed-ups to our DNA design algorithms while retaining the flexibility and interactivity of Python.

Room B118-119

Wednesday, 17th May, 11:00 - 12:30

  •  
  • Sponsor tutorials


Clover Health: Transforming and Analyzing Healthcare Data with Python
Clover Health: Transforming and Analyzing Healthcare Data with Python (13:30 - 15:00)

About the session

This workshop will give you an introduction to how we use python for testing, analysis, and processing at Clover. This includes a walkthrough of our tech stack along with a dive into two use cases. The first use case is from a Data Science perspective which will go over how we test SQL queries in our data pipeline. This will get into an example of statistical modeling in a particular insurance operations context. The second use case is from a Engineering perspective which will show how we transform nested JSON structures into consumable flat table structures. This will also touch on techniques for processing large amounts of data. Clover uses lots of python tools and libraries which we're happy to discuss. We rely heavily on Postgres as our primary database solution. However, this talk will highlight SQLAlchemy, Jupyter Notebook, pytest, generators, partial functions, and LRU caching.

Speakers

Bijan Vakili
Bijan Vakili is a Senior Software Engineer at Clover Health where he is building applications, improving infrastructure and mentoring developers. Prior to joining Clover, Bijan worked in currency and derivative trading, gaming, and network applications, and disaster recovery. He has worked in multiple roles including software developer, team lead, and project manager. Bijan holds a Bachelor’s degree in Software Engineering & Human Biology from the University of Toronto and a MBA from University of Toronto - Rotman School of Management.

Room B118-119

Wednesday, 17th May, 13:30 - 15:00

  •  
  • Sponsor tutorials


Thursday, 18th May
 
09:00
09:15
09:30
09:45
10:00
10:15
10:30
10:45
11:00
11:15
11:30
11:45
12:00
12:15
12:30
12:45
13:00
13:15
13:30
13:45
14:00
14:15
14:30
14:45
15:00
15:15
15:30
15:45
16:00
16:15
16:30
16:45
17:00
17:15
17:30
17:45
Room 1
Hands-On Intro to Python for New Programmers
Hands-On Intro to Python for New Programmers (09:00 - 12:20)

About the session

Brand new to programming and want to get some hands-on Python experience? Let's learn some Python together! During this tutorial we will work through a number of programming exercises together. We'll be doing a lot of asking questions, taking guesses, trying things out, and seeking out help from others. In this tutorial we'll cover: Types of things in Python: strings, numbers, lists, Conditionally executing code, Repeating code with loops, Getting user input

Speakers

Trey Hunner
Trey Hunner is a Python & Django mentor specializing in on-site corporate training. Trey holds weekly online Python chats, is a director at the PSF, and is heavily involved with his local Python meetup group in San Diego.

Room 1

Thursday, 18th May, 09:00 - 12:20

  •  
  • Tutorials


Python Epiphanies
Python Epiphanies (13:20 - 16:40)

About the session

This tutorial is for developers who've been using Python for a while and would consider themselves at an intermediate level, but are looking for a deeper understanding of the language. It focuses on how Python differs from other languages in subtle but important ways that are often confusing, and it demystifies a number of language features that are sometimes misunderstood.

Speakers

Stuart Williams
Stuart Williams lives in Winnipeg. He has taught Python at University, PyCon, and various companies; and to user groups, high school students, and elementary schoolchildren. His loves include biking in sun or snow, sailing, his spouse and three children, and teaching Python.

Room 1

Thursday, 18th May, 13:20 - 16:40

  •  
  • Tutorials


Room 2
Let's build a web framework!
Let's build a web framework! (09:00 - 12:20)

About the session

If you're building a web app, you probably reach for your favorite framework -- Django, Flask, Pyramid, etc. But we rarely stop to think about what these tools are doing under the hood. In this hands-on tutorial, you'll gain a deeper understanding of what frameworks are and how they work by implementing your own framework from scratch. We'll build a complete (if minimal) web framework that handles the WSGI request/response cycle, routing, controllers, templating, and a data layer. Along the way you'll gain a deeper understanding of the decisions web frameworks make, their relative merits, and inner workings.

Speakers

Jacob Kaplan-Moss
Jacob is a core contributor to Django, co-owner of Revolution Systems, and Engineering Security Lead at 18F.

Room 2

Thursday, 18th May, 09:00 - 12:20

  •  
  • Tutorials


Django Admin: Basics and Beyond
Django Admin: Basics and Beyond (13:20 - 16:40)

About the session

Django's admin is a great tool but it isn't always the easiest or friendliest to set up and customize. The ModelAdmin class has a lot of attributes and methods to understand and come to grips with. On top of these attributes, the admin's inlines, custom actions, custom media, and more mean that, really, you can do anything you need with the admin...if you can figure out how. The docs are good but leave a lot to experimentation and the code is notoriously dense. In this tutorial, you'll learn the basics of setting up the admin so you can get your job done. Then we'll dive deeper and see how advanced features like autocomplete, Markdown editors, image editors, and others would be added to make the admin really shine.

Speakers

Kenneth Love
Besides teaching Python at Treehouse, Kenneth Love is a husband & father. He created a few Python libraries (django-braces being the most popular), worked at most levels of the web stack, & was the Creative Director of a newspaper. He likes board games, activism, & dry humor.

Room 2

Thursday, 18th May, 13:20 - 16:40

  •  
  • Tutorials


Room 3
Faster Python Programs - Measure, don't Guess
Faster Python Programs - Measure, don't Guess (09:00 - 12:20)

About the session

Optimization can often help to make Python programs faster or use less memory. Developing a strategy, establishing solid measuring and visualization techniques as well as knowing about algorithmic basics and datastructures are the foundation for a successful optimization. The tutorial will cover these topics. Examples will give you a hands-on experience on how to approach efficiently. Python is a great language. But it can be slow compared to other languages for certain types of tasks. If applied appropriately, optimization may reduce program runtime or memory consumption considerably. But this often comes at a price. Optimization can be time consuming and the optimized program may be more complicated. This, in turn, means more maintenance effort. How do you find out if it is worthwhile to optimize your program? Where should you start? This tutorial will help you to answer these questions. You will learn how to find an optimization strategy based on quantitative and objective criteria. You will experience that one's gut feeling what to optimize is often wrong. The solution to this problem is: „Measure, Measure, and Measure!“. You will learn how to measure program run times as well as profile CPU and memory. There are great tools available. You will learn how to use some of them. Measuring is not easy because, by definition, as soon as you start to measure, you influence your system. Keeping this impact as small as possible is important. Therefore, we will cover different measuring techniques. Furthermore, we will look at algorithmic improvements. You will see that the right data structure for the job can make a big difference. Finally, you will learn about different caching techniques.

Speakers

Mike Muller
Mike Muller has been using Python as his primary programming language since 1999. He is a Python trainer and the CEO at Python Academy (www.python-academy.com).He teaches a wide variety of Python topics including Introduction to Python, Python for Scientists and Engineers, Advanced Python as well as Optimization and Extensions of Python Programs.He is the chairman of the Python Software Verband e.V., a PSF member, a PSF community service award holder, User Group co-founder. He chaired EuroSciPy 2008 and 2009, PyCon DE 2011 and 2012 as well as EuroPython 2014 in Berlin, Germany.

Room 3

Thursday, 18th May, 09:00 - 12:20

  •  
  • Tutorials


Intermediate Python Bootcamp
Intermediate Python Bootcamp (13:20 - 16:40)

About the session

Are you new to Python and want to learn to step it up to the next level? Have you heard about closures, decorators, context managers, generators, list comprehensions, or generator expressions? What are these and why do advanced Pythonistas keep mentioning them? Don't be intimidated, learn to take advantage of these to make you own code more idiomatic. This hands-on tutorial will cover these intermediate subjects in detail. We will modify existing Python code to take advantage of them. We will start with a basic file, and then introduce these features into it using the REPL, command line, and tests. The audience will get to follow along using their own computer and editor of choice (or can use IDLE as the instructor). We will teach the 'code smells' to look for. You will know when you should apply these new techniques to your code.

Speakers

Matt Harrison
Matt is a Python user, presenter, author, and user group organizer. He helps run the Utah Python user group. He authored Treading on Python Vol 1 & 2. His work experience covers search, business intelligence, & data science.

Room 3

Thursday, 18th May, 13:20 - 16:40

  •  
  • Tutorials


Room 4
Deploy and scale containers with Docker native, open source orchestration
Deploy and scale containers with Docker native, open source orchestration (09:00 - 12:20)

About the session

Deploy your own cluster! Use it to 'build,ship,and run ' containerized applications! Learn how to implement logging, metrics, stateful services, and more! Learn the True Way of DevOps! Alright, we can't promise anything about the True Way of DevOps, but everything else will definitely be in this tutorial. We will run a demo app featuring Python components and see some best practices to 'Dockerize ' Python code and Flask in particular; but the tutorial also includes other languages and frameworks. Come with your laptop! You don't need to install anything before the workshop, as long as you have a web browser and a SSH client. Each student will have their own private cluster during the tutorial, to get immediately applicable first-hand experience.

Speakers

Jerome Petazzoni
Jerome is a senior engineer at Docker, where he helps others to containerize all the things. In another life he built and operated Xen clouds when EC2 was just the name of a plane, developed a GIS to deploy fiber interconnects through the French subway, managed commando deployments of large-scale video streaming systems in bandwidth-constrained environments such as conference centers, operated and scaled the dotCloud PAAS, and various other feats of technical wizardry. When annoyed, he threatens to replace things with a very small shell script.

AJ Bowen
As a Solutions Engineer at Convox, AJ Bowen is on a mission to containerize all the things and help others to do the same. Her other experience includes developer advocacy at Gandi.net as well as heavy involvement in the Occupy movement in Kansas City, where she discovered the delightful and complicated world of horizontal organization in groups of passionate people with wildly diverse political perspectives, and humanitarian work in Haiti. AJ doesn’t like being told what to do—or telling others what to do—and is on a constant mission to eliminate unnecessary hierarchy.

Room 4

Thursday, 18th May, 09:00 - 12:20

  •  
  • Tutorials


Effectively running python applications in Kubernetes/OpenShift
Effectively running python applications in Kubernetes/OpenShift (13:20 - 16:40)

About the session

Google, Red Hat, Intel, Huawei, Mirantis, Deis and many, many others are investing a lot of time and effort into improving Kubernetes. I bet, you have encountered that name at least once in the past twelve months, either on Hacker News, Reddit, or somewhere else. Do you want to learn more about the best container orchestration in the universe, but were afraid of the setup complexity? Do you want to see how easy it is to run any application using containers? Do you want to experience the joy of scaling application with a single click? This, and a lot more will be discussed in details. In this tutorial, every attendee will be provided with an environment, and step by step instructions necessary to setup the environment, build and deploy a microservices based sample application. Alternatively, a sample application of any choosing can be used throughout the entire tutorial. All that will be performed on OpenShift, which is a Red Hat distribution of Kuberenets with some add-ons that will be described in details at the beginning of the tutorial. To wet your appetite even more, here are some of the topics we are going to cover: git integration, image registry integration, scaling application, containers security, batch tasks and much more. After the session, every person will be able to play around with the accompanying code repository that was used in the tutorial, which includes detailed instructions how to run it on your own from scratch.

Speakers

Maciej Szulik
Maciej is a passionate developer with over 10 years of experience in many languages. Currently, he is hacking on bugs.python.org and CPython's IMAP library by night. Whereas in the light of day, he's working on OpenShift and Kubernetes for Red Hat. In his spare time he organizes PyCon PL, helps reviewing talks for PyCon, talks at various events and meet ups around Europe.

Room 4

Thursday, 18th May, 13:20 - 16:40

  •  
  • Tutorials


Room 5
Creating And Consuming Modern Web Services with Twisted
Creating And Consuming Modern Web Services with Twisted (09:00 - 12:20)

About the session

This tutorial will show students how to write applications and services which efficiently publish and consume services and APIs. To do so, we will combine 4 Python-based technologies: Jupyter is a real-time development environment. Twisted is a powerful platform for network programming that supports many protocols, including HTTP. Klein is a Twisted-based web application framework. Treq is a Requests-style HTTP client based on Twisted. By combining all of these we will guide students through interactively prototyping a production quality web application that publishes both service APIs and web resources such as HTML, and that can efficiently consume many back-end services such as 3rd-party APIs.

Speakers

Moshe Zadka
Moshe has been involved with Python since 1998, when he helped obsolete math-related modules. Since then, he made some contributions that were not just deleting code to core Python, and has been a contributor on the Twisted project since its inception.

Michele Pratusevich
I graduated from MIT with my BS in computer science in 2013 followed by an MEng in 2015. I currently work at Amazon on large-scale computer vision problems and systems, with Python and C++ being the languages of choice. I like hiking, climbing, and fun programming problems.

Room 5

Thursday, 18th May, 09:00 - 12:20

  •  
  • Tutorials


Introduction to Digital Signal Processing
Introduction to Digital Signal Processing (13:20 - 16:40)

About the session

Spectral analysis is an important and useful technique in many areas of science and engineering, and the Fast Fourier Transform is one of the most important algorithms, but the fundamental ideas of signal processing are not as widely known as they should be. Fortunately, Python provides an accessible and enjoyable way to get started. In this tutorial, I present material from my book, Think DSP, and from a class I teach at Olin College. We will work with audio signals, including music and other recorded sounds, and visualize their spectrums and spectrograms. We will synthesize simple sounds and learn about harmonic structure, chirps, filtering, and convolution.

Speakers

Allen Downey
Allen Downey is a professor of computer science at Olin College, a new engineering college near Boston with the mission to fix engineering education. He is the author of Think Python, Think Stats, Think Bayes, Think Complexity, and several other books all available under free licenses.

Room 5

Thursday, 18th May, 13:20 - 16:40

  •  
  • Tutorials


Room 6
Intro to Bayesian Machine Learning with PyMC3 and Edward
Intro to Bayesian Machine Learning with PyMC3 and Edward (09:00 - 12:20)

About the session

There has been uprising of probabilistic programming and Bayesian statistics. These techniques are tremendously useful, because they help us to understand, to explain, and to predict data through building a model that accounts for the data and is capable of synthesizing it. This is called the generative approach to statistical pattern recognition. Estimating the parameters of Bayesian models has always been hard, impossibly hard actually in many cases for anyone but experts. However, recent advances in probabilistic programming have endowed us with tools to estimate models with a lot of parameters and for a lot of data. In this tutorial, we will discuss two of these tools, PyMC3 and Edward. These are black box tools, swiss army knifes for Bayesian modeling that do not require knowledge in calculus or numerical integration. This puts the power of Bayesian statistics into the hands of everyone, not only experts of the field. And, it's great that these are implemented in Python with its rich, beginner-friendly ecosystem. It means we can immediately start playing with it. We have planned three awesome parts, spread over three awesome hours: First hour: Introduction to Bayesian machine learning. Second hour: Baby steps in PyMC3 and Edward. Third hour: Solve a real-world problem with PyMC3 or Edward (model, fit, criticize).

Speakers

Torsten Scholak
In a former life, Torsten was a theoretical physicist and spent his life hacking together simulations of complex quantum systems, mostly in Python and C. More recently, he has been dedicating his time to data science and, in particular, Bayesian probabilistic modeling. Torsten lives in Toronto with his wife, and enjoys contributing to the local Meetup scene.

Diego Maniloff
Diego Maniloff is VP of Engineering at Unata, where he uses Python everyday to build recommendation systems. Diego was a research fellow at MIT, where he collaborated with GE to characterize the predictability of human diseases from millions of medical records, and with Audi to design an onboard recommender. Diego was a Fulbright scholar at UIC's AI Lab, where he focused on algorithms for offline/online planning under uncertainty. He received his BS+MEng degree in Telecommunications Engineering from UBP in beautiful Córdoba, Argentina. Back home, he learned his most valuable skill: how to prepare a great asado.

Room 6

Thursday, 18th May, 09:00 - 12:20

  •  
  • Tutorials


IoT Lab with Micropython and Friends
IoT Lab with Micropython and Friends (13:20 - 16:40)

About the session

Come learn about the Internet of Things and Micropython in this hands-on hardware tutorial, no soldering or hardware experience required! We will be building a wifi-enabled temperature sensor as a vehicle for learning IoT concepts including data capture, building security into data transmission, and messaging between IoT clients and servers. Attendees will have an opportunity to take their sensors out into the conference venue to take measurements, reconvening to discuss analysis and visualization of IoT data. All the hardware needed will be provided, and attendees will be able to program the devices via a locally-hosted web interface. This tutorial will be a great introduction for folks interested in Internet of Things, Micropython, or hardware hacking. You do not need prior experience in any of these topics to attend. We will be using the ESP8266 micro controller and the MQTT protocol for messaging. Attendees should download the mosquitto MQTT broker in addition to collateral that will be sent out to attendees ahead of the tutorial. We will be using the WebREPL interface for programming the ESP8266. If you are wondering what the heck all of this means do not despair! These topics will be covered in the tutorial.

Speakers

Sev Leonard
Sev likes crisp fall days, ice cream, and aggregating data. He is a Python developer & sciencer of data living in Portland, OR who enjoys writing and teaching technical topics

Room 6

Thursday, 18th May, 13:20 - 16:40

  •  
  • Tutorials


Room 7
IPython and Jupyter in Depth: High productivity, interactive Python
IPython and Jupyter in Depth: High productivity, interactive Python (09:00 - 12:20)

About the session

IPython and Jupyter provide tools for interactive computing that are widely used in scientific computing, education, and data science, but can benefit any Python developer. You will learn how to use IPython in different ways, as: an interactive shell, a graphical console, a network-aware VM (Virtual machine) in GUIs, a web-based notebook combining code, graphics and rich HTML. We will demonstrate how to deploy a custom environment with Docker that not only contains multiple Python kernels but also a couple of other languages.

Speakers

Matthias Bussonnier
Matthias is PostDoc at UC Berkeley Institute for Data science, and have been a core Developer of IPython and Jupyter for a couple of years. With a background in Physics Matthias spend most of his time developing tools for the scientific community and for education as well as promoting Python 3.

Mike Bright
Solution Architect at Hewlett-Packard Enterprise working in the EMEA OpenNFV lab (Cloud Computing for Telecom), based in Grenoble France.Passionate about Containers, Orchestration and Programming Languages.Runs the Grenoble Python User Group.Like to travel, danse (Argentinian Tango, Salsa, Rock)

Min Ragan-Kelley
Min has been a core developer of IPython (and now Jupyter) since 2006. He holds a PhD from UC Berkeley in Applied Science & Technology, with an emphasis in computational plasma physics. He now works as a postdoctoral researcher at Simula Research Laboratory in Oslo, Norway, on the Jupyter and OpenDreamKit projects, focusing on JupyterHub and the Jupyter protocols for interactive computing.

Room 7

Thursday, 18th May, 09:00 - 12:20

  •  
  • Tutorials


Parallel Data Analysis
Parallel Data Analysis (13:20 - 16:40)

About the session

An overview of parallel computing techniques available from Python and hands-on experience with a variety of frameworks. This course has two primary goals: 1. Teach students how to reason about parallel computing 2. Provide hands-on experience with a variety of different parallel computing frameworks. Students will walk away with both a high-level understanding of parallel problems and how to select and use an appropriate parallel computing framework for their problem. They will get hands-on experience using tools both on their personal laptop, and on a cluster environment that will be provided for them at the tutorial. For the first half we cover programming patterns for parallelism found across many tools, notably map, futures, and big-data collections. We investigate these common APIs by diving into a sequence of examples that require increasingly complex tools. We learn the benefits and costs of each API and the sorts of problems where each is appropriate. For the second half, we focus on the performance aspects of frameworks and give intuition on how to pick the right tool for the job. This includes common challenges in parallel analysis, such as communication costs, debugging parallel code, as well as deployment and setup strategies.

Speakers

Ben Zaitlen
Ben is a data scientist and developer at Continuum Analytics. He has several years of experience with Python and is passionate about any and all forms of data. Part of his duties at Continuum include exploring a vast array of data (social networks, climate, astronomy, biology, finance, etc.).

Matthew Rocklin
Matthew is a full time open source developer at Continuum Analytics where he builds Python tools for parallel data analysis.

Min Ragan-Kelley
Min has been a core developer of IPython (and now Jupyter) since 2006. He holds a PhD from UC Berkeley in Applied Science & Technology, with an emphasis in computational plasma physics. He now works as a postdoctoral researcher at Simula Research Laboratory in Oslo, Norway, on the Jupyter and OpenDreamKit projects, focusing on JupyterHub and the Jupyter protocols for interactive computing.

Room 7

Thursday, 18th May, 13:20 - 16:40

  •  
  • Tutorials


Room 8
Mastering scipy.spatial
Mastering scipy.spatial (09:00 - 12:20)

About the session

The heavily-used scipy library is so large that each of the major modules could fill its own tutorial syllabus. It is also production-quality software with a 1.0 release imminent. In this tutorial, my focus is to cover the scipy.spatial component of the library in great detail, from the perspective of a heavy user and active developer of the computational geometry components of scipy. From distance matrices to Voronoi diagrams and Hausdorff distances, we will explore the corners of scipy.spatial code--both long-established features and even proposed features that haven't yet made it into a stable release.

Speakers

Tyler Reddy
I have a PhD in biochemistry and molecular biology and am a post-doctoral fellow in computational virology. I build computational models of viruses (like influenza A and dengue) to better understand their biophysical properties. This requires extensive use of the Python programming language to parse shapes, volumes and areas. This is accomplished by leveraging numpy and scipy to perform computational geometry calculations. I am working with the scientific Python community to improve our computational geometry capabilities--my most recent presentations were at PyData London 2015 and PyCon 2016, both of which focus on computational geometry in Python.

Room 8

Thursday, 18th May, 09:00 - 12:20

  •  
  • Tutorials


Best Testing Practices for Data Science
Best Testing Practices for Data Science (13:20 - 16:40)

About the session

So you're a data scientist wrangling with data that's continually avalanching in, and there's always errors cropping up! NaNs, strings where there are supposed to be integers, and more. Moreover, your team is writing code that is getting reused, but that code is failing in mysterious places. How do you solve this? Testing is the answer! In this tutorial, you will gain practical hands-on experience writing tests in a data science setting so that you can continually ensure the integrity of your code and data. You will learn how to use py.test, coverage.py, and hypothesis to write better tests for your code.

Speakers

Eric J. Ma
I am a 5th year PhD Candidate in the Runstadler Lab in the Biological Engineering department at MIT. I study the influenza virus, which is like a self-replicating deck of 8 poker cards. I am using Python to solve infectious disease data science problems.

Room 8

Thursday, 18th May, 13:20 - 16:40

  •  
  • Tutorials


Room 9
Build a data pipeline with Luigi
Build a data pipeline with Luigi (09:00 - 12:20)

About the session

Luigi is a Python library for building pipelines of batch processes. It 'handles dependency resolution, workflow management, visualization, handling failures, command line integration, and much more' In this tutorial, we will use Luigi to build a data pipeline that runs a series of interdependent jobs. We will also discuss some real-world use cases for Luigi, and show how it can make running a data pipeline much more robust and reliable.

Speakers

Aaron Knight
Aaron Knight is a full-stack engineer at Voxy, a company that creates dynamic, personalized educational courses for English learners.

Room 9

Thursday, 18th May, 09:00 - 12:20

  •  
  • Tutorials


Room 9
Applied Modern Cryptography in Python
Applied Modern Cryptography in Python (13:20 - 16:40)

About the session

Today we use cryptography in almost everywhere. From surfing the web over https, to working remotely over ssh. Although most developers don't need to implement cryptography primitives, the knowledge and understanding of these building block allows them to better deploy them in their application. In modern crypto we have all the building block to develop secure application. However, we see instances of insecure code everywhere. Most of these vulnerabilities are not because of theoretic shortcomings, but due to bad implementation or a flawed protocol design. Cryptography is a delicate art where nuances matter, and failure to comprehend the subtleties of these building blocks leads to critical vulnerabilities. To add insult to injury most of the resources available are either outdated or wrong, and inarguably, using bad crypto more dangerous than not using it.In this tutorial we look at the basic building blocks of modern cryptography. We will cover the encryption techniques, hashing mechanisms, and key devastation algorithms. Furthermore we review two of the most widely used protocol suites, SSL and PGP. We conclude by implementing a simplified version of Pretty Good Privacy (PGP), that is used for encryption of texts, e-mails, files, directories, and whole disk partitions.

Speakers

Amirali Sanatinia
Amirali Sanatinia is a Computer Science PhD candidate at Northeastern and holds a Bachelors degree in CS from St Andrews University. His research focuses on cyber security and privacy, and was covered by venues such as MIT Technology Review, ACM Tech News, Ars Technica, Threatpost, etc. He is a RSAC security scholar. He has talked at different security conferences such as DEF CON, Crypto Village, Virus Bulletin, and BSides Boston.

Room 9

Thursday, 18th May, 13:20 - 16:40

  •  
  • Tutorials


Room A105-106
Python Education Summit
Python Education Summit (09:00 - 17:00)

About the session

In 2017, PyCon will be holding its fifth annual Python Education Summit. The Summit is a gathering of teachers and educators focused on bringing coding literacy, through Python, to as broad a group of audiences as possible. We invite educators from all venues to consider joining the discussion, share insights, learn new techniques and tools and generally share their passion for education. We are looking for educators from many venues: authors; schools, colleges, universities; community-based workshops; online programs; and government. Not only will we have a wide array of full-blown talks, we will also have a round of lightning talks!

Speakers

Al Sweigart
Al Sweigart is a software developer and the author of _Automate the Boring Stuff with Python_, _Invent Your Own Computer Games with Python_, _Making Games with Python & Pygame_, and _Hacking Secret Ciphers with Python_. These books are freely available under a Creative Commons license at [http://inventwithpython.com][1].Al enjoys haunting coffee shops, writing educational materials, cat whispering, and making useful software. He lives in San Francisco. [1]: http://inventwithpython.com

Luciano Ramalho
Luciano Ramalho is a Technical Principal at ThoughtWorks and the author of the bestselling book Fluent Python (O'Reilly, 2015). Since 1998 he has deployed Python on some of the largest Web sites in Brazil. His speaking record includes PyCon US, OSCON, OSCON-EU, PythonBrasil, RuPy and an ACM Webinar that was attended by more than 900 people. Ramalho is a fellow of the PSF and co-founder of the Brazilian Python Association and of Garoa Hacker Clube, the first hackerspace in Brazil.

Room A105-106

Thursday, 18th May, 09:00 - 17:00

  •  
  • Summit


Python Education Summit
Python Education Summit (09:00 - 17:00)

About the session

Join us for a 5k run along the beautiful Willamette River in Portland, OR.

Room A105-106

Thursday, 18th May, 09:00 - 17:00

  •  
  • Summit


Room B110-111
Intel: Accelerating Python across the range of applications: the right tools for the job
Intel: Accelerating Python across the range of applications: the right tools for the job (09:00 - 10:30)

About the session

Python's popularity has given way to its use in many areas--from web frameworks all the way to machine learning and scientific computing. However, getting the best performance from Python requires an intimate knowledge of the right tools and techniques that are available today. In this tutorial, participants will learn how to measure, tune and accelerate Python workflows across various domains. This tutorial will cover the following topics: -Performance speedups for scientific computing using Intel® Distribution for Python, multithreading with Intel® Threading Building Blocks library, Numba, and Intel® VTune Amplifier -Data Analytics and machine learning acceleration with pyDAAL -Web framework, scripting, and infrastructure acceleration using the PyPy JIT

Speakers

David Liu
David is a Technical Consultant Engineer at Intel Corporation in Austin, TX, where he represents Intel's Python products and projects. He is focused on solving customer problems in Python while simultaneously developing and shaping Intel's software products to match customer needs. In the past, he worked as a software engineer utilizing Python in machine learning, network infrastructure, and web work. David holds an MS in Software Engineering from the University of Texas at Austin.

Room B110-111

Thursday, 18th May, 09:00 - 10:30

  •  
  • Sponsor tutorials


Datadog: Distributed Tracing for Python
Datadog: Distributed Tracing for Python (11:00 - 12:30)

About the session

Tracing is a specialized form of logging that is designed to work effectively in large, distributed environments. When done right, tracing follows the path of a request across process and service boundaries. This provides a big step-up in application observability, and can help inform a developer why certain requests are slow, or why they might have behaved unexpectedly. This tutorial will familiarize users with the benefits of tracing, and describe a general toolkit for emitting traces from Python applications in a minimally intrusive way. We will walk through a simple example app, which receives an HTTP request, and gradually instrument it to be observable via traces. We will discuss language constructs that can generate traces - namely decorators, monkey-patching and context managers - and give users hints on how they might add tracing to their own applications and libraries. In the process users will become familiar with the existing standards for modelling traces, and some of the challenges involved in adhering to this model in a distributed, asynchronous environment.

Speakers

Aaditya Talwai
Aaditya Talwai works on large-scale monitoring systems at Datadog. He's enthusiastic about server and application observability, and curious about tools that can give a fresh view into dynamic infrastructure. Over the past year he's been working on open-source libraries and agents to absorb application transaction traces into Datadog's backend. He has also given talks on monitoring and observability in the context of the OpenStack cloud framework.

Room B110-111

Thursday, 18th May, 11:00 - 12:30

  •  
  • Sponsor tutorials


Elasticsearch and Python: Doing more with open source tools
Elasticsearch and Python: Doing more with open source tools (15:30 - 17:00)

About the session

Description: Elasticsearch, a distributed, RESTful search and analytics engine, has wide variety of capabilities that can be used from Python. In this workshop we will explore several different use cases and showcase how the associated Python libraries can be used to help you. It is intended for intermediate users who have basic familiarity with Elasticsearch and want to further their understanding. Some of the topics that will be covered are: * bulk loading data into Elasticsearch * how to efficiently building queries and aggregations * using Elasticsearch for persistence in your application * syncing data in Elasticsearch and other data stores. Attendees will leave the workshop with the skills to integrate Elasticsearch into their apps and additional tools for analyzing data with Python.

Speakers

Honza Kral
Honza is a Python programmer and Django core developer – since he is scared ofthe bright and shiny world of browsers, designers, and users he prefers to stayburied deep in the infrastructure code and just provides others with tools todo the actual site-building.Since 2008 Honza has been building content web sites for fun and profit. Duringthis time he discovered Elasticsearch which lead to him joining the companybehind it in 2013 to work on the Python drivers.

Room B110-111

Thursday, 18th May, 15:30 - 17:00

  •  
  • Sponsor tutorials


Google: How I Learned to Stop Worrying and Love Python at Google | Tech Talk Series
Google: How I Learned to Stop Worrying and Love Python at Google | Tech Talk Series (15:30 - 17:00)

About the session


Speakers

Jasmine Hsu
JASMINE HSU is a software engineer with the Google Brain team, focusing on applying deep learning research to robotics. Her recent work was in imitation learning, and is now currently working on robotic grasping and manipulation. Previous to Google, she worked for several years in the defense industry working on natural language related projects. Feel free to ping her at @hellojas or find her camping out at random coffee shops.

Room B110-111

Thursday, 18th May, 15:30 - 17:00

  •  
  • Sponsor tutorials


Room B118-119
Anaconda: Data Science Apps with Anaconda
Anaconda: Data Science Apps with Anaconda (11:00 - 12:30)

About the session

Anaconda provides a rich foundation of Python and R packages for data science. This tutorial will demonstrate how Anaconda can be used to turn simple models, scripts, or Jupyter notebooks into deployable applications. Participants should have Anaconda installed and have basic Python programming experience. We'll make use of machine learning and AI libraries such as Pandas, Scikit-learn, Tensorflow, and Keras. The tutorial will also demonstrate the app deployment capabilities of Anaconda Cloud.

Speakers

Ian Stokes-Rees
Ian is a computational scientist and engineer at Continuum Analytics. He loves Python, and finding great ways to use it to solve big hairy problems in scientific computing, data analysis, and visualization. Ian helped develop a Python-based computational infrastructure for the CERN LHCb experiment during his PhD at Oxford, and followed that with work on distributed MC option pricing algorithms while a postdoctoral research at INRIA (France). Prior to joining Continuum, Ian spent 5 years at Harvard, first developing a science gateway for computational biology (in Python, of course), and then as lecturer in the School of Engineering.

Room B118-119

Thursday, 18th May, 11:00 - 12:30

  •  
  • Sponsor tutorials


Anaconda: Accelerating your Python Data Science code with Dask and Numba
Anaconda: Accelerating your Python Data Science code with Dask and Numba (13:30 - 15:00)

About the session

Anyone doing numerical computing with Python will have run into performance barriers. Using Anaconda is a great start to get a suite of extension packages where the underlying data structures and algorithms are written in C or Fortan. We'll briefly review the state of numerical computing in Python, look at some examples to help you remember why you should use NumPy based packages whenever possible, and focus on two options for acceleration: faster serial computing or parallelization. Continuum Analytics has developed two popular open source packages to address these issues: Numba, which provides an LLVM-based JIT that can be easily accessed just through a decorator; and Dask, which provides a distributed computing framework and some high quality data structures that are similar to a Pandas DataFrame or a NumPy NDarray. Participants should have the latest release of Anaconda installed and have some familiarity with Python in order to follow along interactively with the tutorial where we'll learn how to efficiently leverage Dask and Numba.

Speakers

Ian Stokes-Rees
Ian is a computational scientist and engineer at Continuum Analytics. He loves Python, and finding great ways to use it to solve big hairy problems in scientific computing, data analysis, and visualization. Ian helped develop a Python-based computational infrastructure for the CERN LHCb experiment during his PhD at Oxford, and followed that with work on distributed MC option pricing algorithms while a postdoctoral research at INRIA (France). Prior to joining Continuum, Ian spent 5 years at Harvard, first developing a science gateway for computational biology (in Python, of course), and then as lecturer in the School of Engineering.

Room B118-119

Thursday, 18th May, 13:30 - 15:00

  •  
  • Sponsor tutorials


Metis: Natural Language Processing and Text Mining in Python
Metis: Natural Language Processing and Text Mining in Python (15:30 - 17:00)

About the session

This session will cover NLP and text mining using Python and offer several examples of real world applications. Participants will be introduced to various text processing techniques and learn about text classification, clustering, and topic modeling. By the end of the workshop, participants will be able to use Python to explore and build their own models on text data. Michael Galvin is the Executive Director of Data Science at Metis. He came to Metis from General Electric where he worked to establish their data science strategy and capabilities for field services and to build solutions supporting Global operations, risk, engineering, sales, and marketing. Prior to GE, Michael spent several years as a data scientist working on problems in credit modeling at Kabbage and corporate travel and procurement at TRX. Michael holds a Bachelor's degree in Mathematics and a Master's degree in Computational Science and Engineering from the Georgia Institute of Technology where he also spent 3 years working on machine learning research problems related to computational biology and bioinformatics. Additionally, Michael spent 12 years in the United States Marine Corps where he held various leadership roles within aviation, logistics, and training units.

Speakers

Michael Galvin
Michael Galvin is the Executive Director of Data Science at Metis. He came to Metis from General Electric where he worked to establish their data science strategy and capabilities for field services and to build solutions supporting Global operations, risk, engineering, sales, and marketing. Prior to GE, Michael spent several years as a data scientist working on problems in credit modeling at Kabbage and corporate travel and procurement at TRX. Michael holds a Bachelor's degree in Mathematics and a Master's degree in Computational Science and Engineering from the Georgia Institute of Technology where he also spent 3 years working on machine learning research problems related to computational biology and bioinformatics. Additionally, Michael spent 12 years in the United States Marine Corps where he held various leadership roles within aviation, logistics, and training units.

Room B118-119

Thursday, 18th May, 15:30 - 17:00

  •  
  • Sponsor tutorials


Friday, 19th May
 
10:00
10:15
10:30
10:45
11:00
11:15
11:30
11:45
12:00
12:15
12:30
12:45
13:00
13:15
13:30
13:45
14:00
14:15
14:30
14:45
15:00
15:15
15:30
15:45
16:00
16:15
16:30
16:45
17:00
17:15
17:30
17:45
18:00
18:15
18:30
18:45
19:00
19:15
19:30
19:45
Oregon Ballroom 201-202
Requests Under The Hood
Requests Under The Hood (10:50 - 11:20)

About the session

Requests is widely acknowledged as a library that saves users an enormous amount of time, effort, and pain through its intuitive and clear API. For this reason, most people who have never looked at the code assume that its code is as intuitive, well-structured, and clear as the API. Of course, the truth is more complex than that. Real software that deals with real problems is rarely ideal: there are edge cases, terrible hacks, and awkward workarounds for problems. Often in the software industry we pretend that these imperfections in our software don’t exist, or we try to hide them. These imperfections frequently cause people to reinvent wheels in order to simplify the code, which has benefits for understandability but frequently has downsides for resilience. When people talk about “battle-tested” code, they mean code that has been dirtied up over time from its original Platonic ideal implementation to something that is just as complex and warty as real life. In this talk, one of the Requests and urllib3 core maintainers lays bare all of the worst and hackiest corners of the codebases of these two libraries. The goal is to help expose all of the invisible work done in mature codebases to tolerate edge cases and misbehaviour, as well as to try to remind us all that the perfect is the enemy of the good.

Speakers

Cory Benfield
Cory is an open source Python developer heavily involved in the Python HTTP community. He's a Requests core contributor, a urllib3 core contributor, and the lead maintainer of the Hyper Project, a collection of HTTP and HTTP/2 tools for Python. For his sins, he also helps out with the Python Cryptographic Authority on PyOpenSSL.When he's not feeling perplexed about his life choices, Cory loves computer networks, film and TV, and travel.

Oregon Ballroom 201-202

Friday, 19th May, 10:50 - 11:20

  •  
  • Talks


Grok the GIL: Write Fast And Thread-Safe Python
Grok the GIL: Write Fast And Thread-Safe Python (12:10 - 12:55)

About the session

I wrote Python for years while holding mistaken notions about the Global Interpreter Lock, and I've met others in the same boat. The GIL's effect is simply this: only one thread can execute Python code at a time, while N other threads sleep or await network I/O. Let's read CPython interpreter source and try some examples to grok the GIL, and learn to write fast and thread-safe Python.

Speakers

A. Jesse Jiryu Davis
Staff Engineer at MongoDB in New York City specializing in C, Python, and async. Lead developer of the MongoDB C Driver libraries libbson and libmongoc. Author of Motor, an async MongoDB driver for Tornado and asyncio. Contributor to Python, PyMongo, MongoDB, Tornado, and asyncio. Co-author with Guido van Rossum of “A Web Crawler With asyncio Coroutines”, a chapter in the “500 Lines or Less” book in the Architecture of Open Source Applications series.Blogs at emptysquare.net and for the PSF at pyfound.blogspot.com.

Oregon Ballroom 201-202

Friday, 19th May, 12:10 - 12:55

  •  
  • Talks


Community powered packaging: conda-forge
Community powered packaging: conda-forge (13:55 - 14:25)

About the session

The Python scientific community always wanted a package manager that is cross platform, does not require sudo, and lets Python be awesome! The conda package manager solved that problem, but created a new ones...This talk is a tour disguised as a beginner tutorial to conda-forge packaging. We will try to discuss some myths and misconceptions about conda and conda-forge, as well as a quick comparison with pip and wheels.

Speakers

Filipe Pires Alvarenga Fernandes
I am a physical oceanographer turned developer due to a failed PhD.It all started when I decided to ditch Matlab in favor of Python for my daily work,and soon I started re-writing all the oceanographic Matlab tools to python.Thanks to that effort I got a job working for NOAA teaching/developing/promoting python as a tool for Earth scientist.

Oregon Ballroom 201-202

Friday, 19th May, 13:55 - 14:25

  •  
  • Talks


Instagram Filters in 15 Lines of Python
Instagram Filters in 15 Lines of Python (14:35 - 15:05)

About the session

Images tell stories, and we love Instagram filters because they give emotion to our images. Do you want to explore what makes up Instagram filters? In this talk, we will talk about the basic elements of Instagram filters and implement them in Python. The staple libraries we will use are scikit-image and numpy - matplotlib and jupyter notebooks for plotting and interactivity. In the end, we will implement the (now-defunct) Gotham Instagram filter in 15 lines of Python (not including imports). Throughout the process, there will be many pretty pictures.

Speakers

Michele Pratusevich
I graduated from MIT with my BS in computer science in 2013 followed by an MEng in 2015. I currently work at Amazon on large-scale computer vision problems and systems, with Python and C++ being the languages of choice. I like hiking, climbing, and fun programming problems.

Oregon Ballroom 201-202

Friday, 19th May, 14:35 - 15:05

  •  
  • Talks


The Glory of pdb's set_trace
The Glory of pdb's set_trace (16:30 - 17:00)

About the session

Everyone needs to debug code, and it can take up a non-trivial portion of our time to wait for code to complete execution and write print messages to stdout. There’s one function in particular in the python debugger (pdb) library that can give you a much clearer understanding of what’s going on in your code, much more quickly; pdb.set_ trace(). In this talk, we’ll identify the most useful things you can do when you use set trace, that can make debugging exponentially more efficient and enjoyable.

Speakers

Nicole Zuckerman
Nicole Zuckerman is a software engineer at Clover Health, where she writes the endpoints and data pipelines to help surface better health care for members. She's also deeply invested in effectively on-boarding entry-level engineers, and improving diversity and inclusion in tech. In her free time, Nicole is an avid dancer and teacher, sci-fi book fanatic, soul and jazz aficionado, and cheese lover. She holds an MA in English Literature and Women's Studies from the University of Liverpool.

Oregon Ballroom 201-202

Friday, 19th May, 16:30 - 17:00

  •  
  • Talks


Looping Like a Pro in Python
Looping Like a Pro in Python (17:10 - 17:40)

About the session

The humble loop: it's hard to write a program without it. Whether it's processing numbers in a sequence, lines in a text file, users in a database, or any other list of things, you use loops all the time. But did you know that Python has a lot of different ways to write loops? Reaching for the right looping tool can make your code cleaner, more readable, easier to test, and it can even make it run faster! By the end of this talk, you'll be looping like a pro, and your code will be better for it.

Speakers

David DB Baumgold
DB is a freelance web developer living in the Boston area. He loves Python for its simplicity, its clarity, and its community -- especially in welcoming beginners. In addition to Python, he also enjoys Javascript, databases, design, teaching, and learning. He genuinely believes that everything will work out alright in the end, somehow.

Oregon Ballroom 201-202

Friday, 19th May, 17:10 - 17:40

  •  
  • Talks


Oregon Ballroom 201–202
Passing Exceptions 101: Paradigms in Error Handling
Passing Exceptions 101: Paradigms in Error Handling (11:30 - 12:00)

About the session

Exception handling in Python can sometimes feel like a Wild West. If you have a send_email function, and the caller inputs an invalid email address, should it: A) Return None or some other special return value, B) Let the underlying exception it might cause bubble up, C) Check via a regex and type checking and raise a ValueError immediately, or D) Make a custom EmailException subclass and raise that? What if there is a network error while the email was sending? Or what if the function calls a helper _format_email that returns an integer (clearly wrong!), or raises an TypeError itself? Should it crash the program or prompt a retry? This talk will introduce the concept of an exception, explain the built-in Python exception hierarchy and the utility of custom subclasses, demonstrate try/except/finally/else syntax, and then explore different design patterns for exception control flow and their tradeoffs using examples. It will also make comparisons to error handling philosophy in other languages, like Eiffel and Go.

Speakers

Amandine Lee
I was trained in physics and statistics, and began programming by learning MatLab and R for modeling in research. A couple of years and a few computer science classes later, I attended the Recurse Center, where I decided to learn Python to have an versatile and concise home programming language with which to learn about CS theory and software engineering. I now work at Dropbox, on the Desktop Release Engineering team, where Python is a part of my daily life.

Oregon Ballroom 201–202

Friday, 19th May, 11:30 - 12:00

  •  
  • Talks


Python from Space: Analyzing Open Satellite Imagery Using the Python Ecosystem
Python from Space: Analyzing Open Satellite Imagery Using the Python Ecosystem (15:15 - 14:00)

About the session

Earth imaging satellites, just like our computers, are shrinking and becoming more ubiquitous than ever before. It is now possible to obtain open satellite data on a daily if not weekly basis and for this data to be put to work; helping us better understand our planet and quickly respond to disaster situations. In this talk we will work through a jupyter notebook that covers the satellite data ecosystem and the python tools that can be used to sift through and analyze that data. Topics include python tools for using Open Street Maps data, the Geospatial Data Abstraction Library (GDAL), and OpenCV and NumPy for image processing. This talk is intended for novice and intermediate python developers who are interested in using data science and satellite imagery for social good and fundamental scientific research.

Speakers

Katherine Scott
Katherine Scott is a senior software engineer for Planet where she works on the data pipeline team making sense of millions of satellite images. Katherine was the lead developer of the SimpleCV computer vision library and co-author of the SimpleCV O'Reilly Book . In her spare time she hacks together robots and teaches kids about robotics and programming.

Oregon Ballroom 201–202

Friday, 19th May, 15:15 - 14:00

  •  
  • Talks


Oregon Ballroom 203-204
Type uWSGI; press enter; what happens?
Type uWSGI; press enter; what happens? (11:30 - 12:00)

About the session

You're a pretty knowledgeable Python web application developer, but how does that web application get served to the world? For many of us, uWSGI is the magic that makes our application available, and in this talk we'll look at how uWSGI works with the OS and the networking stack to make the magic happen.

Speakers

Asheesh Laroia
By day, Asheesh Laroia is a member of the technical staff at Sandstorm in Palo Alto. By night, he is volunteer President at OpenHatch, helping create workshops that teach students how to get involved in open source. His technical background touches machine learning, security, and linguistics. He helped start the Boston Python Workshop for women and their friends, has been teaching Python to newcomers since 2004, including at Noisebridge and the EFF, and has advised user groups on how to make their events more newcomer-friendly and gender-diverse.

Philip James
Philip James is a Software Engineer at Stripe, a Civic Technologist and a Director for the City of Alameda Democratic Club and the East Bay Young Democrats. In his free time he builds ContactOtter, a smart address book you can share.

Oregon Ballroom 203-204

Friday, 19th May, 11:30 - 12:00

  •  
  • Talks


Web identity: OAuth2 and OpenIDConnect
Web identity: OAuth2 and OpenIDConnect (12:10 - 12:40)

About the session

Interested in adding single sign-on to your application, but confused about the variety of different web authentication methods out there? OAuth, OAuth2, OpenID, OpenIDConnect, SAML, Facebook Connect? This talk will clarify the different protocols, examining OAuth2 and OpenIDConnect in greater detail. It will demonstrate a basic client implementation using FLOSS libraries, and briefly touch on some of the issues involved in server implementation.

Speakers

Brendan McCollam
Brendan McCollam is a developer for the University of Chicago, working on the Globus project (globus.org): a data management platform for large-scale scientific data. Formerly he worked at Leapfrog Online, where he released the 'rstr' package. He is a graduate of Pomona College, where he studied neuroscience and mathematics; he was the recipient of a 2008 Watson Fellowship. Brendan lives outside London.

Oregon Ballroom 203-204

Friday, 19th May, 12:10 - 12:40

  •  
  • Talks


Implementing Concurrency and Parallelism From The Ground Up
Implementing Concurrency and Parallelism From The Ground Up (13:40 - 14:25)

About the session

When writing an application, it is common to want to do many things at once. For web servers, this is serve multiple web requests, for GUI applications it's doing a background task whilst keeping the UI responsive. But how do we actually do that? This talk will go into how concurrency and parallelism work from the CPU, OS, and threads up, how state (data) is shared between them, and how this interacts with the functions that you, the programmer, write, and how you can write properly behaving concurrent or parallel software.

Speakers

Amber Brown
Amber Brown (HawkOwl) is the Twisted Release Manager.

Oregon Ballroom 203-204

Friday, 19th May, 13:40 - 14:25

  •  
  • Talks


Factory Automation with Python - Stories about Robots, Serial Ports, and Barcode Readers
Factory Automation with Python - Stories about Robots, Serial Ports, and Barcode Readers (14:35 - 15:05)

About the session

In industrial automation tried and tested always beats latest and greatest: The machines that make smartphones have a serial port and are configured with .csv files. But when your factory automates complex non-linear workflows and is jam-packed with sensors and robots, you are quickly faced with software engineering challenges that call for modern tools. Python turns out to be surprisingly versatile in this setting, whether for prototyping a single conveyor belt or taming a building full of robots. This talk explains how to use Python for interfacing with two common industrial automation devices: a barcode scanner and a Programmable Logic Controller (PLC). After a simple demo, you’ll hear about lessons learned using Python packages that weren’t written with robots in mind, like Celery and pandas, with robots.

Speakers

Jonas Neubert
Jonas is a robotics engineer and software developer interested in automating complex processes. Among the many places where Jonas has caused downtime are: a steel processing plant, an automated genomics lab, and a cinema projection room (during a Harry Potter screening!)I'm currently making electronics prototyping faster with Tempo Automation in San Francisco. Previously, I've been building big robots at Counsyl and tiny robots at Cornell University. I've also created smartphone apps that make physical things, and a web app that 3D-prints topographical maps.

Oregon Ballroom 203-204

Friday, 19th May, 14:35 - 15:05

  •  
  • Talks


What's in your pip toolbox?
What's in your pip toolbox? (15:15 - 15:45)

About the session

pip is a great tool, but dependency management doesn't stop there. I'll explore several tools that work with pip to make managing your dependencies easier, faster, and safer. I'll cover generating dependencies a better way, maintaining your requirements.txt for the long-term, and exploring existing Python environments. Afterward, you'll never want to pip freeze > requirements.txt again.

Speakers

Jon Banafato
Jon Banafato is a Python developer and event organizer living in NYC. He would love to see you attend PyGotham.

Oregon Ballroom 203-204

Friday, 19th May, 15:15 - 15:45

  •  
  • Talks


A gentle introduction to deep learning with TensorFlow
A gentle introduction to deep learning with TensorFlow (16:15 - 19:00)

About the session

Deep learning's explosion of spectacular results over the past few years may make it appear esoteric and daunting, but in reality, if you are familiar with traditional machine learning, you're more than ready to start exploring deep learning. This talk aims to gently bridge the divide by demonstrating how deep learning operates on core machine learning concepts and getting attendees started coding deep neural networks using Google's TensorFlow library.

Speakers

Michelle Fullwood
I dissect words for science at MIT.

Oregon Ballroom 203-204

Friday, 19th May, 16:15 - 19:00

  •  
  • Talks


Oregon Ballroom 203–204
Snakes on a Hyperplane: Python Machine Learning in Production
Snakes on a Hyperplane: Python Machine Learning in Production (17:10 - 17:40)

About the session

Companies with an artificial-intelligence plan have a differentiating strategy in the intelligence economy; however, implementing robust machine-learning in production is nontrivial, often requiring a close collaboration between data scientists and developers, and retooling the production stack and workflows to develop and maintain accurate models. Machine learning in production involves model application, handling missing data, data artifacts, and data outside of the training calibration. A rigorous evaluation framework draws upon logging to determine characteristics of model coverage, model performance, auditing, and run-time performance. Model coverage includes the number of times the model produced sensible output relative to number of times it is called. Model coverage is reduced if the model does not converge or model criteria are not met. Model performance is evaluated with a suite of metrics (accuracy, AUC, FPR, TPR, RMSE, MAPE, etc.), which assist in determining the most appropriate model to use in the production scenario and the validity of the model training. Regularly performing manual audits for spot checks is important for debugging and ensuring the model passes sanity checks. Model performance includes run times and profiling model pieces, ensuring performance is within specified requirements and refactoring otherwise. In the AI renaissance, where ML is a critical piece of intelligent products, seamlessly integrating model evaluation into workflows is an important component of making robust products and building a satisfying customer experience. Python is a great language to build intelligent products with its abundance of ML libraries and wrappers contributed as open-source software in addition to rich full-stack capabilities.

Speakers

Jessica Lundin
Jessica’s research journey began in astro-particle physics doing fieldwork on a neutrino-detector project at South Pole Station. She completed a Masters and PhD in applied mathematics and geophysics at the University of Washington, studying ice physics and climatology, i.e. applied machine learning. After a postdoc in Japan she joined two later-acquired start-ups. She began at Microsoft by way of VoloMetrix, an enterprise productivity company acquired by O365, and is currently compelled to address the technology-undeserved field of medicine, working in the health and IoT domain in Microsoft Research.

Oregon Ballroom 203–204

Friday, 19th May, 17:10 - 17:40

  •  
  • Talks


Oregon Ballroom 203–204
Experiment Assignment on the Web
Experiment Assignment on the Web (10:50 - 11:20)

About the session

A popular way of improving websites is to run experiments on it. We split users into groups, show two or more variations of the site, measure how well each one does, and then show the best version to everyone. In this talk, I'll walk through a toy Python program that does the first step: splits users into groups. A few interesting problems arise: grouping users, whitelists, and scaling. I'll share different ways to address them. I'll also give examples of things that can go terribly wrong when designing experiment assignment code.

Speakers

Jessica Stringham
Jessica is a software engineer at Yelp. At Yelp, she works on the experimentation infrastructure, and previously worked on mobile app analytics and the activity feed. She sometimes blogs at jessicastringham.com.

Oregon Ballroom 203–204

Friday, 19th May, 10:50 - 11:20

  •  
  • Talks


Portland Ballroom 251 & 258
Prehistoric Patterns in Python
Prehistoric Patterns in Python (11:30 - 12:00)

About the session

Why do some code use dictionaries that have None for all values? Is it true that you shouldn't concatenate strings with +? Will Python optimize constant calculations? This talk will go through some patterns that used to be common in Python, but which now are regarded as outdated and see if they really are outdated and why. The results surprised me, maybe they'll surprise you.

Speakers

Lennart Regebro
Lennart Regebro has been using Python so long that he has forgotten all other programming languages. He is born in Sweden, works at Boston company Shoobx and lives in Poland, with his wife, daughter, two cats and a quince tree.His book Supporting Python 3 is available for free online.

Portland Ballroom 251 & 258

Friday, 19th May, 11:30 - 12:00

  •  
  • Talks


Packaging Let’s Encrypt: Lessons learned shipping Python code to hundreds of thousands of users
Packaging Let’s Encrypt: Lessons learned shipping Python code to hundreds of thousands of users (13:55 - 14:25)

About the session

Let's Encrypt launhced on April 12th 2016, for the first time allowing anyone access to free SSL certificates that could be automatically fetched and renewed. The demand was massive, and so was the need for a client to fetch these certificates for all of those users. This client is called Certbot, and it's written entirely in Python. Unfortunately for the sanity of Certbot developers, these users of Let's Encrypt can't decide on a single operating system to use! This requires us to ship our software, and all of its dependencies, to a variety of systems all with different web servers, Python versions, package managers,and underlying packages. Learn how we got through this mess!

Speakers

Noah Swartz
Noah is a Staff Technologist on the Tech Projects team. He works on the various software the EFF produces and maintains, including but not limited to Privacy Badger and Certbot.Before joining EFF Noah was a researcher at the MIT Media Lab as well as a technomancer and free software/culture advocate. An avid game enthusiast, Noah has ascended in nethack four times. He lives in the Mission District of San Francisco with his family of twitterbots.

Portland Ballroom 251 & 258

Friday, 19th May, 13:55 - 14:25

  •  
  • Talks


I Installed Python 3.6 on Windows and I Liked It
I Installed Python 3.6 on Windows and I Liked It (15:15 - 16:00)

About the session

Python has a great reputation as a cross-platform language, which for many people means different varieties of Linux. But a huge number of Python users are running on Windows - a fundamentally different operating system where things do not always work the same. However, Python has always worked incredibly well across different platforms including Windows, going to great lengths to support and expose the platform without making development more complex. In this session, CPython core developer and Microsoft engineer Steve Dower will discuss some of the reasons why cross-platform support is not an accident, and how Python 3.6 makes it even easier to support both Windows and Linux.

Speakers

Steve Dower
Steve is an engineer who tells people about Python and then gives them excuses to use it and great tools to use it with. He works on Python support for Visual Studio, contributes to many of Microsoft's Python libraries, and is a core contributor and Windows expert for CPython.

Portland Ballroom 251 & 258

Friday, 19th May, 15:15 - 16:00

  •  
  • Talks


Building A Gigaword Corpus: Lessons on Data Ingestion, Management, and Processing for NLP
Building A Gigaword Corpus: Lessons on Data Ingestion, Management, and Processing for NLP (16:30 - 17:00)

About the session

As the applications we build are increasingly driven by text, doing data ingestion, management, loading, and preprocessing in a robust, organized, parallel, and memory-safe way can get tricky. This talk walks through the highs (a custom billion-word corpus!), the lows (segfaults, 400 errors, pesky mp3s), and the new Python libraries we built to ingest and preprocess text for machine learning. While applications like Siri, Cortana, and Alexa may still seem like novelties, language-aware applications are rapidly becoming the new norm. Under the hood, these applications take in text data as input, parse it into composite parts, compute upon those composites, and then recombine them to deliver a meaningful and tailored end result. The best applications use language models trained on domain-specific corpora (collections of related documents containing natural language) that reduce ambiguity and prediction space to make results more intelligible. Here's the catch: these corpora are huge, generally consisting of at least hundreds of gigabytes of data inside of thousands of documents, and often more! In this talk, we'll see how working with text data is substantially different from working with numeric data, and show that ingesting a raw text corpus in a form that will support the construction of a data product is no trivial task. For instance, when dealing with a text corpus, you have to consider not only how the data comes in (e.g. respecting rate limits, terms of use, etc.), but also where to store the data and how to keep it organized. Because the data comes from the web, it's often unpredictable, containing not only text but audio files, ads, videos, and other kinds of web detritus. Since the datasets are large, you need to anticipate potential performance problems and ensure memory safety through streaming data loading and multiprocessing. Finally, in anticipation of the machine learning components, you have to establish a standardized method of transforming your raw ingested text into a corpus that's ready for computation and modeling. In this talk, we'll explore many of the challenges we experienced along the way and introduce two Python packages that make this work a bit easier: Baleen and Minke. Baleen is a package for ingesting formal natural language data from the discourse of professional and amateur writers, like bloggers and news outlets, in a categorized fashion. Minke extends Baleen with a library that performs parallel data loading, preprocessing, normalization, and keyphrase extraction to support machine learning on a large-scale custom corpus.

Speakers

Rebecca Bilbro
Dr. Rebecca Bilbro is lead data scientist at Bytecubed, a small tech company in Washington, DC, where she and her team use supervised and unsupervised machine learning, programming in Python and R, and distributed computing with Spark and Hadoop. In her free time, Rebecca works with the open source collaborative District Data Labs (check out the Yellowbrick Project), writes for the DDL blog, practices ukulele, and rides her bicycle around DC. Rebecca earned her doctorate from the University of Illinois Urbana-Champaign, conducting research on communication practices in Engineering, and holds a Bachelor's degree in Mathematics and English from Skidmore College.

Portland Ballroom 251 & 258

Friday, 19th May, 16:30 - 17:00

  •  
  • Talks


Algorithmic Music Generation
Algorithmic Music Generation (17:10 - 17:40)

About the session

Music is mainly an artistic act of inspired creation and is unlike some of the traditional math problems. Music cannot be solved by a simple set of formulae. The most interesting and challenging part is producing unique music without infringing the copyright. The generated music has to sound good, and what sounds good is very subjective and varies from culture to culture. Artificial Neural Network/Deep Learning has a wide range of applications, such as in Image processing, Natural language processing, Time series prediction, etc. But what about its usage in art? Could we use deep learning to create music? This talk is about how deep learning models were used to produce music - catering particularly to Bollywood. This talk would show how an exquisite piece of art i.e. music can be generated using deep learning model which helps in automated feature extraction. In order to automate the music generation, the model must be able to remember the learned features over the longer period of time, this is achieved by a special type of Recurrent Neural Network (RNN) called as LSTM (Long Short Term Memory) network. Implementation of such complex model can be made much easier using inbuilt Python libraries such as Keras with Theano as backend. It allows for easy and fast prototyping. Packages like numpy and scipy are being used for easier mathematical computation of input vectors and for reading/writing the WAV files respectively. The neural network architecture makes use of numerous amount of music samples to train the model. After an adequate number of iterations and training time, this model generates music that is unique and original. In this talk, steps involved in preprocessing of data, training the model, testing the model and generating the music from the trained model will be discussed. This talk will also cover some of the challenges and tradeoffs made for algorithmic music generation.

Speakers

Padmaja V Bhagwat
Padmaja V Bhagwat is a junior pursuing Bachelors in Information Technology at National Institute of Technology, Karnataka - India. Her interest lies in the field of Machine Learning and Artificial Intelligence. She is motivated to make the world a better place by using machine learning. When she isn’t working on machine learning, she dances.

Portland Ballroom 251 & 258

Friday, 19th May, 17:10 - 17:40

  •  
  • Talks


Portland Ballroom 251 & 258
Next Level Testing
Next Level Testing (12:10 - 12:40)

About the session

Unit, functional, and integration tests are great first steps towards improving the quality of your python project. Ever wonder if there’s even more you can do? Maybe you've heard of property-based testing, fuzzing, and mutation testing but you're unsure exactly how they can help you. In this talk we’ll cover additional types of tests that can help improve the quality and robustness of your python projects: property-based testing, fuzz testing, stress testing, long term reliability testing, and mutation testing. We’ll also go beyond just covering what these tests are. For each of the test types above, I’ll give you real world examples from open source software that I maintain that shows you the types of bugs each test type can find. I’ll also show you how you can integrate these tests into your Travis CI and/or Jenkins environment.

Speakers

James Saryerwinnie
James Saryerwinnie is a Software Development Engineer at Amazon Web Services where he works on Boto3, the AWS SDK for Python, and the AWS CLI, the unified command line interface for AWS. He also maintains several open source projects including jmespath, fakeredis, and python-keepassx.

Portland Ballroom 251 & 258

Friday, 19th May, 12:10 - 12:40

  •  
  • Talks


Portland Ballroom 251 & 258
Optimizations which made Python 3.6 faster than Python 3.5
Optimizations which made Python 3.6 faster than Python 3.5 (10:50 - 11:20)

About the session

Various optimizations made Python 3.6 faster than Python 3.5. Let's see in detail what was done and how. Python 3.6 is faster than any other Python version on many benchmarks. We will see results of the Python benchmark suite on Python 2.7, 3.5 and 3.6. The bytecode format and instructions to call functions were redesign to run bytecode faster. A new C calling convention, called 'fast call' , was introduced to avoid temporary tuple and dict. The way Python parses arguments was also optimized using a new internal cache. Operations on bytes and encodes like UTF-8 were optimized a lot thanks to a new API to create bytes objects. The API allows very efficient optimizations and reduces memory reallocations. Some parts of asyncio were rewritten in C to speedup code up to 25%. The PyMem_Malloc() function now also uses the fast pymalloc allocator also giving tiny speedup for free. Finally, we will see optimization projects for Python 3.7: use fast calls in more cases, speed up method calls, a cache on opcodes, a cache on global variables.

Speakers

Victor Stinner
Python core developer since 2010, I'm the author of various Python applications and libraries Python. See my profile on Bitbucket ( https://bitbucket.org/haypo/ ) and and Github ( http://github.com/haypo/ ). I'm working on OpenStack for Red Hat from France.

Portland Ballroom 251 & 258

Friday, 19th May, 10:50 - 11:20

  •  
  • Talks


The Gilectomy: How's It Going?
The Gilectomy: How's It Going? (12:10 - 12:55)

About the session

One of the most interesting projects in Python today is Larry Hastings' 'Gilectomy' project: the removal of Python's Global Interpreter Lock, or 'GIL'. Come for an up-to-the-minute status report: what's been tried, what has and hasn't worked, and what performance is like now.

Speakers

Larry Hastings
Larry has been using Python since the lateish 90s. He is a CPython core contributor, and is Release Manager for Python 3.4.

Portland Ballroom 251 & 258

Friday, 19th May, 12:10 - 12:55

  •  
  • Talks


Ending Py2/Py3 compatibility in a user friendly manner
Ending Py2/Py3 compatibility in a user friendly manner (15:15 - 16:00)

About the session

Four shalt thou not count, neither count thou two, excepting that thou then proceed to three.'Monty Python and the Holy Grail; Scene 33. Python 3 has been around for more than eight years, and much of the Python ecosystem is now available both on Python 2 and Python 3, often using a single code base. Nonetheless, this compatibility comes at a development cost and some library authors are considering ending support for Python 2 . These once-python-2-compatible libraries are at risk of being upgraded on non compatible system and cause user (and developer) frustration. While it may seem simple to cease support for Python 2, the challenge is not in ending support, but doing so in a way that does not wreak havoc for users who stay on Python 2. And that is not only a communications problem, but a technical one : up until recently, it was impossible to tag a release as Python 3 only; today it is possible. Like any maintainer of a widely used library, we want to ensure that users continue to use Python 2 continue to have functioning libraries, even after development proceeds in a way that does not support Python 2. One approach is to ensure easy installation of older versions if possible avoid incompatible versions altogether. Users should not need to manually pin maximal version dependencies across their development environments and projects if all they want is to use the latest versions of libraries that are compatible with their system. Even if we did expect that of users, consider what would happen when a package they rely on converts to be only Python 3 compatible. If they were not tracking the complete dependency tree, they might discover, on upgrade, that their projects no longer work. To avert this they would need to pin those at the last version compatible with Python 2. Users that want to use older python versions should not have to go through so much anguish to do so. In order to solve this problem, and thereby make both users' and maintainers' lives easier, we ventured into the rabbit-hole called Packaging. Though we set off with a singular quest, our tale roves through many lands. We'll narrate the story of our amending PEPs, our efforts in building the ramparts of the pypa/Warehouse Castle, battles with the dragons of Pip, and errands in the 'land of no unit tests 'otherwise known as PyPI legacy.By the end of the above tale,the audience members will know the road to Python 3 only libraries had once had hazards that are now easily avoidable.So long as users upgrade their package management tools.

Speakers

Matthias Bussonnier
Matthias is PostDoc at UC Berkeley Institute for Data science, and have been a core Developer of IPython and Jupyter for a couple of years. With a background in Physics Matthias spend most of his time developing tools for the scientific community and for education as well as promoting Python 3.

Min Ragan-Kelley
Min has been a core developer of IPython (and now Jupyter) since 2006. He holds a PhD from UC Berkeley in Applied Science & Technology, with an emphasis in computational plasma physics. He now works as a postdoctoral researcher at Simula Research Laboratory in Oslo, Norway, on the Jupyter and OpenDreamKit projects, focusing on JupyterHub and the Jupyter protocols for interactive computing.

Portland Ballroom 251 & 258

Friday, 19th May, 15:15 - 16:00

  •  
  • Talks


Portland Ballroom 252-253
In-Memory Event Resequencing: Realistic Testing For Impossible Bugs
In-Memory Event Resequencing: Realistic Testing For Impossible Bugs (10:50 - 11:20)

About the session

As we all know, we should write testable code, and automated tests. But as we also know, no test plan survives contact with the real world. Complex, distributed systems fail in complex, distributed ways, and even the simplest web app today is a complex distributed system. So, as our code accrues little fixes to bugs that only show up in production, our test suites eventually either become slow integration testing monstrosities that are 'realistic' but flaky and unreliable, or useless piles of mocks which are fast and deterministic but don't give you confidence. In this talk, we'll explore how to leverage event-driven programming, or 'async I / O ', to structure code in such a way that its tests are fast, realistic, and reliable, even in the face of horrible race-conditions you only discover in production.

Speakers

Michele Pratusevich
I graduated from MIT with my BS in computer science in 2013 followed by an MEng in 2015. I currently work at Amazon on large-scale computer vision problems and systems, with Python and C++ being the languages of choice. I like hiking, climbing, and fun programming problems.

Portland Ballroom 252-253

Friday, 19th May, 10:50 - 11:20

  •  
  • Talks


Solid Snakes or: How to Take 5 Weeks of Vacation
Solid Snakes or: How to Take 5 Weeks of Vacation (13:40 - 14:25)

About the session

No matter whether you run a web app, search for gravitational waves, or maintain a backup script: being responsible for a piece of software or infrastructure means that you either get a pager right away, or that you get angry calls from people affected by outages. Being paged at 4am in everyday life is bad enough. Having to fix problems from hotel rooms while your travel buddies go for brunch is even worse. And while incidents can’t be prevented completely, there are ways to make your systems more reliable and minimize the need for (your!) manual intervention. This talk will help you to get calm nights and relaxing vacations by teaching you some of them.

Speakers

Hynek Schlawack
Hynek Schlawack is a lead infrastructure and software engineer from Berlin, Germany, PSF fellow, and contributor to a wide variety of open source projects including high-profile ones like Twisted and CPython.His main areas of interest are security, networks, and solid software engineering.

Portland Ballroom 252-253

Friday, 19th May, 13:40 - 14:25

  •  
  • Talks


Dr. Microservices, Or How I Learned to Stop Worrying and Love the API
Dr. Microservices, Or How I Learned to Stop Worrying and Love the API (14:35 - 15:05)

About the session

Assuming that you already know how to build a monolithic app, you must be wondering how you can use all this 'microservice' stuff that you keep hearing about. Well, a good word of advice is that you probably don't need it. If designed properly, a monolithic app should be able to scale and fit the needs of most businesses. Even so, you should keep your development as simple as possible until you have proven and solidified your business concepts. But if you do need to grow to Internet scale, then you have a long road ahead of you. Moving from a monolithic application to microservices is a natural evolution that is often of necessity. There are several competing schools of thought that are still being battle-tested in these early days of microservice architecture. Among all the competing paradigms, most of the requirements can be agreed upon, but are mostly differentiated by the tools used to fulfill the requirements. This talk will cover setting up the required infrastructure, and demonstrate how to migrate a sample monolithic Django application into a microservices platform. The demo application will use the following technologies: Django, Flask, Fabric, Terraform, Ansible, CentOS, Docker, Mesos, Consul, Nginx, Pgbouncer, Kafka

Speakers

Ryan Anguiano
Ryan has been designing and implementing technological solutions in various industries for over 13 years. Over the past decade, he has been a part of helping a number of companies venture into the digital realm and implement policies that allow for vast creativity and innovation to occur. He has built many platforms that utilize state-of-the-art technology in order to gain a competitive edge and created tools that help businesses achieve their goals. Ryan currently works at RevPoint Media developing www.jangl.com

Portland Ballroom 252-253

Friday, 19th May, 14:35 - 15:05

  •  
  • Talks


5 ways to deploy your Python web app in 2017
5 ways to deploy your Python web app in 2017 (15:15 - 15:45)

About the session

You’ve built a fine Python web application and now you’re ready to share it with the world. But what’s the best way to deploy your app in 2017?This talk will demonstrate popular techniques for deploying Python web applications. We’ll start with a simple Flask application and expose it to the world five times over as we learn to use different tools and services available to the modern Python developer.Specific topics covered include:Exposing your local dev environment with ngrok, Using a Platform-as-a-Service (PaaS) like Heroku, Going “serverless” with AWS Lambda, Configuring your own VM with Google Compute Engine, Thinking inside the box using Docker. We’ll also briefly touch on the pros and cons of each technique to help you figure out which one is right for your app. At the end of this talk you will have a basic understanding of how each of these techniques work and you’ll be ready to try them out yourself.

Speakers

Andrew T. Baker
Andrew Baker is a Developer Educator at Twilio where he’s on a mission to change the way developers learn how to use new APIs.Equal parts Python developer and educator, he loves writing well tested code and running hands on workshops. Andrew found his calling focusing on Twilio's documentation, where he is equally upset spotting errant semicolons in Python code or tutorial prose.

Portland Ballroom 252-253

Friday, 19th May, 15:15 - 15:45

  •  
  • Talks


Leveraging Serverless Architecture for Powerful Data Pipelines
Leveraging Serverless Architecture for Powerful Data Pipelines (17:40 - 17:40)

About the session

Serverless Architectures that allow us to run python functions in the cloud in an event-driven parallel fashion can be used to create extremely dynamic and powerful data pipelines for use in ETL and data science. Join me for an exploration of how to build data pipelines on Amazon Web Services Lambda with python. We'll cover a single introduction to event-driven programming. Then, we'll walk through building an example pipeline while discussing some of the frameworks and tools that can make building your pipeline easier. Finally, we'll discuss how to maintain observability on your pipeline to ensure proper performance and troubleshooting information.

Speakers

Jason Myers
Jason works at Juice Analytics as a Software Engineer after spending 15 years as a systems architect and building datacenters and cloud architectures for several of the largest tech companies, hospitals, stadiums, and telecomm providers. He's a passionate developer who regularly speaks at local and national events about technology. He's also the chair of the PyTennessee conference. He's also the author of Essential SQLAlchemy 2nd via O'Reilly Media, Introduction to Relational Databases via Data Camp, and the upcoming Serverless Data Pipelines with Python via Manning Press.

Portland Ballroom 252-253

Friday, 19th May, 17:40 - 17:40

  •  
  • Talks


Portland Ballroom 252–253
Big picture software testing: unit testing, Lean Startup, and everything in-between
Big picture software testing: unit testing, Lean Startup, and everything in-between (11:30 - 12:00)

About the session

There are many ways you can test your software: unit testing, manual testing, end-to-end testing, and so forth. Take a step back and you'll discover even more form of testing, many of them very different in their goals: A/B testing, say, where you see which of two versions of your website results in more signups or ad clicks. How do these forms of testing differ, how do they relate to each other? How do you choose which kind of testing to pursue, given limited time and resources? How do you deal with strongly held yet opposite views arguing either that a particular kind of testing is essential or that it's a waste time? This talk will provide you with a model, a way to organize all forms of testing and understand what exactly they provide, and why. Once you understand the model you will be able to choose the right form of testing for your situation and goals.

Speakers

Itamar Turner-Trauring
Itamar Turner-Trauring has been attending Python conferences since 2000. Currently he works at Datawire, building infrastructure for microservices. You can read his blog at https://codewithoutrules.com.

Portland Ballroom 252–253

Friday, 19th May, 11:30 - 12:00

  •  
  • Talks


Portland Ballroom 252–253
Dask: A Pythonic Distributed Data Science Framework
Dask: A Pythonic Distributed Data Science Framework (16:15 - 17:00)

About the session

Dask is a general purpose parallel computing system capable of Celery-like task scheduling, Spark-like big data computing, and Numpy/Pandas/Scikit-learn level complex algorithms, written in Pure Python. Dask has been adopted by the PyData community as a Big Data solution. This talk focuses on the distributed task scheduler that powers Dask when running on a cluster. We'll focus on how we built a Big Data computing system using the Python networking stack (Tornado/AsyncIO) in service of its data science stack (NumPy/Pandas/Scikit Learn). Additionally we'll talk about the challenges of effective task scheduling in a data science context (data locality, resilience, load balancing) and how we manage this dynamically with aggressive measurement and dynamic scheduling heuristics.

Speakers

Matthew Rocklin
Matthew is a full time open source developer at Continuum Analytics where he builds Python tools for parallel data analysis.

Portland Ballroom 252–253

Friday, 19th May, 16:15 - 17:00

  •  
  • Talks


Portland Ballroom 254-255
Title Available On Request: An Introduction to Lazy Evaluation
Title Available On Request: An Introduction to Lazy Evaluation (10:50 - 11:20)

About the session

Lazy evaluation, also known as 'call by need ', is an evaluation strategy where values are produced only when needed. Lazy evaluation is the opposite of eager evaluation, Python's normal evaluation model, where functions are executed as seen and values are produced immediately. In this talk we will define lazy evaluation and contrast it with eager evaluation. We will discuss tools that exist in Python for using lazy evaluation and show how we can build on the primitives to better represent computations. We will introduce common vocabulary for discussing evaluation models, and compare different systems for implementing lazy evaluation. Finally, we will discuss optimizations that can be made to optimize lazily evaluated expressions.

Speakers

Joe Jevnik
Joe Jevnik works at Quantopian where he works on integrating data from various sources into the platform. Joe works on Zipline, Quantopian's open source backtester. He also works on the Blaze ecosystem, mainly on blaze core, odo, and datashape

Portland Ballroom 254-255

Friday, 19th May, 10:50 - 11:20

  •  
  • Talks


Dial M For Mentor
Dial M For Mentor (13:55 - 14:25)

About the session

One of the nicest things about Python community is the availability of mentors willing to help you. Various mentors have helped me navigate the open source community and help advanced my skills. I realized finding a mentor is not as easy as it seems, and it takes a lot of courage to reach out in the first place. And then, there is impostor syndrome, where one may feel like they don’t deserve the help. In this talk, I will provide advice about working with a mentor. Asking for help is not a failure.

Speakers

Mariatta Wijaya
Mariatta is a new Python core developer. She works as a Software Engineer in Vancouver, where she also helps organize Vancouver PyLadies. She's credited in three movies: Storks, The Angry Birds Movie, and Hotel Transylvania 2.

Portland Ballroom 254-255

Friday, 19th May, 13:55 - 14:25

  •  
  • Talks


Rants and Ruminations From A Job Applicant After 💯 CS Job Interviews in Silicon Valley
Rants and Ruminations From A Job Applicant After 💯 CS Job Interviews in Silicon Valley (14:35 - 15:05)

About the session

What is it like to interview at 1 technology company? Stressful and tiring. What is it like to interview at 100 technology companies? I have done that. In late August 2016, I quit an uninspiring full-time software job and talked to 100 employers in the San Francisco Bay Area to find the best fit. The hiring process reflects the company culture and its values. Listen to my rants and ruminations of interviewing at tiny seed-stage startups to large technology companies in Silicon Valley. Learn how to reform your own hiring process to be more considerate and thoughtful. Learn how to prepare for interviews efficiently.

Speakers

Susan Tan
Susan is a software engineer in San Francisco. She was a web applications engineer at Cisco via the acquisition of Piston, a cloud computing startup located in San Francisco. Prior to Piston, she was a web applications engineer at Rotten Tomatoes the movie review website. She likes to use Python-based web frameworks. She’s a core committer of a Django-based web application project at www.openhatch.org. Susan loves to drink warm cups of oolong tea while coding.

Portland Ballroom 254-255

Friday, 19th May, 14:35 - 15:05

  •  
  • Talks


Constructive Code Review
Constructive Code Review (15:15 - 16:00)

About the session

“Your code is bad and you are bad. Have a bad day.” Too many code reviews feel like this, and it saps the enthusiasm that drives open source. Instead, let’s explore how to give reviews that are truthful but encouraging, boosting the skill level of contributors and the quality of the project. We’ll look at “tact hacks” that nudge communication in a friendly direction, antipatterns to avoid, the pesky human emotions that can tempt us into reviewing poorly, and techniques for leveling up newcomers without losing all your coding time.

Speakers

Erik Rose
Erik Rose coordinates the impact of 108 spring-loaded buttons at Mozilla, venting a byproduct of static analysis, search, and pattern-finding software. By day, he leads the development of DXR, a comprehension tool for large codebases; by night, a bevy of eclectic Python libraries. Skeletons in his closet include the self-bootstrapping mechanism for Let's Encrypt, the hash-verification functionality in pip, and a book about Plone. He is a frequent PyCon speaker and has keynoted PyCon Argentina and Bangalore's The Fifth Elephant.

Portland Ballroom 254-255

Friday, 19th May, 15:15 - 16:00

  •  
  • Talks


Re-Programming the Human Genome with Python
Re-Programming the Human Genome with Python (16:30 - 17:00)

About the session

Modern genome editing techniques such as CRISPR-Cas9 are revolutionizing the way we discover and treat the root genetic causes of disease. Many of the most popular tools and libraries in this cutting edge application are written in Python. This talk will provide a general, software-centric introduction to the exciting new area of genome editing, describe the central string search, machine learning, and data management problems involved, and review how Python frameworks and libraries are used today to solve these problems in Production in order to benefit human health. This talk assumes no prior lab experience: only a proficiency with Python and curiosity!

Speakers

Riley Doyle
Riley is the principal architect of the DESKGEN genome editing platform, its underlying sequence storage and search engine, and inventor of the patent-pending AutoClone search algorithm. Currently his focus is applying CRISPR, machine learning, data engineering, and automated control systems to cell and gene therapy. Prior to Desktop Genetics, Riley was a BioEngineer at Genentech, Inc.Riley is interested the business impact synthetic biology, biotech information and automation systems on the world of science and business. He’s currently working on finding biotech’s next “killer app.Mostly he just sends emails.

Portland Ballroom 254-255

Friday, 19th May, 16:30 - 17:00

  •  
  • Talks


The Wild West of Data Wrangling
The Wild West of Data Wrangling (17:10 - 17:40)

About the session

Data science introductory courses might give you the impression that dealing with data is neat, tidy, and simple. They present you with a simplistic dataset and the scikit-learn or Pandas documentation, and a day or so later, you're done! Piece of cake, right? The real world of data isn't that easy! As a data scientist who has worked in the industry for several years, I have had a lot of experience dealing with messy, inaccurate, incomplete data, and I want to share those experiences with you. I'll talk my way through three real-world situations where I've had to analyze and build models on untidy and complex data, going through how I've preprocessed the data and prepared it for modeling. You'll leave with an understanding of how a data scientist thinks about data and what she does when the data is complicated.

Speakers

Sarah Guido
Sarah is a Senior Data Scientist at Mashable where she studies user behavior through data. She is the chair of the Machine Learning/Artificial Intelligence track at the 2017 SciPy Conference and is an accomplished conference speaker. She is also an O'Reilly Media author, having co-authored Introduction to Machine Learning with Python. Community involvement is very important to Sarah, and she is a co-organizer of the NYC Python Meetup, the largest Python meetup in the world. Sarah attended graduate school at the University of Michigan's School of Information.

Portland Ballroom 254-255

Friday, 19th May, 17:10 - 17:40

  •  
  • Talks


Portland Ballroom 254–255
Immutable Programming - Writing Functional Python
Immutable Programming - Writing Functional Python (11:30 - 12:00)

About the session

The world of Haskell and functional programming may seem like a distant place to many working Python developers, but some of the techniques used there are remarkably useful when developing in Python. In this talk, I will cover some of the pitfalls of mutability that you may run into while writing Python programs, and some tools and techniques that Python has built in that will let you avoid them. You'll see namedtuples, enums and properties, and also some patterns for structuring immutable programs that will make them easier to build, extend, and test.

Speakers

Calen Pennington
Cale is a lead architect at edX.org. Python engineer by day, Haskell hacker by night, he loves to use just the right amount of magic to clarify his designs. When not programming, he plays boardgames and parents a very active toddler.

Portland Ballroom 254–255

Friday, 19th May, 11:30 - 12:00

  •  
  • Talks


Introduction to Threat Modeling
Introduction to Threat Modeling (12:10 - 12:55)

About the session

Are you a website or application developer? Are you worried about security? Don’t know what you need to know, and what you can safely leave to the experts? Come learn about how to analyze your application’s design for potential security flaws, how to think like a security engineer, and see some of the most common pitfalls that programs fall victim to. In this talk we will work through the process of threat modeling - understanding how your system might get attacked, what its weak points are, and how to defend it.

Speakers

Ying Li
Software developer, mainly in Python and Go. Varied interests, including security.

Portland Ballroom 254–255

Friday, 19th May, 12:10 - 12:55

  •  
  • Talks


Room 3
Snek in the Browser
Snek in the Browser (14:35 - 15:05)

About the session

Python is a decades-strong language with a large community, and it has a solid foundation on the server, but it doesn't have a good user story in the browser... until now. The BeeWare project aims to bring Python natively, everywhere. Using a combination of the Batavia and Toga projects, we can develop and entirely native web experience in Python, no JavaScript required. During this talk, you will learn about how the BeeWare project has built Batavia, a Python virtual machine in JavaScript; and Toga, a multi-platform native API wrapper; a combination of which can be used to build an entire web platform in Python only.

Speakers

Katie McLaughlin
Katie has worn many different hats over the years. She has previously been a software developer for many languages, systems administrator for multiple operating systems, and speaker on many different topics.She's currently an operations engineer, core contributor for the BeeWare project, DjangoCon AU organiser, and former council member of Linux AustraliaWhen she's not changing the world, she enjoys making tapestries, cooking, and seeing just how well various application stacks handle emoji.

Room 3

Friday, 19th May, 14:35 - 15:05

  •  
  • Talks


Saturday, 20th May
 
10:00
10:15
10:30
10:45
11:00
11:15
11:30
11:45
12:00
12:15
12:30
12:45
13:00
13:15
13:30
13:45
14:00
14:15
14:30
14:45
15:00
15:15
15:30
15:45
16:00
16:15
16:30
16:45
17:00
17:15
17:30
17:45
Oregon Ballroom 201-202
Debugging in Python 3.6: Better, Faster, Stronger
Debugging in Python 3.6: Better, Faster, Stronger (10:50 - 11:20)

About the session

Python 3.6 was released in December of 2016 and it has a lot of new cool features. Some of them are quite easy for using: a developer can read, for example, about f-strings and they can start using them in their programs as soon as possible. But sometimes features are not so evident, and a new frame evaluation API is one of them. The new frame evaluation API was introduced to CPython in PEP 523 and it allows to specify a per-interpreter function pointer to handle the evaluation of frames. It might not be evident how to use this new feature in everyday life, but it’s quite easy to understand how to build a fast debugger based on it. In this talk we are going to explain how standard way of debugging in Python works and how a new frame evaluation API may be useful for creating the fast debugger. Also we will consider why such fast debugging was not possible in the previous versions of Python. If someone hasn’t made a final decision to move to Python 3.6 this talk will provide some new reasons to do it.

Speakers

Elizaveta Shashkova
Elizaveta Shashkova is a software developer of PyCharm IDE at JetBrains. She is working on the Python debugger which is part of PyCharm and PyDev. Also she is the author of PyCharm's Concurrency Visualiser.

Oregon Ballroom 201-202

Saturday, 20th May, 10:50 - 11:20

  •  
  • Talks


Readability Counts
Readability Counts (11:30 - 12:00)

About the session

Have you found unreadable PEP8-compliant code and wondered how to fix it? Have you ever seen code that was simply a pleasure to read? If you've ever wondered what makes code easy to read, this talk is for you. During this talk we'll learn a number of techniques for refactoring code to improve readability and maintainability.We'll discuss: whitespace, self-documenting code, modularity, expectation management. We'll end with a checklist for improving the readability of your own code.

Speakers

Trey Hunner
Trey Hunner is a Python & Django mentor specializing in on-site corporate training. Trey holds weekly online Python chats, is a director at the PSF, and is heavily involved with his local Python meetup group in San Diego.

Oregon Ballroom 201-202

Saturday, 20th May, 11:30 - 12:00

  •  
  • Talks


Static Types for Python
Static Types for Python (12:10 - 12:55)

About the session

Over the past year and a half, Dropbox has been investing in the development of mypy, a static type checker for Python, as a way to make our multimillion-line Python codebase easier to understand, navigate, and maintain. In this talk, we will discuss the benefits of type annotations, explain how to use them, and give a peak into how mypy works behind the scenes. Mypy is an open-source type-checker for Python which supports the PEP 484 standard for gradual typing. Originally created by Jukka Lehtosalo as part of his PhD thesis in 2013, it is now under active development by a small team at Dropbox which includes David Fisher, Greg Price, and Guido van Rossum. It supports Python 3.2 and higher, as well as Python 2.7 (via type comments).

Speakers

Jukka Lehtosalo
Jukka Lehtosalo is the creator of mypy, a static type checker for Python. He is also one of the authors of the type hinting notation introduced in Python 3.5 (PEP 484). He enjoys working at Dropbox and helping engineers be productive when working on multi-million-line Python codebases. Hailing from Finland, Jukka currently lives in the UK.

Oregon Ballroom 201-202

Saturday, 20th May, 12:10 - 12:55

  •  
  • Talks


Decorators, unwrapped: How do they work?
Decorators, unwrapped: How do they work? (14:35 - 15:05)

About the session

Decorators are a syntactically-pleasing way of modifying the behavior of functions in Python. However, they can be highly opaque to Python beginners. It took me a while to learn how to write one, and even after I was confident writing my own decorators, felt like they were magical. The goal of this talk is to demystify decorators by methodically stepping through how and why they work. Along the way we'll touch on closures, scopes, and how Python is compiled.

Speakers

Katie Silverio
Software engineer at Venmo. Recurse Center Fall 2013. Ask me about text adventures, Markov chains, bytecode, dead code, comics, astrophysics, and Harry Potter.

Oregon Ballroom 201-202

Saturday, 20th May, 14:35 - 15:05

  •  
  • Talks


The Dictionary Even Mightier
The Dictionary Even Mightier (15:15 - 16:00)

About the session

Since my “Mighty Dictionary” talk at PyCon 2010, the Python dictionary has evolved dramatically. Come learn about all of the the improvements, up to and including the re-architecture that has just landed with Python 3.6! The talk will discuss iterable views, the dictionary’s dedicated comprehension syntax, random key ordering, the special key-sharing dictionary designed to underlie object collections, and, most famously of all, the new “compact dictionary” that cuts dictionary storage substantially — and carries a fascinating side-effect. Each new feature that the talk discusses will be motivated by considering the trade-offs inherent in hash table data structure design, and followed up with hints about how you can now use the dictionary even more effectively in your own code!

Speakers

Brandon Rhodes
Brandon Rhodes works at Dropbox where he strives to deliver a world-class Python development environment to his fellow Dropboxers. His most popular open source projects are astronomy libraries for Python that he has been maintaining since the late 1990s. He speaks at several Python conferences a year, and in 2016–2017 is the Chair of the main worldwide PyCon conference as it is held in Portland.

Oregon Ballroom 201-202

Saturday, 20th May, 15:15 - 16:00

  •  
  • Talks


No More Sad Pandas: Optimizing Pandas Code for Speed and Efficiency
No More Sad Pandas: Optimizing Pandas Code for Speed and Efficiency (16:30 - 17:00)

About the session

When I first began working with the Python Pandas library, I was told by an experienced Python engineer: 'Pandas is fine for prototyping a bit of calculations,but it's too slow for any time-sensitive applications.' Over multiple years of working with the Pandas library, I have realized that this was only true if not enough care is put into identifying proper ways to optimize the code's performance.This talk will review some of the most common beginner pitfalls that can cause otherwise perfectly good Pandas code to grind to a screeching halt, and walk through a set of tips and tricks to avoid them.Using a series of examples, we will review the process for identifying the elements of the code that may be causing a slowdown,and discuss a series of optimizations, ranging from good practices of input data storage and reading, to the best methods for avoiding inefficient iterations, to using the power of vectorization to optimize functions for Pandas dataframes.

Speakers

Sofia Heisler
Sofia Heisler is the Lead Data Scientist at Upside Travel, where she develops pricing and product selection algorithms for the travel industry. Previously, she headed up data analytics for a D.C. startup dedicated to connecting small businesses to vendors, as well as performed data analysis on behalf of some of the largest Fortune 500 companies as a Senior Consultant at an economic consulting company. She holds a Master’s degree in Predictive Analytics from Northwestern University, as well as a B.A. and a B.S. in Economics with a concentration in Statistics from the University of Pennsylvania.

Oregon Ballroom 201-202

Saturday, 20th May, 16:30 - 17:00

  •  
  • Talks


Tracing, Fast and Slow: Digging into and improving your web service’s performance
Tracing, Fast and Slow: Digging into and improving your web service’s performance (17:10 - 17:40)

About the session

Do you maintain a Rube Goldberg-like service? Perhaps it’s highly distributed? Or you recently walked onto a team with an unfamiliar codebase? Have you noticed your service responds slower than molasses? This talk will walk you through how to pinpoint bottlenecks, approaches and tools to make improvements, and make you seem like the hero! All in a day’s work. The talk will describe various types of tracing a web service, including black & white box tracing, tracing distributed systems, as well as various tools and external services available to measure performance. I’ll also present a few different rabbit holes to dive into when trying to improve your service’s performance.

Speakers

Lynn Root
Lynn Root is a Site Reliability Engineer at Spotify in NYC. She is also a global leader of PyLadies, an international mentorship group for women and friends in the Python community, and the founder & former leader of the San Francisco PyLadies. When her hands are not on a keyboard, they are usually holding a pair of knitting needles.

Oregon Ballroom 201-202

Saturday, 20th May, 17:10 - 17:40

  •  
  • Talks


Oregon Ballroom 201–202
Know thy self: Methods and method binding
Know thy self: Methods and method binding (13:55 - 14:25)

About the session

Methods are like functions, but different. How? Why? And what's will having to type 'self ' all the time? We'll explore partial application of functions and review why it might be nice to start using classes. Then to clarify how method objects work we'll examine the result of accessing the method attributes of an object without calling them. Understanding the behavior we uncover here will require more attribute lookup experiments, which will lead us discover the power of descriptors. Along the way we'll peek in at other languages' approaches to method binding, hopefully coming to appreciate the way Python does things enough to type 'self' a few thousand more times.

Speakers

Thomas Ballinger
Tom is a Python enthusiast employed by Dropbox. Previously he worked as a facilitator at the Recurse Center in NYC where he explored dark corners of Python with participants and helped bring more BitTorrent clients into the world.

Oregon Ballroom 201–202

Saturday, 20th May, 13:55 - 14:25

  •  
  • Talks


Oregon Ballroom 203-204
It's time for datetime
It's time for datetime (10:50 - 11:20)

About the session

Working with time is not a trivial challenge. Python includes a native module in the standard library to work with it but datetime keeps being together with unicode a common source of errors. This of