<pre>
pre-release: PyConZA meeting announcement

Please take a moment to review your details and reply with OK or edits.
Subject and below is what will go out and also will be used to title the videos.

Subject: 
ANN: PyConZA at Cedarwood Wed October 10, 9p


PyConZA
=========================
When: 9 AM Wednesday October 10, 2018
Where: Cedarwood

https://za.pycon.org/

Topics
------
1. Hello Types
Sheena O'Connell
tags: Other
# Introduction
Lately there has been drive toward adding static typing to dynamically typed languages. Clojure, Javascript and Lua are some examples of this. And of course Python. Why? Because it's a damn fine idea! Having a variable and knowing what it *is* and how it behaves and being able to enforce that in your code is golden.

In this tutorial session we'll briefly cover some of the pros and cons of Python type checking (mostly the pros to be honest) and then get our hands dirty. We'll start off with writing some fresh code and adding Type annotations as we go. We'll then cover a bit of how to add type checking to pre-existing codebases.

# Prerequisite skillz: 
Basic knowledge of Python. If you can run your code and you are comfortable with the syntax around variables, functions, classes and loops then you should be quite comfortable. It would also be useful if you know how to drive a virtalenvironment but this is not strictly necessary.

# Prerequisite tech:
We'll be working in both Python2 and 3. You'll need to bring a laptop with the following things installed:

Most of this tutorial will be in Python3.6+:

- Python3.6+
- python3 -m pip install mypy
- python3 -m install typingextensions

We will also cover type checking in Python2.7:

- Python2.7
- python -m pip install typing

You are welcome to use virtual environments or pipenv.

# What we'll cover
- What is type checking and why should we use it?
- Python3 type annotations
- Python2 type annotations
- Adding annotations to existing projects

# By the end of this you...
Should be able to consistently write saner, more robust and more explicit code. Which is always a worthwhile goal
 recording release: yes license: CC BY  

2. Introduction to Python for Data Science, Part 1
Andrew Collier
tags: PyData
**This is the first half of the 2-session tutorial**

Python is a popular platform for doing Data Science. The two dominant libraries, <a href="https://pandas.pydata.org/">pandas</a> and <a href="http://scikit-learn.org/stable/">sklearn</a>, provide extensive functionality for data preparation, data manipulation and Machine Learning. This workshop will provide an introduction to using these libraries.

Specifically we’ll cover the following topics:

<ul>
    <li>What is Data Science?</li>
    <li>Grabbing data from various sources</li>
    <li>Working with <code>Series</code> and <code>DataFrame</code> objects</li>
    <li>Dealing with funky data (missing data and outliers)</li>
    <li>Overview of Machine Learning</li>
    <li>Keeping it simple using Nearest Neighbours</li>
    <li>Capturing a trend: <code>LinearRegression</code></li>
    <li>Predicting categories: <code>DecisionTreeClassifier</code></li>
    <li>Binary outcomes: <code>LogisticRegression</code></li>
    <li>Using <code>Pipeline</code> to streamline your workflow</li>
    <li>Cross Validation</li>
</ul>

The workshop will be intensely hands on, so you will definitely need a laptop. Instructions for getting everything set up will be provided prior to the workshop.

No prior knowledge of Data Science or Machine Learning is assumed, although it will be helpful if you have worked with a spreadsheet before and are moderately competent with basic Python.

We will work with a diverse selection of data sets and perform a variety of analyses. Along the way we’ll build and submit an entry to a <a href="https://www.kaggle.com/">Kaggle</a> competition. By the end of the day you will be functionally competent to venture forth on your own Data Science projects.

## Setup instructions

Please ensure that you have the following installed and tested:

- Python 3
- Jupyter
- Modules: numpy, pandas, scipy, matplotlib and sklearn.

Two easy ways to get all of the above are:

- install Anaconda or
- use datawookie/jupyterhub Docker image.
 recording release: yes license: CC BY  

3. Introduction to Python for Data Science, Part 2
Andrew Collier
tags: PyData
**This is the second half of the 2-session tutorial.**

Python is a popular platform for doing Data Science. The two dominant libraries, <a href="https://pandas.pydata.org/">pandas</a> and <a href="http://scikit-learn.org/stable/">sklearn</a>, provide extensive functionality for data preparation, data manipulation and Machine Learning. This workshop will provide an introduction to using these libraries.

Specifically we’ll cover the following topics:

<ul>
    <li>What is Data Science?</li>
    <li>Grabbing data from various sources</li>
    <li>Working with <code>Series</code> and <code>DataFrame</code> objects</li>
    <li>Dealing with funky data (missing data and outliers)</li>
    <li>Overview of Machine Learning</li>
    <li>Keeping it simple using Nearest Neighbours</li>
    <li>Capturing a trend: <code>LinearRegression</code></li>
    <li>Predicting categories: <code>DecisionTreeClassifier</code></li>
    <li>Binary outcomes: <code>LogisticRegression</code></li>
    <li>Using <code>Pipeline</code> to streamline your workflow</li>
    <li>Cross Validation</li>
</ul>

The workshop will be intensely hands on, so you will definitely need a laptop. Instructions for getting everything set up will be provided prior to the workshop.

No prior knowledge of Data Science or Machine Learning is assumed, although it will be helpful if you have worked with a spreadsheet before and are moderately competent with basic Python.

We will work with a diverse selection of data sets and perform a variety of analyses. Along the way we’ll build and submit an entry to a <a href="https://www.kaggle.com/">Kaggle</a> competition. By the end of the day you will be functionally competent to venture forth on your own Data Science projects.
 recording release: yes license: CC BY  

4. Building Web Mapping Applications using GeoDjango and other FOSS GIS
Christian Christelis
tags: Web
GeoDjango is a web application development framework, extending the Django project to include support for GeoSpatial web application development. With GeoDjango you can create web-enabled forms that capture both text-based data and geographical data (e.g. polygons / lines / points).

The Django framework makes use of the model/view/controller (MVC) design pattern (which we will explain) to allow you to build a clean application architecture. Django also provides all the infrastructure to do object-relational mapping (ORM). ORM is used to model your data structures in a database backend and automatically save and retrieve objects from the database as they are needed. There are many other great features of Django which we will try to give you a flavour of during this tutorial course.

Through the tutorial we will build a simple Django application that integrates map services. To get the most out of this tutorial it would be good if you have had some exposure to Django.
 recording release: yes license: CC BY  

5. Distributed microservices in the real world
Imraan Parker
tags: Web
This talk is intended for anyone interested in deploying microservices in their environment. It examines what is involved in developing, deploying and maintaining a distributed microservices architecture. Moving from a monolithic architecture to a services oriented architecture has many benefits and tradeoffs that need to be addressed.

One has to consider whether to go with an off the shelf solution or build your own. When does it make sense to do the latter? Tasks like monitoring and debugging is more difficult with the added complexity that is inherent with a distributed system.

This first part of the talk reviews what needs to be considered when deciding to build applications on a distributed microservices architecture. Topics that will be covered include:
<ul>
    <li>Why choose a microservices architecture, and when not to?</li>
    <li>Objectives of a distributed services architecture</li>
    <ul>
        <li>Service presence</li>
        <li>Heartbeating</li>
        <li>Logging</li>
        <li>Security</li>
        <li>Reliability</>
        <li>Performance and Scalability</li>
        <li>Monitoring</li>
        <li>Troubleshooting</li>
        <li>Message contracts</li>
    </ul>
</ul>

The second part of the talk delves into the micorservices framework built and used by CareerJunction. It is an in house framework written in Python 3 (3.5+) which uses asyncio for nonblocking I/O and ZeroMQ as a concurrency networking library.

In simple terms, the framework implements a reliable service-oriented request-reply dialog between a set of client applications, a set of brokers and a set of worker applications. 

The features of the framework include those discussed in the first part of the talk and will be showcased by coding and deploying a service. Above and beyond that, the following topics will be covered:
<ul>
    <li>Conceptual architecture</li>
    <li>Architecture goals</li>
    <li>Coding a service</li>
        <ul>
            <li>Creating and running a service</li>
            <li>Message contract parameters</li>
            <li>Exposing services via HTTP</li>
            <li>Scheduled Jobs</li>
        </ul>
</ul>

The last part of the talk examines the lessons learnt over the past few years, what to avoid and the benefits it brought not only to the IT team, but to the business as a whole.
Topics that will be covered include:
<ul>
    <li>IT benefits</li>
    <li>Business benefits</li>
    <li>What to avoid?</li>
</ul>
 recording release: yes license: CC BY  

6. Playing with Python's internals
Alex Hall
tags: Other
This talk will look at two of my libraries which stretch the limits of what's possible with Python:

1. [birdseye](https://github.com/alexmojaki/birdseye), a debugger that records the value of every expression for easy viewing, and
2. [sorcery](https://github.com/alexmojaki/sorcery), a framework for writing magical functions which know the context in which they are called.

They work by inspecting and manipulating Python's inner workings: execution frames, code objects, and most importantly the Abstract Syntax Tree (AST). I will give an overview of these concepts and explain how some parts of the libraries work.

This is for people interested in peeking under the hood of Python from within Python, i.e. no C and no messing with the interpreter.
 recording release: yes license: CC BY  

7. Python as a tool for e-health systems
Diana Pholo
tags: PyData
E-health has proven to have many benefits including reduced errors in medical diagnosis.
A number of machine learning (ML) techniques have been applied in medical diagnosis, each
having its benefits and disadvantages.
With its powerful pre-built libraries, Python is great for implementing machine learning in the medical field, where many people do not have an Artificial Intelligence background.

This talk will focus on applying ML on medical datasets using Scikit-learn, a Python module that comes packed with various machine learning algorithms. It will be structured as follows:

<ul>
    <li>An introduction to e-health.</li>
    <li>Types of medical data.</li>
    <li>Some Benchmark algorithms used in medical diagnosis: Decision trees, K-Nearest Neighbours, Naive Bayes and Support Vector Machines.</li>
    <li>How to implement benchmark algorithms using Scikit-learn.</li>
    <li>Performance evaluation metrics used in e-health.</li>
</ul>

This talk is aimed at people interested in real-life applications of machine learning using Python. Although centered around ML in medicine, the acquired skills can be extended to other fields.

About the speaker: Diana Pholo is a PhD student and lecturer  in the department of Computer Systems Engineering, at the Tshwane University of Technology.
 recording release: yes license: CC BY  

8. Elementary, my dear Python
Erin Versfeld
tags: Web
Sherlock is a diagnostics framework used to assess fleet health within Oracle Public Cloud. It is developed in Python, and operates within a restricted Python environment, but has been designed to overcome the challenges of our enterprise environment.

This talk will provide an overview of the restricted Python environment in which Sherlock operates and discuss how Sherlock has leveraged features of Python to maintain its independence. It will highlight the design challenges that were faced to ensure that the framework could be robust and lightweight. As part of the talk, I will showcase how we've utilised Python's strengths to provide a framework which easily allows developers from vastly different teams to write diagnostic scripts to easily asses service health across a global fleet.
 recording release: yes license: CC BY  

9. Bring Django Girls Workshop to Mozambique.
Cecilia Tivir, Carina Matimbe
tags: Other
In Mozambique, as well as in other African countries, ICT is still seen as being exclusively for boys. Social norms do not teach women to choose coding or others ICT areas.

Bringing Django Girls Workshop to Maputo was meant to teach girls more than how to code and create amazing blogs, we also wanted to use it as a platform to create opportunities that would empower them and promoting the diversity in technology.

Today we have started to hold small and informal meetings to teach girls and boys how to program in python. Some of the girls who have participated in Django Girls Maputo workshops have continued to learn more  and develop their skills. We hope to have more girls learn how to code in python so they can also share their experiences.

Our talk will cover the importance of how bringing Django Girls community to our girls in Mozambique  and how powerful  it was, is and it will be to bring them to demonstrate their abilities, skills and belief in their capabilities.

The message we intend to cover to the audience is that Django Girls is not only for a some women, but it's for all women that we can reach, so that we can encouraging more women to strive for programming careers benefits the tech community at large and also is a way to give them tools and show them the way that they can empower themselves and  contribute in technological development, so that better the societies they are situated in.

One of the most amazing feedbacks that we got was: "In this event it is possible to realize that anyone is capable of developing technological tools. Django Girls has come to say that women are capable of programming, coding and creating technological solutions".
 recording release: yes license: CC BY  

10. Reproducible Data Science with Docker
Richard Ackon
tags: PyData
Collaboration is a major part of doing Data Science. This means Data Scientists are always sharing their work with their colleagues whether to continue in the Data Science process or for review. One problem that is mostly faced in this process is the "It works on my machine" problem. 

Docker is a tool that is used to package and run applications with all their dependencies in an isolated environment. 

In this talk, I'll use Python to analyse some data in jupyter notebooks and show how Docker can be used to ensure reproducibility of that analysis in a different environment.

This talk will cover:
<ul>
    <li>The basics of the data science workflow</li>
    <li>The basics of Docker</li>
    <li>A demonstration of sharing and reproducing data analysis work in a jupyter notebook.</li>
</ul>
 recording release: yes license: CC BY  

11. The Developer's Guide to Data Science
Helge Reikeras
tags: Other
The myth of data science holds that you need an army of machine learning PHDs to be able to implement anything impactful with data science. In this talk I will attempt to dispel this myth and show how software developers can skip getting the machine learning PHD and start building awesome software with Data Science and Machine Learning today.

I believe developers are well situated to implement data science projects as they possess the understanding of how the product works, how users like to interact with the product and their opinions are already valued within the business.

To help developers level up their data science skills, I’ll discuss the core concepts behind the most prevalent methods in data science, how the data science process works, how to think like a data scientist, which frameworks and programming languages to choose (surprise Python!) and how to measure and communicate the value-add of data science projects to the business.

<em>About the speaker: Helge Reikeras is a Data Scientist at Offerzen with over 6 years experience in practical data science and has also worked as a software developer at various points in his career.
</em>
 recording release: yes license: CC BY  

12. An introduction to concurrent programming with asyncio
Bruce Merry
tags: Other
Concurrent programming is useful any time one needs to deal with multiple concurrent tasks: a server answering requests from multiple clients, a client scraping data from multiple servers, a workflow manager running external processes in a pipeline, and more.

While there are many concurrent programming frameworks for Python, there is one that is included out of the box: asyncio. I will introduce the framework and explain the syntax and APIs. Perhaps more importantly, I will offer practical tips on development with asyncio, such as exception handling, testing, debugging, and integration with existing code.

Attendees will come away with an understanding of why they will want to use asyncio instead of multi-threading, an understanding of the basic concepts, and knowledge of some additional libraries that will help them be productive with asyncio.
 recording release: yes license: CC BY  

13. A Brief Introduction to PyGame Zero
Neil Muller
tags: PyData
PyGame Zero is designed to be a boilerplate free wrapper around PyGame, avoiding the need to manage the PyGame event loop and simplifying the API significantly.

PyGame Zero is designed as an educational tool, but it does not compromise on the ability to create complex games and so it also serves as a nice general purpose introduction to writing graphical games in python.

In this talk, I will give a brief introduction to PyGame Zero, demonstrating it's core functionality and delve into some of the more advanced features it provides.
 recording release: yes license: CC BY  

14. From Idea to Product: Customer Profiling in Apache Zeppelin with PySpark
Sarah Sprich
tags: PyData
Zeppelin is a web based notebook which enables interactive data analytics on big data.  Data can easily be ingested from a variety of databases and analysis can be performed in Python and Pyspark.  Visualisations can be built and displayed together with the code, using Zeppelin’s built in tool Helium, or Python specific tools such as Matplotlib and Bokeh.  The web based interface facilitates easy sharing of results, and collaboration on projects.
 
Developing in Zeppelin has changed the way we approach model development.  We are able to take a project from an idea to a product all within one tool using the following process:
<ol>
    <li>Come up with an idea. Write some notes in a Zeppelin notebook describing how we would like the idea implemented.</li>
    <li>Slowly start fleshing out the idea, with real code, until the solution is built.  This is great to demo, as the code is in bite size chunks, and visualisations can be added directly in.</li> 
    <li>Take the code into production.  It can be scheduled it to run directly in Zeppelin with a cron scheduler, or from a tool such as Nifi.  Interactive visualisations can be embedded in a web-based frontend.</li>
</ol>
 
This talk is aimed at data scientists, particularly those working with big data.  We will demonstrate how we have built a catalogue of subscriber attributes based on customer mobile usage and purchase behavior using Zeppelin and Pyspark.  These attributes can be used to profile subscribers, and are the starting point for indivisualised customer engagement.  Anyone who attends this talk will get an introduction to Zeppelin and Pyspark and an overview of what can be achieved with these tools.
 recording release: yes license: CC BY  

15. Creative ideas specific to Python
Alex Hall
tags: Other
Come talk about your craziest and most creative ideas that only other Python programmers could understand. Let's talk about decorators and descriptors, meta-classes and magic, reflection and recursion. It can be something you've implemented yourself, something you want to implement, or something you're not even sure is possible but you wish *someone* would implement. Maybe you'll find someone else interested in doing just that.
 recording release: yes license: CC BY  

16. Deploying and Managing Python with Kubernetes
Joannah Nanjekye
tags: Testing
Because of the benefits of containers, python applications have been containerized recently. Containers have magically changed the way we deploy and manage python applications allowing us to build, develop, test, and even deploy python applications on a single system with no upgrade downtimes. 

Kubernetes is the missing layer that gives us the ability to manage many containers by providing features that enable containers to scale, talk to each other and work in harmony. 

This talk will focus on how python developers can leverage Kubernetes to manage any sort of python application on Kubernetes from simple to complex applications.

The talk will cover:

- The basics of Kubernetes.
- The basics of containerizing python applications.
- How to run and deploy simple, web and deep learning python pipelines on Kubernetes.
- How to manage or work with Kubernetes using the Kubernetes python client.

This talk will cover in summary the topics I talk about in my upcoming book "Deploying and Managing Python with Kubernetes" published by Apress sometime this year.
 recording release: yes license: CC BY  

17. Jupyter Notebooks for Data Science
Ari Ramkilowan
tags: PyData
This talk is intended for beginner and intermediate data scientists/ analysts/ engineers, although I hope that even experienced data scientists can gain something from the talk. 

The talk will focus on using Jupyter notebooks in data science applications. I will discuss the basics of how to get it up and running and the common features like using markup and code in the same notebook, I will highlight the advantages of working in a notebook rather than a traditional IDE. I will also discuss other features like using code snippets, autocomplete, linting and creating a table of contents. Inserting images and videos into a notebook along side your code can be a handy way of learning something new. I will end the talk with a look into jupyterlab.

Attendees of this course will gain an understanding and appreciation of the quick prototyping that is afforded to you when using Jupyter notebooks in your data science pipeline. Especially when it comes to exploratory data analysis. I want to be able to showcase all the common features of Jupyter notebooks but also some less known ones, so that there everyone attending the talk will learn something.
 recording release: yes license: CC BY  

18. Testing in the wild
Heather Williams
tags: Testing
<strong>Introduction</strong>

This talk is aimed at anyone interested in testing real code with real deadlines. In this talk Test Driven Development (TDD) will be explored along with alternatives to TDD. By the end of this talk participants will have a greater understanding of how to apply one of the tools of testing in the real world and how to ensure they have time in the development cycle to test their code.

<strong>Questions covered</strong>

Almost everyone talks about TDD these days. TDD is taken as the thing to aim for and the one thing you must do as a developer. But does TDD actually work when you are out of the classroom context and in the real world? Are there alternatives to TDD that are better to use? How do you begin to test legacy code so that you can change it safely? What about that new feature that marketing wants ASAP and you have to quickly pull a rabbit out the hat for? How should one ensure that this new code is added rapidly but with good tests? These and many other questions will be briefly explored in this talk.

<strong>Tools used</strong>

This talk will focus on unit testing and will use <a href="https://pypi.org/project/nose/">Python nose</a> as the test runner. Python nose extends <a href="https://docs.python.org/2/library/unittest.html">unittest</a> and makes it easier to setup, discover and run tests with Python. Some brief examples of tests will be shown using Python 2.7 but the broader concepts apply equally well to Python 3.
 recording release: yes license: CC BY  

19. Teach kids (7-17) to code with python & CoderDojo
David Campey
tags: Other
For the last 6 years (since pycon 2012) David's had a weekly class teaching kids to code. See what the first steps look like, get involved in a dojo!
 recording release: yes license: CC BY  

20. How to deploy your Python Web App on Google Cloud Platform
James Mwai
tags: Web
The goal of this talk is to give a basic understanding of how you can configure and deploy your Python web app on various Google Cloud Platform services. 

We will start by demonstrating how to setup and deploy a Python web app on both Standard and Flexible Google App Engine platforms. After that, we will show how to provision a Linux VM on Google Cloud to run a python web app. We will then explore how to package our Python web app into a docker image and deploy the same on Google Kubernetes Engine (GKE). Finally, we will explore Google Cloud Functions and how to write a serverless app in Python. 

The audience for this talk is generally Python web developers and devops engineers.
 recording release: yes license: CC BY  

21. Test your Docker images with Python
Jamie Hewland
tags: Testing
As more and more software is packaged in Docker images, it has become increasingly important that the Dockerfiles and scripts that these images are built from are correct. If Docker images are built and deployed as part of an automated pipeline, it is also important that they continue to work as expected when changes are made upstream.

Start testing your Docker images without relying on Bash scripts! We’ll cover why we decided to write a testing library and how to use it. We’ll also talk about some of the test fixtures we developed for common infrastructure such as RabbitMQ and PostgreSQL. Finally, we’ll explore some of the limitations and workarounds of creating a test environment of Docker containers.

Some of the best tools for working with Docker are already written in Python, for example, docker-compose. Bringing together the Python ecosystems around Docker and test frameworks, we created a new Python library called Seaworthy. Seaworthy can be used to verify that a Docker image works as expected in an isolated environment. It provides rich tools for asserting on processes, logs, and HTTP requests.
 recording release: yes license: CC BY  

22. Thursday Lightning Talks
Adrianna Pińska
tags: No Track
Thursday Lightning Talks
========================

* Bruce Merry: "The amazing disappearing import"
* Neil Muller: "The Python Events Calendar"
* Johan Zietsman "Cython - Writing C integrations to Python"
* Simon Cross: "What do you get if ..."
* Gordon Inggs: "Time series prediction with Facebook Prophet"
 recording release: yes license: CC BY  

23. Python on Azure
Toros Gökkurt
tags: Other
Python is a general purpose programming language which has broad usage areas from web applications to data science. Microsoft Azure is an ever-expanding set of cloud services to help your organization meet your business challenges. It’s the freedom to build, manage, and deploy applications on a massive, global network using your favorite tools and frameworks. In this session you will learn about Azure tools and services for Python developers and data scientists and learn how to build and deploy Python apps in Azure, with a range of apps and data services.

This talk will provide a high level introduction to the wide range of Azure tools, app and data services for Python developers and data scientist, and will conclude with a demonstration of end to end application lifecycle management of a Python app using Azure services and open source Microsoft tools.
 recording release: yes license: CC BY  

24. Batteries Included
Jonatas Baldin
tags: Other
Python is a “batteries included” language, but how many batteries are you using today? It’s so easy to install and import a third-party package that we end up forgetting how vast and valuable the Python Standard Library is. In this talk I’ll be exploring the default Python ecosystem, the one that does not need a pip install to be used. From collections to bisect, everyone from with any Python knowledge is welcome to have some fun and learn about the tools you already have in your toolbelt.
 recording release: yes license: CC BY  

25. So What's the Story?
Kerryn Gammie
tags: PyData
<h2>As the world's data grows, so does its aptitude for AI.</h2> 
In the context of business, however, translating black-box-magic into something more accessible for business-users to engage with is tricky. While this speaks to a larger problem of upskilling and making education more accessible, one method of translation is through story telling. 

I learn best when an <strong>idea is relatable, simple, and colourful</strong>. This talk is going to look at how to convey complex ideas simply.  I'm going to be covering two sections: 

<ul>
    <li>1. That's So Random (Forests)!</li>
    <li>2. You Gotta (Neural) Network to Get Work</li>
</ul>

I'll run through the high level concepts and methodologies, and then show the work/code that was done to create a random forest, and a neural network. Note: this will cover how I built the RF, and NN using Python via Jupyter Notebook.

This session is for anyone who uses/wants to use ML to solve problems but struggles with translating the black-box-magic. 
It's going to be an engaging, and slightly animated, talk with the intention of reinforcing concepts and showcasing different ways of explaining them.
 recording release: yes license: CC BY  

26. Python Community Development in East Africa
Linus Wamanya, Kato Joshua, Buwembo Murshid
tags: Other
The talk is about the growth of the python user community in East Africa, streamlining the role Afrodjango initiative is playing in building and empowering people with Python software development skills in the region.
 
Our audience is python community developers and accelerators of python programs.
we want the above mentioned participants to learn about our activities and how we are helping expand the python user community across the region.
we will be covering the growth and expansion of the python community in East Africa.
 recording release: yes license: CC BY  

27. (Re)solving an appliance traffic dilemma with the DNS loophole
Marco Slaviero
tags: Other
Cloud-based components are an all too common speed bump when installing new gear or software. While not an issue in home networks, outbound connections are shunned by default in regulated environments. Enabling communications between the newly installed technology and its cloud service then involves change control requests, committees, firewall admins, and (worst of all) delays... hardly the high-speed future we were promised.

Product builders: *it doesn't have to be this way*. Right now in your network one type of traffic almost certainly can exit your network without restriction: DNS. That VOIP network you think is isolated? Pretty good chance it can resolve DNS.

This is the story of how we grew one of the larger DNS overlay networks around using Python Twisted. We built a secure and reliable channel between thousands of appliances (hardware and virtual) and hundreds of servers, over the inherently unreliable DNS. 

The talk covers designing and building custom network channels in Twisted, Twisted limitations we bumped into, unexpected DNS behaviours, challenges in scaling the channel, and more. Network knowledge is useful but not necessary to follow along, and while we used Twisted, the lessons are applicable in other frameworks too. If you've got an hankering for network code, then this heady mix of network stacks and Python hacks is for you!
 recording release: yes license: CC BY  

28. From Zero to kind of a hero: Getting your Python side project ready for deployment
Sewagodimo Matlapeng
tags: Web
One evening, my little sister asked me for help with her homework at 9pm. She had already asked her classmates on a WhatsApp group, but nobody was able to help as all her peers had the same limited resources available to them. This gave me the idea for Buza (Zulu for “ask”), a platform for high school learners to ask questions that could be answered by volunteer university students in their free time.

Buza is the first side project of mine that I’ve ever truly wanted to deploy. After months of coding and adding loads of features, I finally reached out to a senior developer to help me deploy my truck of features. The first thing she said was “A lot of this code will have to be replaced with production-ready code”. These are words you never want to hear.

University Computer Science equipped me with the ability to write code to solve problems, but in industry extra skills were required to build production-ready software. This talk will share the valuable lessons I learned from getting this Django web app production-ready. This will start with how to c<strong>hoose the right tools, framework, environment for your project</strong>. I will also cover how to set up <strong>testing and continuous integration for your project</strong>.

<em>What I will cover (Zero to Hero):</em>
<ul>
    <li>No code is the best code</li>
    <li>Framework: DJango</li> 
    <li>Environment: Pipenv</li>
    <li>Static Tests: Flake8, Mypy (checks types), isort(sorts imports)</li>
    <li>Test Driven Development</li>
    <li>Automated testing: Tox and Travis</li>
    <li>Code Coverage</li>
</ul>
 recording release: yes license: CC BY  

29. Dimensionality reduction - squeezing out the good stuff with PCA
Aabir Abubaker Kar
tags: PyData
Data is often high-dimensional - millions of pixels, frequencies, categories. A lot of this detail is unnecessary for data analysis - but how much exactly? This talk will discuss the ideas and techniques of dimensionality reduction, provide useful mathematical intuition about how it's done, and show you how Netflix uses it to lead you from binge to binge.

In this session, we'll start by remembering what data really is and what it stands for. Data is a structured set of numbers, and these numbers typically (hopefully!) hold some information. This will lead us naturally to the concept of a *high-dimensional dataspace*, the mystical realm in which data lives. It turns out that data in this space displays an extremely useful 'selection bias' - ***a datapoint can be known by the company it keeps***. This is one of the basic ideas behind **k-means clustering**, which we will briefly discuss.

We'll then talk about the *informative-ness* of certain dimensions of the data space over others. This lays the mathematical foundation for the technique of **Principal Component Analysis** (PCA), which we will run on the Netflix movie dataset using *scikit-learn*.

We will also touch upon **tSNE**, another popular dimensionality-reduction algorithms.

I will be using *scikit-learn* for processing and *matplotlib* for visualization. The purpose of this session is to introduce dimensionality-reduction to those who do not know it, and to provide useful guiding intuitions to those who do. We'll also discuss some seminal use-cases, with tips and warnings for your own applications.
 recording release: yes license: CC BY  

30. Building Python communities in Africa
Linus Wamanya, Kato Joshua, Buwembo Murshid
tags: Other
A discussion about building resilient python communities in Africa
 recording release: yes license: CC BY  

31. Guide to choose right deep Learning framework for your AI project
Rishikesh
tags: PyData
As world rolling around Artificial Intelligence (AI), demand for the AI-based product seen exponential growth, so the AI research. Deep learning algorithms and techniques are widely used for research and development of these products. Good news is that year by year Deep Learning has seen its glory in the release of many open source frameworks which ease the pain to develop and implement these algorithms.

As there are many deep learning frameworks out there and it can lead to confusion as to which one is better for your task. And choosing a deep learning framework for an AI project is as important as choosing a programming language to code product, Data science project coupled with the right deep learning framework has truly amplified the overall productivity.

In this talk, I will discuss the common points which help developers to understand which framework will be the perfect fit for solving given business challenges. Also, we will look into some of the most widely used frameworks and comparing with standard benchmarks.

Following deep learning/machine learning frameworks will be discussed: <br/>
1. **Tensorflow**<br/>
2. **PyTorch**<br/>
3. **Chainer** and/or **MXNET**<br/>

Key Highlight of this talk:

* define key points to judge any deep learning framework.

* hardware dependencies.

* anatomy of widely used open source frameworks.

* comparison of above-mentioned frameworks as per defined key points. 


**Who is the audience?**

Anyone who inspired to code deep learning algorithms.

**Audience Level**: Beginner to Intermediate
 recording release: yes license: CC BY  

32. Developing good ORMs is HARD!
Nickolas Grigoriadis
tags: Other
As with many people, I was looking for an ORM for `asyncio` Python.

Whilst `asyncio` is a great framework for I/O bound applications, there isn't any mature, recommendable ORMs for it.  
Many attempts to wrap an existing sync Python ORM (such as peewee or sqlAlchemy) by having them run in a separate thread, and then dealing with synchronising between the event loop and threads got abandoned, due to a myriad of problems, including performance, correctness and blocking.  
Other ORMs were abandoned before they worked, or had so many layers of abstraction that I feared to touch it.

Then I came across [Tortoise ORM](http://tortoise-orm.readthedocs.io/)  
It had a simple design. (Inspired by the Django ORM syntax)  
It actually worked when I tried it out.

So I decided to jump in, and help with development.

In this talk I'll talk about my experience of being on a development team on an ORM.  
There is a reason there are so few successful ORM projects out there.

Developing good ORMs are **HARD**
 recording release: yes license: CC BY  

33. Two approaches to python web services
Kenneth Goldswain, Matthew French
tags: Web
We discuss two different approaches to building web services using Python. The first and more traditional approach uses Flask to build a web service, while the second approach builds a web service using only native Python 2.7 libraries without dependencies on any additional software.

The talk starts by discussing web services in general before moving onto the different environments our services run in, and what they will be used for. We will briefly cover the code used, but the focus will be on the reason why we need two different approaches and we will compare the risks and benefits of using these two contrasting methods. 

We hope the talk will inspire people to experiment even further with web services, or at least give them insight into new ways they could use web services.
 recording release: yes license: CC BY  

34. Sanic: Async Python (uvloop) with a familiar flask like feel.
Christo Goosen
tags: Web
Sanic is born from an article https://magic.io/blog/uvloop-blazing-fast-python-networking/ and the premise that async/await syntax should be combined with a familiar flask like feel.

The blazing speed of uvloop is combined with a familiar flask-like API to create a framework with less blocking code and faster response times. 

This talk will cover the specifics of Sanic as well as my real world exercise of building a insurance API with expensive blocking network calls to legacy (and slow) insurance APIs and services. 

Also I might have forgotten to mention the web framework has a mascot from the meme for sanic:

<pre>
   Sanic go.......................fast
                    ▄▄▄▄▄
           ▀▀▀██████▄▄▄       _______________
         ▄▄▄▄▄  █████████▄  /                 \
        ▀▀▀▀█████▌ ▀▐▄ ▀▐█ |   Gotta go fast!  |
      ▀▀█████▄▄ ▀██████▄██ | _________________/
      ▀▄▄▄▄▄  ▀▀█▄▀█════█▀ |/
           ▀▀▀▄  ▀▀███ ▀       ▄▄
        ▄███▀▀██▄████████▄ ▄▀▀▀▀▀▀█▌
      ██▀▄▄▄██▀▄███▀ ▀▀████      ▄██
   ▄▀▀▀▄██▄▀▀▌████▒▒▒▒▒▒███     ▌▄▄▀
   ▌    ▐▀████▐███▒▒▒▒▒▐██▌
   ▀▄▄▄▄▀   ▀▀████▒▒▒▒▄██▀
             ▀▀█████████▀
           ▄▄██▀██████▀█
         ▄██▀     ▀▀▀  █
        ▄█             ▐▌
    ▄▄▄▄█▌              ▀█▄▄▄▄▀▀▄
   ▌     ▐                ▀▀▄▄▄▀
    ▀▀▄▄▀
</pre>
 recording release: yes license: CC BY  

35. Building Rest API with Django Rest Framework
Jose Machava
tags: Web
Most of web applications  provide an rest API to connect with other application that need to access a specific information  and django rest framework is a powerful way for django developers to create robust rest API,  I'll demonstrate the process of creating a complex API using django rest framework.


This talk is intended for beginner who have some familiar with python and want a web framework to create rest API to connect with a mobile application.

Content which will be covered  are as follows:
<ul>
<li>HTTP methods</li>
<li>Django Models</li>
<li>Request & Response</li>
<li>Status Codes</li>
<li>Serializers</li>
<li>Viewsets and Routers</li>
</ul>
<strong>Prerequisites:</strong>
<ol>
    <li>Some experience with Django</li>
    <li>Basic python concepts</li>
    <li>Knowledge about HTTP and web development</li>

</ol>
 recording release: yes license: CC BY  

36. Bayesian Analysis in Python: A Starter Kit
Andrew Collier
tags: PyData
Bayesian techniques present a compelling alternative to the frequentist view of statistics, providing a flexible approach to extracting a swathe of meaningful information from your data. The learning curve is somewhat steep, but the benefits of adding Bayesian techniques to your tool suite are enormous!

What are the bare essentials that you need to know to start applying Bayesian techniques? This talk will provide an entry level discussion covering the following topics:

<ul>
    <li>What can Bayes do for me? (A brief introduction to Bayesian methods)</li>
    <li>Understanding Markov Chain Monte Carlo. (MCMC is what happens behind the scenes)</li>
    <li>What is Stan? (Writing models in Stan)</li>
    <li>Using Stan in Python. (The PyStan package)</li>
</ul>

The talk will be peppered with useful tips for dealing with the initial challenges of using Stan with Python.
 recording release: yes license: CC BY  

37. Python as a tool to boost productivity in (electronic) product and system development.
Johan Hartman
tags: Other
The general take away from this talk will be to motivate the use of Python not only as the implementation language to develop your application, but also to use Python to develop your own automation tools that fit your development process. By doing this, you will be able to implement faster, more accurate, have a better tested result and along the way derive many benefits that you won’t foresee when you start. 

Although my talk will speak from an embedded engineering point of view and not an IT or web development point of view, I belief the principles applies generally. Over many years and through the life cycle of many products I have evolved a development methodology that uses Python as the language of choice to develop my own tools for code generation (for ‘C’ and Python), software and firmware test automation, systems compliance testing and hardware manufacturing test automation. 

My process is that I start a project by defining XML object definitions for all instances of configuration, data or communication objects. From the XML, on the embedded firmware side, I generate ‘C’ structures, enum’s and prototype function bodies to use in my coding. On the PC or server side, I generate Python data object definitions from the XML. These are imported into Python programs that are written to operate on the object definitions, giving me free reuse of all the interfacing, testing, visualization etc. code that I have developed in the past.
 recording release: yes license: CC BY  

38. My Journey into Artificial Intelligence
Blessing Malumi
tags: PyData
The buzzword in recent times has been AI, what is AI? how can I learn AI? What is the usefulness of AI? How would it impact my business? Why should I learn AI? The questions just keep coming. This talk is supposed to present in practical terms how a young African girl got her start in AI, the journey so far, where she is now, and the possibilities of a future in AI.

It will also explain in details, the resources and materials she utilized in her learning with Python being the most resourceful, it will explain python's many functionalities and tools which she has found and used, most especially in it's application to data science and artificial intelligence, projects she has worked on, of which an image classifier was one of them. She would also talk about the mistakes she made, the successes she attained, the lessons she learned and finally the BIG PICTURE of her AI dream.
 recording release: yes license: CC BY  

39. Fast random number generation in Python and NumPy
Bernardt Duvenhage
tags: Scientific Computing
A fast Random Number Generator (RNG) is key to doing Monte Carlo simulations, efficiently initialising machine learning models, shuffling long sequences of numbers and many tasks in scientific computing. CPython and NumPy use implementations of the Mersenne Twister RNG and rejection sampling to generate random numbers in an interval. The NumPy implementation trades more samples for cheaper division by a power of two. The implementation is very efficient when the random interval is a power of two, but on average generate many more samples compared to the GNU C or Java algorithms. The Python RNG uses an algorithm similar to GNU C.

A recent paper by Daniel Lemire (https://arxiv.org/abs/1805.10941) discusses an efficient division light method to generate uniform random numbers in an interval. This method is apparently used in the Go language. On 64-bit platforms there are also fast alternative RNGs that perform comparatively on statistical examinations passing tests like BigCrush. The splitmix64 (part of the standard Java API) and lehmer64 RNGs, for example, require approximately 1.5 cycles to generate 32 random bits (without using SIMD) while the 32-bit Mersenne Twister requires approximately 10 cycles per 32 bits.

This talk will discuss the inclusion of Lemire's method in NumPy as an alternative to the current rejection sampling implementation. My implementation results in a 2x - 3x improvement in the performance of generating a sequence of random numbers. Using splitmix64 or lehmer64 RNGs in NumPy instead of the Mersenne Twister results in a further 2x performance improvement.

The random module in Python does not do the rejection sampling in C like NumPy does. Much of the time to get a random number is therefore spent in the Python code. This talk will also discuss a fast random Python module that implements Lemire's method instead of the current rejection sampling, provides alternative RNGs and moves more of the code into C. 

I'm considering doing pull requests for both the NumPy modification and the Python fast random module and will present a detailed analysis of the proposed modifications.
 recording release: yes license: CC BY  

40. Teaching coding to children (with python)
David Campey
tags: Other
A discussion o how to teach python to children i the 7 to 17 age bracket, how to get started and how to help existing projects
 recording release: yes license: CC BY  

41. Custom metadata plugins for Calibre: cataloguing an old paper library
Adrianna Pińska
tags: Other
Calibre is a cross-platform program for managing an e-book library: organising the books, annotating them with metadata, converting them between different formats and moving them between devices. Its organisation and metadata functionality can also be used to catalogue a collection of paper books.

By default, Calibre fetches its metadata from a few large, popular online sources which focus on recently published English-language books, and often have little to no information about older editions or books in other languages. However, there are many user-created custom metadata plugins which make it possible to integrate Calibre with more specialised book databases. Calibre is written in Python, and so are the plugins!

In this talk I will give an overview of how to find resources to help you start writing your own plugin, and describe how I forked and re-wrote a plugin for downloading metadata from the Internet Speculative Fiction Database.
 recording release: yes license: CC BY  

42. Parallel Programming with (Py)OpenCL for Fun and Profit
Gordon Inggs
tags: Scientific Computing
## Overview
It's never been easier to use all manner of interesting computing devices such as multicore CPUs, GPUs and FPGAs using [OpenCL](https://www.khronos.org/opencl/), an open heterogeneous computing standard, supported by major hardware vendors: [Intel](https://software.intel.com/en-us/articles/opencl-drivers), [NVIDIA](https://developer.nvidia.com/opencl), [AMD](https://www.amd.com/en-us/solutions/professional/hpc/opencl), [ARM](https://developer.arm.com/graphics/resources/tutorials/opencl-tutorials), etc. And it's never been easier to use OpenCL via the excellent Python bindings, [PyOpenCL](https://documen.tician.de/pyopencl/).

In this talk, I will introduce the basics of the OpenCL programming and runtime APIs, using examples run in Jupyter notebooks on a variety of devices. I will also help identify the situations where it makes sense to accelerate portions of a codebase.

## Audience
This talk is aimed at anyone who loves the expressiveness of Python, but has bumped into its performance limitations. I assume no background in HPC and/or heterogeneous computing, and will be using simple, yet hopefully relevant examples such as fundamental linear algebra and analysis applications.

By the end of the talk, provided it isn't a post-lunch slot, the audience should be ready to identify the hotspots in their code, and start accelerating using the CPUs, GPUs and FPGAs in their laptops and favourite public clouds such as AWS, Azure and GCE.
 recording release: yes license: CC BY  

43. Insight into Customer Segmentation
Cornelia van der Walt
tags: PyData
In retail, understanding your customer is all, and when you do not have a brick and mortar storefront to attract new shoppers, it is even more important to get insight into the array of visitors who frequent your site. You need an idea of who they are, what they want and how to attract them. This is a universal truth of all businesses and the lessons learned here could be easily applied to other industries.

Enter data science and the ability to segment your customers. What is customer segmentation? What types of segmentation are there? What models could you use? How do you do it? What is it good for? Having done it a few times, first for Superbalist, then for the Spree customers during the Superbalist/Spree merger, I might have a few tricks and tips to share. 

The talk will look at a high-level overview of clustering, then deep-dive into the code a bit before coming up at the end with a few use cases and conclusions. I'll discuss a few potential model algorithms we investigated, but focus mostly on the K-Means clustering model. 

If you have some data science experience it would be helpful, but the talk should provide interesting information for everyone. The talk aims to leave you with a solid idea of how to build a customer segmentation model of your own. Come discover the joys of classification models with me!
 recording release: yes license: CC BY  

44. Friday Lightning Talks
Adam Piskorski
tags: No Track
Friday Lightning Talks
======================

* Jonatas Baldin: "Three Small Techniques to Break the Ice While Meeting People at Conferences"
* Christian Christelis: "How much architecting does your project need and when?"
* Sheena O'Connell: "GraphQL vs REST"
* Whitney Tennant: "The photon - what this particle has to do with building better software and living the good life."
* Mary Racter: "What not to do when your Dockered Django app is consuming Vault secrets"
* Nickolas Grigoriadis: "Http client performance"
* Bruce Merry: "birdisle: an in-process redis for unit testing."
* Toufeeq Ockards: "Everyone wants to be a blogger ..."
 recording release: yes license: CC BY  

45. Machine Learning in Real Life
Jade Abbott
tags: PyData
Thanks to the openness of the machine learning community, any developer with an interest in machine learning these days, can get up a model to recognise characters or generate Trump-like tweets in a couple of hours. But what happens when we need to train a model to do a customer facing task, that we trust enough to deploy to a production system? And how do we get that model into production and maintain it once it is there? 

My talk aims to share some of the struggles, trade-offs and strategies from the trenches of training and building the infrastructure for a complex deep learning model for production use.

The talk is aimed at tech leads and developers who are interested in machine learning and are working on training their own models that they'd like to deploy
 recording release: yes license: CC BY  


Location
--------
Cedarwood


About the group
---------------


</pre>