pre-release: Enthought meeting announcement

Please take a moment to review your details and reply with OK or edits.
Subject and below is what will go out and also will be used to title the videos.

Subject: 
ANN: Enthought at 105 Mon July 16, 8p


Enthought
=========================
When: 8 AM Monday July 16, 2012
Where: 105

None

Topics
------
1. Introduction to NumPy and Matplotlib
Eric Jones
tags: Introductory/Intermediate
NumPy is the most fundamental package for scientific computing with Python. It adds to the Python language a data structure (the NumPy array) that has access to a large library of mathematical functions and operations, providing a powerful framework for fast computations in multiple dimensions. NumPy is the basis for all SciPy packages which extends vastly the computational and algorithmic capabilities of Python as well as many visualization tools like Matplotlib, Chaco or Mayavi.

This tutorial will teach students the fundamentals of NumPy, including fast vector-based calculations on numpy arrays, the origin of its efficiency and a short introduction to the matplotlib plotting library. In the final section, more advanced concepts will be introduced including structured arrays, broadcasting and memory mapping.
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1190/introduction-to-numpy-and-matplotlib 
2. scikit-learn
Jake Vanderplas
tags: Advanced
Machine Learning has been getting a lot of buzz lately, and many software libraries have been created which implement these routines. scikit-learn is a python package built on numpy and scipy which implements a wide variety of machine learning algorithms, useful for everything from facial recognition to optical character recognition to automated classification of astronomical images. This tutorial will begin with a crash course in machine learning and introduce participants to several of the most common learning techniques for classification, regression, and visualization. Building on this background, we will explore several applications of these techniques to scientific data -- in particular, galaxy, star, and quasar data from the Sloan Digital Sky Survey -- and learn some basic astrophysics along the way. From these examples, tutorial participants will gain knowledge and experience needed to successfully solve a variety of machine learning and statistical data mining problems with python.

 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1342/scikit-learn 
3. Advanced Matplotlib
Ryan May
tags: Advanced
Matplotlib is one of the main plotting libraries in use within the scientific Python community. This tutorial covers advanced features of the Matplotlib library, including many recent additions: laying out axes, animation support, Basemap (for plotting on maps), and other tweaks for creating aesthetic plots. The goal of this tutorial is to expose attendees to several of the chief sub-packages within Matplotlib, helping to ensure that users maximize the use of the full capabilities of the library. Additionally, the attendees will be run through a 'grab-bag' of tweaks for plots that help to increase the aesthetic appeal of created figures. Attendees should be familiar with creating basic plots in Matplotlib as well as basic use of NumPy for manipulating data.
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1344/advanced-matplotlib 
4. HDF5 is for Lovers
Anthony Scopatz
tags: Introductory/Intermediate
HDF5 is a hierarchical, binary database format that has become a *de facto* standard for scientific computing. While the specification may be used in a relatively simple way (persistence of static arrays) it also supports several high-level features that prove invaluable. These include chunking, ragged data, extensible data, parallel I/O, compression, complex selection, and in-core calculations. Moreover, HDF5 bindings exist for almost every language - including two Python libraries (PyTables and h5py).

This tutorial will discuss tools, strategies, and hacks for really squeezing every ounce of performance out of HDF5 in new or existing projects. It will also go over fundamental limitations in the specification and provide creative and subtle strategies for getting around them. Overall, this tutorial will show how HDF5 plays nicely with all parts of an application making the code and data both faster and smaller. With such powerful features at the developer's disposal, what is not to love?!

This tutorial is targeted at a more advanced audience which has a prior knowledge of Python and NumPy. Knowledge of C or C++ and basic HDF5 is recommended but not required.

 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1196/hdf5-is-for-lovers 
5. Time Series Data Analysis with pandas
Wes McKinney
tags: Advanced
In this tutorial, I'll give a brief overview of pandas basics for new users, then dive into the nuts of bolts of manipulating time series data in memory. This includes such common topics date arithmetic, alignment and join / merge methods, resampling and frequency conversion, time zone handling, moving window functions like moving mean and standard deviation. A strong focus will be placed on working with large time series efficiently using array manipulations. I'll also illustrate visualization tools for slicing and dicing time series to make informative plots. There will be several example data sets taken from finance, economics, ecology, web analytics, or other areas.

The target audience for the tutorial includes individuals who already work regularly with time series data and are looking to acquire additional skills and knowledge as well as users with an interest in data analysis who are new to time series. You will be expected to be comfortable with general purpose Python programming and have a modest amount of experience using NumPy. Prior experience with the basics of pandas's data structures will also be helpful.

 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1198/time-series-data-analysis-with-pandas 
6. Efficient Parallel Python for High-Performance Computing
Kurt Smith
tags: Introductory/Intermediate
This tutorial is targeted at the intermediate-to-advanced Python user who wants to extend Python into High-Performance Computing. The tutorial will provide hands-on examples and essential performance tips every developer should know for writing effective parallel Python. The result will be a clear sense of possibilities and best practices using Python in HPC environments.

Many of the examples you often find on parallel Python focus on the mechanics of getting the parallel infrastructure working with your code, and not on actually building good portable parallel Python. This tutorial is intended to be a broad introduction to writing high-performance parallel Python that is well suited to both the beginner and the veteran developer.

We will discuss best practices for building efficient high-performance Python through good software engineering. Parallel efficiency starts with the speed of the target code itself, so we will first look at how to evolve code from for-loops to list comprehensions and generator comprehensions to using Cython with NumPy. We will also discuss how to optimize your code for speed and memory performance by using profilers.

The tutorial will cover some of the common parallel communication technologies (multiprocessing, MPI, and cloud computing) and introduce the use of parallel map and map-reduce.

At the end of the tutorial, participants should be able to write simple parallel Python scripts, make use of effective parallel programming techniques, and have a framework in place to leverage the power of Python in High- Performance Computing.

 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1345/efficient-parallel-python-for-high-performance-co 
7. statsmodels
Skipper Seabold
tags: Advanced
This tutorial will give users an overview of the capabilities of statsmodels, including how to conduct exploratory data analysis, fit statistical models, and check that the modeling assumptions are met.

The use of Python in data analysis and statistics is growing rapidly. It is not uncommon now for researchers to conduct data cleaning steps in Python and then move to some other software to estimate statistical models. Statsmodels, however, is a Python module that attempts to bridge this gap and allow users to estimate statistical models, perform statistical tests, and conduct data exploration in Python. Researchers across fields such as economics and the social sciences to finance and engineering may find that statsmodels meets their needs for statistical computing and data analysis in Python.

All examples in this tutorial will use real data. Attendees are expected to have some familiarity with statistical methods.

With this knowledge attendees will be ready to jump in and use Python for applied statistical analysis and will have an idea how they can extend statsmodels for their own needs.

 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1200/statsmodels 
8. IPython in-depth: Interactive Tools for Scientific Computing
Fernando Perez, Min Ragan-Kelley
tags: Introductory/Intermediate
IPython provides tools for interactive and parallel computing that are widely used in scientific computing. We will show some uses of IPython for scientific applications, focusing on exciting recent developments, such as the network-aware kernel, web-based notebook with code, graphics, and rich HTML, and a high-level framework for interactive parallel computing.

 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1343/ipython-in-depth-interactive-tools-for-scientifi 
9. Welcome

tags: Plenary
None
 recording release: no  

10. matplotlib: Lessons from middle age.  Or, how you too can turn a hundred lines of patch rejection into two hundred thousand lines of code.
John Hunter
tags: Plenary
None
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1192/matplotlib-lessons-from-middle-age-or-how-you 
11. Numba Python bytecode to LLVM translator
Travis Oliphant, Jon Riehl
tags: HPC
Numba is a Python bytecode to LLVM translator that allows creation of fast, machine code from Python functions.  The Low Level Virtual Machine (LLVM) project is rapidly becoming a hardware-industry standard for the intermediate representation (IR) of compiled codes.   Numba's high-level translator to the LLVM IR provides Python the ability to take advantage of the machine code generated by the hardware manufacturers contributions to LLVM.  Numba translates a Python function comprised of a subset of Python syntax to machine code using simple type inference and the creation of multiple machine-code versions.  In this talk, I will describe the design of Numba, illustrate its applications to multiple domains and discuss the enhancements to NumPy and SciPy that can benefit from this tool.
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1194/numba-python-bytecode-to-llvm-translator 
12. Unlock: A Python-based framework for rapid development of practical brain-computer interface applications
Byron V. Galbraith, Jonathan S. Brumberg, Sean D. Lorenz, Frank H. Guenther
tags: General
The Unlock Project aims to provide brain-computer interface (BCI) technologies to individuals suffering from locked-in syndrome, the complete or near-complete loss of voluntary motor function. While several BCI techniques have been demonstrated as feasible in a laboratory setting, limited effort has been devoted to translating that research into a system for viable home use. This is in large part due to the complexity of existing BCI software packages which are geared toward clinical use by domain experts. With Unlock, we have developed a Python-based modular framework that greatly simplifies the time and programming expertise needed to develop BCI applications and experiments. Furthermore, the entire Unlock system, including data acquisition, brain signal decoding, user interface display, and device actuation, can run on a single laptop, offering exceptional portability for this class of BCI.

In this talk, I will present the Unlock framework, starting with a high-level overview of the system then touching on the acquisition, communication, decoding, and visualization components. Emphasis will be placed on the app developer API with several examples from our current work with steady-state visually evoked potentials (SSVEP).
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1193/unlock-a-python-based-framework-for-rapid-develo 
13. Copperhead: Data Parallel Python
Bryan Catanzaro
tags: HPC
Copperhead is a data parallel language embedded in Python, which aims
to provide both a productive programming environment as well as
excellent computational efficiency on heterogeneous parallel
hardware. Copperhead programs are written in a small, restricted
subset of Python, using standard constructs like map and reduce, along
with traditional data parallel primitives like scan and
sort. Copperhead programs are written in standard Python modules and
interoperate with existing Python numerical and visualization
libraries such as NumPy, SciPy, and Matplotlib. The Copperhead runtime
compiles Copperhead programs to target either CUDA-enabled GPUs or
multicore CPUs using OpenMP or Threading Building Blocks. On several
example applications from Computer Vision and Machine Learning,
Copperhead programs achieve between 45-100% of the performance of
hand-coded CUDA code, running on NVIDIA GPUs. In this talk, we will
discuss the subset of Python that forms the Copperhead language, the
open source Copperhead runtime and compiler, and selected example
programs.
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1197/copperhead-data-parallel-python 
14. Self-driving Lego Mindstorms Robot
Iqbal Mohomed
tags: General
In this talk, I'll describe the workings of my personal hobby project - a self-driving lego mindstorms robot! The body of the robot is built with Lego Mindstorms. An Android smartphone is used to capture the view in front of the robot. A user first teaches the robot how to drive; this is done by making the robot go around the track a small number of times. The image data, along with the user action is used to train a Neural Network. At run-time, images of what is in front of the robot are fed into the neural network and the appropriate driving action is selected. This project showcases the power of python's libraries, as they enabled me to put together a sophisticated working system in a very short amount of time. Specifically, I made use of the Python Image Library to downsample images, as well as the PyBrain neural network library. The robot was controlled using the nxt-python library. A high-level description + videos are available here: http://slowping.com/2012/self-driving-lego-mindstorms-robot/
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1195/self-driving-lego-mindstorms-robot 
15. Mid-morning Break
None
tags: ---
None
 recording release: yes license: None  

16. Solving the import problem: Scalable Dynamic Loading Network File Systems
Aron Ahmadia, Jed Brown, William Scullin
tags: HPC
The most common programming paradigm for scientific computing, SPMD (Single Program Multiple Data), catastrophically interacts with the loading strategies of dynamically linked executables and network-attached file systems on even moderately sized high performance computing clusters.  This difficulty is further exacerbated by "function-shipped" I/O on modern supercomputer compute nodes, preventing the deployment of simple solutions.  In this talk, we introduce a two-component solution: collfs, a set of low-level MPI-collective file operations that can selectively shadow file system access in a library, and walla, a set of Python import hooks for seamlessly enabling parallel dynamic loading scalable to tens of thousands of cores.
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1201/solving-the-import-problem-scalable-dynamic-load 
17. PySAL: A Python Library for Exploratory Spatial Data Analysis and Geocomputation
Sergio Rey
tags: HPC
This talk presents an overview and update of PySAL.  PySAL is designed to support the development of high level applications in exploratory spatial data analysis and geocomputation. The library includes a comprehensive suite of modules that cover the entire spatial data analysis research stack from geospatial data processing and integration, to exploratory spatial data analysis, spatial dynamics, regionalization, and spatial econometrics. A selection of these modules are illustrated drawing on research in spatial criminology, epidemiology and urban inequality dynamics. A number of geovisualization packages that have been implemented using PySAL as an analytical core are also demonstrated. Future plans for additional modules and enhancements are also discussed.
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1199/pysal-a-python-library-for-exploratory-spatial-d 
18. Implicit Multicore Parallelism using CnC-Python
Shams Imam, Vivek Sarkar
tags: HPC
We introduce CnC-Python (CP), an approach to implicit multicore parallelism for
Python programmers based on a high-level macro data-flow programming model
called Concurrent Collections (CnC). With the advent of the multi-core era, it
is clear that improvements in application performance will primarily come from
increased parallelism. Extracting parallelism from applications often involves
the use of low-level primitives such as locks and threads. CP is implicitly
parallel and enables programmers to achieve task, data and pipeline parallelism
in a declarative fashion while only being required to describe the program as a
coordination graph with serial Python code for individual nodes (steps). Thus,
CP makes parallel programming accessible to a broad class of programmers who are
not trained in parallel programming. The CP runtime requires that Python objects
communicated between steps be picklable, but imposes no restriction on the
Python idioms used within the serial code.  Most data structures of interest to
the SciPy community, including NumPy arrays, are included in the class of
picklable data structures in Python.

The CnC model is especially effective in exploiting parallelism in scientific
applications in which the dependences can be represented as arbitrary directed
acyclic graphs (``dag parallelism'').  Such applications include, but are not
limited to, tiled implementations of iterative linear algebra algorithms such as
Cholesky decomposition, Gauss-Jordan elimination, Jacobi method, and Successive
Over-Relaxation (SOR).  Rather than using explicit threads and locks to exploit
parallelism, the CnC-Python programmer decomposes their algorithm into
individual computation steps and identifies data and control dependences among
the steps to create such computation DAGs. Given the DAG (in the form of
declarative constraints), it is the responsibility of the CP runtime to extract
parallelism and performance from the application. By liberating the scientific
programmer, who is not necessarily trained to write explicitly parallel
programs, from the nuances of parallel programming, CP provides a
high-productivity path for scientific programmers to achieve multi-core
parallelism in Python.

LINKS:
CnC-Python: http://cnc-python.rice.edu
Concurrent Collections: http://habanero.rice.edu/cnc
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1212/implicit-multicore-parallelism-using-cnc-python 
19. mystic: a simple model-independent inversion framework
Michael M. McKerns, Alta Fang, Tim Sullivan, Michael A.G. Aivazis
tags: HPC
Not Relased
 recording release: no  

20. QNDArray: A Numpy Clone for C++/Qt
Glen W. Mabey
tags: General
While Numpy/Scipy is an attractive implementation platform for many algorithms, in some cases C++ is mandated by a customer.  However, a foundation of numpy's behavior is the notion of reference-counted instances, and implementing an efficient, cross-platform mechanism for reference counting is no trivial prerequisite.

The reference counting mechanisms already implemented in the Qt C++ toolkit provide a cross-platform foundation upon which a numpy-like array class can be built.  In this talk one such implementation is discussed, QNDArray.  In fact, by mimicking the numpy behaviors, the job of implementing QNDArray became much easier, as the task of "defining the behavior" became "adopting the behavior," to include function names.

In particular, the following aspects of the implementation were found to be tricky and deserve discussion in this presentation:
  * slicing multidimensional arrays given the limitations of operator[] in C++,
  * const
  * partial specialization
  * implicit vs. explicit data sharing in Qt

QNDArray has been deployed in scientific research applications and currently has the following features:
  * bit-packed boolean arrays
  * nascent masked array support
  * unit test suite that validates QNDArray behavior against numpy behavior
  * bounds checking with Q_ASSERT() (becomes a no-op in release mode)
  * memmap()ed arrays via QFile::map()
  * easily integrated as a QVariant value, leading to a natural mapping from QVariantMap to Python dict.
  * float16 implementation including in-place compare

The author has approval from his management to submit the source code for QNDArray to the Qt Project and plans to have it freely available for download via http://qt.gitorious.org/ before the SciPy conference begins.
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1203/qndarray-a-numpy-clone-for-cqt 
21. Julia: A Fast Dynamic Language for Technical Computing
Jeff Bezanson
tags: General
Julia is a dynamic language designed for technical applications and
high performance. Its design is based on a sophisticated but unobtrusive
type system, type inference, multiple dispatch instead of class-based OO,
and a code generator based on LLVM. These features work together to run
high-level code efficiently even without type declarations. At the same time,
the type system provides useful expressiveness for designing libraries,
enables forms of metaprogramming not traditionally found in dynamic languages,
and creates the possibility of statically compiling whole programs and
libraries. This combination of high performance and expressiveness makes it
possible for most of Julia's standard library to be written in Julia itself,
with an interface to call existing C and Fortran libraries.

We will discuss some ways that Python and Julia can interoperate, and
compare Julia's current capabilities to Python and NumPy.
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1204/julia-a-fast-dynamic-language-for-technical-comp 
22. OpenMG: A New Multigrid Implementation in Python
Tom Bertalan, Akand W. Islam
tags: HPC
Here, include a talk summary of no longer than 500 words. Aspects such as relevance to Python in science, applicability, and novelty will be considered by the program committee.

In most large-scale computations, systems of equations arise in the form Au=b, where A is a linear operation to be performed on the unknown data u, producing the known right-hand-side, b, which represents some constraint of known or assumed behavior of the system being modeled. Since u can have a many millions to billions elements, direct solution is too slow. A multigrid solver solves partially at full resolution, and then solves directly only at low resolution. This creates a correction vector, which is then interpolated to full resolution, where it corrects the partial solution.

This project aims to create an open-source multigrid solver library, written only in Python. The existing PyAMG multigrid implementation–a highly versatile, highly configurable, black-box solver–is fully sequential, and is difficult to read and modify due to its C core. OpenMG is a pure Python experimentation environment for developing multigrid optimizations, not a new production solver library. By making the code simple and modular, we make the alogrithmic details clear. We thereby create an opportunity for education and experimental optimization of the partial solver (Jacobi, Gauss Seidel, SOR, etc.), the restriction mechanism, the prolongation mechanism, and the direct solver, using GPGPU, multiple CPUs, MPI, or grid computing. The resulting solver is tested on an implicit pressure reservoir simulation problem with satisfactory results.
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1205/openmg-a-new-multigrid-implementation-in-python 
23. Lunch
None
tags: ---
None
 recording release: yes license: None  

24. TBA

tags: Plenary
(Needs description.) 
 recording release: no  

25. Performance Python Panel Discussion
Andy Terrel, moderator
tags: Plenary
Travis Oliphant (Continuum Analytics), Kurt Smith (Enthought) and Jeff Bezanson (MIT, Julia author) discuss Python performance issues.  Andy Terrel (UT/TACC) is the moderator.

 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1189/performance-python-panel-discussion 
26. SciPy + MapReduce with Disco
Al Barrentine
tags: HPC
MapReduce has become one of two dominant paradigms in distributed computing
(along with MPI). Yet many times, implementing an algorithm as a MapReduce job
- especially in Python - forces us to sacrifice efficiency (BLAS routines, etc.)
in favor of data parallelism.

In my work, which involves writing distributed learning algorithms for processing terabytes of 
Twitter data at SocialFlow, I've come to advocate a form of "vectorized MapReduce"
which integrates efficient numerical libraries like numpy/scipy into the MapReduce setting,
yielding both faster per-machine performance and reduced I/O, which is often a major
bottleneck. I'll also highlight some features of Disco (a Python/Erlang MapReduce 
implementation from Nokia) which make it a very compelling choice for writing scientific 
MapReduce jobs in Python.
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1207/scipy-mapreduce-with-disco 
27. Time Series Manipulation with pandas
Wes McKinney
tags: General
In this talk I'll discuss major developments in pandas over the last year
related to time series handling and processing. This includes the integration
of the new NumPy datetime64, implementation of rich and high performance
resampling methods, better visualization, and a generally cleaner, more
intuitive and productive API. I will also discuss how functionality from the
defunct scikits.timeseries project has been integrated into pandas, thus
providing a unified, cohesive set of time series tools for many different
problem domains. Lastly, I'll give some details about the pandas development
roadmap and opportunities for more people to get involved.
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1206/time-series-manipulation-with-pandas 
28. Bringing High Performance to Python/Numpy Without Changing a Single Line of Code.
Simon Andreas Frimann Lund, Mads Ruben Burgdorff Kristensen, Brian Vinter, Troels Blum
tags: HPC
Recent years have provided a wealth of projects showing that using Python for scientific applications outperforms even popular choices such as Matlab. A major factor driving these successes is the efficient utilization of multi-cores, GPUs for general-purpose computation and scaling computations to clusters.

However, often these advances sacrifice some of the high-productivity features of Python by introducing new language constructs, enforcing new language semantics and/or enforcing explicit data types. The result is that the user will have to rewrite existing Python applications to use the Python extension.

In order to use GPGPUs in Python, a popular approach is to embed CUDA/OpenCL code kernels directly in the Python application. The programming productivity of this approach is better and more readable than C/C++ applications but it is still inferior to native Python code. Furthermore, the approach enforces hardware specific programming and thus requires intimate knowledge of the underlying hardware and the CUDA/OpenCL programming model.

Copenhagen Vector Byte Code (cphVB) strives to provide a high-performance back-end for Numerical Python (NumPy) without reducing the high-productivity of Python/NumPy. Without any involvement of the user, cphVB will transform regular sequential Python/NumPy applications into high-performance applications. The cphVB runtime system is capable of utilizing a broad range of computing platforms efficiently, e.g. Multi-core CPUs, GPGPUs and clusters of such machines.

cphVB consists of a bridge that translates NumPy array operations into cphVB vector operations. The bridge will send these vector operations to a Vector Engine that performs the actual execution of the operations. cphVB comes with a broad range of Vector Engines optimized to specific hardware architectures, such as multi-core CPUs, GPGPU and clusters of said architectures. Thus, cphVB provides a high-productivity, high-performance framework that support legacy NumPy applications without changing a single line of code.
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1209/bringing-high-performance-to-pythonnumpy-without 
29. SymPy Stats - Uncertainty Modeling
Matthew Rocklin
tags: General
SymPy is a symbolic algebra package for Python. In SymPy.Stats we add a 
stochastic variable type to this package to form a language for uncertainty 
modeling. This allows engineers and scientists to symbolically declare the 
uncertainty in their mathematical models and to make probabilistic queries. We 
provide transformations from probabilistic statements like $P(X*Y > 3)$ or 
$E(X**2)$ into deterministic integrals. These integrals are then solved 
using SymPy's integration routines or through numeric sampling. 

This talk touches on a few rising themes:
 * The rise in interest in uncertainty quantification and 
 * The use of symbolics in scientific computing 
 * Intermediate representation layers and multi-stage compilation 

Historically solutions to uncertainty quantification problems have been 
expressed by writing Monte Carlo codes around individual problems. By creating 
a symbolic uncertainty language we allow the expression of the 
problem-to-be-solved to be written separately from the numerical technique. 
SymPy.stats  serves as an interface layer. The statistical programmer doesn't 
need to think about the details of numerical techniques and the computational 
methods programmer doesn't need to think about the particular domain-specific 
questions to be solved. 

We have implemented multiple comptuational backends including purely symbolic 
(using SymPy's integration engine), sampling, and code generation. 

In the talk we discuss these ideas with a few illustrative examples taken from 
basic probability and engineering. The following is one such example

http://sympystats.wordpress.com/2011/07/02/a-lesson-in-data-assimilation-using-sympy/
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1208/sympy-stats-uncertainty-modeling 
30. Mid-afternoon Break
None
tags: ---
None
 recording release: yes license: None  

31. Parallel High Performance Statistical Bootstrapping in Python
Aakash Prasad
tags: HPC
BLB ("Bag of Little Bootstraps") is a method to assess the quality of a statistical estimator based upon subsets of sample distributions. BLB is a variant of and solves the same class of problems as the general bootstrap. Unfortunately, the general bootstrap is a computationally demanding operation when given large data sets, and does not parallelize easily. BLB is an attractive alternative due to its its structural and computational properties which allow for much better parallelization. However, two obstacles exist to realizing this parallelism in practice. First, expressing the parallelism inherent in the algorithm requires quite different code depending on the platform (for example, multi-core with Cilk-aware compiler vs. GPU with CUDA or OpenCL vs. shared-nothing cloud computing with Spark or Hadoop). Second, even given the skeleton code for a particular platform, the specific estimator function being computed is different for each application, making it difficult to encapsulate the BLB pattern in a library. We apply the SEJITS technology (Selective Embedded Just-in-Time Specialization) to solve both problems: scientists can write applications in Python that make use of estimator functions also written in (a subset of) Python, and just-in-time code generation techniques are used to "lower" these functions to efficiency-level languages and inline them into an execution template optimized for each type of parallel platform; in this paper we focus on the multicore environment. The result is that Python applications that use BLB are source- and performance-portable, with performance comparable to hand-written code in low-level languages. We expect that code variants produced for the multicore environment can reasonably support data sets in the order of tens of gigabytes in size, and that variants for the cloud environment can support data sets in the order of terabytes in size. Preliminary results from a simple application of linear regression show a 13.6x speedup as a result of using 16 cores instead of 1 core, and this parallel performance was obtained by simply coding the linear estimator function in Python. Our runtime compiler (specializer) for BLB augments a growing family of such compilers that will ultimately allow Python applications to make use of generic computational patterns across different platforms.
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1210/parallel-high-performance-statistical-bootstrappi 
32. *** CANCELLED *** Zen: Introducing a pythonic, high-performance network analysis library
Derek Ruths
tags: HPC
Network datasets are becoming larger; network algorithms are becoming more computationally expensive; and interest in network research is becoming more widespread.  These and other factors are creating a widespread need for a network library that can both handle massive network datasets and still not sacrifice the conventions and designs that make a python library quick-to-learn, elegant, and easy-to-use.

In this talk, we will introduce the Zero-Effort Network (Zen) library.  Developed from scratch in a combination of python and cython, it aims to be the most memory- and computation-efficient network library available while still adhering to the conventions of good pythonic design.  Nearly all functions are implemented in cython, yielding C-level performance.  Currently in its first release, it implements a large suite of network-based functions including algorithms involved in finding shortest paths, clustering, community detection, and matchings.  Overall, these implementations outperform (sometimes by more than an order of magnitude) all popular network libraries (including both those that work with python and those that do not).

From the outset, we intended for Zen to be applicable to massive network datasets (e.g., hundreds of millions of edges).  In order to achieve this, the data structures involved in holding the networks themselves in memory have been designed to be very memory-efficient.  Furthermore, new storage formats have been designed that both minimizes the storage space required to store network connectivity on disk and maximizes the speed with which it can be loaded from disk into memory.  The result of this combination is that massive network datasets that previously could not be loaded and analyzed in python libraries can now be loaded in a matter of minutes.

Our aim in this presentation is to introduce the Zen library to the community: showing how it can be used to analyze datasets that have previously been beyond the reach of python-based network tools and highlighting the unique designs that allow it to achieve this high-performance without sacrificing pythonic conventions.  Attention will be given to discussing both the overarching design and use of the library and also those specific innovations that possible working with massive network datasets.
 recording release: yes license: CC BY-SA  

33. TBA
None
tags: HPC
(Needs description.) 
 recording release: yes license: None  

34. Antonia
Alejandro Weinstein, Michael Wakin
tags: General
slot auto wallet อีกขั้นของการเล่นcasinoด้วยระบบการเล่นที่มีความนำสมัย เร็วไว มีความปลอดภัยรวมทั้งความยั่งยืนที่สูง SLOTPLAY138 เปิดให้บริการเว็บตรง ที่มีระบบการฝากถอนออโต้ที่ผู้ใช้บริการไม่มีความจำเป็นที่จะต้องส่งสลิปให้เจ้าหน้าที่ ฝากถอนได้ทันใจใช้เวลาเพียง 
10 วินาทีเพียงแค่นั้น ทำรายการได้ตลอด 24 ชั่วโมง ไม่จำกัดปริมาณครั้งสำหรับการฝากถอน ด้วยการประมวลผลโดย Ai ทำให้การฝากถอนของเว็บตรง ฝากถอนออโต้จะไม่เกิดข้อผิดพลาดแม้แต่ครั้งเดียว ปัญหาพวกโอนเงินไม่ตรงบัญชี, โอนเงินไม่ถูกปริมาณ, เงินไม่เข้า จะไม่มีอีกต่อไป slot ฝากถอนออโต้ เปิดช่องทางการฝากถอนให้สมาชิกได้ทำธุรกรรมด้านการเงินหลากหลายช่องทาง อาทิเช่น แบงค์ออนไลน์, ธนาคารภายในประเทศ, บัตรเครดิต/เดบิต, โมบายแบงค์กิ้ง 
แล้วก็คลิปโตเคอเรนซี เพื่อรองรับทุกท่านที่มีบัญชีในการทำธุรกรรมที่แตกต่าง ทั้งสำหรับคนที่ไม่มีบัญชีธนาคารนั้นก็สามารถร่วมสนุกกับ สล็อตฝากถอนไม่มีขั้นต่ํา 2021 ทางพวกเรารองรับการฝากถอนผ่านบัญชีทรูมั่นนี่วอเลท (True 
Money Wallet) ที่สามารถทำรายการฝากถอนไม่มีอย่างน้อยได้เช่นเดียวกัน เริ่มแค่เพียง 1 บาทเพียงเท่านั้น 
สำหรับผู้ใช้บริการที่อยากเล่นสล็อตเว็บตรงไม่ผ่านเอเย่นต์กับเกมสล็อตออโต้ที่มีระบบการฝากถอนออโต้ จำเป็นต้องเข้ามาเล่นที่นี่เท่านั้น การันตีว่ามีความปลอดภัย สุจริต มั่นคงไม่มีการปิดหนี้อย่างแน่นอน lot auto ได้รวบรวมค่ายสล็อตเว็บตรงมาให้สมาชิกได้เล่นมากยิ่งกว่า 31 ค่ายเกมส์ รวมเกมเว็บตรงสล็อตออนไลน์ที่มีให้เลือกเล่นมากกว่า 
1,000 เกมส์เลยทีเดียว สมาชิกทุกท่านสามารถเลือกเล่นได้ตามที่ใจต้องการ ทางเราไม่มีการกำหนดอย่างต่ำในการเล่น สามารถเลือกว่าเดิมพันเบทได้จากที่อยาก เริ่มต้นได้ตั้งแต่หลักหน่วยไปจนถึงหลักหมื่นเลยทีเดียว 
ยิ่งเสี่ยงมากเงินรางวัลยิ่งสูงตามไปด้วย สำหรับเว็บไซต์สล็อตออโต้ได้เปิดให้บริการทุกท่านทุกวัน ไม่มีทางหยุด 
สามารถเล่นได้ตลอด 1 วัน เกมสล็อตออนไลน์ของพวกเรานั้นสามารถเล่นได้ทั้งยังคอมพิวเตอร์รวมทั้งโทรศัทพ์โทรศัพท์มือถือ รองรับอีกทั้งแพลตฟอร์ม 
IOS และก็ Androlid ซึ่งไม่ว่าคุณจะอยู่ที่แหน่งใดของโลกนี้ก็สามารถเล่นเกมสล็อตเว็บไซต์ตรงได้ และด้วยความพิเศษของ auto slot casino ที่ได้ปรับปรุงมาอย่างดีเยี่ยมทำให้ผู้เล่นไม่จำเป็นที่จะต้องดาวน์โหลดAPP สามารถเล่นเว็บตรงสล็อตออนไลน์ได้หน้าเว็บหรือบราวร์เซอร์ได้เลย 
สำหรับผู้ใดที่ไม่มั่นใจสำหรับในการลงทุนสามารถทดสอบเล่นสล็อตได้ฟรี ทางพวกเรามีระบบการทดสอบเล่นสล็อตให้ทุกคน ทดสอบเล่นได้โดยไม่ต้องสมัครสมาชิก ทางพวกเรามีเครดิตฟรีให้แก่คุณเอาไปเล่นฟรีๆแบบไม่จำกัด สิทธิพิเศษวันนี้สำหรับคนที่ยังมิได้เป็นพวกกับทาง เว็บสล็อตออโต้ สามารถรับโบนัสเครดิตฟรี 100% ได้เพียงแต่สมัครสมาชิกใหม่เข้ามาเล่นกับพวกเราแค่นั้น 
รีบสมัครได้เลยวันนี้ auto ออโต้
 recording release: maybe  
 Video: http://pyvideo.org/video/1211/a-tale-of-four-libraries 
35. ROFL: a functional Python dialect for science
Jonathan Riehl
tags: HPC
jon.riehl@resilientscience.com

Current parallel programming models leave a lot to be desired and fail
to maintain pace with improvements in hardware architecture.  For many
scientific research groups these models only widen the gap between
equations and scalable parallel code.  The Resilient Optimizing Flow
Language (ROFL) is a data-flow language designed with the purpose of
solving the problems of both domain abstraction and efficient
parallelism.  Using a functional, declarative variant of the Python
language, ROFL takes scientific equations and optimizes for both
scalar and parallel execution.

ROFL is closely tied to Python and the SciPy libraries.  ROFL uses
Python expression syntax, is implemented in Python, and emits
optimized Python code.  ROFL's implementation in Python allows ROFL to
be embedded in Python.  Using Python as a target language makes ROFL
extensible and portable.  By removing imperative loop constructs and
focusing on integration with the NumPy and SciPy libraries, ROFL both
supports and encourages data parallelism.

In this presentation, we introduce the ROFL language, and demonstrate
by example how ROFL enables scientists to focus more on the equations
they are solving, and less on task and data parallelism.
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1191/rofl-a-functional-python-dialect-for-science 
36. Lightning Talks - Wednesday

tags: Plenary
1. SciPy Sparse Graphs, Jake Vanderplas.
2. Animation for Traits and Chaco, Corran Webster.
3. Pynthantics, Jon Roland.
4. State of the Numba, Jon Riehl.
5. Pipe-o-matic call, Walker Hale.
6. A Command ND-Array, Frédéric Bastien.
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1346/lightning-talks-wednesday 
37. ALGES: Geostatistics and Python
Felipe Lema
tags: Geophysics Mini-Symposia
ALGES is a laboratory that develops tools applied to geostatistics.  We've been using python for a while and it has brought us very good results. Its ease-of-use and portability allow us to rapidly offer practical solutions to problems.
Along with a brief introduction to the laboratory, we cover two particular projects we are currently working on.
One project is an application for multivariate geostatistic analysis. Most available applications provide analysis for a single variable at a time and either obviate how variables can relate between one another or make it really difficult to consider any relationship. Our proposal provides both an interface that's both easy to use for primers and fine tuning for experienced users. 
The other presented project covers a problem in geological modeling and resource estimation. Commonly, when modeling geological volumes, continuity in data is assumed. This is not often true, as there are different kinds of faults that break this continuity. This is very hard to incorporate when modeling.
We propose a solution to restore the original continuous volume for better modeling as well as restitution to the real distorted volume, all this providing a better estimation.
Both projects have lots of heavy computations and no shortage of input data. We take this as a challenge to build fast processing solutions, so we take advantage of both the easiness of a python interface and the speed of C/C++ code. 

 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1213/alges-geostatistics-and-python 
38. Astropy - Herding Snakes or Astronomers: which is more difficult?
Perry Greenfield, Thomas Robitaille, Erik Tollerud
tags: Astronomy Mini-Symposia
Developers and users of astronomical software have long bemoaned the absence of shared efforts in the field. While there are well known, free software tools available for astronomy, most have been developed by large institutions, and the past few decades have seen comparatively little progress in fostering a community-based set of software tools. There is hope that is changing now. The continuing growth of Python in astronomy has led to an increasing awareness of needless duplication of efforts within the community and the need to make existing packages work better with each other; such discussions came to a head on the astropy email list in the spring of 2011 leading to formation of the astropy project. The first coordination meeting was held in the fall of 2011, and significant progress has been made in setting up a community repository of core astronomical packages. We will describe the general goals of astropy and the progress that has been made to date.

 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1214/astropy-herding-snakes-or-astronomers-which-is 
39. A Unified Release of Python & IRAF software for astronomy
James E.H. Turner
tags: Astronomy Mini-Symposia
As astronomical software development expands from historical data reduction
platforms towards more sophisticated Python applications,
non-technically-focused users can struggle with installing and maintaining
a large number of heterogeneous dependencies. PyRAF has successfully
bridged the gap between IRAF and Python, but managing dependencies falls
outside its scope. A few existing Python distributions make installation
easy, but don't cater for specific needs (such as dependence on IRAF). STScI
and Gemini have therefore developed a prototype, easy-to-install software
distribution for Linux and OSX known provisionally as the 'Unified Release'
(UR).

Currently the UR includes STScI Python and its dependencies (eg. Python,
NumPy, IRAF 2.15), as well as Matplotlib & Tk, SciPy, a number of IRAF
packages, DS9, X11IRAF and some testing and documentation tools. Its scope
extends to complementary non-Python/IRAF software, but we do not intend to
produce a comprehensive (Scisoft-like) distribution of tools for astronomy,
nor to satisfy every installation preference. Our focus is on providing a
simple way to run key tools, for users with minimal support resources and
who may not have administrative privileges. Unlike most comparable
distributions, our approach includes basic provision for in-place software
additions and updates.

Recently we have completed a first internal version of the UR for both
Linux and OSX, which we shall briefly demonstrate. We plan to make our
first public release during the coming months.

 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1216/a-unified-release-of-python-iraf-software-for-a 
40. Parallel Computational Methods and Simulation for Coastal and Hydraulic Applications Using the Proteus Toolkit
Chris Kees
tags: Geophysics Mini-Symposia
(Needs description.) 
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1215/parallel-computational-methods-and-simulation-for 
41. AstroML: data mining and machine learning for Astronomy
Jake Vanderplas, Zeljko Ivezic, Andrew Connolly, Alex Gray
tags: Astronomy Mini-Symposia
Python is currently being adopted as the language of choice by many astronomical researchers.  A prominent example is in the Large Synoptic Survey Telescope (LSST), a project which will repeatedly observe the southern sky 1000 times over the course of 10 years.  The 30,000 GB of raw data created each night will pass through a processing pipeline consisting of C++ and legacy code, stitched together with a python interface.  This example underscores the need for astronomers to be well-versed in large-scale statistical analysis techniques in python.  We seek to address this need with the AstroML package, which is designed to be a repository for well-tested data mining and machine learning routines, with a focus on applications in astronomy and astrophysics.  It will be released in late 2012 with an associated graduate-level textbook, 'Statistics, Data Mining and Machine Learning in Astronomy' (Princeton University Press).  AstroML leverages many computational tools already available available in the python universe, including numpy, scipy, scikit-learn, pymc, healpy, and others, and adds efficient implementations of several routines more specific to astronomy. A main feature of the package is the extensive set of practical examples of astronomical data analysis, all written in python.  In this talk, we will explore the statistical analysis of several interesting astrophysical datasets using python and astroML.
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1218/astroml-data-mining-and-machine-learning-for-ast 
42. Building a Solver Based on PyClaw for the Solution of the Multi-Layer Shallow Water Equations
Kyle Mandli
tags: Geophysics Mini-Symposia
The multi-layer shallow water equations are an active topic for researchers in 
geophysical fluid dynamics looking for ways to increase the validity of shallow 
water modeling techniques without using a fully three dimensional model which 
may be too costly for the domain size being looked at. In this talk we will 
step through the effort needed to convert a Fortran based solver to one using 
the PyClaw framework, a Python framework targeted at the solution of hyperbolic 
conservation laws. Once the application is converted the ease of implementing 
parallel and other solver strategies is greatly simplified. Discussion of how 
this is accomplished and design decisions and future extensions to PyClaw will 
also be presented.

 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1217/building-a-solver-based-on-pyclaw-for-the-solutio 
43. yt: An Integrated Science Environment for Astrophysical Simulations
Matthew Turk
tags: Astronomy Mini-Symposia
The usage of the high-level scripting language Python has enabled new
mechanisms for data interrogation, discovery and visualization of
scientific data. We present yt ( http://yt-project.org/ ), an open
source, community-developed astrophysical analysis and visualization
toolkit for both post-processing and in situ analysis of data
generated by high-performance computing (HPC) simulations of
astrophysical phenomena.  We report on successes in astrophysical
computation through development of analysis tasks, visualization,
cross-code compatibility, and community building.
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1202/yt-an-integrated-science-environment-for-astroph 
44. Domain Specific Languages for Partial Differential Equations using Ignition
Andy Terrel
tags: Geophysics Mini-Symposia
As scientific computing pushes towards extreme scales, the programming
wall is becoming more apparent. For algorithms to scale on new
architectures, they often must be rewritten accounting for completely
different performance characteristics. A handful of the communities
fastest codes have already turned to automatic code generation to
tackle these issues. Code generation gives a user the ability to use
the expressiveness of a domain specific language and promises for
better portability as architectures rapidly change.

In this presentation, I will show Ignition, a project for creating
numerical code generators.  Python and SymPy make exceptional
languages for developing these code generators in a way that domain
experts can understand and manipulate.  I show examples how Ignition
can generate several different parts of geophysical simulations.

 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1219/domain-specific-languages-for-partial-differentia 
45. Discussion

tags: Astronomy Mini-Symposia
None
 recording release: yes license: None  

46. Discussion

tags: Geophysics Mini-Symposia
None
 recording release: yes license: None  

47. Welcome

tags: Plenary
None
 recording release: no  

48. Python as Super Glue for the Modern Scientific Workflow
Joshua Bloom
tags: Plenary
None
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1220/python-as-super-glue-for-the-modern-scientific-wo 
49. Interactive Visualization Widgets Using Chaco and Enable
Corran Webster
tags: Visualization
Interactivity is an important part of computer visualization of data, but all too often the user interfaces to control the visualization are far from optimal. This talk will show how you can use the Enable and the Chaco to build interactive visualization widgets which give much better user feedback than sliders or text fields.

Chaco is an open-source interactive 2D plotting library that is part of the Enthought tool-suite, which is in turn built upon the Enable interactive 2D drawing library that are compatible with PyQt, WxPython, Pyglet and VTK. These libraries are written in Python and are key tools that Enthought uses to deliver scientific applications to our clients.

This talk will show how to use these tools to build UI widgets that can be used to control visualizations interactively.  Rather than building a complex, monolithic control, the approach that we will demonstrate builds the control our of many smaller interactions, each controlling a small piece of the overall state of a visualization, with a high level of reusability.

As a simple but useful case-study, we'll show how we built an interactive histogram widget that can be use to adjust the brightness, contrast, gamma and other attributes of an image in real-time.  We'll also discuss some of the tricks we used to keep the user interactions responsive in the face of having to visualize larger images.
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1222/interactive-visualization-widgets-using-chaco-and 
50. IPython: tools for the entire lifecycle of research computing
Fernando Perez, Brian Granger, Min Ragan-Kelley, Thomas Kluyver, Evan Patterson
tags: General
IPython started as a better interactive Python interpreter in 2001, but over
the last decade it has grown into a rich and powerful set of interlocking tools
aimed at enabling an efficient, fluid and productive workflow in the typical
use cases encountered by scientists in everyday research.

Today, IPython consists of a kernel executing user code and capable of
communicating with a variety of clients, using ZeroMQ for networking via a
well-documented protocol. This enables IPython to support, from a single
codebase, a rich variety of usage scenarios through user-facing applications
and an API for embedding:

*   An interactive, terminal-based shell with many capabilities far beyond the
    default Python interactive interpreter (this is the default application
    opened by the ``ipython`` command that most users are familiar with). 

*   A Qt console that provides the look and feel of a terminal, but adds
    support for inline figures, graphical calltips, a persistent session that
    can survive crashes of the kernel process, and more.

*   A web-based notebook that can execute code and also contain rich text and
    figures, mathematical equations and arbitrary HTML. This notebook presents
    a document-like view with cells where code is executed but that can be
    edited in-place, reordered, mixed with explanatory text and figures, etc.

*   A high-performance, low-latency system for parallel computing that supports
    the control of a cluster of IPython engines communicating over ZeroMQ, with
    optimizations that minimize unnecessary copying of large objects
    (especially numpy arrays). 

In this talk we will show how IPython supports all stages in the lifecycle of a
scientific idea: individual exploration, collaborative development, large-scale
production using parallel resources, publication and education.  In particular,
the IPython Notebook supports multiuser collaboration and allows scientists to
share their work in an open document format that is a true "executable paper":
notebooks can be version controlled, exported to HTML or PDF for publication,
and used for teaching.  We will demonstrate the key features of the system,
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1221/ipython-tools-for-the-entire-lifecycle-of-resear 
51. Bokeh: An Extensible Implementation of the Grammar of Graphics for Python
Peter Wang, Hugo Shi
tags: Visualization
Bokeh is a new plotting framework for Python that natively understands the relationships in multidimensional datasets, uses a Protovis-like expression syntax scheme for creating novel visualizations, and is designed from the ground up to be used on the web.

Although it can be thought of as "ggplot for Python", the goals of Bokeh are much more ambitious.  The Grammar of Graphics primarily addresses the mapping of pre-built aeshetics and layouts to a particular data schema and tuples of measure variables.  It has limited facility for expressing data interactivity, and its small set of graph types (aka "geoms" or glyphs) are somewhat limited in both their number and in the number of ways they can be combined with one another.

On the flip side, most existing Python plotting frameworks adopt a "tell me how" instead of a "tell me what" approach.  Thus, user plotting code canfrequently become mired down in what amounts to details of the rendering system.

In our talk, we will show various features of Bokeh, and talk about future development.  We will also go into some detail about how Bokeh unifies the tasks of describing data mapping, building data-driven layout, and composing novel visualizations using a single, multi-purpose scene and data graph.
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1224/bokeh-an-extensible-implementation-of-the-gramma 
52. Total Recall: flmake and the Quest for Reproducibility
Anthony Scopatz
tags: General
`FLASH`_ is a high-performance computing (HPC) multi-physics code which is used to perform astrophysical and high-energy density physics simulations.  It runs on the full range of systems from laptops to workstations to 100,000 processor super computers - such as the Blue Gene/P at Argonne National Laboratory.

Historically, FLASH was born from a collection of unconnected legacy codes written primarily in Fortran and merged into a single project.  Over the past 13 years major sections have been rewritten in other languages.  For instance, I/O is now implemented in C.  However building, testing, and documentation are all performed in Python.

FLASH has a unique architecture which compiles *simulation specific* executables for each new type of run.  This is aided by an object-oriented-esque inheritance model that is implemented by inspecting the file system's directory hierarchy.  This allows FLASH to compile to faster machine code than a compile-once strategy.  However it also places a greater importance on the Python build system.

To run a FLASH simulation, the user must go through three basic steps: setup, build, and execution.  Canonically, each of these tasks are independently handled by the user.  However, with the recent advent of `flmake`_ - a Python workflow management utility for FLASH - such tasks may now be performed in a repeatable way.

Previous workflow management tools have been written for FLASH.  (For example, the "Milad system" was implemented entirely in Makefiles.)  However, none of the priorattempts have placed reproducibility as their primary concern.  This is in part becausefully capturing the setup metadata requires alterations to the build system.

The development of flmake started by rewriting the existing build systemto allow FLASH to be run outside of the main line subversion repository.  It separates outproject and simulation directories independent of the FLASH source directory.  Thesedirectories are typically under their own version control.

Moreover for each of the important tasks (setup, build, run, etc), a sidecar metadata *description* file is either written or appended to.  This is a simple dictionary-of-dictionaries JSON file which stores the environment of the system and the state of the code when each flmake command is run.  This metadata includes the version information of both the FLASH main line and project repositories.  However, it also may include *all* local modifications since the last commit.  A patch is automatically generated using the Python standard library ``difflib`` module and stored directly in the description.  

Along with universally unique identifiers, logging, and Python run control files, the flmake utility may use the description files to fully reproduce a simulation by re-executing each command in its original environment and state.  While ``flmake reproduce`` makes a useful debugging tool, it fundamentally increases the scientific merit of FLASH simulations.  

The methods described above may be used whenever source code itself is distributed.   While this is true for FLASH (uncommon amongst compiledcodes), most Python packages also distribute their source.  Therefore the same reproducibility strategy is applicable and highly recommended for Python simulation codes.  Thus flmake shows that reproducibility - which is notably absent from most computational science projects - is easily attainable using only version control and standard library modules.

.. _FLASH: http://flash.uchicago.edu/site/

.. _flmake: http://flash.uchicago.edu/site/flashcode/user_support/tools4b/usersguide/flmake/index.htm
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1223/total-recall-flmake-and-the-quest-for-reproducib 
53. Poster Session
None
tags: Plenary
None
 recording release: yes license: None  

54. nD image segmentation using learned region agglomeration with the Ray Python library
Juan Nunez-Iglesias
tags: General
One of the principal goals of the Janelia Farm Research Campus is the
reconstruction of complete neuronal circuits. This involves 3D
electron-microscopy (EM) volumes many microns across with better than
10nm resolution, resulting in gigavoxel scale images.  From these,
individual neurons must be segmented out. Although image segmentation
is a well-studied problem, these data present unique challenges in
addition to scale: neurons have an elongated, irregular branching
structure, with processes up to 50nm thin but hundreds of micrometers
long); one neuron looks much like the next, with only a thin cellular
boundary separating densely packed neurons; and internal neuronal
structures can look similar to the cellular boundary. The first problem
in particular means that small errors in segment boundary predictions
can lead to large errors in neuron shape and neuronal network
connectivity.

Our segmentation workflow has three main steps: a voxelwise edge
classification, a fine-grained segmentation into supervoxels (which
can reasonably be assumed to be atomic groups of voxels), and
hierarchical region agglomeration.

For the first step, we use Ilastik, a pixel-level interactive learning
program.  Ilastik uses the output of various image filters as features
to classify voxels as labeled by the user. We then use the watershed
algorithm on the resulting edge probability map to obtain supervoxels.
For the last step, we developed a new machine learning algorithm
(Nunez-Iglesias et al, in preparation).

Prior work has used the mean voxel-level edge-probability along the
boundaries between regions to agglomerate them. This strategy works
extremely well because boundaries get longer as agglomeration proceeds,
resulting in ever-improving estimates of the mean probability. We
hypothesized that we could improve agglomeration accuracy by using a
classifier (which can use many more features than the mean). However, a
classifier can perform poorly because throughout agglomeration we may
visit a part of the feature space that has not yet been sampled. In our
approach, we use active learning to ensure that we have examples from
all parts of the space we are likely to encounter.

We implemented our algorithm in arbitrary dimensions in an open-source,
MIT-licensed Python library, Ray (https://github.com/jni/ray). Ray
combines leading scientific computing Python libraries, including
NumPy, SciPy, NetworkX, and scikits-learn to deliver state of the art
segmentation accuracy in Python.

 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1225/nd-image-segmentation-using-learned-region-agglom 
55. luban: a minimalist UI 'language'
Jiao Lin
tags: Visualization
Luban (http://lubanui.org) is a python package for building user interface.
With luban, one can easily create dynamic, ajax-based web
interfaces behaving like desktop UI using pure python:
no knowledge of html and javascript is required.

Luban is different from any existing web frameworks in philosophy:
it provides a generic specification "language" for describing user interface,
and a luban specification of user interface can be
automatically rendered into web or native user interfaces
using media-specific languages.

Luban is focused on providing a simple, easy-to-understand syntax to
describe user interfaces, and hence allows users to focus more
on the business logic needed behind user interfaces.

In this talk I will discuss recent developments of luban and some of its applications.
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1226/luban-a-minimalist-ui-language 
56. PythonTeX: Fast Access to Python from within LaTeX
Geoffrey M. Poore
tags: General
The LaTeX document preparation system is frequently used to create scientific documents and presentations.  This process is often inefficient.  The user must switch back and forth between the document and external scientific software that is used for performing calculations and creating figures.  PythonTeX_ is a LaTeX package that allows Python code to be entered directly within a LaTeX document.  The code is automatically executed and its output is included within the original document.  The code may also be typeset within the document with syntax highlighting provided by Pygments.

.. _PythonTeX: https://github.com/gpoore/pythontex

PythonTeX is fast and user-friendly.  Python code is separated into user-defined sessions, and each session is only executed when its code is modified.  When code is executed, sessions run in parallel.  The contents of stdout and stderr are synchronized with the LaTeX document, so that printed content is easily accessible and error messages have meaningful line numbering.

PythonTeX greatly simplifies scientific document creation with LaTeX.  For example, SymPy can be used to automatically solve and typeset step-by-step mathematical derivations.  It can also be used to automate the creation of mathematical tables.  Plots can be created with matplotlib and then easily customized in place.  Python code and its output can be typeset side by side.  The full power of Python is conveniently available for programming LaTeX macros and customizing and automating LaTeX documents.
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1227/pythontex-fast-access-to-python-from-within-late 
57. Python's role in VisIt
Cyrus Harrison, Harinarayan Krishnan
tags: Visualization
VisIt is an open source, turnkey application for scientific data analysis and visualization that runs on a wide variety of platforms from desktops to petascale class supercomputers. This talk will provide an overview of Python’s role in VisIt with a focus on use cases of scripted rendering, data analysis, and custom application development. 

Python is the foundation of VisIt’s primary scripting interface, which is available from both a standard python interpreter and a custom command line client.  The interface provides access to all features available through VisIt’s GUI. It also includes support for macro recording of GUI actions to python snippets and full control of windowless batch processing.

While Python has always played an important scripting role in VisIt, two recent development efforts have greatly expanded VisIt’s python capabilities:

1)  We recently enhanced VisIt by embedding python interpreters into our data flow network pipelines. This provides fine grained access, allowing users to write custom algorithms in python that manipulate mesh data via VTK’s python wrappers and leverage packages such as numpy and scipy. Current support includes the ability to create derived mesh quantities and execute data summarization operations.

2) We now support custom GUI development using Qt via PySide. This allows users to embed VisIt’s visualization windows into their own python applications. This provides a path to extend VisIt’s existing GUI and for rapid development of streamlined GUIs for specific use cases.

The ultimate goal of this work is to evolve Python into a true peer to our core C++ plugin infrastructure.

This work performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344 (LLNL-ABS-552316).
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1228/pythons-role-in-visit 
58. Lunch
None
tags: ---
None
 recording release: yes license: None  

59. Forking your way to success and happiness: how GitHub style collaboration is ushering in a new era of amateur led innovation.
Tim Clem
tags: Plenary
Come hear about the tools, technology, corporate structure, and ethos
that lets GitHub use GitHub to build GitHub. From a couple of guys in
a coffee shop to almost 100 employees, millions of users, and massive
open source projects that are powering businesses around the world,
it's been a bit of a wild ride. Hear about some lessons learned and
challenges we've faced: things we've done right and others that didn't
work out so well. Learn a little bit about our growing technology
stack and how we design and deploy features. Get some insight into why
we still have no managers and how everyone decides what to work on.
Finally, hear about how open source has shaped the company and our
vision of 'open' in everything from hardware to politics to education
and science. The social web is old news, but the collaborative web is
just in its infancy and GitHub sees that as a very bright future.
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1229/forking-your-way-to-success-and-happiness-how-gi 
60. Param: Declarative programming using Parameters
James A. Bednar, Christopher E. Ball
tags: General
As a scientific Python application grows, it can be increasingly
difficult to use and maintain, because of implicit assumptions made
when writing each component.  Users can pass any possible data type
for any argument, so code either fills up with assertions and tests to
see what type of data has been supplied, or else has undefined
behavior for some datatypes or values.  Once software is exchanged
with other users, obscure error messages or even incorrect results are
the likely outcome.  Programming languages that require types to be
declared alleviate some of these issues, but are inflexible and
difficult to use, both in general and when specifying details of types
(such as ranges of allowed values).  Luckily, Python metaobjects make
it possible to extend the Python language to offer flexible
declarative typing, offering the best of both worlds.

The Param module provides a clean, low-dependency, pure-Python
implementation of declarative parameters for Python objects and
functions, allowing library and program developers to specify
precisely what types of arguments or values are allowed.  A Parameter
is a special type of class attribute that supports type declarations
(based on subtypes of a specified class, support for specified methods
(duck typing), or any other criterion that can be tested), ranges,
bounds, units, constant values, and enumerations.  A Parameter has a
docstring (visible at the command line or in generated documentation),
inherits its default value, documentation, etc. along the class
hierarchy, and can be set to dynamic values that generate a stream of
numbers for use in controlling scientific code.  In essence, a
Parameter is a Python attribute extended to support clean, simple,
robust, maintainable, and declarative scientific programming.

Param has been under continuous development and use since 2002 as part
of the Topographica simulator (topographica.org), but is now being
released as a separate package due to demand from users who want
similar functionality in their own code.  Param is very similar in
spirit to the Enthought Traits library, despite having been developed
independently, and offers much of the same functionality.  Param is
particularly useful for people who find that Traits is difficult to
integrate into their work flow, since it consists of only two pure
Python files with no dependencies outside the standard library.  Param
is also useful for people building Tk applications, and provides an
optional Tk property-sheet interface that can automatically generate a
GUI window for viewing and editing an object's Parameters.


Param is freely available under a BSD license from:
http://ioam.github.com/param/

 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1230/param-declarative-programming-using-parameters 
61. Enaml - A Framework for Building Declarative User Interfaces
S. Chris Colbert
tags: Visualization
Overview
--------
Enaml is a new domain specific declarative language for specifying user 
interfaces in Python applications. Its syntax, a strict superset of the
Python language, provides a clean and compact representation of UI 
layout and styling, and uses dynamic expressions to bind a view's logic 
with an application's underlying computational model.

Design Goals
------------
A number of considerations were given during the design of Enaml with the
ultimate goal being the creation of a dynamic UI framework that has a low
barrier of entry and can scale in complexity and capability according to 
the needs of the developer.

**Influence** Enaml improves upon existing technologies and ideas for specifying 
user interfaces. Much of Enaml's inspiration comes from Qt's QML, a declarative 
UI language derived from ECMAScript and designed specifically for developing 
mobile applications with the Qt toolkit. In contrast, Enaml is designed for the
development of scientific and enterprise level applications, and makes use of 
a Python derived syntax and standard desktop-style widget elements. For layout,
Enaml raises the bar by providing a system based on symbolic constraints. The 
underyling technology is the same which powers the Cocoa Auto-Layout system in 
OSX 10.7, however in Enaml, the constraints are exposed in a friendly Pythonic 
fashion.

**Toolkit Independence** In large projects, the costs of changing infrastructure 
can be extremely high. Instead of forcing an application to be tied to a single 
underlying toolkit, Enaml is designed to be completely toolkit agnostic. This 
decoupling provides the benefit of being able to migrate an entire project 
from one gui library to another by changing only a single line of code or 
setting an environment variable. Enaml currently supports both Qt (via Pyside 
or PyQt4) and WxPython backends with plans for HTML 5 in the future. The 
authoring of new toolkit backends has been designed to be a simple affair. 
Adding new or custom widgets to an existing toolkit is trivial.

**Extensibility** A good framework should be useable by a wide variety of 
audiences and should be able to adapt to work with technologies not yet 
invented. Enaml can provide the UI layer for any Python application, with few 
limitations placed on the architecture of the underlying computational model. 
While Enaml understands Enthought's Traits based models by default, it provides 
simple hooks that the developer can use to extend its functionality to any 
model architecture that provides some form of notification mechanism. 
Possibilities include, but are not limited to, models built upon databases, 
sockets, and pub-sub mechanisms.

**Continuity** No matter how easy it is to get started with a new framework, it 
will not be adopted if the cost of switching is exceedingly high. Enaml is 
positioned to become the next generation of TraitsUI, the user interface layer 
of the Traits library. Enaml can both include existing TraitsUI views in an 
application as well as itself be embedded within a TraitsUI. Enaml also 
interacts seamlessly with the Chaco plotting library, allowing easy integration 
of interactive graphics. Enaml cleanly exposes the toolkit specific objects 
that it manages, allowing a user with a large amount of toolkit specific code 
to continue to use that code with little or no changes. This provides a path 
forward for both TraitsUI and non-TraitsUI applications.
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1231/enaml-a-framework-for-building-declarative-user 
62. Blaine
Stephen McQuay, William Blattman
tags: Visualization
Hello! If you need data harvesting services, I'd 
gladly offer my assistance. As a skilled professional in this domain, I possess the knowledge and essential tools to deliver swift and precise results.
This can facilitate you in deciding wisely and expanding your business.
Feel free to get in touch with me for assistance with web scraping..

Intelligent Content Scraper
 recording release: maybe  
 Video: http://pyvideo.org/video/1233/surface-subdivision-schemes-for-python 
63. Object Oriented Finite Elements at NIST
Andrew Reid
tags: General
 The Object Oriented Finite-Element project at NIST
 is a Python and C++ tool designed
to bring sophisticated numerical modeling capabilities to users
in the field of Materials Science.  The software provides numerous
tools for constructing finite-element meshes from microstructural
images, and for implementing material properties from a very
broad class which includes elasticity, chemical and thermal diffusion,
and electrostatics.
 The current series of releases has a robust interface for defining
new nonlinear properties, and provides both first and second
order time-dependence in the equations of motion.
 The development team is currently working on a fully-3D version
of the code, as well as expanding the scope of available properties
to include surface interactions, such as surface tension and chemical
reactions, and inequality constraints, such as arise in mechanical
surface contact and plasticity.
 The software is a hybrid of Python and C++ code, with the high
level user interface and control code in Python, and the heavy
numeric work being done in C++.  The software can be operated
either as an interactive, GUI-driven application, as a scripted
command-line tool, or as a supporting library, providing useful
access to users of varying levels of expertise.  At every level,
the user-interface objects are intended to be familiar to the
materials-science user.
 This presentation will focus on an interesting example of
a nonlinear property, called Ramberg-Osgood elasticity, and the process
for incorporating this feature into the OOF architecture.
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1232/object-oriented-finite-elements-at-nist 
64. Mid-afternoon Break
None
tags: ---
None
 recording release: yes license: None  

65. QuTiP: An open-source Python framework for the dynamics of open quantum systems
Paul Nation, Robert Johansson
tags: HPC
We present QuTiP, an object-oriented open-source framework for solving the dynamics of open quantum systems. The QuTiP framework is written in a combination of Python and Cython, and using SciPy, NumPy and matplotlib to provide an environment for computational quantum mechanics that is easy and efficient to use. Arbitrary quantum systems, including time-dependent systems, may be built up from operators and states defined by a quantum object class, and then passed on to a choice of unitary and dissipative evolution solvers. We give an overview of the basic structure for the framework and the techniques used in its implementation. We also present a few selected examples from contemporary research on quantum mechanics that illustrate the strengths of the framework, and the types of calculation that can be performed. The framework described here is particularly well suited to the fields of quantum optics, superconducting circuit devices, nanomechanics, and trapped ions, while also being ideal as an educational tool.

For more information see http://qutip.googlecode.com.
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1347/qutip-an-open-source-python-framework-for-the-dy 
66. Fcm - A python library for flow cytometry
Jacob Frelinger
tags: Computational Bioinformatics
Cellular populations in biology are often heterogeneous, and aggregate assays
such as expression arrays can obscure the small differences between these
populations. Examples where these differences can be highly significant include
the identification of antigen-specific immune cells, stem cells and circulating
cancer cells. As the frequency of such cells in the blood can be vanishingly
small, assays to detect signals at the single cell level are essential. Flow
cytometry is probably the best established single cell assay, and has been an
integral tool in immunology and biology for decades, able to measure cellular
marker levels for individual cells, as well as population statistics over
millions of cells.

Recent technological innovations in flow cytometry have increased the number of
cell markers capable of being resolved simultaneously, and visual analysis
(gating) is difficult and error prone with increasing data dimensionality.
Hence there is increasing demand for tools to automate the analysis and
management of flow data, so as to increase accuracy and reproducibility.
However, essentially all software used by flow cytometry laboratories is
commercial and based on the visual analysis paradigm. With the exception of the
R BioConductor project, we are not aware of any other full-featured open source
tools for analyzing flow data. The few open source flow software modules that
exist simply extracts data from FCS (flow cytometry standard) files into
tabular/csv format, losing all metadata associated with the file, and provide
no additional tools for analysis. We therefore decided to develop the *fcm*
library in python that would provide a foundation for flow cytometry data
management and analysis.

The *fcm* library provides functions to load fcs files, apply spectral
compensation, and perform standard log and log-like transforms for
visualization.  The library also provides objects and methods for traditional
gating-based analysis, including standard polygon, threshold, interval, and
quadrant gates.  Using *fcm* and other common python libraries, one can quickly
write scripts for doing large scale batch analysis.  In addition to
gating-based analysis, *fcm* provides methods to do model-based analysis,
utilizing GPU-optimized statistical models to identify cell subsets. These
statistical models provide a data-driven way to construct generative
probability models that scale well with the increasing dimensionality of flow
data and do not require expert input to identify cell subsets.  High
performance computational routines to fit statistical models are optimized
using cython and pycuda. More specialized tools for the analysis of flow data
include the use of a novel information measure to optimize reagent panels and
analysis strategies, and optimization methods for automatic determination of
positivity thresholds.

We are currently using the *fcm* library for the analysis of tetramer assays for
cancer immunotherapy, as well as intracellular expression of effector molecules
in the NIAID-sponsored External Quality Assurance Policy Oversight Laboratory
(EQAPOL) program to standardize flow cytometry assays in HIV studies. An
illustrative example is the use of *fcm* in building a pipeline for the
Cytostream application to automate the analysis of 459 FCS files from 12
laboratories, reducing the analysis time of one month to a single evening.
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1234/fcm-a-python-library-for-flow-cytometry 
67. nmrglue: a Python Module for Working with NMR Data.
Jonathan J. Helmus
tags: Computational Bioinformatics
Nuclear magnetic resonance (NMR) spectroscopy is a key analytical technique
in the biomedical field, finding uses in drug discovery, metabolomics, and
imaging as well as being the primary method for the determination of the
structures of biological macromolecules in solution.   In the course of a
modern NMR structural or dynamic study of proteins and other biomolecules,
experiments typically generate multiple gigabytes of 2D, 3D and even 4D data
sets which must be collected, processed, analyzed, and visualized to extract
useful information.  The field has developed a number of software products to
perform these functions, but few software suites exist that can perform all
of the tasks which a typical scientist requires.  For example, it is not
uncommon for NMR data to be collected using software provided by the
spectrometer vendor, processed and visualized using software from the NIH,
and analyzed using software from a University, collaborator or developed in
house.  Complicating this process is the lack of a standard format for
storing NMR data; each software program typically uses its own format for
data storage.

nmrglue is an open source Python module for working with NMR data which acts
as the "glue" to tie together existing NMR programs, and can be used to
rapidly develop new NMR processing, analysis or visualization methods.  With
nmrglue, spectral data from a number of common NMR file formats can be
accessed as numpy arrays.  This data can be sliced, rearranged or modified as
needed and written out to any of the supported file formats for later use in
existing NMR software programs.  In this way, nmrglue can act as the "glue"
to tie together NMR workflows which employ  existing NMR software.

In addition, nmrglue can be used in conjunction with other scientific python
libraries to rapidly test, prototype, and develop new methods for processing,
analyzing, and visualizing NMR data.  The nmrglue package provides a number
of common NMR processing functions, as well as implementation of scientific
routines which may be of interest to other Python projects including peak
pickers, multidimensional lineshape fitting routines, linear prediction
functions, and a bounded least squares optimization.  These functions
together, with the ability to read, write and convert between a number of
common file formats, allow developers to harness nmrglue for established
routines while focusing on the novel portion of the new method being created.
In addition, the numerical routines in numpy and scipy can be used to
further speed this process.  If these packages are used with the Ipython
shell and matplotlib, a robust, interpreted environment for exploring and
visualizing NMR data can be created using only open source software.

nmrglue is distributed under the New BSD license. Documentation, tutorials,
examples, and downloadable install files and source code are available at
http://code.google.com/p/nmrglue/. Despite a limited exposure in the
scientific field, nmrglue is already used in a number of university research
labs and portions of the package have been adapted for use in VeSPA, a
software suite for magnetic resonance spectroscopy.
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1236/nmrglue-a-python-module-for-working-with-nmr-dat 
68. TuLiP: a toolbox for hybrid and reactive systems research
Scott C. Livingston, Richard M. Murray
tags: General
=========================================================
TuLiP: a toolbox for hybrid and reactive systems research
=========================================================

:Author: Scott C. Livingston  and Richard M. Murray 
:Affiliation: California Institute of Technology

We present a toolbox for the creation and study of controllers for
hybrid systems. It contains modules for

 - working with n-dimensional polytopes,

 - refining continuous state space partitions to satisfy reachability
   properties,

 - synthesizing, manipulating, and visualizing finite automata as
   winning strategies for a class of temporal logic-based games,

 - simulating hybrid executions, and

 - reading and writing problem solutions to an XML format.

The toolbox is named TuLiP (for "Temporal Logic Planning") and written
almost entirely in Python, making critical use of NumPy, SciPy,
CVXOPT, and matplotlib. While software for hybrid systems research is
commonly written in Matlab scripts or otherwise requires the end-user
to build from source for her particular platform, TuLiP requires
neither. For a standard scientific Python environment, the only
additional library may be CVXOPT. Code (re)use and experimentation are
easy, and because of this, TuLiP has provided a natural basis for
further research and development.

Source code and documentation are currently available at
http://tulip-control.sourceforge.net

In this talk we will describe the problem domain addressed by TuLiP,
various use cases, and lessons learning in the Python implementation.
We shall include a full example making use of all components and show
ways that individual modules are useful more broadly. Major items of
the talk will be

 1. related work, and the paucity of Python use in hybrid control
    research, which we argue is a matter of inheritance rather than
    best practices;

 2. overview of the type of hybrid systems represented in TuLiP and
    relevance to other fields;

 3. summary of the major steps going from problem statement to solution;

 4. using only the "polytope computations" module;

 5. using only "discrete reactive synthesis" related modules, with a
    brief description about temporal logic synthesis to provide
    background for those not working on computer aided verification;

 6. snippets about recent research using and building on TuLiP; and

 7. discussion about the Python-based implementation and lessons learned.

For the last item, we will describe challenges faced while developing
TuLiP, given its role of "stitching together" several external tools,
e.g., Gephi  for large graph visualization and
gr1c  for game solving. We will also
touch on liberation from a Matlab-only tool (Mult-Parametric Toolbox;
see http://control.ee.ethz.ch/~mpt/), achieved by creating our own
Python module for working with polytopes, using NumPy and CVXOPT for
computations and matplotlib for visualization.


A tool paper describing an earlier version of TuLiP was presented at
the conference Hybrid Systems: Computation and Control (HSCC) in April 2011.
There have since been substantial additions and improvements.
Furthermore, a broader audience can be reached at SciPy 2012, with new
opportunity to address designs issues likely shared by other scientific
Python developers.

Development of TuLiP has been supported in part by the AFOSR through
the MURI program, the Multiscale Systems Center (MuSyC) and the Boeing
Company.
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1235/tulip-a-toolbox-for-hybrid-and-reactive-systems 
69. Lightning Talks - Thursday

tags: Plenary
1. Scalable Python, Travis Oliphant.
2. Big Data in the Cloud with Python, Chris Cope.
3. CMake and Cython, Matt McCormick.
4. Psychometric Python, Mark Moulton.
5. Evolutionary Comp. in Python, Alan Lockett.
6. Generative Art with Neural Networks, Byron Galbraith.
7. Cellulose Based Serialization, Matt Terry.
8. NumFocus, Fernando Perez.
9. Software Carpentry, Matt Davis.

 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1348/lightning-talks-thursday 
70. Utilizing Python in a Real-Time, Quasi-Operational Meteorological Environment
Patrick Marsh
tags: Meteorology Mini-Symposia
The National Oceanic and Atmospheric Administration's (NOAA) Hazardous
Weather Testbed (HWT) is a facility jointly managed by NOAA's National
Severe Storms Laboratory (NSSL), NOAA National Weather Service's (NWS)
the Storm Prediction Center (SPC), and the NOAA NWS Oklahoma
City/Norman Weather Forecast Office (OUN) within the National Weather
Center building on the University of Oklahoma South Research Campus.
The HWT is designed to accelerate the transition of promising new
meteorological insights and technologies into advances in forecasting
and warning for hazardous weather events throughout the United States.
The HWT facilities include a combined forecast and research area
situated between the operations rooms of the SPC and OUN, and a nearby
development laboratory. The facilities support enhanced collaboration
between research scientists and operational weather forecasters on
specific topics that are of mutual interest.

The cornerstone of the HWT is the yearly Experimental Forecast Program
(EFP) and Experimental Warning Program (EWP) which take place every
spring. In each of those programs, forecasters, researchers, and
developers come together to participate in a real-time operational
forecasting or warning environment with the purpose of testing and
evaluating cutting-edge tools and methods for forecasting and warning.
In the EFP program, between 5 and 10 TB of meteorological data are
processed for evaluation over the course of a 5 week period. These
data come in a variety of sources, a variety of formats, each
requiring a different set of processing.

This talk will discuss how the data flow and data creation processes
of the EFP are accomplished in a real-time setting through the use of
Python. The utilization of Python ranges from simple shell scripting,
to speeding up algorithm development (and runtimes) with Numpy and
Cython, to creating new, open source data-visualization platforms,
such as the Skew-T and Hodograph Analysis and Research Program in
Python, or SHARPpy.
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1242/utilizing-python-in-a-real-time-quasi-operationa 
71. Python @ Life
Daniel Williams
tags: Bioinformatics Mini-Symposia
Life Technologies relies heavily on Python for product development. Here we present examples of using Python with the Numpy/SciPy/Matplotlib stack at Life Technologies for sequencing analysis, Bayesian estimation, mRNA complexity study, and customer survey analysis. We also display our use of Django for developing scientific web tools in Python. These applications, taken together, demonstrate scientific Python’s vital position in Life Technologies’ tool chain.
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1238/python-life 
72. How Sequencing Saved Python
Chris Mueller
tags: Bioinformatics Mini-Symposia
(Needs description.) 
 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1240/how-sequencing-saved-python 
73. Py-ART: Python for remote sensing science
Scott Collis
tags: Meteorology Mini-Symposia
The drive to publish often leaves scientists working with old, inflexible, poorly documented dead end software. Even operational systems can end up being a mash of legacy systems cobbled together. As the Atmospheric Radiation Measurement (ARM) Climate Facility brings its 30+ cloud and precipitation sensitive radars into operation a concerted effort to modernize, modularize and adapt existing code and write new code to retrieve geophysical parameters from the remotely sensed signals. Due to the open nature, active development community and lack of licensing issues Python is a natural development environment choice. This presentation will outline the challenges involved in retrieving model comparable geophysical parameters from scanning weather radars, introduce the framework behind the Python ARM Radar Toolkit (Py-ART) and discuss the challenges involved in building high performance code while maintaining portability, readability and ease of use.

 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1239/py-art-python-for-remote-sensing-science 
74. Domain Analysis of Mosaic Proteins in Purple Sea Urchin
Adam Hughes
tags: Bioinformatics Mini-Symposia
Purple sea urchins (Strongylocentrotus purpuratus or Sp) are invertebrates that share more than 7,000 genes with humans, more than other common model invertebrate organisms like fruit flies and worms.  In addition, the innate immune system of sea urchins demonstrates unprecedented complexity.  These factors make the sea urchin a very interesting organism for investigations of immunology.  Of particular interest are the set of proteins in SP that contain C-type lectin (CLECT) domains, a functional region in the protein which recognizes sugars.  Proteins containing CLECTs may be particularly important to immune system robustness because of sugars that are present on pathogens.

The primary goals of this research project are first to identify all the CLECT-containing proteins in the Sp genome, and then to predict their function based on similarity to characterized proteins in other species (protein homology or similarity).  The latter goal is particularly challenging and requires new and creative analysis methods.  

From an informational viewpoint, proteins are represented by a unique sequence of letters, each letter corresponding to an amino acid.  For example G-A-V indicates the sequence glycine, alanine and valine.  Commonality between proteins is usually measured by sequence alignments; that is, by directly comparing the sequence of letters between two proteins.  Algorithms and tools for these alignments are among the most standardized and available tools in bioinformatics.

Sequence similarity between homologous proteins can degrade over long evolutionary timescales.  This is in part because some mutations at the sequence level can occur without compromising a protein's overall function.  This is akin to the evolution of a language, e.g modern English and middle English, which initially appear to be separate languages due to spelling differences.  Because domains are regions of a protein which can function semi-independently, they are less prone to accommodate mutations.  By comparing proteins based on the ordering of their domains, or their ``domain architecture'', it becomes possible to identify homology, or similarities in domain order, separated by extensive evolution.

Alignment tools based on domain architecture are promising, but are still in their infancy.  Consequently, very few researchers utilize both sequence and domain alignment methodologies corroboratively.  Using Python scripts in tandem with various web tools and databases, we have identified the top alignment candidates for the CLECT-containing Sp proteins using both methods.  With the help of the Enthought Tool Suite, we have created a simple visualization tool that allows users to examine the sequence alignments side-by-side with two types of domain alignments.  The information provided by these three results together is much more informative with respect to predicting protein function than any single method alone.  Finally, we have developed a systematic set of heuristic rules to allow users to make objective comparisons among the three sets of results.  The results can later be parsed using Python scripts to make quantitative and qualitative assessments of the dataset.  We believe that these new comparison and visualization techniques will apply in general to computational proteomics.

 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1241/domain-analysis-of-mosaic-proteins-in-purple-sea 
75. Can Python's web and science communities run concurrently?
Eric Bruning
tags: Meteorology Mini-Symposia
Python has been adopted by many disciplinary communities, showing its adaptability to many problems. Scientific computing and web development are two examples of such communities. These might, at first glance, seem to share few common interests, especially at the level of algorithms and libraries. However, at the level of integrated practice in time-constrained academic environments, where framework development is less valued than research and teaching productivity, ease of adoption of tools from each of these communities can be tremendously valuable.

Using examples from the recently-deployed West Texas Lightning Mapping Array, which is processed and visualized in real-time, this paper will argue that a shared sense, among disciplinary communities, of the essence of how one deploys Python for specific problems is beneficial for continuation and growth of Python's status as a go-to language for practitioners in academic settings.

 recording release: yes license: CC BY-SA  
 Video: http://pyvideo.org/video/1237/can-pythons-web-and-science-communities-run-conc 
76. Running a Coupled General Circulation Model with Python
Luiz Irber
tags: Meteorology Mini-Symposia
(Needs description.) 
 recording release: yes license:   
 Video: http://pyvideo.org/video/1349/running-a-coupled-general-circulation-model-with 
77. Discussion

tags: Bioinformatics Mini-Symposia
None
 recording release: yes license: None  

78. Discussion

tags: Bioinformatics Mini-Symposia
None
 recording release: yes license: None  

79. Discussion

tags: Meteorology Mini-Symposia
None
 recording release: yes license: None  



Location
--------
105


About the group
---------------