Hi
user
Admin Login:
Username:
Password:
Name:
Data Mining and Processing for fun and profit
--client
pyconza
--show
pyconza2016
--room tugela_room 11495 --force
Next: 1 Python-assisted creative writing: managing dynamic gender in RPG scenarios
show more...
Marks
Author(s):
Reuben Cummings
Location
Tugela Room
Date
oct Fri 07
Days Raw Files
Start
13:30
First Raw Start
error-in-template
Duration
01:30:00
Offset
None
End
15:00
Last Raw End
Chapters
Total cuts_time
None min.
https://za.pycon.org/talks/38/
raw-playlist
raw-mp4-playlist
encoded-files-playlist
mp4
svg
png
assets
release.pdf
Data_Mining_and_Processing_for_fun_and_profit.json
logs
Admin:
episode
episode list
cut list
raw files day
marks day
marks day
image_files
State:
---------
borked
edit
encode
push to queue
post
richard
review 1
email
review 2
make public
tweet
to-miror
conf
done
Locked:
clear this to unlock
Locked by:
user/process that locked.
Start:
initially scheduled time from master, adjusted to match reality
Duration:
length in hh:mm:ss
Name:
Video Title (shows in video search results)
Emails:
email(s) of the presenter(s)
Released:
Unknown
Yes
No
has someone authorised pubication
Normalise:
Channelcopy:
m=mono, 01=copy left to right, 10=right to left, 00=ignore.
Thumbnail:
filename.png
Description:
# AUDIENCE - data scientists (current and aspiring) - those who want to know more about data mining, analysis, and processing - those interested in functional programming # DESCRIPTION Data mining is a key skill that involves transforming data found online and elsewhere from a hodgepodge of numbers into actionable information. Using examples ranging from RSS feeds, open data portals, and web scraping, this tutorial will show you how to efficiently obtain and transform data from disparate sources. # ABSTRACT Data mining is a key skill that any self proclaimed data scientist should possess. It involves transforming data from disparate sources and a hodgepodge of numbers into actionable information. Tabular data, e.g., csv/excel files, is very common in data mining and greatly benefits from python's functional programming idioms. For better or for worse, the leading python data libraries, Numpy and Pandas, eschew the functional programming style for object-oriented programming. Using examples ranging from RSS feeds, the South Africa Data Portal API, raw excel files, and basic web scraping, this tutorial will show how to efficiently locate, obtain, transform, and remix data from the web. These examples will prove that you can do a lot with functional programming and without the need for Numpy or Pandas. Finally, it will introduce meza: a pure python, functional, data analysis library and alternative to Pandas. IPython notebooks and sample data files will be distributed beforehand on Github to facilitate code distribution. # OBJECTIVES Attendees will learn what data and data mining are, why they are important. They will learn some basic functional programming idioms and see how it is ideally suited to data mining. They will also see in what areas the 20lb gorilla (Pandas) shines and when a lightweight alternative (meza) is more practical. # ADDITIONAL INFO ## Level Intermediate ## Prerequisites Students should have at least basic knowledge of python itertools and functional programming paradigms, e.g., map, filter, reduce, and list comprehensions. Laptops should have python3 and the following pypi libs installed: bs4, requests, and meza. ## Format Students will be instructed in the completion of a series of exercises that will explore using python for data mining. It will involve lessons to introduce concepts; demos which implement the concepts using meza, beautiful soup, and requests; and exercises for students to apply the concepts. # OUTLINE - [10 min] Part I - [2 min] Intro (lecture) - Who am I? - Topics to cover - format - [8 min] Definitions (lecture) - What is data? - What is data mining? - Why is it data mining important? - [35 min] Part II - [15 min] You might not need pandas (demo) - Obtaining data - Analyzing and Transforming data - [20 min] interactive data gathering (exercise) - [45 min] Part III - [10 min] Introducing meza (demo) - [20 min] interactive data processing (exercise) - [15 min] Q&A
markdown
Comment:
production notes
Rf filename:
root is .../show/dv/location/, example: 2013-03-13/13:13:30.dv
Sequence:
get this:
check and save to add this
Veyepar
Video Eyeball Processor and Review