Hi
user
Admin Login:
Username:
Password:
Name:
Dynamic Data Pipelining with Luigi
--client
pyohio
--show
pyohio_2019
--room cartoon2 14877 --force
Next: 12 Big Data with Small Computers: Building a Hadoop Cluster with Raspberry Pis
show more...
Marks
Author(s):
Trey Hakanson
Location
Cartoon 2
Date
jul Sun 28
Days Raw Files
Start
16:15
First Raw Start
16:08
Duration
0:30:0
Offset
0:06:46
End
16:45
Last Raw End
17:03
Chapters
00:00
0:23:10
Total cuts_time
28 min.
https://www.pyohio.org/2019/presentations/93
raw-playlist
raw-mp4-playlist
encoded-files-playlist
host
archive
tweet
mp4
svg
png
assets
release.pdf
Dynamic_Data_Pipelining_with_Luigi.json
logs
Admin:
episode
episode list
cut list
raw files day
marks day
marks day
image_files
State:
---------
borked
edit
encode
push to queue
post
richard
review 1
email
review 2
make public
tweet
to-miror
conf
done
Locked:
clear this to unlock
Locked by:
user/process that locked.
Start:
initially scheduled time from master, adjusted to match reality
Duration:
length in hh:mm:ss
Name:
Video Title (shows in video search results)
Emails:
email(s) of the presenter(s)
Released:
Unknown
Yes
No
has someone authorised pubication
Normalise:
Channelcopy:
m=mono, 01=copy left to right, 10=right to left, 00=ignore.
Thumbnail:
filename.png
Description:
As the scale of modern data has grown, so too has the need for modern tooling to handle its growing list of needs. Databases have had to become more horizontally scalable, less centralized, and more fault tolerant to handle the expectations of modern users. As such, the concept of data-warehouses and data-engineering are relatively new concepts, and engineers are still hard at work to solve core problems of this new sector. One problem of particular interest is that of dynamic data pipelining and workflows. Ingesting large amounts of data, transforming streams dynamically into a standardized format, and maintaining checkpoints and dependencies in order to ensure that proper prerequisites are met before beginning a given task are all difficult problems. This talk will describe how these problems can be solved using Luigi, Spotify’s robust tool for constructing complex data pipelines and workflows. Luigi allows for complex pipelines to be described programmatically, handling multiple dependencies and dependents. This allows it to be used for a wide variety of batch jobs, and the option to use the centralized scheduler makes it easy to monitor job progress across data warehouses. In addition, Luigi’s robust checkpoint system allows for pipelines to resumed at any point they may fail at. Each task is well-defined, specifying required inputs and resulting outputs, so creating or editing pipelines is a breeze. As the scale of modern data has grown, so has the need for tooling to handle its growing list of challenges. Whether performing reporting, bulk ingestion, or ETL processes, it is important to maintain flexibility and ensure proper monitoring. Luigi provides a robust toolkit to perform a wide variety of data pipelining tasks, and can be easily integrated into existing workflows with ease.
markdown
Comment:
production notes
2019-07-28/16_08_14.ts
Apply:
16:08:14 - 16:15:03 ( 00:06:49 )
S:
16:08:14 -
E:
16:38:13
D:
00:29:59
(
End:
409.0)
show more...
vlc ~/Videos/veyepar/pyohio/pyohio_2019/dv/cartoon2/2019-07-28/16_08_14.ts :start-time=00.0 --audio-desync=0
Raw File
Cut List
16:08:14
seconds: 0.0
Wall: 16:08:14
Duration
00:29:59
16:38:13
seconds: 409.0
Wall: 16:15:03
Comments:
mp4
mp4.m3u
dv.m3u
Split:
Sequence:
:
delete
2019-07-28/16_08_14.ts
Apply:
16:15:03 - 16:38:13 ( 00:23:10 )
S:
16:08:14 -
E:
16:38:13
D:
00:29:59
(
Start:
409.0)
show more...
vlc ~/Videos/veyepar/pyohio/pyohio_2019/dv/cartoon2/2019-07-28/16_08_14.ts :start-time=0409.0 --audio-desync=0
Raw File
Cut List
16:08:14
seconds: 409.0
Wall: 16:15:03
Duration
00:29:59
16:38:13
seconds: 0.0
Wall: 16:08:14
Comments:
mp4
mp4.m3u
dv.m3u
Split:
Sequence:
:
delete
2019-07-28/16_38_14.ts
Apply:
16:38:14 - 16:43:15 ( 00:05:01 )
S:
16:38:14 -
E:
17:03:49
D:
00:25:35
(
End:
301.0)
show more...
vlc ~/Videos/veyepar/pyohio/pyohio_2019/dv/cartoon2/2019-07-28/16_38_14.ts :start-time=00.0 --audio-desync=0
Raw File
Cut List
16:38:14
seconds: 0.0
Wall: 16:38:14
Duration
00:25:35
17:03:49
seconds: 301.0
Wall: 16:43:15
Comments:
mp4
mp4.m3u
dv.m3u
Split:
Sequence:
:
delete
2019-07-28/16_38_14.ts
Apply:
16:43:15 - 17:03:49 ( 00:20:34 )
S:
16:38:14 -
E:
17:03:49
D:
00:25:35
(
Start:
301.0)
show more...
vlc ~/Videos/veyepar/pyohio/pyohio_2019/dv/cartoon2/2019-07-28/16_38_14.ts :start-time=0301.0 --audio-desync=0
Raw File
Cut List
16:38:14
seconds: 301.0
Wall: 16:43:15
Duration
00:25:35
17:03:49
seconds: 0.0
Wall: 16:38:14
Comments:
mp4
mp4.m3u
dv.m3u
Split:
Sequence:
:
delete
Rf filename:
root is .../show/dv/location/, example: 2013-03-13/13:13:30.dv
Sequence:
get this:
check and save to add this
2019-07-28/16_08_14.ts
2019-07-28/16_38_14.ts
Veyepar
Video Eyeball Processor and Review