The Big Data Puzzle Where Does the Eclipse Piece Fit? by J. Langley

Channel:

Subscribers:

24,200

Published on July 11, 2017 4:06:25 PM ● Video Link: https://www.youtube.com/watch?v=gc7f2yV447k

Duration: 35:21

85 views

We will introduce a Big Data configuration that uses Avro & Parquet for data formats, Hadoop for storage, and Spark / Hive for running queries. All of these projects are from the Apache Software Foundation and are widely used in the Data Science field. We will show how Eclipse provides an excellent foundation for IDE support and tooling to make it easier to develop solutions based on this technology stack.

CohesionForce has put together a data set consisting of over 200M samples based on actual records from New York City taxi cabs. This data has been used to compare file size, read/write time, and query speeds using the tooling configuration provided above. We have also created tools in Eclipse that help us transform the data between formats, and we have made those available under the EPL:

https://github.com/CohesionForce/dis-toolkit
https://github.com/LangleyStudios/eclipse-avro
https://github.com/CohesionForce/avroToParquet

We will give a short description of these projects (not a sales pitch) and discuss possibilities for other ways to use Eclipse in the Big Data field.

Eclipse Modeling Framework - define the data formats
XTend - generate code and other files
BIRT - create reports of the analysis results

We would also like to solicit feedback from others that may be using Eclipse in the Data Science field.

Note - the attached slides are from ECNA 2016 and will be updated based on recent changes to Spark & Zeppelin

Other Videos By Eclipse Foundation

2017-07-11	EcoreTools Next: Executable DSL made (more) accessible, by Cedric Brun
2017-07-11	Formalizing Financial Protocols with Xtext, by Denis Ignatovich
2017-07-11	Deep dive into Xtext scoping - local and global scopes explained, by H. Schill & S. Zarnekow
2017-07-11	Debugging DSLs with Xtext's new Tracing API, by Christian Schneider & Miro Spoenemann
2017-07-11	How EASE unleashes the scientific power of Airbus' engineers in Eclipse, by Alain Bernard
2017-07-11	Ignite talks, session 2
2017-07-11	Debug Java code like a Pro, by Mikaël Barbero
2017-07-11	Driving Intelligent Transportation System with Capella, by Jerome Montigny
2017-07-11	Jenkins at Scale, Baptiste Mathus & Michael Pailloncy
2017-07-11	TypeScript, Future of JavaScript and rise of the transpilers, by Sebastien Pertus
2017-07-11	The Big Data Puzzle Where Does the Eclipse Piece Fit? by J. Langley
2017-07-11	Building an IoT product from scratch using Eclipse IoT Technologies
2017-07-11	Eclipse and Java 9, by Jay Arthanareeswaran
2017-07-11	Ignite talks, Session 1
2017-07-11	Keynote - The Past, Present, and Future of Robotics, by Francesco Ferro
2017-07-03	Rapid IoT Prototyping with Eclipse Vorto - Virtual IoT
2017-06-29	vECM \| Eclipse Sirius 5.0, All about UX -Eclipse Oxygen Series
2017-06-29	vECM \| Generic Editor and Language Server Protocol (LSP) -Eclipse Oxygen Series
2017-06-28	Virtual IoT \| Rapid IoT Prototyping with Eclipse Vorto
2017-06-27	vECM \|What's New in the Eclipse Platform? -Eclipse Oxygen Series
2017-06-26	vECM \| New in Xtext: Core Framework, LSP, Tracing Code Generators -Eclipse Oxygen Series

Channel	Latest
NazaVictor - A Vida é Boa, Bora Jogar! - Fly	6 hours ago
IndieGamerRetro	6 hours ago
MultiClassicGamer	6 hours ago
SHADOW-MAN	6 hours ago
D Glock	6 hours ago
Game Hauntings	6 hours ago
LeCLoutGOAT	6 hours ago
HowsItJoeIn	6 hours ago
Op Glück oder Können Win ist Win	6 hours ago
Tivibu Spor	6 hours ago
JorRaptor	6 hours ago
Sonic The Hedgehog 1991 Animations	7 hours ago
MALICEDOLL79	7 hours ago
Royale News	7 hours ago
SlimKirby	7 hours ago
Sunwu Gaming	7 hours ago
JoeCactus64	7 hours ago
classically important	7 hours ago
Symphoniac	7 hours ago
Pids	7 hours ago
Sabrina's Let's Plays	7 hours ago
Germanarih Games	7 hours ago
Kingpingamer	7 hours ago
DODO -PLAYER	7 hours ago
THEREALSPARTAN	7 hours ago