romming's comments

romming · on Nov 1, 2017

Etleap | Senior Backend Engineer | San Francisco | Full Time | Onsite

Etleap came to be out of the frustration with how much time data wrangling takes away from the actual analysis. We were just tired of spending time building and maintaining data pipelines. Then we noticed, so is everyone else! That is why we've created an intuitive ETL tool that easily enables the data analysts themselves to integrate data from any source. This way data analysts can do their most significant work faster than ever before.

Now we are looking to add engineers to our core engineering team to help build the infrastructure that modern data teams depend on to create and operate their data warehouse! It shouldn't take a CS degree to use big data effectively, and abstracting away the difficult parts is our mission.

What we want to see in you: - You love data engineering - You build robust and scalable data systems three times as fast as other developers - Coding in Java is second nature to you - You have a passion for improving data analytics - You’re excited to work in a scrappy environment - You’re down to earth and fun to be around. This is an absolute must!

Big plus if you have the following:

- Have experience with Cascading, Docker, and AWS - Know the ins and outs of current big data frameworks like Hadoop, Spark, or Flink, but this is not an absolute requirement, as you're a quick learner! - Have startup experience

More details here: https://etleap.com/jobs/

To apply, send your resume to jobs@etleap.com.

romming · on Sept 1, 2017

Etleap | Senior Backend Engineer | San Francisco | Full Time | Onsite

Etleap came to be out of the frustration with how much time data wrangling takes away from the actual analysis. We were just tired of spending time building and maintaining data pipelines. Then we noticed, so is everyone else! That is why we've created an intuitive ETL tool that easily enables the data analysts themselves to integrate data from any source. This way data analysts can do their most significant work faster than ever before.

Now we are looking to add engineers to our core engineering team to help build the infrastructure that modern data teams depend on to create and operate their data warehouse! It shouldn't take a CS degree to use big data effectively, and abstracting away the difficult parts is our mission.

What we want to see in you: - You love data engineering - You build robust and scalable data systems three times as fast as other developers - Coding in Java is second nature to you - You have a passion for improving data analytics - You’re excited to work in a scrappy environment - You’re down to earth and fun to be around. This is an absolute must!

Big plus if you have the following: - Have experience with Cascading, Docker, and AWS - Know the ins and outs of current big data frameworks like Hadoop, Spark, or Flink, but this is not an absolute requirement, as you're a quick learner! - Have startup experience

More details here: https://etleap.com/jobs/

To apply, send your resume to jobs@etleap.com.

romming · on March 18, 2016

I think there's a more common reason why companies end up with "awful-to-work-with messes": ETL is deceptively simple.

Moving data from A to B and applying some transformations on the way through seems like a straightforward engineering task. However, creating a system that is fault-tolerant, handles data source changes, surfaces errors in a meaningful way, requires little maintenance, etc. is hard. Getting to a level of abstraction where data scientists can build on top of it in a way that doesn't require development skills is harder.

I don't think most data engineers are mediocre or find their job boring. The expectation from management is that ETL doesn't require significant effort is unrealistic, and leads to a technology gap between developers and scientists that tends to be filled with ad-hoc scripting and poor processes.

Disclosure: I'm the founder of Etleap[1], where we're creating tools to make ETL better for data teams.

[1] http://etleap.com/

romming · on Dec 9, 2015

Great question. We find that most effective approach (whether you're a coder or not) is to interact with samples of the data directly to specify your transformations, so our approach is similar to Stanford's data wrangler (http://vis.stanford.edu/wrangler/). Once the transformations are specified, Etleap compiles them into code an continuously applies them to the data at scale.

romming · on Aug 18, 2014

> perhaps the data scientist needs to be involved in the data wrangling in order to understand the source better?

Founder of an ETL startup here. This is exactly what we believe: the end-user of the data should be involved as early in the data pipeline as possible, including the wrangling. If you eliminate the engineer from the ETL process you remove a lot of painful back-and-forth and get more flexible pipelines.

joncooper · on Aug 18, 2014

I'd love to hear about your startup and/or beta test.

romming · on Aug 18, 2014

Hit us up at info at etleap dot com.