Big Data in the Automotive Test Industry Ever log every single sensor output (as...

w_t_payne · on June 29, 2015

I think I work for one of those 2 dozen companies. I have exactly the same problem at the moment.

akg_67 · on June 30, 2015

If you want to know how to capture and handle automotive test data, look into process automation systems for manufacturing industry. I used to work at petrochemical facilities in 90s as process control engineer, we not only received data from Thousands of I/O inputs (for example, from temperature, flow, pressure, vibration, rpm sensors) every second to every tenth of a second, but we processed them, made decision and sent out I/O signals to controllers (for example, flow control valves). We used combination of programmable logic controllers and distributed control systems.

Talk to companies like Honeywell (I believe they do something similar to what you are trying to do for aviation industry), ABB, Foxboro, etc. These guys have been doing this since 80s.

valarauca1 · on June 30, 2015

They do measurement, and live analysis. If there a method of storing and collating data after the fact they haven't announced it to the industry at large.

luckydata · on June 29, 2015

I could go on for hours about all of this, but long story short, I'm the head of product of a company that builds a data system designed exactly to handle this kind of scenarios (we provide data collection services to Pioneer to mention one).

Feel free to get in touch.

jamesbrewer · on June 30, 2015

I would LOVE to chat about this for hours. In fact, I applied at Treasure Data just a few days ago for that very reason.

My email is in my profile if you'd like to chat. :) Hope to hear from you!

luckydata · on June 29, 2015

forgot to add a link: https://vimeo.com/122691639#at=0

valarauca1 · on June 29, 2015

Called into the company, and I suggest looking into an operator. If you press 0, you just keep repeating the same menu. Submitted an information request.

kiyoto · on June 29, 2015

>Have a good way of querying, analyzing, processing, and securing all this time series data in a way that can handle literally getting >100TB per hour?

Is this data logging the changes or the states of your sensors? If it's the states, then I am guessing most of this is highly compressible. If it is actually 100TB of changes logged, then that's a pretty difficult problem.

tiernetworks · on June 29, 2015

CloudFlare has an approach that may scale to the levels you are looking for; rather than storing the logs, they analyze and rollup the expected responses in realtime, and store additional detail for items that appear anomalous. John Graham-Cumming performed a talk on this topic earlier this month at dotScale:

http://www.thedotpost.com/2015/06/john-graham-cumming-i-got-...

Here is the related HN thread: https://news.ycombinator.com/item?id=9778986

-G

valarauca1 · on June 29, 2015

If you have no working model of what is/is not correct how do you determine anomalous responses?

To Expand:

This method is already used in data logging compression (slightly) where one stores channel delta's/time stamps. Reconstructing the value ad-hoc when necessary. This is a good way to compress non-violalitle datasets.

I've actually watched the talk already. And while it seems to apply the problem is it doesn't. Every data point is important, because the real problem is comparing different tests, with time between tests to attempt to get an idea of how hardware ages. Or to test componenet swaping, where a known test is performed on several different items and in post processing the results are compared. To use the suggest method your storage solution requires knowledge of whats being stored.

:.:.:

The goal is to unify these storage solutions, and present a unified front end for querying/report generation.

splike · on June 29, 2015

> a good way of querying, analyzing, processing, and securing all this time series data

Could you expand a little more on what sort of features you would like to see in a solution? I have some relevant experience and I could see myself taking a stab at this problem

Mandatum · on June 29, 2015

Having looked at just the raw data from OBDII on a few vehicles; data format standardization there is none. Not between car makes, models, years or versions (ie 2014 Suzuki Swift vs 2014 Suzuki Swift S) - this would require either months (or years) of reverse-engineering, or unfettered access to auto maker's internal documentation (for some I know it's minimal). (Likely require partnership with the auto industry to avoid litigation.)

If you pulled that together and offered it in a usable format.. Wow.

curiously · on June 30, 2015

Do you have such data that I can have a look at

jchonphoenix · on June 29, 2015

I'm currently working on VC backed solution to this problem and I'd love to chat in order to get your thoughts and requirements. PM me and we'll chat!

valarauca1 · on June 29, 2015

I'll throw an email at info@koalitycode.com

You have no contact listed...

nether · on June 29, 2015

we're having similar issues in aerospace. time series data out the wazoo.

vonmoltke · on June 29, 2015

I used to deal with radar data on roughly this order of magnitude, which had to be synced to the output of other sensors.

I really miss the challenges of real-time signal and data processing. I really want back into it.

w_t_payne · on June 29, 2015

Video data from multiple cameras FTW.

lukateake · on June 29, 2015

So I'm listening to this Radiolab episode [0] about drones with cameras that can solve crime (traffic and other societal ills) and it occurs to me that with everyone soon to be walking around with DSLR-quality smartphones, couldn't we triangulate all this video/ audio data and provide substantially better resolution to daily life? Think of it as continuous Meerkat/ Periscope localized around an event in four dimensions.

[0] http://www.radiolab.org/story/eye-sky/

AmSal · on June 29, 2015

I was involved in a project conceptualizing real-time video streams from smartphones and synchronizing, adjusting/correcting quality before having it be presentable... in real-time!

Think of a soccer stadium, with fans taking "video" of the game. All the feeds would be gathered, synchronized, quality adjusted and put online for anyone to view, from any angle.

w_t_payne · on June 29, 2015

Yeah ... there is a ton of research out there on multi-sensor fusion; super-resolution & synthetic viewpoint reconstruction.

Even the buzzword-du-jour is getting involved[0].

[0] https://www.youtube.com/watch?v=cizgVZ8rjKA&feature=youtu.be

Cshelton · on June 29, 2015

Formula 1 does some interesting things here...that sport is all about data now..in real time.

mrfusion · on June 29, 2015

I couldn't see a way to contact you. I was about to email you some ideas but looking at your angel.co page it seems like you'd already know how to do this?

valarauca1 · on June 29, 2015

I just updated this.

I know the technical side/requirements, and the people. My business skills and design are lacking. Also a whole front end for interfacing/formating/report generation needs to be created.