An open source Rails API to track "quantified self" data.
Right now, the hardware companies in this space are storing your data in closed systems. Let's change that.
Take the Withings WiFi scale or the FitBit, for example. Both are essential pieces of the quantified self hacker, yet both store your data in their respective, siloed databases. What happens when one of them goes bust? Or decides to pivot towards an ad-based business model?
This project includes a Rails + MySQL RESTful API to store your data.`
This is one of the use cases we had in mind when in developing Tent (https://tent.io).
Tent is an evented data store for arbitrary JSON blobs which can be typed b the developer. It has the added benefit of operating as a decentralized system that lets you push and receive posts with other users via webhooks.
Tent can be used to create applications and systems from the quantified self to microblogging to cloud-backed file sync to collaborative real-time document editing.
One day we will own our own node. Where we have control over everything, what we want to share with whom and why.
All we need is a place for storage and we add services like apps.
With ipv6 around the corner everyone will get his personal ip attached anyway so we can build upon that.
Many of us have this idea but no one made a breakthrough so far but it will emerge. We have not figured it out quite yet we have the technologie but we
need to get the idea of how we glue everything together.
I will look forward to the day when i pay a certain amount of money per year to have my own node without being the product of some megacorp thats only interested in quarter performance.
I have been working on something like this. A home server (called Amahi). See my profile. There is one app that one can run in a home server that is sticking out for this kind of behavior is ownCloud. We have packaged it for distribution in Amahi servers and it's resonating well with the users. I am not sure it has all the elements you would want, but many of them are there.
I'm working on exactly the kind of thing you're describing. Except ipv6 isn't really going to solve the connectivity problem as all the NATs and middleboxes on the network aren't going anywhere. We'll be using DNS instead as the 'identity' piece.
Since you said you would pay for such a thing, can you elaborate how much and what you'd expect (in terms of functionality)?
I would pay for storage videos/pictures/documents etc. This could be one monetization stone. My gut feeling maybe around $9.99 / month or $50 / year.
The basis everyone would except are normal social network features sharing of "items" galleries friend system feeds and groups.
Maybe this part should be an open source solution like for example diaspora.
On top of that you can build an app/service network. Maybe something like app.net.
Important on all parts is that you own the data and you should be able to pull it at will from any part of the network.
Amusing ourselves to death in the 21st century.. this technology does raise, once again, the question of why people are saying so much personal information online. As if they had a giant ampitheatre.
This is so cool. Somebody should turn this into a platform - a mobile app that at least helps you record the stats you want, paired with an API platform that lets you grant and request permission for other "personal APIs".
Then other API calls can be built on top of individual people - College students in the US, citizens of India, etc. The possibilities are pretty endless.
The Biggest Loser - play along at home! Be the "biggest loser" in [your region] to [acquire swag], etc.
Sorry, maybe I'm being a bit facetious. There's definitely a lot of cool stuff that could be done with a personal API though. But how do you get people on-board? You have to have equipment to track/update stats right? Are people willing to buy into this potential network?
If Fitbit and all of those types of companies start having Export APIs a service could have wrapper around all of them. Then anyone with one of those devices could join easily. Could be the way to start, since you definitely don't want to be building the hardware and API altogether for an MVP.
Interesting. Not the api part but the idea itself. What we need is bitcoin like encrypted decentralized personal file with an open spec. This file can be edited only by the owner while can be read by anyone. Then we need exporter applications which can read this data and provide as an api. Now there are two levels of abstraction:data owners and data providers. There can be "data unions", groups which can negotiate with data providers and data consumers for a better deal(money,privacy). Identity verification can now be handled by data unions instead of data providers(as is the case now).
"Take control" anyone?
I like the idea of this. I know there was a post recently about a guy selling 24 hour chunks of data about himself.
Is there an opportunity here to create a secure data store for your own structured data hosted securely somewhere else and that providers would pay YOU to access?
Of course you'd give free access to it for your doctor who needed to pull some info. But what if you had total control over your own API. This is a kind of like Facebook but I would never let facebook hold some of my most personal data. I would like to control authorization to my own data though to certain people.
Interesting take. I looked at it almost the opposite way: "You are turning yourself into a set of numbers that has no identity outside of those numbers." Certainly if this were to be a thing (a la online charts or Microsoft Health) it would have that end result.
I actually started hacking on this idea a few weeks ago. A personal API could have tons of uses. Many others have commented on the benefits of aggregation, but this could allow users to control and selectively sell/provide information to companies, apps, and advertisers. Applications could be built which utilize the personal APIs of friends. In Naveen's example, my app could compare my fitness level with my friends, recommend new restaurants based on ones my friends actually went to, track progress on goals, etc.
What I'm trying to say is that with a personal API platform, apps simply become a front end to standardized personal data. Switching costs would be reduced significantly, which would encourage a lot more innovation.
Here's a spec I've written out for my work on my own personal API. It's not done yet, and the code isn't open sourced yet, but hopefully it will inspire others to hack on the idea.
I work for a non-profit called Open mHealth. We are working to build an open architecture around health data from mobile apps and devices. The basic idea is that existing applications can have their own proprietary way of getting data into their systems, and we want to evolve a standard way of getting data out of these systems. We are developing an API specification around this. https://github.com/openmhealth/developer/wiki
Once app developers adopt the Open mHealth approach, much of the work of getting data out of these silo'ed apps will be done. This is important because the real value of this data comes about when the streams are combined to be analyzed and visualized.
I am posting here because many of the issues outlined in this thread and in the original post are things we are actively trying to solve.
This is awesome. I've also been working on a personal data dashboard app, which is similar. It pulls in data from various social and fitness app APIs. You can also enter in data manually. It's mainly for viewing the data and trends, but I'll add an API too if there's demand.
I've been spending a bit of time thinking about this lately, too. Here are the things that I think need to be answered for this to be truly useful.
* Secure access. It's great you decided to share everything publicly. I'm looking at fairly confidential data like blood tests, though.
* How do you express permissions? It's likely you'd give out sets of data, not per-url permissions. (I.e. my photos from date A to date B, in this general vicinity - your vacation pics)
* How can others discover what data sets & types you've shared?
* Related: A type description. Some events cover time ranges. Some are momentary samples. They all use different units. Some have location info attached, some don't. (I.e. my heart rate doesn't. Pictures do. Steps/day - geolocation is meaningless)
Either way, it's exciting to see people push ahead on this - many of these questions will probably be answered as we go.
you're totally right on all these points: for the time being, i've only put out information that's okay being 'public', but i imagine when you do want to share a set with a small group (let's say your physicians), you'll want an additional layer of access control.
Oh, I didn't mean it as criticism. I admire that you went ahead and just created a v0 - that's further than I've gotten on my project, that's for sure :)
This is also something that I hope Huginn can tackle. Huginn makes it easy to retrieve and react to your data. Adding an API beyond that should be straightforward.
Kind of interesting to think about slowly building up and aggregating more layers - e.g. activity, location, shopping, internet, phone, financial, educational, medical, legal, work, relationships etc - and interesting how much of this exists already.
So if this comprehensive personal dataset becomes the norm at some point in the future - what happens when this is under somebody else's control against your interests (whether an illegal attacker or a legal authority)? And what about more gray areas, such as marketing, insurance etc (which would likely encourage the creation of that kind of API in the first place)?
Had a similar idea before but looked at it as more of a decentralized social platform where everyone hosts his own node with all personal information (maybe even locally on a personal device). Kind of like a virtual avatar that could connect and create a real social layer for the web, one that we could control and trust. Technically this should be pretty easy to create, nothing to fancy here. However standardizing and major real world adoption seems to be pipe dream (look at all the failed attempts like diaspora, etc.).
Love the idea. If you wish to incorporate social stream data, check out http://socialsafe.net
I've used it to back up and then delete my personal FB account, and continue to archive instagram, linkedin, twitter, blogs, etc. etc.
What's more, even though their interface is kinda weird, they store all kinds of granular data in a very nice SQL database which contains more then they represent in their reports, and can be be repurposed in many different ways...
The FOAF crowd already did a lot of this style of work - see also http://wiki.foaf-project.org/w/DataSources ; you probably have a few data feeds already, particularly focused on social networking site activity.
Other vocabs work - ie, crschmidt's menow vocab.
Express it as json ld to make it more modern, and you have a distributed ecology of services that all talk the same sort of language, about you.
Interesting. I think the real challenge behind this though, is not the API itself, but that for it to be really useful there has to be some sort of semi-automated form of data collection, that can be generally applied.
this is why i believe a smart scale is one of the best examples of good data tracking: you step on it whenever you think of it (not on some weird schedule; nor do you have to wear it or take it with you). it does the mundane stuff of capturing the data and sending it to the cloud, which you would otherwise have to write into some notebook.
if it's a system that's fully automated though, one might lose interest in collecting it in the first place: the QSers always like to say – when you add a little friction to a process, you end up paying more attention to why it is. so instead of just blinding capturing your weight, setting up alerts for goals and seeing trends fall and rise will get you to make the most of the data – instead of just collecting data for data's sake.
Ooh, we have a solution to that problem (losing interest in collecting it if it's fully automated). Assuming you're collecting data that you want to change, like, for many people, their weight, or, say, how much time you spend on Hacker News: http://beeminder.com/d/hn
Namely, set up a commitment device that forces you to keep your datapoints on a path to some goal. That's what Beeminder does.
This may be a bit of a privacy issue to publish, but an interesting dataset is that of https://openpaths.cc/ ... keep the app running on your phone and it will keep track of your location. This data is then available to you over an API, or to researchers that you approve to get access to your anonymized data.
I had actually mocked up something similar with openapi.me, but never got to release it.
It included different level of permissions and data including addresses and payment information. The goal was to have a centralized repo of data to update only once and have it propagate to all the other services asking for the same data over and over (email, address, links, etc).
Friends/relationships, for checking general activity and check-ins.
Potential businesses, for seeing interests and the like. Or, if currently employed, the employer could randomly check to see how the employee has been sleeping (are they going to be tired/cranky today?) or to see how often they check-in at the office.
Aggregated with more data from people doing similar things and you can start to get a look at populations (what if everyone in SF was logging this data? What if a good chunk of those in the US were?).
The quantified self people all claim if you can't measure something, you can't improve it. It is very popular with runners. If you don't track your time or run competitively, you tend not to improve your times. Runners are willing to put a lot of data in, even like how they felt after each run, which has good uses for planning work outs, etc.. But the same concept applies to anything - eating healthy/less, walking more, smoking less, etc.. Like good old Ben Franklin used to track virtues until he mastered them.
I like the approach you're taking Naveen. I do wonder whether individual, disparate data steams have any value, or whether its a case of the sum of the parts is less than the whole, so to speak (along with concerns like security, etc etc that others have raised), but the concept is intriguing. Will be watching / following.
I've actually been toying with the idea of having a private facebook account to hold all this data and more. Facebook's pretty much become the default social layer these days and almost every app can publish data to your news feed.
Facebook also has a pretty cool Timeline UI to boot which will probably only get better as time passes.
I'm starting to think this is a really great idea - to have somewhere to keep a private personal history - particularly for checking in to keep a track of where I've been without appearing to be a self-important moron. You know who you are FB friends!
you could probably do it today without the use of a private account
i believe in app settings, you can restrict the visibility of posts by any particular app. this way, your scale's posts could be seen by just you and not your friends.
If there was an easy way to collect and publish this data for the average person would you be willing to sell this to companies directly? We're basically paying companies like Facebook and Twitter with our information in order to have, well, Facebook and Twitter. Maybe we could do something else with our "money".
Nice! I like the idea of keeping all the data about me at one place. Given that so many companies are entering the wearable gadgets and personal fitness space, we will surely see some "Personal Portal" and the company making tons of money selling your personal data or showing "relevant" ads for you.
I've had a similar idea for a while but never executed it well. I wanted to have my personal website merge all of my various feeds from other services into posts on my site to be the "one feed" of things happening in my life.
I suppose API access to that would be a logical extension - this is very cool.
Federating data is a fairly 'solved' issue. Ironically (or not) enterprise IT is ahead of the average organism in this space.
Faced with legacy systems communicating (or not communicating) via various channels, Middleware, ESBs, SOA, message queues, content-based routers, business process management, event processing, and data virtualization have all been created so that a development group can factor these systems into canonical representations of business events.
At that point, the input method ranges from the simple (existing web app to API) to complex (treating a batch-oriented system in a semi-synchronous manner or screen-scraping an ancient smart terminal as a queriable data source).
At that point, the real challenge isn't the input method but master data management and security/authority/privilege.
Would love to show you a working example of a similar project with a public API. We have a number of users giving it a spin and would really appreciate the technical feedback. Email?
I've had several Fitbit's and own the Withings scale; love them to pieces. My early Fitbit's broke when I sat down and caught them on the arms of a chair (once at a beer garden, natch) but the new Flex bracelet is incredibly nice.
i own them all and have tried each one on and off for the last couple of years. i like each one in its own way (and, also, am friends with guys at each place)
if you're just getting started and you aim to keep it light, then i'd start with the withings scale. you leave it in your bathroom, set your goals, step on it when you feel like it and watch trends and alerts from time to time. i find my weight tells me more about me over time than many other metrics: it shows my bad weeks from my good and, in indirect ways, shows when i stress out and get bad sleep (and therefore, end up eating poorly and exercising inconsistently)
They're pretty lenient -- there are all sorts of awful websites that I'm able to visit without issue. They only tend to block sites related to hacking and piracy.
I was about to dismiss this but have thought about it a bit more, and there's something interesting here.
There are already APIs built on top of you. There are the ones you know about, like Facebook and Twitter. There are the ones we know exist but have no direct access to, such as Amazon's backend or your police record. And there are the ones we may never know about, such as the NSA bot scanning this post right now and correlating it to my recent overseas cell phone calls.
I'm intrigued by the concept of aggregating this data. Imagine if the wealth of data collected about you all went to one place, and, most importantly, was controlled by you. Imagine if Company X had to pay you to know about your Amazon spending habits instead of paying Amazon. Imagine if scientists could perform long-term health studies in an instant because the data they need is already collected and just needs to be crunched.
I think the concept that's missing from your blog post is the idea of discoverability.
Have you read Jaron Laniers new book "Who Owns the Future"? You have similar ideas ("Imagine if the wealth of data collected about you all went to one place, and, most importantly, was controlled by you. Imagine if Company X had to pay you to know about your Amazon spending habits instead of paying Amazon.")
It's a thought provoking book.. and slightly worrisome to boot (about the state of things at the moment, with regard to the economics of "sharing data").
yes, i do think of it that way: as we leave all this data exhaust behind us, i keep wanting a system to have it all in one place – or at least, an "interface" that shows it to me in one place, even if that's not how the backend really is.
thanks for the book recommendation - i will check it out this weekend!
Academia has been thinking about this for a while. I'm involved with several projects that aim to put users in control of their 'digital footprint' and what you've described are two aspects of the work. First is how to collate and access all the data and second is how to allow third parties to query/process that data in a way that respects a users' privacy. [1]
I'm keen to commercialize the work when it's ready but it's a research project precisely because it's a tough problem. Of course, one approach is to just make that personal info public but I don't think that will really benefit users in the long term (companies would certainly benefit, though)
Jaron Lanier looks and behaves like a total eccentric, and his ideas can seem pretty out there, but I found his latest book to be a really interesting read. This idea is not the same thing though, because the data ends up being more or less public and demonetized. Companies are already collecting lots of data about you. This, in theory, just gives them even more of it, free of charge.
Your point is of course correct. I was trying to point out the similarity in viewpoint of some of what abraininavat had replied to the OP (not the OP himself).
Ignoring how JL looks and behaves (which is not really germane to the point), the commoditisation of everyone's data by a select few "Siren Servers" (to use Laniers term) without any payment to the originators themselves is - in the long term - not a viable strategy.
Regardless, I stand by my point that his book is thought-provoking and worrisome, though it does not give any concrete answers.. that of course is up to us as a tech "community" to fix (before we all merrily cut off our noses to spite our faces).
edit: corrected the username i was referring to, and spelling.
I think this is a very libertarian notion that misses out on the social aspects of reputation. What you claim about yourself is often less valuable to a third party than what someone else claims about you. Perhaps you could store signed endorsements made by other people. But then you need support for revocation, and what about negative claims? Free speech includes the ability to say negative things about others.
Social software is very complicated because of issues like this. Something like a personal dropbox is straightforward - single customer, no community. You only have to worry about people hacking you, not what they do to each other. Anyone who gets into social software inevitably sets themselves up for making rules and judging disputes, and that often requires hard decisions.
Or compare email with social messaging; the rights and responsibilities are different and it's due to the software. Personal servers only support setting ground rules in a certain way (for example, the inability to revoke or prevent copies). Centralized services can support communities with different rules, for better or worse.
Agree 100%! The point that's important to remember about discoverability is that it is usually mission-centric.
e.g. - you want to discover the 'right' kind of people to follow, as to have better or more entertaining or more useful or _____________ information. What is 'right' for you may not be 'right' for me, since our goals most likely aren't the same.
I love Naveen's idea (and more importantly his implemention) & will be playing with it a bit more this weekend. But, the value may not be apparent right away.
Aggregating and mining one type of data across a relationship may be more valuable. It may be that I have to track a similar set of data for myself and my significant other to make sense of the aggregates.
Yes, An s-expression is a tree. Imagine that you want to know the average score of a list of films like: (average listOfFilms) and someone else like the same average for the same list. You will end up with two trees that indeed are the same and can be calculated only once. A DAG is a way to represent dependencies, so if two cells are linked to the same cell (average listOfFilm).
> Imagine if the wealth of data collected about you all went to one place, and, most importantly, was controlled by you. Imagine if Company X had to pay you to know about your Amazon spending habits instead of paying Amazon.
Even more than Project VRM linked above, this is precisely the eventual goal of Tent: https://tent.io/
That said, I'm skeptical of these projects', or anyone's, ability to achieve this utopia. Why would Amazon or Google adopt such a system? Your shopping/searching history is valuable data that gives them a business advantage.
Also, while a personal data store that you control may be a geek's wet dream, for an average nontechnical person, it would likely be difficult to use, insecure and offer no clear benefits. And if it ends up just a toy for geeks, the lack of scale would make it even harder to do anything interesting with it or get any companies of significant size to play ball.
I do hope ideas like this surface in some form, but I'm skeptical of the personal data store ever really becoming a thing.
Why would Amazon or Google adopt such a system
I could imagine laws requiring that personal data be controlled by the subjects.
it would likely be difficult to use, insecure...
Who says? How can you assume this?
and offer no clear benefits
Some would consider the ability to keep your information private at will a benefit. Others would consider the ability to sell said information a benefit.
the lack of scale would make it even harder to do anything interesting
Lack of scale? An individual's data would be controlled by that individual, but the system would scale across all of humanity. I don't think you're thinking big enough -- I'm not talking about each person running his personal EC2.
Look, I'm not saying don't go for this idea. I'd love to see you try it. My doubts shouldn't get in your way. Code it up.
As for why it would be difficult to use and insecure? I'm guessing most of the people you hang out with are probably geeks. For perspective, take a look at this video:
Imagine each one of those people in the video has a personal data store that contains their complete medical history, right alongside all their Google searches, Amazon purchases and tweets.
How secure of a password do you think they picked?
How hard would it be to trick them with a phishing attack?
How much do you think they even know how to do with it in the first place?
The questions you pose also apply to centralized services in exactly the same way. There is no reason that personal clouds would be any less secure than centralized services are today. [1]
Even though some people are confused about the difference between browsers and search-engines doesn't mean they don't know how to use either. You'd probably have got the same response if you'd asked about the difference between the Web and Internet. Or any other complex system they don't have deal with day-to-day (internal combustion engine, electricity transmission, etc)
[1] edit: think of things like mass-assignment issue with rails and github or the recent facebook exploit that was posted. Centralized services are just as much of a honey-pot as a personal cloud might be.
No one who steals your Facebook password can get into anything as serious as your entire medical history. The proposal is dangerous precisely because it's centralized — in the sense of centralizing all different types of data on a particular individual. It also, as a side effect of giving the user more control, would give the user more to lose if their account is compromised. Again, with no clear gain for most people.
I feel we're just going to disagree over this but it's probably because neither of us has clearly stated the threat models we're dealing with. Also, it's somewhat hyperbolic to claim that such proposals are 'dangerous'. I could claim the same about the current situation where more and more personal data is handed over to companies, almost by default.
A trivial example from my own recent experience: getting my highlights and notes out of the Amazon Kindle app on my phone.
Amazon could make these available through a simple RSS feed or export feature, but they don't. The only way I can get my hands on them is to scrape their website, which depends on some fragile scripting and is subject to their terms of service.
I really like this idea, but without some kind of positive government regulation, I don't see it happening. At least not in the sort of meaningful ways the OP envisions.
I'm no expert, but it occurs to me that a mountain of data coming in from disparate sources needs the concept of discoverability to be adequately used. That is, there has to be a way for a machine to find occurrences of data that it cares about without specifically knowing the exact attributes of the data.
A simple example off the top of my head. Let's say there's a popular app that collects your pulse from your cellphone somehow, and that data is stored in your "Personal API" in some format. Some scientists write a routine that scans people's pulse patterns (with their permission, say) and warns them of possible risk factors. Later on hospitals start feeding their pulse data into the same personal API, but in a different format and using different units (bpm instead of bph, say), because they've never heard of the cellphone app.
I take "discoverability" as a general concept addressing how the scientists' routine might discover that the hospital data exists in a person's "Personal API", discover that the thing being measured by the hospital is the same as the thing being measured by the app, and discover how to map both formats into a common workable format and units.
couldn't one also keep track of the 'source' from where a data point comes (i do this, but don't reveal it in the API - yet)? this way, when it comes to discoverability via some algorithm, you could always separate the 'verified' bucket from the 'unknown' bucket?
It's more about money. This data is worth a lot of money to marketers, retailers, pharmaceutical companies, etc. They have no incentive to share it freely or, I suspect, at a price reasonable to most individual consumers. (Maybe I'm wrong here?)
There was that interesting article a while back about how Target was able to figure out a teenage girl was pregnant before her own father did:
Target can buy data about your ethnicity, job history, the magazines you read, if you’ve ever declared bankruptcy or got divorced, the year you bought (or lost) your house, where you went to college, what kinds of topics you talk about online, whether you prefer certain brands of coffee, paper towels, cereal or applesauce, your political leanings, reading habits, charitable giving and the number of cars you own. (In a statement, Target declined to identify what demographic information it collects or purchases.)
I'd be very curious, and a little scared, to see all this kind of information collected on me.
I don't think so. Documentation is for people. Discoverability is for machines. Notice I suggested it's the scientists' routine which does the discovery, not the scientists themselves.
Caring about your health and well-being isn't narcissism. It's one of the most important things that you can give for your loved ones - kids, spouse, family and close friends - because healthy and happy person can support others in the need. And I'm strong believer that our personal data will play a big part in the preventive healthcare in the future.
Well in defense of this snarky comment, I have to say that the social sharing part of personal data like this can come across in this way (doesn't make it narcissism but still...).
The thing I like about personal data collection and cool new apis is what it could do for the individual, i.e. offer new insight, movitivation, encourage better behaviours.
But I hate the sharing part because it opens the door to over-sharing. Not just in terms of corporations having all kinds of data and using it in their interests, but also just simply the noise that gets added by people sharing information that frankly, has very little relevance.
I don't really care if someone just completed a 2 mile run and I certainly don't go to twitter to read automatically generated tweets about it.
Right now, things are fairly sparse. You can check the repo out here: https://github.com/jdjkelly/bodyimage
An open source Rails API to track "quantified self" data.
Right now, the hardware companies in this space are storing your data in closed systems. Let's change that.
Take the Withings WiFi scale or the FitBit, for example. Both are essential pieces of the quantified self hacker, yet both store your data in their respective, siloed databases. What happens when one of them goes bust? Or decides to pivot towards an ad-based business model?
This project includes a Rails + MySQL RESTful API to store your data.`
Contact in profile.