Hacker News new | past | comments | ask | show | jobs | submit login
Facebook confirms that it tracks mouse movements (indiatoday.in)
345 points by shahocean on June 13, 2018 | hide | past | favorite | 182 comments



I have always assumed that Facebook uses heat maps to track what is under your pointer. Doesn't every serious site do that to gauge user behavior, interest and interface utilization? I guess the difference is FB is putting it in an available data set along with everything else about consumers' individual lives.

More interesting is "a patent held by the company states that the Facebook app uses voice recognition algorithm, which uses audio recorded by the microphones, to modify the ranking scores of stories in users News Feed." and their speculation that Facebook could soon reveal details about their use of surreptitiously recorded user audio.

Facebook makes a curiously specific denial about audio, which is that it is not used for advertising. Considering their entire business is basically advertising, what does that leave? But all they mean is ad selection. When they were found to be recording audio during the posting of statuses, I believe they claimed it it was so they could recognize the music you were listening to, and know something about your mood. So for a long time, I have thought that they use audio to select other content, like friend suggestions, or to inform the selection of stories that appear on your newsfeed.


Like heat mapping a visual UI, there are a lot of people out there recording audio.

However, I think that I disagree with you on whether or not sharing the data is important. If you are heat mapping me, like facebook and probably everyone else from Microsoft to CNN and FoxNews does, or you are recording me like everyone from Facebook to Samsung does, I'm sorry, I've got a problem with it. I don't care if you don't share that data. I don't want Samsung recording what's going on in my living room. Doesn't matter if the data isn't shared. It's just the principle of the thing.

It's gotten to the point where I actually purchased a certain model of Sony TV, because the teardown verified that there is no microphone in it. Then I tossed the remote control and got a generic remote with no voice control.

People joke about me being paranoid, but I'm not paranoid. Sheez... I'm old and boring, I know that no one cares about what's going on in my house or on my computers.

I'm just stubborn.

Why let the privacy invaders win?


I'm on your side mate. Everyone seems to think people like us are nuts. Only time will tell what side of history we were on. Be the change you want to see in the world and all that. Big faceless corporations make me uncomfortable in general; having their tentacles in my home, with my family, makes me even more uncomfortable. Don't pitch me any nonsense about trying to provide me with a service because that's absurd: you literally have to use technology to be a part of modern society. There's no option. I don't care if people think I'm a wacko for being privacy conscious, I think they're wacko for the opposite. Even if there is no future horrible implications of the surveillance state, which is hard to imagine, I think just trying to exist without having every micro detail of my life profiled by what seems like every mega international on the planet is a worthwhile ambition.

I got carried away and rambling but I mean, come on. Mouse movements? Really? I suppose you have to give them credit they are creative in a very perverted sense.


Not only that, but it makes the sites literally unplayable on most devices and computers manufactured more than like 2 years ago. The technical bloat is horrendous.


May I ask what model of television that was? Might be worth getting one myself.


43x800d

You need to be careful though because the microphone is in the remote control instead. So buy the remote control WITHOUT voice control, get rid of the remote that comes with the TV, and you're good. There will be no mic's in your tv setup. (This will cost you a bit extra. About 20 or 30 bucks for a generic one.)

A second problem is that it is a smart tv, so you have to make sure to NOT connect it to the internet and get your smart services via a box you control. (I haven't been able to prove that Sony siphons smart data usage from its tv's the way the Samsung Android tv's do, but it is technically possible for them to do so.)

In any case, under no circumstances should you get a samsung. You can't even use those tvs without the privacy invasion equipment activated.


Thank you for sharing this information. I have a 10 year-old tv that I might get replaced soon, so your comment is very helpful.

>You can't even use those tvs without the privacy invasion equipment activated.

Could you elaborate further? If I don't connect a samsung smart TV to my network, shouldn't I still be able to use it as a normal TV?


Thanks. I'm in similar circumstances to jake_the_third and have been wondering for some time about how to find a decent television that's not full of this.

Indeed, I would never buy a Samsung smart TV (or phone).


I agree with all of that. Personally I don't even use a smart phone because of the lack of controls over knowing what or who is tracking whatever crazy shit. I either carry a laptop or a netbook when I go places, which isn't perfect still, but with proper precautions is fairly secure.

I don't do anything im worried about people finding out, but that doesn't mean I want to let them listen in either. Maybe if someone was upfront about it from the beginning or showed us what they were truly doing, but seeing how shady it all seems to be makes me assume they are doing shady shit with it.


The difference is how the data is used and the whether it's associated with an individual's permanent personal data. If it's only gathered anonymously and used for internal UI improvement, that isn't objectionable. At the other extreme, association with a real-world individual enables many uses that are potentially harmful, such as unmasking those who legitimately prefer to be anonymous.

Edit: I wanted to add that I didn't intent to focus on whether the data is shared. I think FB having and using it is bad enough, especially if they're ubiquitous. Also, once anyone creates such data, other entities such as governments will seek to obtain it and likely do so eventually.


Anonymous data is one identifier or clever match away from identification. This is particularly severe for sites/services/hardware that records audio (man, who would knowingly allow that kind of abuse??), but it can apply to mouse tracking too. Mouse tracking fingerprints could be used to re-identify all sorts of other things.


It depends how much data is collected in the first place, and how much is available to the person trying to break anonymization. If I'm not mistaken, everything is deanonymizable with global traffic analysis.


For some people, there is no difference ultimately. Because it's not that they aren't doing anything malicious with the collected data now, it's about the fact that they CAN do it if they desire to (ethical or not) in the future--the data is being collected, stored (perhaps indefinitely), and will always be accessible. In a world where capitalism reigns, to think that any large corporate business would treat our data with the best care and in our bests interests seems a little silly. I have always held the belief that businesses are not people and it's reasonable to expect that businesses may not be inclined to always do the right thing, especially if it gains them more money and power.


I'm picturing the minimum being a system that collects nothing more than a mouse movements, rather than also IP address, full URL, user account, and other details that could easily tie it to a person. I mean more the minimum as an abstract ideal, than something anyone actually does.


There is no such thing as anonymous data. The belief that something is anonymous is simply an aspect of statistical ignorance or naiveté.

And, since we can't know in advance all the ways data can be combined, recombined, projected, and analyzed there is no such thing as informed consent to use said data unless specifically restricted to a single analysis using only given data.


I realize this, but I can picture a bare minimum store of heatmap generated data that would be extremely difficult to use for anything other than knowing what people on the website clicked on. Indeed, the more info collected, the more likely someone can combine it with other data to make broader conclusions.

Such as, any time you store a precise time in connection with user actions that has privacy implications. I picture simply not recording the time or exact URL. If the system is designed without any sort of privacy in mind, and just records whatever data is convenient and too much, that's easier to abuse than one that intentionally records a minimum with privacy in mind. I agree it's amazing the way all of this can be subverted, and yes, I realize that HN is stocked with data scientists who are more knowledgeable about this than I am.


Perhaps a noob question but doesn't the browser ask for permission to use the microphone? And hence: they can't really listen in if you haven't given it explicitly.


If you attempted to record a video once with the Facebook app, it then has the microphone permissions forever.


Plus: At least Instagram and Messenger will force you to enable the microphone to let you use the camera even if just to take pictures. Now I always take pictures from the Camera app and simply load it into the app instead of giving them permission to use the microphone.


Can apps with access to the camera simply record video and audio whenever they want, even while running in the background?


Yes. AVFoundation provides both a turn-key camera UI view and direct control to the camera and microphone.

Correction: Not in the background, but it could do it silently while the app is in the foreground.


Looks like Android P blocks background access to camera and mic too: https://www.theverge.com/2018/3/7/17091104/android-p-prevent...

I can only assume that up until P developers could record both sound and video in the background. That's some scary shit.

[edit] Why isn't there a system level visual indication or audit trail of any camera or mic access? Surely this would be trivial to implement, you could even disable it if you just didn't give a toot about privacy at all.


Indeed. Finally in Mojave Apple is adding direct control over which apps have access to the camera and microphone, but in general iOS is miles ahead of macOS in regards of privacy control.


This is just truly bad if they use the this permission to then record you without knowing! You gave the permission to take a video when you wanted to, not when they want to..

I wonder if the apps and browsers will get a "grant access for 5 minutes" setting, so that one can feel safe with the services of these services. EDIT: Or maybe lock-screen/pulldown-screen (or whatever it's called) notification that has a "remove permission" button so that you can remove it when you're done.


Firefox has both the option of granting the permission only for the current pageload (this is in fact the default behavior!) and exactly the "click the icon and revoke the permission" behavior you are asking for once you have granted the permission. So "browsers" already do this, for some values of "browsers". l)

[Disclaimer: I work on Firefox.]


Many permissions like that are used in ways a typical user doesn't expect. Companies will explain it away by saying you agreed to it in the terms of service, although it's generally acknowledged that almost nobody reads or understands those terms for any company. For example, most people don't realize that when you give an app read access to your photos, it probably will scan every single one of them for data and upload it to the app's maker. Practices like this are so common that I don't even think the developers understand that they're abusing users' trust.


On macOS Safari, any location request through the browser API (not based on IP geolocation server-side) only gives the option to allow it for a day, not permanently. It's a start.


Does it?

In Firefox you can grant the permissions once (for the current page only) or grant them forever, your choice. If your browser doesn't offer that choice, that's a problem with your browser.

[Disclaimer: I work on Firefox.]


You are assuming the facebook.com page. He was talking about one of their apps.


That's fair, now that I reread his comment, but he was responding to a comment that was talking about browsers, not apps...


Is it like this in Android, iOS or both?


I’m pretty sure iOS doesn’t let you access either the camera or the microphone without displaying a banner.


Only the first time, then any use is permitted while the app is in the foreground (on iOS at least) unless the user manually disables it.


"Facebook App"


Webtrends had a product that launched, I wanna say more than 10 years ago, that generated heat maps of mouse movement

The media and general public seem to be 10-15 years behind when it comes to understanding how the things they rely on, and the tools being used to “improve” them, work

Though IMO, a lot of the blame is on Facebook and the whole lot for avoiding discussing openly in order to avoid fallout. Just asking customers too is out of the question, of course. Cause BIG Corps are smarter than their customers

Only once you’re “too big to fail” can you be honest about your shady BS


> "a patent held by the company states that the Facebook app uses voice recognition algorithm, which uses audio recorded by the microphones, to modify the ranking scores of stories in users News Feed.

Does the patent really state that Facebook does that, or Facebook spammed the patent office with obvious ideas about how they could do that. Big tech companies have loads of trivial patents on stuff they have no firm plans to build, just to stake out IP territory.

> I believe they claimed it it was so they could recognize the music you were listening to, and know something about your mood.

When did this happen, and why wasn't it frontpage on HN and all the news sites?


I don't know, does the patent state that? We should find it and read it.

As far as the status audio, I'm sure it has been discussed on HN. I don't have the time to dig up all the info right now but here is FB's take on that: https://newsroom.fb.com/news/2014/05/a-new-optional-way-to-s...


Audio might be used to locate users that are nearby (and hear the same sound). Or to detect what TV program/radio/music is the user listening to. Both of these of course are invasion into user's privacy.


I feel like they even want to know the basics, like whether you are in a loud place or a quiet place.


Just to be clear - users had to give explicit permission for that status update recording feature back in 2014. There are no external apps that can record audio on Android without permission unless Google explicitly whitelists the permission for the app. This never happens because there is no use case where the user shouldn't know.

One bad thing about this system for Android is how much control Google has over permissions - for example, their own built-in Shazam...


Facebook's various apps ask for the microphone permission, anyway, for other reasons. How could we verify they aren't recording or analyzing audio at times?


They are using speech spectral analysis to gather emotion context from users, used in conjunction with syntactic emotional features to tailor ads based on user mood.


I'm not surprised. Honestly, I stopped trusting them years ago, so pretty much all the words out of their mouths I consider to be BS. I'm not even sure why people apologize for creepy-ass multi-billion dollar companies that treat their users like products and study them like labrats.

I'm looking forward to technical solutions to verifiably disable and prevent this kind of tracking.

To those talking about "tracking UI usage": do a UX study. Sit some people down and watch how they use the site. Ask them questions about what works and what doesn't. Stop spying. You got all this damn money and you can't be bothered to actually lift a finger to do some difficult work that involves interacting with all those "dirty" people out there. FFS most people would probably be happy to fill out a survey if it actually would impact the product in a positive way. Creepiness is creepy.


I am not affiliated with Facebook myself; I barely use it, but given how big it is I bet they do user studies in addition to metrics collection. Metrics alone leave out important parts of the human element, but user studies alone are a biased sample.


Given the speed and comfort of not doing user studies, I expect "move fast and break things" Facebook to not care about user studies at all


I know several UX researchers at facebook. Yes, they do in person user studies.


I'm not sure if this is actually how reCAPTCHA v2 works, but I've found moving my mouse and highlighting text immediately after ticking the box almost certainly passes the tests for human. I very rarely get asked to recognise images or pick X out of Y. When I don't do this, i.e., I don't move my mouse or highlight text on a page at random, I most certainly have to sit through a couple of screens of tests (I'm behind a shared IP with lots of users).

All this leads me to think mouse movements tracking is much more widespread.


I have a similar experience. Filling in fields and switching with the TAB button then pressing ENTER always brings me to a visual recognition test, while manually clicking on the fields (and adding a bit of sloppiness) is most of the time an immediate validation from reCAPTCHA2.

However I'm pretty sure this was advertised or at least acknowledged by Google in the launching of reCAPTCHA v2.


This is correct, mouse movement is one factor for Google's one-click Captcha:

https://www.wired.com/2014/12/google-one-click-recaptcha/

> IP addresses and cookies provide evidence that the user is the same friendly human Google remembers from elsewhere on the Web. And Shet says even the tiny movements a user’s mouse makes as it hovers and approaches a checkbox can help reveal an automated bot.


Oh, that's probably why i always have to go through those tests, reCAPTCHAv2 is not vimium-friendly.


I rarely use the mouse, so this might explain why recrapcha v2 always thinks I am a robot


According to lcamtuf[1] how people move mouse and use keyboard are unique per person, it's your personal digital-fingerprint that can be transferred to track you elsewhere.

[1] From Book "Silence on the Wire"


You should check out https://www.typingdna.com/ a startup making this technology available for everyone.


I could see using this for fraud detection and then kick off a 2FA flow if it couldn't verify.

However, most biometrics aren't the best single line of verification. You still have to add a backup verification of some kind.

Examples, I was working on my car all weekend and my hands have new calluses on them and my finger print is off. I got a new keyboard and my typing pattern is different, etc.


Unfortunately, it's not just JS, but several other indicators. In practice, if you don't use the Tor Browser, it's as if you decided to leave tracks everywhere, almost identifying yourself. Yet, this is not common knowledge among web users. These are the things that children need to be taught in schools. The society as a whole needs to be aware and learn how to protect itself.


That is victim blaming, just because you do not know that everybody out there is tracking you does not mean that you have decided to let them.

I agree that this is something people should be more aware of and school is a good place to start. However it is up to browser manufacturers to fix this, not users.


The comment says "it's as if" you are intentionally leaving traces everywhere. So, the effect is the same as doing it intentionally, not that users are to blame somehow.


Wonder if the Facebook Containers for Firefox can stop this type of fingerprinting too.


True, but I'd wager that if Facebook is using mouse movements to track users, it means those other methods aren't foolproof.



I have a network meter on my xfce taskbar, and it shows uploads and downloads whenever I move my cursor on a website I view with a JS enabled browser. It's almost all of them.


Correlation doesn't imply causation.

(You can always open the browser inspector and check network traffic for each page or, if you are using Chrome, dive into chrome://net-internals/ )


Hang on - the one time correlation does strongly suggest causation is when trials include randomized interventions, which i pretty much the case here (once in a while, move my mouse and see if network traffic spikes).


Yes, certainly. But I don't have background programs that connect randomly, and I have observed this enough times that I can say this.


As somebody who worked in the past on a piece of software that generated heatmaps from cursor movements on websites, I can confirm that it's a very widespread thing. Well, it was ~5 years ago, so I'd guess it's even worse now.


How did you use this data? Well, except for bot detection that is. I can't think of a particularly useful way this data can be utilised.


For a non-nefarious use case, it can be used to iterate on the UI to create a better user experience because it can expose areas that people aren't seeing on the webpage. Your site might have the important content or useful navigation in a place that users aren't noticing which causes them to leave the site in frustration.


Yeah, basically this. The data was aggregated and a graphical heatmap was displayed on top of the website, with some fancy accomodation for responsive designs. You could see heatmaps for hovers and for clicks. Customers then optimized their shop flow, adjusted graphics that looked like they were clickable but weren't, moved important content into more visible places, etc.


> Facebook said that it tracks mouse movements to help its algorithm distinguish between humans and bots.

Stupid cat and mouse game. How difficult would it be for a bot to simulate a human's mouse movements? I suppose not very difficult.

Also, doesn't this conflict with rare types of input devices? Or people with a motor function disability?

> to also determine if the window is foregrounded or backgrounded

Shouldn't there be an API for that?


    How difficult would it be for a bot
    to simulate a human's mouse movements?
Very very hard. Because the bot author does not have the giant database that FB has to analyze how humans move the mouse around. Also the bot author does not know which aspects FB looks at to determine if it's a human.

And even if the bot author had all that information, it would still be super hard to write an AI that accomplishes a given task in a way that mimicks a human successfully. It would mean to win a 'mouse turing test'.

    Shouldn't there be an API for that?
What the API returns is under the control of the user. So the API does not help FB to fingerprint you.

This issue touches on the real privacy problem the net is facing. It's not the wrong cookies or privacy policies. It's fingerprinting. There is no technical solution to it.


> Very very hard. Because the bot author does not have the giant database that FB has to analyze how humans move the mouse around.

Don't forget that Facebook's false positive rate should be very low. There are lots of humans on their platform, and they should all pass the test.

This makes it easier to construct a bot that will pass the test.


It won't hurt if humans fail the test every so often, as long as it's under a threshold that humans regularly can overcome.

I can imagine it would be easy to trick the system a few times (either as a bot pretending to be human, or a human acting like a bot), but tricking it consistently over months or years is going to be damn near impossible.


Also don't forget that Facebook probably has to do all detection in Javascript on the client, i.e. with limited resources. I suspect they don't send every mouse-movement to the server. This also means they probably don't have fine-grained historical data.


Not necessarily.

I've only given it a few minutes thought, but position and time data is really small, and easy to compress (you don't need to send anything while the user isn't moving the mouse). If it's sent in batches or over an already open websocket, it's not like it's using a ton of resources on the client.

Assuming all of their users (guessing a billion daily active users) are on desktop half of the time (a wildly incorrect assumption I'm sure), and the mouse position data is 1mb per person for the data you care about (which again, seems like a lot), that's 500tb.

For $25k you could store it all. That's nothing compared to the benefits of being able to identify bots on your platform.


Yes, the standard way to do this a few years back for conversion optimization, was to RLE compress and send the data in intervals. Also the resolution/measurement does not need to be in the milliseconds.


You could probably start with recording all of your mouse movements over a period of a month or so. Record speed, acceleration, how straight each move is, how much each move deviates from a straight line, how many times the mouse movement stops along its way to its target, where you place the mouse when you scroll, etc.

Using these metrics you could probably start to draw some characteristics of how your mouse acts based on what you are doing and where you are moving your mouse.

This could then probably be used to build some form of algorithm that moves the mouse for you with noise (accelerating up & down along the way, deviation from a straight line, stopping in the middle of the line, etc.).


I was thinking more of a machine-learning approach, e.g. using a GAN network.


Many ways to skin a cat. Not an impossible thing to solve in my book. But i dont deny that it would probably be difficult.


I was thinking the same. Maybe we should do it together...


All those "the attacker won't have the knowledge" underestimate that the attacker can simply run their own websites tracking the exact same stuff, and can then just get the same knowledge.

You need to break ReCaptcha? Simple, you implement your own captcha on your own site that's frequently used and whenever you need to solve one you copy the challenge and present it to one of your users.

Same with recording mouse data.

It's an old idea even, very similar to https://xkcd.com/792/


> Very very hard. Because the bot author does not have the giant database that FB has

You don't need to learn from all humans. You need to learn from very few (or just even one).

Not all problems are machine learning problem.


Not only that, this can be made a machine learning problem if needed. I'm a human, so if I train my computer to act like my mouse movement, it's sufficient to fool facebook. Well, now I realize that this is not as easy as it sounds since as other pointed out we don't know how fine grained Facebook's data is and what they're paying attention to. I'm just saying that theoretically I should be able to train my agent to act just like me.


I don't think it should be that difficult. Project for a weekend hackathon or so. Collect same mouse movement data with a js on your own website if you own anything mildly popular or partner with someone and buy it off, stick it into some off the shelf GAN, job's done. Turing test is broken by modern deep learning

At least the mouse movements themselves shouldn't be difficult to do given a source of data. Simulating that you click on same FB UI elements as real people with same statistical properties on other hand is where you might be lacking the data to do it properly


> stick it into some off the shelf GAN, job's done. Turing test is broken by modern deep learning

This is a cartoon version of deep learning.


Google's already doing it with captcha: https://security.stackexchange.com/questions/78807/how-does-...

But does Facebook track user during everyday session or just during some validation-action?


As for the former, I'm pretty sure I'd trigger as robot since I use keyboard mostly (using some special features in Firefox) just because they make it a lot faster and nicer.

As for the latter, iirc there are blur/focus(-like) events for the window object. Maybe mouse movements gives them better confidence? Because of course you want to make absolutely sure your users are seeing all the ads.


I wouldn't be surprised if they tracked keyboard input as well.


Yeah that's why I think they'd classify me as a robot.


>Stupid cat and mouse game. How difficult would it be for a bot to simulate a human's mouse movements? I suppose not very difficult.

http://idlewords.com/talks/website_obesity.htm An interesting read on this cat and mouse game.


> Stupid cat and mouse game.

I suspect this is part of covering their ass for GDPR. Pose everything as a security problem, so you can claim you have legit interest in tracking all of that.

> How difficult would it be for a bot to simulate a human's mouse movements?

Simulating? Extremely difficult. Perturbing pre-recorded paths is a bit easier to do, but requires pre-recording of a lot of paths. One of the fastest ways to get your Poker bot banned is to not fix this one way or the other.

> Also, doesn't this conflict with rare types of input devices? Or people with a motor function disability?

It still tracks the mouse movements, just now being able to classify their users as disabled or using an arcane device (both of which are interesting tidbits to add to your advertisement profile).


> Shouldn't there be an API for that?

There is one : https://developer.mozilla.org/en-US/docs/Web/API/Page_Visibi...

Although I think it can be a privacy concern and somewhat of an anti-feature. For instance, Youtube uses that API to stop playback on mobile when the page or browser is not in the foreground. Of course, there's an extension for that ...


> How difficult would it be for a bot to simulate a human's mouse movements? I suppose not very difficult.

I also suppose not very difficult in principle (imitate any nearby human's movements and Facebook should not complain), but it is not within the focus of a general bot developer and therefore makes the whole project exponentially more difficult.


I actually think its difficult to simulate mouse movements. Is there even a way to do so using PhantomJS or Headless chrome?


I would imagine that FB is not using these libraries but instead using their engineering teams to develop these "solutions." Perhaps some of our friends are actually "working against" our privacy.


Ironically, the more interested I am in part of a page the less likely my mouse is going to be over or even anywhere near it. I find the pointer distracting.

If I'm interested in the content of a page, I either swipe the mouse off to the edge of the screen or put it on an area of whitespace so I can scroll with the scroll wheel.

The thought that people are hovering their pointers over stuff they're actively looking at strikes me as odd. Oh well.


This is beyond absurd. Why would you do that? What could go wrong? Is this a social network or spy agency?


Lots of websites you would not expect do this; it's not for spying on users but to improve the user experience.

Being able to aggregate data or inspect individual sessions is a useful tool to learn how users navigate with a site.

Keyboard keystrokes get captured too but the systems are intelligent enough to filter out passwords and payment details.

I don't really like this form of monitoring either but I've seen it in several companies.


> Keyboard keystrokes get captured too but the systems are intelligent enough to filter out passwords and payment details.

Sure. Except they are not.


> Keyboard keystrokes get captured too but the systems are intelligent enough to filter out passwords and payment details.

Citation needed. But i did not realize that before, so thank you very much for this information, i will desactivate js on every page with a password field from now on.


Relevant:

"Following the recent report that Mixpanel, a popular analytics provider, had been inadvertently collecting passwords that users typed into websites, we took a deeper look. While Mixpanel characterized it as a “bug, plain and simple” — one that it had fixed — we found that:

- Mixpanel continues to grab passwords on some sites, even with the patched version of its code.

- The problem is not limited to Mixpanel; also affected are session replay scripts, which we revealed earlier to be scooping up various other types of sensitive information.

- There is no foolproof way for these third party scripts to prevent password collection, given their intended functionality. In some cases, password collection happens due to extremely subtle interactions between code from different entities."

https://freedom-to-tinker.com/2018/02/26/no-boundaries-for-c...


I have worked in the past on a tool that recorded user sessions on websites and keystroke collection didn't end up implemented only because we were a small enough company that my strong stance against it could actually block it. It was a feature that often came up from the product team after discussions with customers, and IIRC some competitors already had it implemented back then.

Our own prototype, from before I've actively joined that particular project, tested on some live website helpfully displayed all the content of some textarea of some request form that somebody started to fill in, but afterwards decided not to include some of the details. That was a big eye-opener for me that it's absolutely not a right thing to do. We have ended up implementing a debounced indicator "some typing activity is occurring right now on this field", but we still had to deal with feature requests about content collection.

Judging from quality of some of those competing solutions, I certainly wouldn't bet that they're "intelligent enough". Maybe in 75% of the most common cases, maybe.


Just because it may help out the company and you've seen others do it doesn't make it any less spying on your users.

It would also help out Facebook if someone sent them a daily minute-by-minute log of what I was doing. That doesn't mean they should go try and do it.


There are some posts from Princeton on password and CC leaks for trackers such as this, e.g.

https://freedom-to-tinker.com/2017/11/15/no-boundaries-exfil...


>it's not for spying on users but to improve the user experience

So, it's for spying on users. What happens then?


> Is this a social network or spy agency?

A modern social network is a spy agency where you file your own reports on yourself in return for being able to read those of other people.


Under the hypothesis that people move their mouse to where they're intending to interact with your page, you can build a heat map of mouse locations and show your UX people where on the page users are focusing their attention. I'd probably rate it as one of the more benign forms of telemetry.


Pretty much every major website does this. Not only that, if you are using windows 10, it does this too. Not only does windows 10 track your mouse movements, it tracks your keystrokes and it scans your filesystem and sends the meta data info back to microsoft.

It's a very anti-privacy environment we live in right now. But it's for our own good or so we are told.


To be fair, you can turn that stuff off. But most people probably don't understand that it's doing that, so they don't even think of it.


> Is this a social network or spy agency?

Yes

Centralised social networks are by nature spy networks in that they collect a huge amount of data on a large group of people, both data concerning those people as well as data relating those people to any of the others. As to whether a centralised social network uses these data for nefarious purposes or only to provide a service to the users is up to those currently in control of the network. Ownership can change, a once-benign social network can turn into a nightmare overnight as the data will be there for the new management to exploit.


What!? Plenty of websites do this.


To be able to detect robots. Makes sense since most users will move their mouse in a massively different way than default robots. There is also probably some product reason like being able to see if people had their mouse around an ad before clicking it, etc.


If robots can mimic human conversations, this should also be trivial. This not only looks like a lame excuse, but also massively underestimates the ability of automatons that mimic humans.


This should be no surprise — received chat messages are marked as read on Facebook if you wiggle your mouse after a few minutes of inactivity.

To be fair, this makes sense from a UX standpoint. You don't want messages to be marked as read if you have your window open but have walked away from the computer.


This one is quite a big quality of life improvement IMO. It used to be that I sometimes left my browser accidentally open on facebook, and people would get offended because I 'read' their messages and didn't respond.


Maybe telling your friends that you didn't actually see their message and that instead the algorithm FB uses to record my mouse activity sometimes reports that I have read messages that I have not. This sounds dangerous and could be harmful to a person's credibility if another with bad intentions acts.


From a UI perspective it makes sense to persist my mouse activity to a database for eternity?


Earnest question from someone without much front-end knowledge: how is browser-based mouse tracking performed in a way that doesn't significantly degrade performance?

Do they use some type of client-side library that caches data for a while and asynchronously uploads it occasionally? Or occasionally try to asynchronously sample the mouse position and just get a coarser set of data?

It seems like real-time requests that respond to mouse changes would create huge performance problems and/or be easily stopped with browser extensions.


Capturing, collecting and forwarding mouse movement events can be done at almost no cost. 10 locs max.


> We collect information from and about the computers, phones, connected TVs [emphasis mine]

Personally, I find this even more disturbing. Does this "only" apply to TVs where a user is logged on, or are they also building shadow profiles for any smart TV that comes with the Facebook App preloaded and is connected to the internet?

At least for Samsung and Sony, I can easily see them cooperating with Facebook for a negligible fee.


Lol @ the Facebook army astro-turfing the comment section.

"But it's not nefarious..."

"But everybody else is doing it too..."

"It's not for surveillance, profiling and shadow profiling, pinky promise, trust us..."


> Please don't impute astroturfing or shillage. That degrades discussion and is usually mistaken. If you're worried about it, email us and we'll look at the data.

https://news.ycombinator.com/newsguidelines.html


I'm sorry to say this but as a part of the community maybe I should state my opinion. I don't really agree with this policy strictly. If there is some shillage in the comment section, it's perfectly fine to email admins, but I think it's also beneficial for the community bring it up. I personally find it hard to realize legitimate shillage sometimes, and appreciate it when my friends bring it up. I don't know why admins should access more information about what we see than we do.


"Everyone else is doing it" is totally a valid thing to point out here. It means that, while you can get up in arms about it, you need to get up in arms about the practice industry-wide, using the facebook name to make it sound worse than it is makes this either dishonest or ill-informed.

It's a lot easier to get one company (or person) to stop a practice only they do than it is to get them to stop doing something that everyone else does too.


Not sure I agree, there is a difference between a private company that can collect covert private data on over 2/3 of the US population, and one that has access to a handful of users.

Facebook has ties to and influence over government, it is so big and far reaching that I feel it's right to be more concerned about FB doing something like this than other smaller players.

The scale of FB is what makes it a special case.

However, I think that the proper approach to something like this is educating users. Companies are gonna capture your mouse movements, it's not something we should legislate over, but users should be informed as to what it means to give companies like Facebook information about yourself.

Worrying about mouse movements when you freely send clear text messages to their data pile about your most intimate feelings and thoughts is ass-backwards.


While its fine to point out, it is not fine to use as a reason not to take action. It sort of sounds like you're saying we can't hold any company accountable for anything if at least x% of their competitors are doing it.

Going after the largest offenders and making a big splash is more effective than doing nothing until you can get every company to simultaneously stop something.


>using the facebook name to make it sound worse than it is makes this either dishonest or ill-informed

No. Not only does facebook not get cover from "everyone else is doing it," they are actively perpetuating that cover existing for others. They are one of a few entities with the weight to set "industry standard" simply by changing their practices.

>while you can get up in arms about it, you need to get up in arms about the practice industry-wide

I am.


yeah it's not like your mouse movements is a serious privacy issue


I guess mouse movements are a very good way to identify people across websites.

It's also an efficient way to distinguish people from machines. Think about the click-only reCaptcha for example.


>> I guess mouse movements are a very good way to identify people across websites.

I absolutely not believe that. for real. this is something i keep reading but a seriously not see how mouse movement can be unique from one person to another. it's not a fingerprint come on.

> It's also an efficient way to distinguish people from machines.

well that i can believe, but not more


Change mouse brand to surreptitiously become another person without telling a single lie or changing any normal user data :)


It's about the pattern of movement, not the brand of mouse. Identity identification algorithms are more robust than that.


I wonder to what degree DPI settings, tracking speed, and acceleration influence this. People will move their mice differently depending on how fast the cursor goes, and it's cursor movement that's being tracked, not physical mouse position.


How often would you be ready to change mouse brand?

Would you change mouse each time you change browser tabs, keeping a mouse for Facebook and a mouse for other websites?

Also, keep in mind mouse tracking is just another technique deployed to identify you. It's not supposed to be a silver bullet.


""We collect information from and about the computers, phones, connected TVs and other web-connected devices users use that integrate with our Products, and we combine this information across different devices users use," Facebook wrote in the document adding that the collected information is used to "give better personalize the content (including ads), to measure whether they took an action in response to an ad we showed them on their phone"."

In other news, Google Tag Manager and crazyegg exist.


Is there even anything left that Facebook doesn't track?


They claim they don’t record and track what you say. My spouse claims otherwise, saying she’s gotten hyper-specific advertising for medications that she’s discussed with her patients with the phone in the room.


Facebook didn't claim they don't record what you say.

They said very specifically they don't record what you say for advertising. It was specific enough that it almost seems an admission that they use it for something else.

However, there are plausible explanations for your spouse getting ads for things she's spoken with her phone in the room - for example, the patient googled the medication later or wrote about it on facebook, and facebook knows thay your spouse and the patient likely had a conversation based on location data showing that they walked down a hallway together.


I still call maximum shenanigans on that whole concept. You'd think there'd be one hacker out there with a packet dump or disassembly of the application (yknow.. concrete evidence) rather than lame anecdotes that reek of confirmation bias.

Furthermore, I don't believe that Google and Apple (well.. less Google, more Apple) are in cahoots with Facebook to give them a backdoor to device permissions.


> (well.. less Google, more Apple)

That's funny, because very recently, there was a discovery that fb gave most of the device makers deep access to user data, including Apple.

Against all evidences, Apple still gets a pass.


Unless I’m mistaken the “deep access” is just bad reporting (the media are surfing on the wave of Facebook bashing).

What they mean by deep access is that the Facebook sharing system extension built into some OSes (including iOS pre iOS 9) had the possibility of accessing a lot of information from the connected Facebook account, which is not really anything to worry about (if your device’s OS is malicious you have way more to worry about).


Source?


as does every person here using Mouseflow, LogRocket, or a dozen other off the shelf solutions. is this really that surprising?


"Everyone does it" != "it's the right thing to do"


Seriously my thought.. Pretty much every company does that. Not especially for detecting if its a Human or AI, but for understanding if the User has problems with the Usability of the Product..


Yes, of course, I'm sure Facebook only uses this data for usability studies. I'm also sure they could pretext all other tracking they do to some other marginally positive use case.


Where I work we track: mouse movements, how long you spend on a specific page, the path that you took to get to a specific page, what you clicked, ip addresses, and basically all information included in the header of the request. We also screen shot everything from when you login to when you logout.


Um. This isn’t nefarious.

We’ve used inspectlet from time to time to help figure out where there are problems in our UI. It tracks mouse movements as part of a complete session. It’s been really helpful.


You should still only do this when the user is aware you’re tracking these behaviors and you have their consent. That’s the problem here.


I'm not so worried about this, as long as it's only on their site and when I'm logged in. They're using it to test features and engagement, and can't really learn much about me as a person with it.


The average user is unaware their mouse is being tracked. Without this knowledge or without given an informed choice, it could be argued that they can't act with full consent. They might be less willing to use a site which does this if they did know.


It isn't just Facebook that has been doing this. Web development tools are able to track scroll and mouse movements as well, and have been used to test website usage


There is a big difference between using this kind of technology to run usability studies or to provide more targeted advertising.


Not really. If you record and save that data, the reason is secondary.


Tracking an individuals mouse movement across devices for perpetuity, packaging said data as a product offering to potential advertisers, seems to be a more extensive undertaking than simply UI/UX improvements.


Every website that uses google analytics does this.


I assume from this they can infer if you're left or right handed and store that in your profile.


The really sad thing is that all this data is not available for academic research. Instead of understanding human nature to make a better world, it is being used to make better targeted ads.


Does Facebook just grab everything they can? Does anybody know if they grab your device's battery status, or gyroscopic data about how you are tilting/holding your device?


I mean, likely. Given the number of devices they are on, someone has to be using the Battery Historian or similar tools to optimize for power usage.


it's stated right in this article that Facebook "also admitted that it collects information about operating systems, hardware, software versions, battery levels, signal strength, available storage space, Bluetooth signals, file names and types, device Ids, browser and browser plugins" ;


Presumably Facebook can do this because Javascript gives up this information.

What's the easiest way to hack your browser to give no information or dummy information here?


At this point Facebook can come out and say they track any and everything. They might as well because no one will do a single thing.


I remember just few years ago there was an article here about a website tracking mouse movement to predict what the user will click and pre-loading the content in the background. It was seen as a huge improvement to the experience and the comments section was full of people saying how cool it is. But hey, now facebook is doing it, so it's suddenly the worst thing in the world?


There's a difference between pre-loading content to improve user experience, and tracking mouse movements to gather information about users that's then used to show them advertisements, or potentially track them across the web[0].

There's also the difference between announcing "hey, look at this cool tech we have to make the web faster!" vs "we are legally required to admit that we've been watching you like a hawk, for... reasons."

[0] https://news.ycombinator.com/item?id=17301769


I can't speak to Facebook's intentions or policys(I don't use big blue app at all), but many sites use tools like Full story to discover bugs or watch a customers user journey to discover flows that aren't working well. Some portions of the page are automatically filtered(inputs with type=password), and the rest depends on the team being very thoughtful about marking sensitive portions of the screen as such.

It's a very manual process, but probably one of the most powerful tools for improving user experience I've ever seen. And typically for most businesses, you are keeping these sessions for 2 weeks or thirty days at most.


I absolutely agree - I used something similar at my previous job.

But neither of us know Facebook's policy (although we can easily guess their intentions, based on past behavior), and nobody ought to be expected to cut them any slack.


Is there a more legitimate source for this announcement? Seems like HN has fallen prey to fake news.


https://www.judiciary.senate.gov/imo/media/doc/Zuckerberg%20...

Quick search indicates the tracking information starts on page 84.

Page 86: "Device operations: information about operations and behaviors performed on the device, such as whether a window is foregrounded or backgrounded, or mouse movements (which can help distinguish humans from bots)."


Using mouse movements to detect that tab is inactive is an old trick. Compared to other facts, I think that tracking mouse movements is absolutely not a problem. Collecting things like IMEI or IP addresses needs more attention. For example, I think it is not necessary to keep IP addresses longer than week or two if you respect your users' privacy.


Every company and their mother is tracking mouse movements. How is this trending???


This is hardly interesting. A lot of websites have been doing this for a while. Just open your network console while you are browsing and you'll see all the things they are tracking


"Facebook confirms it receives your IP address!!!"

We need a new science of hysterias. The internet provides ample data for robust analysis and prediction. Any takers?


Seriously. The hysteria is getting a bit stale that this is what people are now being outraged about? We've been using CrazyEgg, or any other number of different mouse and scroll tracking tools for ages.

"But Facebook is using it to signal advertising!"

Well. Everything you do on Facebook is used to signal advertising. I thought we were kind of all aware that was what happens with this free service. /shrug


  document.body.addEventListener('mouseenter', e => {
    console.log('this just in, erickj tracks mouse movements');
  })


its just for security like he said to congress right?


thats great step towards security and privacy


Mark is tracking lots of user data


Does this allow detecting drug addicts and recommending treatment?


"We think you may have Parkinson's, would you like to visit the doctor?"


"We think you may have Parkinson's, your driver's license has been suspended until you can confirm your fitness to drive. Also you're no longer registered to vote"


This seems like a reasonable thing to do to detect bots.


BREAKING NEWS: A free service on the web monitors how you use it.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: