Senate Testimony on Privacy Rights and Data Collection in a Digital Economy

idlewords · on May 12, 2019

I should explain how this process works, and how I wound up giving this testimony.

Basically, a staffer on the committee found me online, was interested in my ideas, and asked for a phone call. A couple of days later, he scheduled a second phone call, this time including the committee chairman's staff from the opposing party.

I got a formal invitation to appear with about eight days' notice, along with a request for written testimony on specific topics. You pay your own way to get to D.C. for the hearing.

The testimony was due 24 hours before the hearing, so that staff had a chance to review it and plan their questions.

In the hearing, there is a two-hour block of time, and senators each get five minutes to ask questions, alternating parties, in order of seniority. Some of them want to know stuff, others want to read a statement into the record. You get a glass of ice water and a little countdown clock at your desk that tells you when the five minute blocks of time are up.

At the start of the process, the three witnesses each get a five minute opening statement. In my hearing, we had a former diplomat and expert on the GDPR, and someone from the financial industry. The hearings are open to the public and if you are in D.C., you should attend one! It is quite a spectacle!

I'm very thankful to Matt Blaze and others who wrote about what hearings are like, to help me prepare for this. And I am happy to answer questions in turn.

pantaloons · on May 12, 2019

In your written testimony, the goals for privacy regulation seem quite watered down from past proposals of yours like prohibiting behavioral ad targeting, or eliminating data brokers altogether.

Does this represent a shift in what you think is reasonable, or achievable, or just language tailored for the audience?

idlewords · on May 12, 2019

I wanted to stay within the parameters of the hearing, and also talk about stuff that I thought had a fair chance at getting bipartisan support (like regulation for making binding voluntary commitments about privacy).

I don't think I've watered down my proposals, but I think choosing winnable battles is important now that the legislative wheels are turning. For example, I think a good way to get rid of behavioral ad targeting is to give publishers a way to demonstrate that they make more money by not doing it (like we saw with the NYT in Europe).

sjackso · on May 12, 2019

I'm really glad that US senators are hearing your perspective on these issues. Thank for putting in the time and for your patience with their endless name-mangling. :)

Do you feel any more or less optimistic about what's coming in privacy law, having done this?

Any part of the experience that was especially surprising?

Edit: wording.

idlewords · on May 12, 2019

There seems to be a strong desire for regulation, which gives me some hope. On the other hand, the big tech companies have a massive lobbying operation in D.C., and I'm not naive enough to think my testimony will have much effect.

I was ready for the questioning to be far more adversarial. That was a very pleasant surprise. Another surprise was the party imbalance—many of the Republican senators did not attend.

My sense is that the current direction in drafting privacy law is "GDPR lite", on the basis that 'everybody owns their data'. I don't think that framing works, but it seems to have captured many of the senators' imaginations. I believe the time pressure for regulation is coming from the California law, which will go into effect in 2020 unless pre-empted by Federal regulation.

In principle, the privacy issue cuts across party lines and could have a real bipartisan consensus form around it. In practice, we live in polarized times. I am not optimistic, but we have to try our best.

kevin_thibedeau · on May 13, 2019

What is likely to happen is that new internet based companies will be snubbed to appease the latest round of pitchfork masses.

Much like how Equifax got away without punishment, nothing will be done for the entrenched data broker industry which has been invasively profiling consumers since before the internet was invented.

Where Faceboock mostly collects it's data via voluntary submission in exchange for a free service, traditional data brokers get their mitts on it without any consent. They are the real demons. I'm sure their people are working behind the scenes to keep the attention focused on Facebook so that legislative theater can make it look like something is being done without affecting their bottom line.

aptwebapps · on May 13, 2019

Without getting into the voluntary/involuntary debate, which I think is a bit murkier than your last paragraph would imply, I think companies like FB are much more dangerous in the long run than companies like Equifax.

mlb_hn · on May 12, 2019

At ~53:00 or so Mr. Brown asked about the impact of analytics on newspapers. You pointed out that the newspapers are basing editorial decisions on the analytics (article metrics).

As we all know, click-fraud's a huge problem on the internet. Have you come across any recent numbers on click-fraud in the newspaper analytics? It seems like if your description is accurate, if I wanted more stories about a topic I could just have a botnet give those articles lots of views.

idlewords · on May 12, 2019

This information is very jealously guarded. The best public info I know of is here: https://cdn2.hubspot.net/hubfs/3400937/White%20Papers/ANA_WO...

I'd welcome any other info, public or not so public, people care to share.

skrebbel · on May 12, 2019

I watched a portion of the video, and I get the impression that the people asking the questions appear to have a pretty good idea what they're talking about. Do you feel the same? Those are the senators, right? Gives some good hopes for the future.

idlewords · on May 12, 2019

Those are all senators, yeah. They have staff experts who brief them for these hearings so they can go into knowing what they want to ask (or say), as well as what they are likely to hear. The level of staff expertise is extremely high, and the senators themselves are no fools. They are used to regulating the financial industry, which is even more slippery than big tech.

YokoZar · on May 12, 2019

Perhaps today's Senators have learned not to completely wing it - we're a long way from the "series of tubes" era.

lostlogin · on May 13, 2019

Perhaps in the legislative branch, the executive branch suffers no such limitations. Eg. https://www.whitehouse.gov/briefings-statements/remarks-pres...

Permit · on May 12, 2019

> I get the impression that the people asking the questions appear to have a pretty good idea what they're talking about

This video is edited so perhaps it's misleading, but I did not get that sense at all: https://www.youtube.com/watch?v=t-lMIGV-dUI

Edit: As mentioned below this is not from the same hearing. I interpreted skrebbel's praise more generally and the representatives in these clips didn't inspire much confidence.

skrebbel · on May 12, 2019

That's.. different people in a different hearing.

drilldrive · on May 13, 2019

Not to mention the editing cut footage mid-sentence. I can't watch something like that.

OrwellianChild · on May 12, 2019

Many thanks for taking advantage of your opportunity to give a thoughtful and reasoned statement to the committee! Your advocacy of late has done a lot to raise the profile of these privacy issues, and you've been able to keep your explanations clear and understandable. That's not easy, and you've clearly taken time to prepare your critiques.

This extends to your Tech Solidarity [1] work, coaching politicians and journalists... (Thank you for updating/clarifying your recommendations last month to keep it current!)

Your Tech Solidarity security guidelines are purely defensive in nature, while this testimony was much more proactive and forward-thinking... Have you given consideration to spending more of your time lobbying in favor of specific legislation or educating politicians about these topics on a longer term basis? You seem to have their ear, and could conceivably fundraise around these issues. (I realize the personal commitment it would require on your part - just curious whether you're thinking about next steps...)

In the meantime, are there things that average internet citizens can do beyond following the Tech Solidarity guidelines to both protect their own privacy and also support/advocate for progress on these issues? I can pretty easily block trackers, use 2FA, use HTTPS Everywhere, but as you've articulated in your testimony, what we really need is a sort of "herd immunity" like what vaccines attempt to achieve. How do we get there from here?

[1] https://techsolidarity.org/resources/basic_security.htm

idlewords · on May 12, 2019

The problem with privacy in our age is that it's a collective harm issue (like public health, or environmental protection) so there really isn't much individuals can do, other than petition for regulatory change.

There are lots and lots of people working in this space worth following. One person in particular working on the California privacy law is Ashkan Soltani, and his twitter account (@ashk4n) is worth a follow. And I think no one thinks more deeply or incisively about these issues than Zeynep Tufekci. If you haven't read her book (Twitter and Tear Gas) it's fantastic.

I know it seems otherwise because of my fundraising work last year, but I really don't have strong connections to D.C. or any inside knowledge on what is on the regulatory agenda. Everyone I know in this space is kind of just winging it.

Despegar · on May 12, 2019

>In the meantime, are there things that average internet citizens can do beyond following the Tech Solidarity guidelines to both protect their own privacy and also support/advocate for progress on these issues?

Focusing on tech solutions is a band aid to a systemic and structural problem that can only be addressed in law. That's why GDPR is good and why Congress grappling with passing privacy legislation is good.

ethbro · on May 12, 2019

Coming up with proof of concept technical solutions is a good first step.

It helps determine facts about the problem (What type of trackers are used? How widely? What changes when they're blocked?), which is critical for quality legislation.

viburnum · on May 12, 2019

Here's the video of the question and answer part: https://www.banking.senate.gov/hearings/privacy-rights-and-d...

dredmorbius · on May 12, 2019

Hearing begins at 10m30s

Maciej's remarks at 42 minutes.

The video (or audio) can be played directly from:

    mpv --ytdl 'https://www.senate.gov/isvp/?comm=banking&type=live&stt=&filename=banking050719&auto_play=false&wmode=transparent&poster=https%3A%2F%2Fwww%2Ebanking%2Esenate%2Egov%2Fthemes%2Fbanking%2Fimages%2Fvideo%2Dposter%2Dflash%2Dfit%2Epng'

tzs · on May 12, 2019

For any given type of item of information about me that a site may be able to see and want to use, it is going to fall into one of these groups:

• Items that I'm OK with sharing with every site.

• Items that I'm OK with sharing with some sites, but not with others.

• Items I'm not OK with sharing.

It might be nice if there was a standard list of items and a way to tell my browser which group each item falls under, and a standard way for the site to tell the browser which items it wants. The browser could then tell the site which group each such item falls under, and the site could dynamically generate a permission request and privacy disclosure that just covers those things I'm not willing to share with every site.

Maybe add some other dimensions to this covering things like how the site uses the item (internal use or shared with third parties; just to provide the services I use or also for marketing, for example), the category of site, and the location of the site.

idlewords · on May 12, 2019

Where this consent model gets tricky is things like group chat. Who owns this exchange between us? You? Me? Hacker News?

anon1m0us · on May 13, 2019

Why not everyone who participates in the exchange? Or perhaps only the words they contribute?

schoen · on May 13, 2019

Well, there's a fact that idlewords responded to tzs, which could be viewed as a fact about idlewords, a fact about tzs, a fact about both of them, or a fact about this overall conversation, among other things.

For copyright purposes idlewords already has copyright in his post, but I don't think that's the kind of "owning" that we're talking about here.

There's also the fact that, for example, you read what idlewords and tzs had to say to each other, and the fact that some people in the discussion upvoted their comments.

That's already a lot of different sorts of facts about this conversation, and doesn't even reflect everything that HN (or readers) knows.

burtonator · on May 13, 2019

It's not really possible due to the issues with some small leakage and classification of the events.

For example, you might be ok with sending errors your app generates to the vendor so they can improve their product but there could be leakage.

Filenames could leak and if you named something with PII then that leaks as well.

mbesto · on May 12, 2019

This is the most succinct explanation and description I've ever seen regarding privacy and regulation.

Maciej - thank you for writing this and spending time with congress.

idlewords · on May 12, 2019

It was an honor! I am still flabbergasted I got the opportunity, but I didn't want to waste it.

CamperBob2 · on May 12, 2019

I'd say you definitely didn't waste your time, even if you had only posted the essay online. That's a great piece of writing on the subject, especially the perspective on the GDPR's effects. Not being an EU resident, I had no idea it was that cumbersome.

idlewords · on May 12, 2019

I advise everyone with a few spare minutes to spin up an EC2 instance in Frankfurt and proxy your web surfing through it for a few hours. You'll be amazed at what you see.

rdiddly · on May 12, 2019

Strictly speaking I would call that the effect of companies' resisting the spirit, and in some cases letter, of the law, not an effect of the law itself.

TeMPOraL · on May 12, 2019

The same thing happened with the "cookie law" too. The law was created to gently nudge companies away from abusive tracking, by requiring to inform users about third-party cookies. But instead of stopping to use third-party trackers, the web world has simply shown the finger to the regulators, and that's how we got cookie notices on every site.

My guess is that's where GDPR got its teeth from - the regulators tried the "industry self-regulation" route before, and it failed spectacularly.

temp99990 · on May 12, 2019

If there is one company I hope gets the spotlight in this debate it is Plaid. I think there are among the least transparent when it comes to what data they collect, have zero way of auditing/ensuring compliance among devs, and arguably dealing with some of the most sensitive personal data (banking, transactional).

switch007 · on May 12, 2019

> They [Silicon Valley] see a regulation and they find a way around it. We don't like banking regulations? So we invent cryptocurrency and we're going to disrupt the entire financial system. We don't like limits on discrimination in lending? So we're going to use machine learning. Which is a form of money laundering for bias.

Could you expand on what you meant by this Maciej? The first sounds a bit like a conspiracy theory - that Bitcoin (I presume) was invented by Silicon Valley to avoid banking regulations? Or did I completely misunderstand?

EDIT: to be clear, I'm not contradicting or being nasty. I would just like to learn more as I've never heard that take on it before.

idlewords · on May 12, 2019

No, I don't think Bitcoin is a conspiracy theory. I think it was a genuine novelty that came out of tech. But at a certain point in its rise, it became clear that you could create an unregulated securities market with initial coin offerings, and it was off to the races.

The dynamic that I am describing is new technologies being used to circumvent regulation, with the excuse that this is something new and technical, and so should be exempt.

We've seen the pattern over and over again, from sales tax exemptions to cryptocurrency to taxi and hotel laws.

novok · on May 13, 2019

What about ICOs make it unique in creating a securities market vs somebody just doing it with java, payment provider integration and a mysql database?

pjc50 · on May 13, 2019

Bitcoin was designed from the start to prevent either transactions being reversible, by anyone, or any actor in the network being able to prevent a transaction. This makes "systematic" regulation impossible.

It doesn't say in the whitepaper "this is for making illegal transactions and circumventing security laws", but that's what the advocates picked up almost immediately.

mindslight · on May 12, 2019

Maciej - I think my viewpoint on the surveillance industry is pretty similar. But reading your description of the current state of affairs still put a shiver up my spine. I can't even pinpoint a single passage that did it. Rather the sheer disconnect between the technological world we envisioned and the world they built is overwhelming, and you really got that across.

I think throwing the GDPR under the bus based on how the surveillance industry is doing its best to malinterpret it is a bit off. A popup with a list of task-orthogonal surveillance companies demanding consent is obviously an anti-pattern. The prudent course feels like to wait and see how the GDPR actually turns out in practice, and then hopefully adopt it as-is.

What I worry about in the meantime is a half-baked "Americanized" implementation that guts its strongest provisions ("Right To Download" -> "correct" feels like already trying to fortify a backstop! Why can't I simply just erase?), and blesses ongoing abuses. A disingenuous standard of purported consent is pervasive in our entire society, and I don't see why this topic will turn out any different. I can forsee EU regulators concluding that surveillance-based-advertising is not a necessary part of simply viewing a news article, whereas I can see a US regulator blessing that practice.

Which maybe implies that "copy GDPR" is actually a decent answer right now, even not knowing how it will turn out. For one, it tells US companies that they need to take the GDPR seriously rather than throwing "block EU" hissy fits. If it's really as unworkable as the surveillance industry public relations make it sound, I'm sure they'll have no problems modifying it.

idlewords · on May 12, 2019

I'm sorry that you read my statement as so hostile to the GDPR. The consensus at the hearing I think was unanimous—that so much of the GDPR is open to regulatory interpretation that it is hard to evaluate yet. My fellow witness said there's a value to not going first, and I wholly agree with him. I believe we all testified it has made people safer.

That said, there is time pressure on Congress because of the 2020 California law. The tech companies and data brokers are scared of 50 state laws on this and are pushing for the mildest form of Federal regulation they can get, to pre-empt state laws, and that is the context of the fight.

I think the GDPR is inadequate but much better than nothing, and I tried to convey that in my testimony.

Despegar · on May 12, 2019

I think what I didn't like about your comments about GDPR is that it used the worst example of the effects on users when you could have used a better one.

Take the iPhone for example. The iOS permissions system has basically nailed the "how do we present this to the user" problem pretty much from day one of the iPhone and the App Store. It's specific, it's easy to understand, it's not bundled with anything else and it shows you the prompt exactly when the app actually needs the location data (not just a succession of 5 different permission requests when you first launch the app). The annoyance to users that comes from GDPR is because, like you said, the existing surveillance ecosystem is trying to respond to GDPR by changing as little about their practices as possible. Maybe the surveillance ad ecosystem shouldn't even exist?

In iOS 11.3 Apple released a new API for ad attribution [1]. It seems like it could be very interesting from a privacy perspective [2].

>A successful app installation relays just five data points back to the ad network — ad network ID, transaction identification, ad campaign ID, app ID installed and attribution code — while excluding the private user information offered by Facebook and Google.

If they chose to mandate the use of this API, it would vastly restructure the surveillance ad ecosystem overnight (at least within Apple's platforms, which is a very big economy) [3]. I think the same could be done for a user surfing the web under GDPR, it doesn't necessarily have to be a hellscape.

[1] https://developer.apple.com/documentation/storekit/skadnetwo...

[2] https://www.mobilemarketer.com/news/apple-ios-privacy-and-th...

[3] https://adexchanger.com/mobile/is-apple-angling-to-cut-out-a...

mindslight · on May 12, 2019

I didn't take it as outright "hostile", but rather more of a basis with which to pivot into the idea of doing something different. Which feels like playing to lawmakers' interest / American exceptionalism of doing something "better", but counterproductive given how the US lawmaking process works.

I'm also not personally a huge fan of the GDPR as is, because I agree we have no idea how it will actually play out. But I don't think the American philosophy will be productive at coming up with a different approach. So if it's to be one national regime, why not simply match the EU for now?

If we can't do that, then honestly each state having a different approach sounds better for individuals (assuming it it doesn't devolve into companies demanding one's physical address for compliance). It's more likely that one state will properly codify the idea that users should have full control over data about themselves.

Ultimately I don't think we'll end up with sensible-looking legislation unless legislators actually get down and dirty with the technical specifics. For example, the brokenness of the cookie law actually necessitating cookies could have been avoided by mandating a fixed cookie/header format to express the preference rather than allowing every site to come up with its own bespoke implementation.

idlewords · on May 12, 2019

The problem with saying "let's just match the EU" is that key parts of the GDPR are still undecided. What is a "legitimate business interest" for data collection? What are the limits to algorithmic decisionmaking? No one knows.

A regime with 50 state privacy laws would result in every website you visit having a consent click-through where you agree to abide by Alabama privacy laws (or whoever wins that race to the bottom).

mindslight · on May 12, 2019

But how can it not come down to judgment calls? We're essentially trying to define the contour of a new right. If we stick with the axiomatic approach, then it feels like we can only end up essentially right back where we are now - data collected by companies is theirs to do with as they like.

FWIW are you sure it would be a race to the bottom (companies going to the least-restrictive state) ? In the context of salestax/cookie-nexus, I would have hopes that it would be a race to the top (users benefiting from being in a more-restrictive state). Which is why I'd think the worry would be every random service demanding to verify your address to keep you from claiming to be in the more restrictive states.

skybrian · on May 12, 2019

It seems like this is emphasizing the big five a bit much, when there is an entire ecosystem of smaller adtech firms that will be, if anything, harder to regulate since they're scrappy firms operating under the radar, sometimes overseas. At least the big five are likely (under pressure) to put in place the bureaucracy to follow regulations, and attempt to impose their rules on smaller vendors as well.

I also wonder about the idea that it's "traditional" that users own their own data. Maybe that's true in Europe, but in the US, selling customer lists in the direct-mail industry goes back many years. I'm guessing that salesmen keeping customer lists in rolodexes goes way back as well, and these lists were sometimes shared or sold. No user ownership of their own data there! Gossip seems more traditional than privacy.

GDPR-style regulations seem more like a new thing, long overdue as this stuff scales up rapidly.

HNthrow22 · on May 12, 2019

This is fantastic, thanks for writing it and for your testimony!

my favorite bits

"The emergence of this tech oligopoly reflects a profound shift in our society, the migration of every area of commercial, social, and personal life into an online realm where human interactions are mediated by software."

"Consumers will just as rightly point out that they never consented to be the subjects in an uncontrolled social experiment, that the companies engaged in reshaping our world have consistently refused to honestly discuss their business models or data collection practices, and that in a democratic society, profound social change requires consensus and accountability."

Brilliant!

brilee · on May 13, 2019

"The training process behaves as a kind of one-way function. It is not possible to run a trained model backwards to reconstruct the input data; nor is it possible to “untrain” a model so that it will forget a specific part of its input."

I don't have a concrete reference, but my understanding is that it is quite likely to be possible to reconstruct outliers in the training data by inspecting the model's weights.

bo1024 · on May 13, 2019

This is a good point. We have some concerns and evidence that neural networks do memorize their training data.

Also, the "untraining" idea is an open research question (I just saw a talk about it). We don't know exactly how to do it yet, definitely not in general, but "impossible" is too strong.

dgudkov · on May 12, 2019

>The internet economy today resembles the earliest days of the nuclear industry. We have a technology of unprecedented potential, we have made glowing promises about how it will transform the daily lives of our fellow Americans, but we don’t know how to keep its dangerous byproducts safe.

This is one of the best analogies about the internet that I've ever heard.

idlewords · on May 12, 2019

The title on this is horked, it should be "Senate Testimony on Privacy Rights and Data Collection in a Digital Economy"

snazz · on May 12, 2019

If it isn’t obvious from the domain name and parent’s username, the author of the linked page would like the HN title changed to the original one.

dang · on May 12, 2019

Looks like moderator fixed this a while ago.

The submitted title was "Notes from an emergency", which I think was a crossed wire with a different article.