Privacy

Engineer goes off and adjusts the plan. At a high level,it looks like this:

    Client side code

        Set up

            snippet lets users know there is a new options in preferences

            if the app locale does map to a supported language, the pref panel is greyed out with a message that their locale is not supported by the EU service

            if not, user clicks ToS, privacy policy checkbox, confirms language

            app contacts server

            if server not available, throw up offline error

            if server available, upload device id, language, url list

            server sends back the guid assigned to this device id

            notify user setup is complete

            enable  upload service

        When new tab is opened or refreshed

            send msg to server with guid + url

        Turning off feature

            prompt user ‘are you sure’ & confirm

            notify server of deletion

            delete local translated pages

Server side code

    Set up

        poked by client,

        generate guid

        insert into high risk table: guid+device id

adds rows for tabs list (med table)

adds rows for the urls (low table)

    Periodic background translation job:

        finds urls of rows where the translated blob is missing

        for each url, submits untranslated blob to EU’s service

        sticks the resulting translated text blob back into the table

    Periodic background deletion jobs:

        finds rows older than 2 days and evicts them in low risk table & medium risk tables

        find rows in sensitive table older than 90 days and evict.

secure destruction used.

        user triggered deletion

            delete from sensitive table. secure destruction

            delete from medium table

    Database layout

        sensitive data/high risk table columns: user guid, device id

            maps guid to device id

        medium risk table columns: user guid, url

            maps guid to tabs list

        low risk table columns: url, timestamp, language, blob of translated text

            maps urls to translated text

The 4th Meeting

Engineer: Hey, so what do you guys think of the new plan? The feedback on the mailing list was pretty positive. People seem pretty excited about the upcoming feature.

Engineering Manager: indeed.

DBA: much better.

Operations Engineer: I agree. I’ll see about getting you a server in stage to start testing the new plan on.

Engineer: cool, thanks!

Engineer: Privacy Rep, one of the questions that came up on the mailing list was about research access to the data. Some phd students at the Sorbonne want to study the language data.

Privacy Rep: Did they say which bits of data they might be interested in?

Engineer: the most popular pages and languages they were translated into. I think it would really be just the low risk table to start.

Privacy Rep: I think that’d be fine, there’s no personal data in that table. Make we send them the basic disclosure & good behavior form.

Engineering Manager: A question also came to me about exporting data. I don’t think we have anything like that right now.

Engineer: No, we don’t.

Privacy Rep: well, can we slate that do after we get the 1.0 landed?

Engineering Manager: sounds like a good thing to work on while it’s baking on the alpha-beta channels.

Who brought up user data safety & privacy concerns in this conversation?

Engineer, Engineering Manager, & Privacy Rep.

Engineer goes off and creates the initial plan & sends it around. At a high level,it looks like this:

    Client side code

        Set up

            snippet lets users know there is a new options in preferences

            if the app language pref  does not map to a supported language, the pref panel is greyed out with a message that their language is not supported by the EU service

            if it does, user clicks ToS, privacy policy checkbox, confirms language

            app contacts server

            if server not available, throw up offline error

            if server available, upload device id, language, url list

            notify user setup is complete

            enable  upload service

        When new tab is opened or refreshed

            using device id, send a sql query for matching row of url+device id

 

Server side code

    Set up

        poked by client, adds rows to table, one for each url (tab)

    Periodic background translation job:

        finds urls of rows where the translated blob is missing

        for each url, submits untranslated blob to EU’s service

        sticks the resulting translated text blob back into the table

    Periodic background deletion job:

        finds rows older than 2 days and evicts them

        unless that is the last row for a given device id, in which case hold for 90 days

    Database layout

        table columns: device id, url, timestamp, language, untranslated text, translated text

 

 The 3rd Meeting:

Engineer: Did y’all have a chance to look over the plan I sent out? I haven’t received any email feedback yet.

DBA: I have a lot of concerns about the database table design. The double text blobs are going to take up a lot of space and make the server very slow for everyone.

Operations Engineer: ..and the device id really should not be in the main table. It should be in a separate database.

Engineer: why? That’s needless complexity!

Operations Engineer: Because it should be stored in a more secure part of the data center. Different parts of the data center have different physical & software security characteristics.

Engineer: “ oh”

Engineering Manager: You should design your interactions with the server to encapsulate where the data comes from inside the server. Just query the server for what you want and let it handle finding it. Building the assumption that it all comes from one table into your code is bad practice.

Engineer: ok

DBA: The double blobs of text is still a problem.

Engineer: Ok, well, let’s get rid of the untranslated text blob. I’ll take the url and load the page and acquire the text right before submitting it to the EU service. How does that sound?

Engineering Manager: I like that better. Also allows you to get newest content. Web pages do update.

Privacy Rep: I was wondering what your plans are for encryption? I didn’t see anything in your draft after it either way.

Engineer: If it’s on our server and we’re going to transmit it to an outside service anyway, why would we need encryption?

Privacy Officer: We’ll still have a slew of device ids stored. At the very least we’ll need to encrypt those. Risk mitigation.

Engineer: But they’re our servers!

Operations Engineer: Until someone else breaks in. We’ve agreed that the device ids will go in a separate secure table, so it makes sense to encrypt them.

Engineer: fine.

Privacy Officer: The good news is that if you design the url/text table well, I don’t think we’ll need to encrypt that.

Engineer: That’s your idea of good news?

Engineering Manager: Privacy Officer is here to help. We’re all here to help.

Engineer: Right, Sorry Privacy Officer.

Privacy Rep: It’s ok. Doing things right is often harder.

Privacy Officer: Anyway, I wanted to let you all know that I’ve started talking with Legal about the terms of service and privacy policy. They promise to get to it in detail next month. So, as long as we stay on the stage servers, it won’t block us.

Engineering Manager: thanks. Do you have any idea how long it might take them once they start?

Privacy Officer: Usually a couple weeks. Legal is very, very through. I think it’s a job requirement in that department.

Engineer: Fair enough.

Operations Engineer: Speaking of legal & privacy, do you/they care about the server logs?

Privacy Officer: I don’t think so. The device id is the most sensitive bit of information we’re handling here and that shouldn’t appear in the logs right? I’ll still talk it over with the other privacy folks in case there’s a timestamp to ip address identification problem.

Operations Engineer: That’s correct.

Engineering Manager: Always good to double check.

Privacy Officer: By the way, what happens to a user’s data when they turn the feature off?

Engineer: … I don’t know. Product Manager didn’t bring it up. I don’t have a mock from him about what should happen there.

Privacy Officer: we should probably figure that out.

Engineer: yeah

Engineering Manager: I look forward to seeing the next draft. I think after you’ve applied the changes & feedback here you should send it out to the public engineering feature list and get some broader feedback.

Engineer: will do. Thanks everyone

Who brought up user data safety & privacy concerns in this conversation?

DBA, Engineering Manager, Privacy Rep.

The day after the first meeting…

Engineering Manager: Welcome DBA, Operations Engineer, and Privacy Officer. Did you all get a chance to look over the project wiki? What do you think?

Operations Engineer: I did.

DBA: Yup, and I have some questions.

Privacy Officer: Sounds really cool, as long as we’re careful.

Engineer: We’re always careful!

DBA: There are a lot of pages on the web, Keeping that much data is going to be expensive. I didn’t see anything on the wiki about evicting entries and for a table that big, we’ll need to do that regularly.

Privacy Officer: Also, when will we delete the device ids? Those are like a fingerprint for someone’s phone, so keeping them around longer than absolutely necessary increases risk for the user & the company’s risk.

Operations Engineer: The less we keep around, the less it costs to maintain.

Engineer: We know that most mobile users have only 1-3 pages open at any given time and we estimate no more than 50,000 users will be eligible for the service.

DBA: Well that does suggest a manageable load, but that doesn’t answer my question.

Engineer: Want to say if a page hasn’t been accessed in 48 hours we evict it from the server? And we can tune that knob as necessary?

Operations Engineer: As long as I can tune it in prod if something goes haywire.

Privacy Officer:: And device ids?

Engineer: Apply the same rule to them?

Engineering Manager: 48 hours would be too short. Not everyone uses their mobile browser every day. I’d be more comfortable with 90 days to start.

DBA: I imagine you’d want secure destruction for the ids.

Privacy Officer:: You got it!

DBA: what about the backup tapes? We back up the dbs regularly?

Privacy Officer:: are the backups online?

DBA: No, like I said, they’re on tape. Someone has to physically run ‘em through a machine. You’d need physical access to the backup storage facility.

Privacy Officer:: Then it’s probably fine if we don’t delete from the tapes.

Operations Engineer: What is the current timeline?

Engineer: End of the quarter, 8 weeks or so.

Operations Engineer: We’re under water right now, so it might be tight getting the hardware in & set up. New hardware orders usually take 6 weeks to arrive. I can’t promise the hardware will be ready in time.

Engineering Manager: We understand, please do your best and if we have to, Product Manager won’t be happy, but we’ll delay the feature if we need to.

Privacy Officer:: Who’s going to be responsible for the data on the stage & production servers?

Engineering Manager: Product Manager has final say.

DBA: thanks. good to know!

Engineer: I’ll draw up a plan  and send it around for feedback tomorrow.

 

Who brought up user data safety & privacy concerns in this conversation?

Privacy Officer is obvious. The DBA & Operations Engineer also raised privacy concerns.

The 1st Meeting

Product Manager: People, this could be a game changer! Think of the paid content we could open up to non-english speakers in those markets. How fast can we get it into trunk?

Engineer: First we have to figure out what *it* is.

Product Manager: I want users to be able to click a button in the main dropdown menu and translate all their text.

Engineering Manager: Shouldn’t we verify with the user which language they want? Many in the EU speak multiple languages. Also do we want translation per page?

Product Manager:  Worry about translation per page later. Yeah, verify with user is fine as long as we only do it once.

Engineering Manager: It doesn’t quite work like that. If you want translation per page later, we’ll need to architect this so it can support that in the future.

Product Manager: …Fine

Engineer: What about pages that fail translation? What would we display in that case?

Product Manager: Throw an error bar at the top and show the original page. That’ll cover languages the service can’t handle too. Use the standard error template from UX.

Engineering Manager: What device actually does the translation? The phone?

Product Manager: No, make the server do it, bandwidth on phones is precious and spotty. When they start up the phone next, it should download the content already translated to our app.

Engineer: Ok, well if there’s a server involved, we need to talk to the Ops folks.

Engineering Manager: and the DBAs. We’ll also need to find who is the expert on user data handling. We could be handling a lot of that before this is out.

Project Manager: Next UI release is in 6 weeks. I’ll see about scheduling some time with Ops and the database team.

Product Manager: Can you guys pull it off?

Engineer: Depends on the server folks’ schedule.

Who brought up user data safety & privacy concerns in this conversation?

The Engineering Manager.