Highlights From Facebook’s [email protected] (@SecatScale)

Facebook hosted [email protected] in Boston on November 12, 2015 and I attended. It took place at a fun venue, Artists for Humanity, a nonprofit organization dedicated to enhancing the arts in Boston public schools. Facebook will post videos and notes on their Engineering blog (here are the notes from 2014), but following are my notes and highlights.

Engineering Security @ Scale

Gregg Stefancik of Facebook talked a bit about history, bug bounties, php, and Hack. Hack is a php-based language developed at Facebook. It provides type checking, which allows developers to find bugs in their editor. Finding bugs in the editor enables productivity, which developers like. He also touched on the Threat Exchange service

Takeaway Quotes:

  • “Move fast and build things”
  • “If it’s not super easy and intuitive to use then it’s kind of a fail for us”
  • “When we don’t share with each other, we have the possibility of reinventing the wheel.”

Making Security Usable at HubSpot

Ken Breeman of HubSpot spoke about the tradeoff of usability and security. Examples of #fails on this spectrum include passwords, Windows Vista, and the U.S. OPM hack. Given the choice between two paths, people will choose the easiest. He also described categories of problems and how they balance them at HubSpot:

  • Authentication: SSO (single sign on), phrase-based passwords, 2FA, self-service, timeouts, Vault
  • Authorization: sane defaults (shouldn’t need to file a ticket to do your job), JITA (just in time access, built by HubSpot)
  • Accounting: use good judgement (trust but verify), CCS (compliance & control service: breaks down audit into manageable chunks)
  • Code Reviews: freedom to override (plan for exceptions)
  • Engineering

Takeaway Quotes:

  • “They’re going to find clever ways to work around what you designed”
  • “As engineers, we’re not normal”
  • “You can’t trust if you don’t verify”

Hive Anonymization

Karen Sittig of Facebook shared how her goal was to catalog all of the data in Hive (Facebook’s flavor of Hadoop). Any solution scales to exabytes. How much is that? Picture one Olympic swimming pool full of hard drives. Her background guided her solution: machine learning (ML). They built the tool from the ground up. They had a bunch of labeled data, which others might have to do for themselves if they’re following along at home. ML needs a large, representative set of labeled data. The solution is called Bii (a bee monitors the hive).

  1. Get labeled data.
  2. Bootstrap where you can.
  3. Performance is key – and “done” is better than “perfect”

If you know where your data is, it’s easier to use/protect.

Takeaway Quotes:

  • “Stay Focused & Keep Shipping”
  • “Everyone wants labeled data”

Safety at Scale

Kevin Riggle of Akamai discussed how they use incident management for everything. They also did some collaboration with Nancy Leveson, who wrote the Therac-25 paper. Software moves faster than physical systems, but we’re missing critical tooling. It’s a truism in our industry that software sucks. He highly recommends that we all read “Engineering a Safer World,” especially since the PDF is free. We should also check out the PSAS conferences at MIT.

Takeaway Quotes:

  • “A great body of literature says safety at this scale is impossible”
  • “Safety is an emergent property of systems”

Crypto Drafts, Curves & Making Web-Scale Impact

Deirdre Connolly of Brightcove presented a standards body-based intro to security. She claimed that if she can do it then you can and should do it. Check out IETF, IRTF and CFRG. They are independent and available to everyone. She then went on to describe the technology behind the RFC that she supported. She explained the improvements Edwards curves have over Weierstrass curves: simpler, faster, and fewer edge cases. Her contribution timeline consisted of joining the mailing list, asking questions for clarification, and then she got acknowledged in the RFC. Anyone can contribute. Lurk on mailing lists, IRC/Jabber, and GitHub. Lots of projects in flight right now, i.e. Let’s Encrypt.

Building Open Source Software for Security

Javier Marcos de Prado of Facebook was one of the creators of osquery. His background had him move from offense (pen testing) to defense. Reasons for open source: accelerates innovations, better software, shared challenges. and proper documentation. Osquery uses 100% native API; no fork execv. Requires secure DevOps! Started on TravisCI but moved to Jenkins after getting lots of OOMing. They ran it all on Mac Minis so they could virtualize OS X. Jenkins is low hanging fruit for pen testers, so how do we secure? Look at all of ways jenkins has been pwned and blacklist them! It’s not the best, but it is a good place to start. Use GitHub Oauth because then you can do MFA through GitHub. Implement CSRF protection (cross-site request forgery). Facebook uses Duo. You probably don’t want to just run every pull request (dry understatement). Consider Doomsday scenario: hiding commit after a pull request has been approved for testing. Always presume you can get compromised, and figure out how fast you can react. They run osquery on osquery – so meta – and then forward logs to ELK – Elasticsearch, Logstash, and Kibana.

Takeaway Quotes:

  • “You get the Jenkins tiger”
  • “This is fine”
  • “Everything is urgent then nothing is urgent.”

Rapid Identification and Classification of Mobile Malware

Seth Hardy, Lookout – Types of mobile malware: trojan, ransomware, click fraud. Classify type, family, and variant.

Takeaway Quote:

  • “My handphone always be malware” – Ted Goats

Improving Code Health with Invariant Detector

Marjori Pomarole of Facebook said their goal is to build tools and frameworks so developers don’t have to think about security. Our failure scenario occurs when evil request occurs on bad code. Everything must go through code review at Facebook, but that doesn’t scale if we require involvement from the security team. There’s also tons of legacy code at Facebook that might have vulnerabilities.

One solution: “Whitehat researchers” and Facebook’s bug bounty program. Their internal solution is Invariant Detector, an automatic approach to look for extraneous behavior. Essentially, it generates rules based on normal traffic. TAO – read-optimized graph data store built at Facebook. Negligible overhead in 100% of rules in Facebook. Then there’s another tool: Configerator. It deploys configurations across all the different web servers. False positives happen, so they have a whitelist that’s maintained. These tools blocked a bunch of stuff and they’re still improving.

Visualizing Security Data at Scale

John Langton & Alex Baker, Bit9 + Carbon Black – John and Alex discussed three topics: scalability issues, methods that scale, and use cases.

  • Use humans where they’re still better than computers: Pattern finding/matching. Jock Mackinlay’s hierarchy in 1986, revised several times. Different things can affect human processing. Common visualization problem with occlusion. Humans are not good at depth queues.
  • Pixelization. NDVis (dissertation work). Hilbert Curves. Take a single line and wrap it up in a fractal (IPv4 Census Map). Treemaps. Transparency. Heatmaps (Internet census project). Sacrificing variables for clarity. Scatterplot with nugget of data visualization. Integrated views. Correlate different views so clicking on one highlights the others.
  • Anything you buy off the shelf will load all the data into memory, which means you need to build it yourself. PCA and MDS. Machine learning. Clustering is great for making labels.

Additional resources at http://datadrivensecurity.info/blog and visitrend.tumblr.com.

Takeaway Quotes:

Trusted Computing @Scale

Sahil Rihan, Facebook – Pretty much a horror story. The movies make it sound like it’s “us” vs. “them.” Think of Sandra Bullock in The Net. We all know it’s really “us” vs. some others of “us” and “them.” Consider racks full of servers. How do we secure the platform? Just talking about boot-time protection: not protecting against runtime attacks.

  1. The root of trust is in the cpu: a signed “quote” from the TPM (Trusted Platform Module). Build off of the root of trust all the way. Now to do it at scale: CM talks with attestation service, and both of those talk to the servers.
  2. Advanced paranoia: malicious devices, BIOS, firmware updates, remote access. Think about SONY attack and what if they flashed firmware.
  3. Black belt security: hardware implants, SGX, NSA ANT Catalog, the network inside. So where do we start and what do we do?

Takeaway Quotes:

  • “There’s a lot of snake oil out there but some of it actually works”
  • “Danger lurks below the syscall boundary”
  • “Wiping your hard drive is not going to get rid of a BIOS malware”
  • “Who’s going to go and check all of these things?”
  • “Carbon Black people, I’m looking at you”

Tool or Hacker: Which one should I use?

Geoff Vaughan, Security Innovation – Consider the SDLC:

  1. Catching security issues during the requirements phase can save a ton of money if done correctly. A hacker might miss some issues but also catch some issues a tool will miss.
  2. A hacker shines during the Design/Architecture phase, since not a lot of tools can do this.
  3. Development. The hacker doesn’t scale well.
  4. QA/Testing.
  5. Production.

Takeaway Quote:

  • “Think more like a hacker”


Topics discussed include:

  • Visualization merging with cyber
  • Tracking the bad guys can actually help in practice
  • Performance overhaul
  • Often the tradeoffs don’t exist: we can get savings and security

That’s it! Or, more specifically, that’s what I managed to take down during the event. Thanks to Ryan Mack and Facebook for putting on a great conference!