Trail of Bits

W/Internships

Episode Summary

Meet the Trail of Bits interns who represent the next generation of security engineers. They’re creating new tools that will be used by software developers around the world, and updating existing ones to optimize their efficiency. They’re even inspiring their supervisors to write poetry celebrating their accomplishments.

Episode Notes

Featured Voices in this Episode:

Trent Brunson

Trent Brunson is a Principal Security Engineer and Research Practice Manager at Trail of Bits. He has worked in computer security since 2012 as a researcher and engineer at Assured Information Security in Rome, NY, and at the Georgia Tech Research Institute, where he served as the Threat Intelligence Branch Chief and the Associate Division Chief of Threat Intelligence & Analytics. 

Dan Guido

Dan Guido is the CEO of Trail of Bits, a cybersecurity firm he co-founded in 2012 to address software security challenges with cutting-edge research. In his tenure leading Trail of Bits, Dan has grown the team to more than 80 engineers, led the team to compete in the DARPA Cyber Grand Challenge, built an industry-leading blockchain security practice, and refined open-source tools for the endpoint security market. In addition to his work at Trail of Bits, he runs Empire Hacking, a 1,500-member meetup group focused on NYC-area cybersecurity professionals. His latest hobby coding project, AlgoVPN, is the Internet's most recommended self-hosted VPN.

Suha Hussain

Suha Hussain is a software security engineer who specializes in machine learning assurance. Her work also involves data privacy, program analysis, and applied cryptography. She’s currently an intern at Trail of Bits, where she’s worked on projects such as PrivacyRaven and Fickling. She’s also pursuing a BS in Computer Science at Georgia Tech.

Sam Alws

Sam Alws is a computer science student at Vanderbilt University, hoping to take part in shaping the future of tech. He was a Trail of Bits wintern and also previously interned at Bloomberg LP. He serves as a volunteer software developer for Change++, writing code for charities, and spent two years with Project Spark, designing a programming curriculum for schools in India.

Nick Selby (Host)

An accomplished information and physical security professional, Nick leads the Software Assurance practice at Trail of Bits, giving customers at some of the world's most targeted companies a comprehensive understanding of their security landscape. He is the creator of the Trail of Bits podcast, and does everything from writing scripts to conducting interviews to audio engineering to Foley (e.g. biting into pickles). Prior to Trail of Bits, Nick was Director of Cyber Intelligence and Investigations at the NYPD; the CSO of a blockchain startup; and VP of Operations at an industry analysis firm. 

Production Staff

Story Editor: Chris Julin
Associate Editor: Emily Haavik
Executive Producer: Nick Selby
Executive Producer: Dan Guido

Recording

Recorded at Rocky Hill Studios, Ghent, NY - Nick Selby, Engineer
22Springroad Tonstudio, Übersee, Germany - Volker Lesch, Engineer

Remote recordings: New York, NY; Brooklyn, NY; Virginia; Atlanta, GA (Emily Haavik); Silver Spring, MD (Jason An). 
Trail of Bits supports and adheres to the Tape Syncers United Fair Rates Card.

Edited by Emily Haavik and Chris Julin
Mastered by Chris Julin  

Video

You can watch a video of this podcast.

Special Thanks

Dominik Czarnota
Josselin Feist

Music

TRAIL OF BITS THEME: DISPATCHES FROM TECHNOLOGY'S FUTURE, Chris Julin
ELEMENT, Frank Bentley
FOUR AM, Curtis Cole
DRIVING SOLO, Ben Fox
OPEN WINGS, Liron Meyuhas
SHAKE YOUR STYLE, Stefano Mastronardi
THE QUEEN, Jasmine J. Walker
ILL PICKLE, Phil David
PIRATE BLUES, Leon Laudenback
SCAPES, Gray North

Reproduction

With the exception of any Copyrighted music herein, Trail of Bits Season 1 Episode 2; Internships and Winternships © 2022 by Trail of Bits is licensed under Attribution-NonCommercial-NoDerivatives 4.0 International.  This license allows reuse: reusers may copy and distribute the material in any medium or format in unadapted form and for noncommercial purposes only (noncommercial means not primarily intended for or directed towards commercial advantage or monetary compensation), provided that reusers give credit to Trail of Bits as the creator. No derivatives or adaptations of this work are permitted. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/4.0/

Referenced in this Episode:

Learn more about the work done by Trail of Bits interns over the years on the company blog.

Apply for an internship or winternship at https://www.trailofbits.com/careers

Suha Hussain and lead engineer Evan Sultanik describe the Fickling project: Never a Dill Moment: Exploiting Machine Learning Pickle Files. The Python manual refers specifically to the security issues discussed in this episode:  

"The pickle module is not secure. Only unpickle data you trust... It is possible to construct malicious pickle data which will execute arbitrary code during unpickling. Never unpickle data that could have come from an untrusted source, or that could have been tampered with."

Read more about PrivacyRaven and watch Suha’s video introducing the project: PrivacyRaven Has Left the Nest

Sam Alws describes his journey to speed up Echidna: Optimizing a Smart Contract Fuzzer

For those interested in CTFs, especially for those who seek to start their own, Trail of Bits has posted a CTF Field Guide in the company github repository. It contains details on past CTF challenges, guidance to help you design and create your own toolkits, and case studies of attacker behavior – both in the real world, and in past CTF competitions. Each lesson is supplemented by links to supporting reference materials.

Check out the AngstromCTF site here: angstromctf.com

And here’s the Montgomery Blair High School Cybersecurity Club’s github repository: github.com/blairsec

The Blair students you met in this podcast were Jason An, Clarence Lam, Harikesh Kailad and Patrick Zhang. 

Meet the Team:

Chris Julin

Chris Julin has spent years telling audio stories and helping other people tell theirs. These days he works as a story editor and producer for news outlets like APM Reports, West Virginia Public Broadcasting, and Marketplace. He has also taught and mentored hundreds of young journalists as a professor. For the Trail of Bits podcast, he serves as story and music editor, sound designer, and mixing and mastering engineer.

Emily Haavik

For the past 10 years Emily Haavik has worked as a broadcast journalist in radio, television, and digital media. She’s spent time writing, reporting, covering courts, producing investigative podcasts, and serving as an editorial manager. She now works as an audio producer for several production shops including Us & Them from West Virginia Public Broadcasting and PRX, and APM Reports. For the Trail of Bits podcast, she helps with scripting, interviews, story concepts, and audio production.

Episode Transcription


 

TRENT BRUNSON:  5-4-3-2-1. 

TRENT BRUNSON: [POETRY]: 

Is binary analysis no fun?/

Well look what our Wintern has done!

NARRATOR (NICK SELBY): That's Trent Brunson. He's not a poet by trade. He’s the Practice Director for Research and Engineering at Trail of Bits. But he posted this poem to Twitter a while back, and we asked him to read it.

TRENT BRUNSON:  [POETRY]   

Specified declarations,/

Transparent explorations,/

And it's mutable, not run-and-done./

NARRATOR: Trent was moved to verse by the work of an intern. 

When you move the practice director to write limericks praising your work...it's a pretty safe bet you aced your internship.

Trail of Bits has some great interns. 

DAN GUIDO:  -  The competition is really tough because you're going to be working on real things. Things that matter, in high assurance software. Which means that from day one, you’ll be working on systems that have catastrophic consequences when they go wrong.

NARRATOR: That’s Trail of Bits CEO Dan Guido, talking about his company’s internship program that runs every summer - and the Winternship program that runs, well, every winter. 

DAN GUIDO:  -  We’re really methodical about how we select people, and that's because we know we're going to throw them into the deep end of the pool, that's where you learn the most, the fastest.

NARRATOR: The company, and everyone in it, takes internships very seriously. when he was a grad student, Dan did a demanding internship himself - at the National Security Agency... 

DAN GUIDO:  - NSA was one of the earliest internships that I had, and it really set the tone for the rest of my career because it showed me what's important in security. There's a lot of work being done, but I don't feel like there's a lot of progress being made.   And that's because a lot of activity that we that we spend a lot of the research time, a lot of the software development, it doesn't align directly with what attackers do. So having an inside look at what other people are doing against the United States and how intelligence collection works was extremely valuable for me to orient myself early in my career.

NARRATOR: In today’s economy, there’s incredible demand for talented engineers. So for Trail of Bits, an internship program isn’t some vague way of “giving back to the community” or cutting a kid a break. To Trail of Bits, the internship program? It’s strategic.

DAN GUIDO: I'm running a business, I'm not just giving away money to kids, that's a side effect. The side effect is that these students get opportunities to teach themselves things, to bank a little bit of cash. But we do this because it helps Trail of Bits. our clients and our projects, and our people need the extra assistance from smart, ambitious young students to be able to deliver services that trailer bits offers …it goes beyond altruism.

MUSIC in__Four AM, Curtis Cole

NARRATOR: In this episode, you'll meet some of our Security Engineering Interns, and hear about some of the real world projects they work on - like making one of our most popular free and open-source software packages run faster and use less memory. And working in the obscure world of deep-learning systems security, another intern built her own project from the ground up.

MUSIC out

SUHA HUSSAIN: Hi, I’m Suha Hussain, and I am a security engineering intern here at Trail of Bits. I work mainly on machine learning security, I’m also a student at Georgia Tech. 

NARRATOR: Suha was hired by NYU’s Center for Cyber Security - after she attended a high school science fair. The NSA tried to recruit her, but as she said, she ended up at Georgia Tech. Which she loves, but... 

SUHA HUSSAIN:  - it's very easy to get frustrated in school… one, knowing a reasonable percentage of the material and also knowing what is and what is not relevant anymore. it does feel like we're jumping hoops sometimes, especially when there's extremely dated syllabi. 

NARRATOR: Suha plans to graduate next year. But she already has skills that can be used now, in real life. She wants to take these abstract solutions from the classroom - and apply them to real problems.

SUHA HUSSAIN:  the idea that I could come in with a project of my own for an internship instead of being assigned to work on something was pretty cool.

NARRATOR: Suha’s project was called: Privacy Raven.

SUHA HUSSAIN:  [from video at trail of bits website]: “This summer I worked on Privacy Raven, a comprehensive privacy testing suite for deep learning. 

NARRATOR: That's Suha in a video that's on the Trail of Bits website. Like all the tools you will hear about on this podcast, Privacy Raven is free and open source, and available in our Github repository - the link’s in the show notes. 

SUHA HUSSAIN: I built a Python library optimized for usability and efficiency, that allows users to simulate privacy attacks on any deep learning system.“

MUSIC IN

NARRATOR: Deep learning allows software to perform tasks without explicit programming. But unlike other software, Deep Learning systems developers don’t have a comprehensive suite of assurance tools. Here’s Dan Guido again:

DAN GUIDO: Deep learning systems are an open area of research for the security community. 

NARRATOR: For users, Deep Learning systems are simple: Let’s say you have a deep learning system that’s holding medical diagnosis data. A user - like a doctor or technician - can submit a query through a web application, and get a “yes” or “no” answer. 

So, one expression of this could be that the user uploads a medical image, then asks the system whether the image contains evidence of, say, a brain bleed - and they would get that Yes or No answer. 

From a security standpoint, it would seem that the answer is given with as little access as possible to the deep learning model behind the scenes. 

MUSIC OUT

NARRATOR: But is that true? We’ve just described the front door user experience, but from a systems perspective, that front door prevents us from seeing that there’s a whole lot going on in there. So, when we get down to it, does that limited and simple access mean the system isn’t vulnerable?

DAN GUIDO: Not everybody in the security community understands what a deep learning system is. And that's why it's so important that tools like Privacy Raven are created because it lowers the bar for entry to do the kinds of analysis required to make these systems safe.

NARRATOR: The data held by these ML systems - fraud, medical, actuarial, even marketing information, or what have you - can be incredibly sensitive. Suha and the team built Privacy Raven to learn how exploitable the potential vulnerabilities are.

SUHA HUSSAIN: It was born out of my frustrations doing machine learning research. People kept listing all of these cool attacks that are possible with ML systems, and I wanted to build like a privacy testing system. So all these previous attacks could be put in one place, 

NARRATOR: To test the security, Privacy Raven simulates an adversary. It launches different types of attacks, to probe the vulnerabilities of a deep learning system. 

It turns out that, even when they were granted limited access, these simulated adversaries were able to successfully exploit vulnerabilities.   

First, Privacy Raven tests for what’s called model extraction attacks: Piecing together enough information from those yes or no answers, to create a copycat model. 

SUHA HUSSAIN:  Companies spend huge sums of money to build a training dataset and hire engineers to build these tools. Model extraction is used in industrial or even nation-state espionage, to steal this intellectual property and cheaply make a clone of it. These attacks can also be used for reconnaissance, to collect information about a machine learning model in order to mount future attacks. 

NARRATOR: Privacy Raven also tests for model inversion attacks, which can recreate the private input data that was used to train the deep learning model.

SUHA HUSSAIN: Machine learning models are built by using a set of confidential data to train the system. Criminals can use a model inversion attack to extract and steal that training data. 

NARRATOR: Membership inference attacks can have devastating results, so Privacy Raven tests for those, too. The tests determine if an attacker is able to figure out whether a certain data point, like a name, is in a dataset.

SUHA HUSSAIN: - A membership inference attack can be used to expose - or “de-anonymize” - the true identities of anonymized data within a machine learning system. 

NARRATOR: The stakes are especially high for an attack like this when the dataset includes private medical information. Here’s Suha in another clip from that video:

SUHA HUSSAIN:  [from video]: The creation of that software relied upon many patients being willing to trust the developers with their private medical data. If a patient’s participation and in conjunction with that, their image and diagnosis, could be recovered, it would first of all violate HIPAA and then it would diminish the trustworthiness of the whole enterprise.

MUSIC IN

NARRATOR: One of the most important goals for Privacy Raven was that it would be easy to use.

SUHA HUSSAIN: The tools out there are often built by researchers for researchers. When I  thought that… this is such an important problem. Anyone with a knowledge of Python and maybe a little bit of machine learning should be just able to run this attack.

NARRATOR: Privacy Raven is a good example of something designed from scratch as an internship project. But Trail of Bits internships are designed to find ways to match fledgling engineers with real-world problems. 

MUSIC OUT

SAM AWLS: My name is Sam Alws. I'm a student at Vanderbilt University and a software engineer. /  and I did a winternship at Trail of Bits.

NARRATOR: When Sam first applied at Trail of Bits, he didn’t get the job. At least, not exactly. 

SAM AWLS: I applied for a winternship at a completely different role,

NARRATOR: Sam applied to work with a tool called Dylint. 

SAM AWLS: someone emailed me saying, "We're not sure if you're very qualified for the dylint role. …Instead, what would be best is if you work for Echidna that matches more closely with your skill set. 

NARRATOR: Sam’s supervisor wanted him to work on Echidna, which is a tool Trail of Bits released in 2018 to help test smart contracts. 

A smart contract is a small computer program that runs on a blockchain, and sets the conditions of a transaction. For example, you might develop a smart contract that will execute the logic of a lending platform, where users deposit their funds and receive interest when other users take loans, without third-parties or central banks involved. Smart contracts help ensure that, even in a world in which no one trusts anyone else, each party holds up their end of the bargain. 

But there’s a problem.

SAM AWLS: Smart contracts do exactly what you tell them to, which you would expect to be a good thing. But oftentimes, there's millions of dollars at play, and these contracts are hard to update.If exactly what you tell it to do is just slightly off, then that can cause people to lose their money or people to be able to steal money from the smart contract.

MUSIC IN

NARRATOR: To help check smart contracts for bugs, Trail of Bits developed a smart contract fuzzer - that’s what Echidna is. 

At its heart, a fuzzer is used to verify that places within a program that accept inputs protect against the unexpected. So, for example, when you log into your bank account, instead of entering your name and account number, a fuzzer would just type a whole lot of random things - spaces, letters, symbols, etc., and maybe throw in some database code - and an engineer can observe whether anything weird happens. 

Echidna is a much more sophisticated tool than that. It helps engineers find weaknesses in smart contracts - weaknesses that could let, for example, people steal money, or delete accounts.

A key feature of a fuzzer like Echidna is its speed. In a minute, a human might be able to enter maybe a dozen inputs to a program, while a fuzzer can do thousands or even tens of thousands.

MUSIC OUT

NARRATOR: Echidna runs fast, but Sam was asked to look into making it run faster.

SAM AWLS: even though Echidna ran perfectly well, it provided the right outputs given the right inputs… / Sometimes it would take super long to run and sometimes it would take up too much memory. So given certain inputs, it would either take forever to run or it would run out of memory on your computer, and then it would have to stop halfway through. 

NARRATOR: In the software industry, this is called technical debt. Developers take shortcuts to make something give you the features you need now, even though you know you’re putting off until later doing some hard, boring work on things like efficiency. It’s actually a lot like credit card debt: you get the shiny things you want now, and you pay a lot more for it over time. Technical debt affects everyone, and Trail of Bits is no exception. 

As long as it doesn’t create an actual security vulnerability, we can sometimes let our own tech debt linger. Until we have a chance to burn it down.

Trail of Bits recruited Sam - a highly qualified wintern - to burn down some of Echidna’s technical debt.

SAM AWLS: My task was that I had to improve the performance of Echidna so that when the average person ran it, it didn't take super long to, to fuzz a contract. 

NARRATOR: Sam went to work.

SAM AWLS: I found that there were a few very easy problems that I could fix. The first one that I found was that there was this function that scanned through the entire … contract code that was being called… like every single frame, and that could be dealt with pretty easily. I found another problem where the solution was available for an older version of the code, but it hadn't been added yet. So I just needed to add that solution, and then that would fix the problem.

NARRATOR: Sam was making great progress, and actually he surprised himself:

SAM AWLS: When I came in, I was expecting to get like 10 percent faster or 20 percent faster. 

NARRATOR: Overall, we reckon the work Sam did made Echidna six times faster. And that’s just the official number.

MUSIC IN

NARRATOR: Sam improved the memory problem, too. That’s a little harder to measure. 

SAM AWLS: we went from, my computer is not able to run this input at all, and it runs out of memory, into, my computer is able to run this input fairly easily.

NARRATOR: It doesn’t take much imagination to picture the wide-ranging impact it’s going to have.

SAM AWLS: Echidna] is open source, so it's being used in a lot of different scenarios, often with millions of dollars at stake. … / And I like to think that my fixes are helping all these people who are trying to make their contracts more secure. 

MUSIC OUT

NARRATOR: While Sam was working on Echidna, Suha’s winternship at Trail of Bits, fueled by the progress with Privacy Raven, turned into a summer internship. And then it became a gap semester. 

SUHA HUSSAIN: I kept on thinking, Wow, there's so many cool problems and cool things I can do here in this space at Trail of Bits, and I really just wanted to do this more. 

She’s stayed on while finishing school, doing work on problems in machine learning assurance. And a lot of this comes down…

[SFX: Pickle Crunch] 

… to pickles. 

Yeah, not cucumbers. Pickle files. 

MUSIC IN ILL PICKLE - PHIL DAVID

NARRATOR: Let’s say you have a Python object, like a list. The list has a hierarchy. Pickling is a way Python can convert that object into a character string containing everything you would need to reconstruct the object in another python script. But there’s a problem: 

SUHA HUSSAIN: Pickle files are commonly known to be insecure.

NARRATOR: If you’ve ever endured the security awareness talk at work, you’ll remember being told never to open a spreadsheet that someone emails you, because it might contain a malicious macro. Well, right there in the Python manual - we link to it in the show notes - you’ll see almost the same warning about Pickle files. Because of the way Pickling works, two operation codes that are an inherent part of the process are capable of executing arbitrary Python code outside the Pickle Machine.

MUSIC OUT

NARRATOR:  It’s just a known hazard, and it can’t be changed. 

To help with this, Trail of Bits launched a project known as “Fickling”:

SUHA HUSSAIN: Fickling is a custom Python interpreter that will symbolically execute pickle files instead of overtly executing them. This will help people build and reverse engineer malicious pickle files.

NARRATOR: Through all these projects, there’s a common theme. Trail of Bits and the interns who apprentice with them are working to make software more secure for the people and systems that use it.

SUHA HUSSAIN:   it is very common for machine learning researchers to download pickle files containing machine learning models and just cross their fingers and pray that those files aren't malicious. There's a lot of trust that shouldn't be there. …Within machine learning, there's a move fast and break things culture, and once they break things, that's where we come in with tools like privacy raven and fickling.

NARRATOR: People like Suha, and Sam, are the future of information security - but they don’t just appear when they’re ready for the big leagues. 

SUHA HUSSAIN:   in my freshman year of high school, I emailed 100 startups in New York City begging for an internship. 

NARRATOR: They often start developing these skills when they’re kids.

NAT SOUND BLAIR CYBERSECURITY CLUB DISCUSSIONS

NARRATOR: This is the Montgomery Blair High School Cybersecurity Club in Silver Spring, MD.

NAT SOUND: BLAIR CYBER SECURITY CLUB

FADE UNDER

NARRATOR: These students are collaborating on a challenge. 

NAT SOUND: BLAIR CYBER SECURITY CLUB 

DAN GUIDO: Montgomery Blair is one of these extraordinarily specialized tech high schools. 

NARRATOR: Dan Guido loves Blair: 

DAN GUIDO:: It's probably like one of the top five in the country.

NAT SOUND: BLAIR CYBER SECURITY CLUB

NARRATOR: As we recorded this, the students at Blair were getting set to host the seventh iteration of their annual Capture-the-Flag cybersecurity competition. It's called ångstromCTF. 

MUSIC IN

NARRATOR: In the information security world, a Capture the Flag is basically a contest in which individuals or teams compete to break into computer networks and then accomplish a mission. It’s a terrific training activity that emulates and concentrates work on a set of real world problems. In 2021, more than 3000 users, in 1700 teams, competed in ångstromCTF, making it one the the largest high-school oriented CTFs in the country. 

NARRATOR: Trail of Bits has sponsored ångstrom, and other CTFs at high schools around the United States, for years. In fact, if there is a high school CTF out there, chances are that Trail of Bits sponsors it. One reason is that CTFs offer people the same kind of plunge into real world issues that we see every day at Trail of Bits

MUSIC OUT

DAN GUIDO: So your your ability, your demonstrated ability to solve those challenges and and complete those exercises tells me that you're going to be a really good candidate for Trail of Bits

NARRATOR: The other reason is pure pragmatism:

DAN GUIDO: the really excellent, ambitious, talented young students get recruited early and [00:18:00] then stick with the job for 10 years, and you might have a chance to hire them again when they're 30 or when they're 40, but they're not actively looking for jobs, they're going to be very sticky. They're going to work with a single firm for a very long period of time. 

DAN GUIDO: it's kind of the same strategy for diversity recruiting. … you put your name out there on the expectation that three or five years later, when that person was looking for a job that they remember you offer some service [00:18:45] that aligns with their interests and speaks to them, and they give you that chance that that one out of three times in their career that they're willing to switch jobs to a new employer. You hope that Trail of Bits [00:19:00] makes the running for that. 

MUSIC UNDER: SCAPES, GRAY NORTH: 

NARRATOR: The people who worked on this podcast are Emily Haavik, Chris Julin, Dominik Czarnota, Josselin Feist, Dan Guido, and hi, I’m Nick Selby, the director of the software assurance practice at Trail of Bits. With thanks to Blair Cybersecurity Club members Jason An, Clarence Lam, Harikesh Kailad, and Patrick Zhang. 

Chris Julin made our theme music. I did the foley - which means, I bit into that pickle. 

Season One of Trail of Bits is available for download now, wherever you get your podcasts. 

You can learn much more about the work done by Trail of Bits interns over the years, visit blog dot trailofbits dot com slash category slash internship hyphen projects .

There are links in the show notes to these articles and more: 

In "Never a Dill moment: Exploiting Machine Learning Pickle Files", Suha Hussain and lead engineer Evan Sultanik describe the Fickling project we heard about in this episode. There are puns. 

You'll also find links to the blog post, "PrivacyRaven Has Left the Nest" as well as a video introducing Privacy Raven.   

In "Optimizing a smart contract fuzzer", Sam Awls describes his journey to speed up Echidna. 

For those interested in CTFs, especially for those who seek to start their own, our CTF Field Guide contains details on past CTF challenges, guidance to help you design and create your own toolkits, and case studies of attacker behavior - both in the real world, and in past CTF competitions. Each lesson is supplemented by links to supporting reference materials. it's in our github repository: Trailofbits dot github dot IO slash ctf

The AngstromCTF site is at angstromctf dot com

Trail of Bits helps secure some of the world's most targeted organizations and devices. We combine high-end security research with a real-world attacker mentality to reduce risk and fortify code. We believe the most meaningful security gains hide at the intersection of human intellect and computational power. Learn more at trailofbits dot com. On Twitter we are AT trail of bits. Dan Guido’s Twitter is AT DGuido and I’m AT fuzztech.