Co-ops at Recorded Future wrangle, structure, and package data to help clients stay secure
Author: Attrayee Chakraborty
Date: 07.22.24
Recorded Future, a threat intelligence cloud platform, enables organizations to identify and mitigate threats across cyber, supply-chain, physical, and fraud domains. The company uses machine learning and natural language processing to collect, process, and analyze threat data from open, deep, and dark web sources, and offers a cloud-based threat intelligence platform that provides real-time visibility into cyber threats, adversaries, and infrastructure. The company also integrates its intelligence with security products from companies like Cisco, Splunk, and Palo Alto Networks.
Khoury student Maya Prasad and alumnus Felix Yang have both completed co-ops with Recorded Future, and they shared their experiences with Khoury News.
Maya Prasad
Prasad, a rising third-year data science and math major, has had a passion for coding since high school, and especially enjoys deductions based on provided information. Data science aligned well with these passions, as did her data engineering position with Recorded Future’s Structured Data and Signals team.
“Recorded Future is incredibly focused on building threat intelligence to share with customers,” Prasad says. “Data science is a really key part of building that, because in large part, we’re the backbone.”
Prasad’s team receives data from many different clients using many different platforms, whether through Recorded Future’s APIs or structured data provided by vendors. The team packages the data to give clients a more open view of the threats they face.
“Problems can vary from tickets for addressing software bugs to full-on products to solution designs for a new product,” Prasad says. “One of the first problems that I worked on was using dynamic attributes to understand where malicious data was coming from on a scanner IP, which was difficult as we received around 20 million records in a day on our platform.”
Dynamic attributes are properties of an object that can be added, changed, or defined during a program’s runtime rather than being fixed when the program is written. For instance, you might add a “current speed” attribute to a car object only when the car starts moving.
Prasad notes the challenge of figuring out the source of malicious content, including whether it was the scanner IP or whether there was any malicious data attached to it. It was a great introduction to the realm of applied data science and different ways of storing data.
Prasad feels that her Northeastern courses, especially “Advanced Programming with Data,” helped her to improve the efficiency and runtime of algorithms.
“Now I have a way to apply that knowledge in a real-world setting,” Prasad says. “I was able to increase the code run, which allowed us to process more data and in turn give more information to the clients.”
Prasad even ended up working on a crypto mining tracker project during a company-wide hackathon.
“We created fake crypto miners and used machine learning to detect data compromise,” Prasad says. “We ended up being 95% accurate and won the hackathon. It was great to know my team better and how to code better in a fun setting!
“There’s so much potential for growth and understanding in cybersecurity,” Prasad adds. “I could definitely see myself doing a career in this!”
Felix Yang
Felix Yang, who graduated Northeastern in May with a math and data science degree, works as a business intelligence engineer at Chewy, an online retailer specializing in pet products and services. Yang did his co-op at Recorded Future as a data analyst for the Product Insights team.
READ: Furry, four-legged coworkers? A co-op experience at Chewy
“I loved the ease in interaction at Recorded Future,” Yang says. “I felt like I had a sense of agency to pick my projects and help out in ways I wanted to.”
Those projects included building applications and dashboards, plus providing reports to product managers. One such dashboard allowed salespeople to interact with clients and notify them of outstanding alerts and notifications.
“My primary question was ‘How we can automate alerts for a specific client instead of having to generate it over and over again?’” Yang says. “I built an interactive Slackbot service in response.”
Yang worked with API access to understand the data coming out of servers, then create a streamlined service that provided alerts to salespeople from client companies — alerts they could use without much technical knowledge.
Yang also feels that his classes at Northeastern, specifically “Foundations of Data Science” and “Advanced Programming with Data,” helped a great deal.
“I learned how to work with Python code and websites in class,” Yang says. “This gave me the foundational basis for working with JSON and CSV files that I needed to make the Slackbot.”
Yang’s biggest challenge was getting used to the business-to-business approach at Recorded Future, and thereby understanding business needs. But the more he worked, the more he gained the sort of interpersonal experience and savvy needed to retain customers.
“Talking with the managers in a professional setting, along with working with project managers and coworkers, has been really helpful,” Yang says. “Discussing work, especially being introverted, helped me highlight my successes.
“I got to wear many hats — engineering and business — in one role,” Yang adds. “Being a data scientist and also helping to set business goals has been extremely enriching.”