NextRoll Wins Second Place in Machine Learning Competition

The competition consisted of click and sales prediction tasks under privacy constraints.

NextRoll’s machine learning technology team recently took second place in Criteo’s Privacy Preserving Machine Learning Competition – just behind Facebook – beating out more than 170 participants. The challenge gave the NextRoll team the opportunity to apply our machine learning expertise to aggregated data in hopes of accurately predicting advertising clicks and sales. 

“We were leading for a good portion of the competition, so despite our second-place (ranking), I am proud of what we accomplished together,” said NextRoll Senior Data Science Engineer Marco Lugo. 

Currently, NextRoll's core machine learning technology relies on first and third-party cookies to determine the value of 150 billion ad opportunities per day to our advertisers, using information from over 2.5 billion daily events.

As Google prepares to deprecate third-party cookies, a move already made by Mozilla and Safari, NextRoll, and others in the Marketing Technology space continues to prepare privacy-first systems that can operate without third-party data. 

“Most machine learning is done on granular data, so Google’s proposal to aggregate views of user data complicates things,” said Matt Wilson, a NextRoll Senior Staff Data Science Engineer who participated in the Criteo competition. “Criteo decided to host the competition to see who could learn on aggregated data the best.”

While the $20,000 in prizes was quite a nice incentive, perhaps the greatest value came from the learning and workshopping participants participated in while navigating and testing these complicated changes together. 

“This project allowed us to engage with the problem and think deeply about machine learning solutions for the future,” said Matt Wilson. 

The Competition 

Criteo, an online advertising company, launched the competition in May 2021 and proposed teams explore ways to learn click and sales prediction models on noisy, aggregated data. Criteo donated the dataset for research purposes. And to truly make it an exercise that will help participants prepare for a future without third-party cookies, Criteo engineered the data to simulate proposals in Google’s Privacy Sandbox

“This competition was a test of our agility and adaptability of our machine learning,” said NextRoll Senior Staff Data Science Engineer Bartek Siudeja.

While most of the data provided was the noisy, aggregated data mentioned above, Criteo also provided a small granular dataset – something more similar to what NextRoll’s machine learning technology ingests today. 

The NextRoll team, which included NextRoll’s Matt Wilson, Bartek Siudeja, and Marco Lugo, built its first data model based on the granular data and enriched it with information from the aggregated data. The result was the team was able to determine clicks and sales using aggregated data as input, but there is more work to be done in order to stop relying on granular data altogether. 

The Facebook team, which ultimately won, used a similar thought process but came closer to “solving the problem,” said Matt Wilson. 

For a more detailed and technical explanation of the project, watch NextRoll’s team project video submission

What’s Next

Both the NextRoll and Facebook teams based their winning projects partially on results stemming from granular data, but in the future machine learning technology will likely need to rely heavily on aggregate data. So if Criteo plans to host another competition in the future, it may not include granular data like before. 

“We didn’t solve the problem, but we did better than others,” said Marco Lugo. “Facebook did better than us, but also didn’t solve the problem and used the granular data – so the problem remains. However, we feel much better about training a machine learning model on aggregate data than we did earlier this year.” 

In the meantime, NextRoll’s engineering teams will continue to develop new ideas and test them to ensure the company is ready for a future without third-party cookie data. And our leaders will continue to support competitions and other learning opportunities to test those ideas.

“Having our work pay us and pay for the cloud infrastructure to do a competition like this isn’t available to everyone, and engineers don’t see that at every company,” said Marco Lugo. “We’re very fortunate that we’re encouraged to participate.”

Want to stay up to date on NextRoll’s tech team happenings? Check out NextRoll’s Engineering blog.