TEEs: What They Are and Why They’re Critical for Privacy Sandbox Testing

A cloud-based “Trusted Execution Environment” (TEE) is a set of infrastructure maintained by one organization that securely and opaquely runs code developed by a different organization. The organization hosting the TEE can monitor the environment and invoke the code, but cannot change it, and in many cases, cannot see the details of how it works. In the context of adtech, specifically Google’s Privacy Sandbox initiative, a TEE is a critical piece of the privacy puzzle that allows the processing of a user’s data without revealing their identity.

How are TEEs Used in the Privacy Sandbox?

With Chrome’s impending deprecation of third party cookies, the Privacy Sandbox, which includes NextRoll and other adtech providers, is developing an industry-wide solution to support the future of digital advertising. Historically, the adtech industry uses third-party cookies to track a user’s actions across sites and associate those actions with future actions. Without third-party cookies, linking those actions becomes considerably more complicated. The Privacy Sandbox focuses on correlating user activity within the Chrome browser, preventing that information from leaving the browser in a de-anonymized way. This information is then securely provided to advertisers using TEEs, maintaining a privacy-forward environment. 

Within the Privacy Sandbox, there are several proposed APIs that leverage TEEs for data processing. At NextRoll, we’re exploring these capabilities in relation to ad performance measurement and attribution. These include the Attribution Reporting API (ARA) for recording attributable conversions and for measuring ad performance, as well as the Private Aggregation API for tracking the health of the system and key user metrics.

NextRoll's Use of TEES

As part of this implementation, NextRoll receives valuable, encrypted information (which will later be aggregated and anonymized) from users’ browsers when an attribution event is detected, such as a click, purchase, or other conversion. This information comes via reports from users’ browsers, which detail information about the conversion, the ad that was viewed, the associated ad campaign, and other parameters that we set. 

However, we’re not able to read these encrypted reports directly to prevent the identification of user interactions and cross-site tracking. Instead, we process these reports through the “Aggregation Service”, a capability developed largely by Google that is run in a NextRoll-owned cloud environment. We invoke, monitor, scale, and pay for this service, which gives us full control over the underlying infrastructure. However, the implementation details are provided in an open-source, auditable repository on GitHub. This separation allows us to make design decisions related to how we use the data, and ensures user-level privacy, as promised by Chrome, such as: 

  • We define the “aggregation keys,” which are used to summarize the incoming reports. This allows us to define keys that are relevant to our use cases and our customers. However, the more granular the keys, the less data we have associated. The service adds a degree of “noise” to the aggregate value in each key, which enhances privacy. This is known as “differential privacy.”

  • We determine how much data is pushed through the service. We can operate with hourly, daily, weekly or any other data cadences we set. However, the shorter timeframe, the less data we have, and we are more impacted by “noise” as mentioned above.

  • We can expect to receive reports within a short time frame after an attributable conversion (i.e. in around 10 minutes). However, by waiting and collecting many reports, we increase the accuracy of the data. This is also good for user privacy, as it prevents us from determining the users that saw particular ads.

At NextRoll, we’ve been testing the Aggregation Service, via the TEE, on production-level attribution data. Our diverse customer base has helped us understand the accuracy of the data and the impact of privacy-preserving measures for customers of all sizes, spend, and campaign cadences. These inputs are helping us think through desired data granularity and the impacts of these privacy-forward changes.

There are a few interesting technical points to consider in using these particular services and TEEs in general:

  • It’s important to factor use of the environment into cost planning. True of any infrastructure, it’s critical to understand how the underlying services consume cloud resources and to understand the implications of scaling.

  • To preserve privacy, a report can only be run through this TEE Aggregation Service once. It’s very important to drive the creation of test data so as to not squander precious production data. This can be a major risk and drives the need for expertise and planning.

  • It’s imperative to carefully consider what we want to measure for both APIs, as the Aggregation Service also has something of a privacy budget, such that if you ask too much of your data, you get nothing. Imagine trying to play 20 questions, but only being able to ask three. 

  • Debugging capabilities are available in Chrome temporarily, where noise-less reports can be received, and privacy-preserving variables can be configured. We recommend implementing and testing sooner than later.

Overall, TEEs and specifically the TEEs associated with the Privacy Sandbox provide a good separation of responsibilities and a clear trust model. Even though the service details are abstracted, it’s necessary to have a strong understanding of how the privacy-preserving measures work and the variables that companies can use to tailor their use of the services.

Tom Polchowski is a Senior Manager of Software Engineering at NextRoll.

Contributions from Marco Lugo, a Staff Data Science Engineer at NextRoll.