If you caught our last blog, Privacy by Design & Default in Analog's Development Processes then you know that we shared our thoughts on how can systems be designed such that they achieve their purpose while also protecting the privacy of users. Today, we take the subject a bit further to discuss how to put the principles into practice.
Frameworks often present ideal, and at times impractical, expectations which leave engineers scratching their heads around how they apply to real-world development. Next, we introduce some of the tools that can be leveraged to protect privacy, and then return to our real-world use case of electronic toll systems (as described in the previous blog) to bridge the gap between privacy by design principles and their application in practice.
There are many privacy-enhancing technologies (PETs) to protect users' privacy, from differential privacy to secure multi-party computation. While PETs are powerful tools, applying them correctly requires domain expertise, and on their own they are unlikely to resolve all of the relevant privacy concerns. Rather than focus on any particular PET, let's examine how to apply the principles of privacy by design during the product design stage. Before walking through an example use case (2), it is important to understand the concept of data minimization: sharing as little sensitive data as possible with entities the user does not control. A goal of privacy by design is to minimize the privacy risks and trust assumptions that are placed on other entities, as they remove control from the user over their data.
The usual approach to data collection begins with a surplus of data, far more than is required. This surplus set of data is then reduced after reviewing the relevant data protection and privacy regulations, which restrict what data is allowed to be collected. However, just because data is not disallowed from being collected does not mean that it needs be collected.
If instead we follow the privacy by design approach, we would begin by identifying the absolute minimum amount of data that must be collected by other entities to solve the underlying problem. This minimal set may need to be expanded in order to ensure the final system functions as intended.
Let’s return to our example electronic highway toll system, and specify two requirements: (1) the toll authority must know the correct fee to charge each user, and (2) the users cannot cheat the system. There are clear privacy concerns around the collection of users' location data for determining the fee based on highway entry/exit, as well as service integrity requirements to prevent users from cheating the toll system.
A straightforward solution would be to transmit GPS data from each user's vehicle to the toll authority along with their identity so that the toll authority can compute the fee to charge. However, GPS data reveals an incredible amount of auxiliary information: which interchanges the user entered/exited, what time this occurred, how fast the user was driving, etc. In fact, none of that information is needed to satisfy the system requirement of charging the correct fee! Rather, the minimum amount of data that must be shared with the toll authority is the total distance traveled (3). This can be accomplished by calculating the distance locally at the user's vehicle, and then transmitting the distance traveled to the toll authority. The sensitive GPS data doesn't need to leave the user's control (their vehicle) which minimizes their privacy risk and the amount of trust that must be placed in the toll authority.
What about the second requirement that users cannot cheat the system? This is the type of problem that PETs are designed to solve. In this case, a cryptographic tool can provide the toll authority with sealed commitments containing the GPS data that backs up the fee calculation. Since the commitment is sealed, the toll authority can't access the GPS data on its own. However, if there is other evidence that calls into question the validity of the fee, the court can compel the user to open the commitment and prove that the fare was calculated correctly. In this example, the cryptographic commitment is an example of additional data that is required to maintain the integrity of the service, even though it wasn't in the original minimum set of required data.
Privacy is not only a requirement for products to comply with regulations, it also demonstrates our commitment to protecting the rights and freedoms of our users. Importantly, Privacy by Design principles must be integrated throughout the product development lifecycle, from inception through end-of-life. As we saw in our example, it’s worthwhile to identify the minimum amount of data required for the system to achieve its stated goals before making design decisions. Proceeding without factoring in privacy considerations can result a solution that cannot comply with privacy regulations due to the design itself. For instance, consider in our example the privacy implications of whether the toll operator or vehicle owner controls the collection and transmission of the data. Poor privacy design decisions can result in re-design costs, regulatory fines, and a loss of customer trust. By raising awareness of these principles, we hope to encourage engineers to begin discussions around data privacy and protection early in the lifecycle.