Project Name:
Secure Multiparty Computation: A Case Study
Contractor: Stealth Software Technologies, Inc.
Lessons Learned
- Why SMPC?
- A long-standing hypothesis of SMPC pioneers is that this technology will make data sharing much safer and therefore, easier to convince stakeholders. Several recent examples just in the past 5 years have shown that this is the case, but it has to come from a combination of technological and administrative alignment and not simply SMPC alone. Even within privacy-enhancing technologies, SMPC is only one tool in that toolbox, but it uniquely offers a much stronger privacy and functionality guarantee that other tools cannot achieve (of course, at the cost of performance). In some cases, this may be overkill, and “secure enough” is secure enough, but we have found that this additional strength may be what ultimately moves the needle.
-
- However, there is still a general feel of “too new” or not having enough previous use-cases to point to. One hospital we spoke with preferred a Privacy-Preserving Record Linkage solution using secure hashing as opposed to SMPC simply because they saw it being used in a previous city-wide study. Like many technologies, there is a network effect of just having more organizations using the technology and having heard of it. These points only further highlight the importance of this project.
-
- For our specific use case, we have noted that even with the technology being offered to protect privacy, it appears that there needs to be an internal business need to motivate the private sector partner to engage their legal team to collaborate on this work.
- Technology Insights
-
- In order for the SMPC technology to scale for use in multiple concurrent studies, there requires a standardized, auditable account management capability. To support future studies, we should implement (or adopt) an account management system that allows each user to authenticate and manage the studies they are involved in (e.g., view status, track necessary actions). For this demonstration project, we are assessing whether existing secure data platforms, within a future National Secure Data Service, can integrate with an open-source identity/account stack (e.g., Django-based authentication or comparable frameworks) or whether a separate, agency-approved solution is required.
- Data Governance and Access
-
- For commercial data governed by restrictive data-use and redisclosure controls, the data governance or legal stakeholders should be engaged at project initiation and treat approval timelines as a critical-path dependency. In the early outreach, it is important to use precise language that clearly distinguishes (a) internal project use vs. redisclosure to additional agencies/contractors, (b) access to schema/metadata vs access to full datasets (including synthetic), and (c) who will have access, where data will be stored, and under what controls. For this demonstration project, data access approvals have remained unresolved after ~3 months, future project plans should allocate 4-6 months for commercial data acquisition that involves multi-party access and redisclosure constraints, especially when approvals span multiple organizations. In this case study, we initially expected data acquisition from J.P. Morgan Chase (JPMC) to be straightforward because the datasets are synthetic and publicly listed, and we had a supportive point of contact. In practice, however, their terms of use restricted downstream sharing, which conflicted with our multi-agency project structure; subsequent clarification requests were declined by the data governance team, creating a schedule impact. To mitigate schedule risk (particularly given year-end holiday slowdowns), the team is continuing engagement with JPMC’s data governance stakeholders to identify a compliant access/sharing path while, in parallel, searching for alternative data sources as a contingency should JPMC ultimately be unable to support the project’s multi-party access requirements.
-
- For the other datasets in the project, data acquisition has progressed more efficiently because we had a clear path to the decision-makers and the U.S. Department of Veterans Affairs (VA) saw clear value in participating—i.e., securely analyzing their dataset alongside others to benefit their users, veterans. A key accelerant was the team’s preparation of a concise, one-page partner brief using VA’s own terminology and publicly available references, clearly mapping the project objectives to VA’s stated priorities and articulating the mutual benefits of participation. Additionally, having Western Institute for Veterans Research represented on the project team (and funded under the project) improved prioritization, responsiveness, and timeline adherence. For future efforts, we should explicitly align participation benefits for each data contributor, package the ask in the partner’s language with a brief, evidence-backed value proposition, and ensure early engagement with the correct owners of data access and approval pathways.
-
- For future reference, the projects should treat data rights and governance alignment as a formal prerequisite to cross-organization linkage work and plan accordingly. This does not necessarily require completing all approvals before submitting a project proposal; rather, proposals should scope and resource these activities as early-phase tasks with clear entry/exit criteria, and allocate a realistic schedule margin for multi-party review cycles. Specifically, 1) confirm the data use and redisclosure terms early and document them as an entry criterion for cross-agency analysis; 2) obtain full stakeholder buy-in and documented commitment, and 3) ensure empowered points of contact are identified at each partner organization and that incentives and accountability (deliverables/timelines, and when applicable, funded participation) are aligned to support on-schedule execution.
Disclaimer: America’s DataHub Consortium (ADC), a public-private partnership, implements research opportunities that support the strategic objectives of the National Center for Science and Engineering Statistics (NCSES) within the U.S. National Science Foundation (NSF). These results document research funded through ADC and is being shared to inform interested parties of ongoing activities and to encourage further discussion. Any opinions, findings, conclusions, or recommendations expressed above do not necessarily reflect the views of NCSES or NSF. Please send questions to [email protected].




