Project Name:

Linking State and Federal Data to Strengthen Federal Program Evaluation

Contractor: The Urban Institute

Lessons Learned

During the project’s first quarter, Urban submitted and revised detailed analytic plans for the two major project tasks: Task 1, conducting a qualitative study on barriers and opportunities for federal–state data linkages, and Task 2, executing privacy-preserving record linkage (PPRL) within the National Secure Data Service (NSDS).

A key lesson learned is that successful PPRL projects require extensive coordination across legal, procedural, and technical governance domains. Understanding the privacy and security capabilities of existing systems proved challenging, particularly when engaging stakeholders with varied technical expertise. To address this, we tailored outreach materials to distinct audiences to ensure consistent interpretation of project goals, privacy requirements, and technical constraints. This approach improved our ability to assess system readiness and identify logistical barriers early in the PPRL process.

Another major challenge was timely access to raw data, which is often delayed due to the high coordination burden inherent in PPRL projects. To mitigate this bottleneck, we developed an end-to-end PPRL simulation using synthetic data informed by publicly available sources. This workaround enabled parallel progress on code development and governance planning while formal data access processes were underway.

Lessons from this quarter highlight opportunities for NSDS to strengthen its role beyond providing secure computing infrastructure. Most federal–state linkages rely on shared unique identifiers or common PII fields; however, for this project, address data provided by states is likely to be central due to both the nature of HUD data and the incompleteness of state records. While PPRL systems can be designed to handle address data, many existing open-source solutions (such as those offered by MRAIA) do not currently support address pre-processing under robust security certifications.

These findings suggest potential value in NSDS investing in PPRL software and infrastructure that supports secure address pre-processing. The broader utility of such an investment depends on whether this challenge is common across federal–state linkages or largely specific to HUD and CCDF data. Combining insights from Task 1 (qualitative interviews) and Task 2 (technical PPRL testing) will help NSDS determine where standardized tools would have the greatest impact.

More broadly, NSDS could play a proactive role in improving data interoperability by facilitating the collection and dissemination of metadata and linkage-quality assessments. This would make potential linkages more discoverable while helping researchers anticipate match quality issues and mitigate risks associated with low-quality or incomplete identifiers.

Disclaimer: America’s DataHub Consortium (ADC), a public-private partnership, implements research opportunities that support the strategic objectives of the National Center for Science and Engineering Statistics (NCSES) within the U.S. National Science Foundation (NSF). These results document research funded through ADC and is being shared to inform interested parties of ongoing activities and to encourage further discussion. Any opinions, findings, conclusions, or recommendations expressed above do not necessarily reflect the views of NCSES or NSF. Please send questions to [email protected].