When converting data to a common standard, most techniques and tools assume that the team undertaking the work to understand how to convert the data will also have access to the data. We wanted to ensure there was a consistent way to convert the data by having a central team but we did not want this team to have access to the data to maintain confidentiality of the data hosted by each Data Partner. Therefore, we needed to research a new way to undertaking data conversion where the team would never have access to the data.
Data Partners run an open source tool called WhiteRabbit on a pseudonymised version of their data. This generates a metadata report. The report contains metadata regarding the tables, fields and values. The Data Partner always retains control of what data WhiteRabbit can access and the configuration of the parameters. Once Data Partners have checked the report only contains metadata (no identifiable, row level data or small numbers), the Data Partner then shares this report with the CO-CONNECT technical team. The CO-CONNECT technical team use the metadata to develop a set of rules to apply to the data and share back the rules with the Data Partner who then run our software that applies the rules and outputs data in a standard format and check that the data has been transformed appropriately.