We all remember the Life of Julia, where the Obama Administration laid out how government programs were going to affect someone's entire life. But in order for those programs and policies to be there to ferry Julia from one life stage to another, some government agency had to design and implement them. Someone had to anticipate future problems and create programs that would address them. In today's world you only do that with data which is collected, crunched, analyzed and finally used to justify policy. That data collection begins at birth and ends at death. A social security number is applied for at birth which creates a permanent record for that individual. A death certificate is registered at the end of that life. In the middle other data are collected: a student ID, a driver's license, a mortgage account, a credit report, a criminal record, a health record, etc. All this data tells the story of us. Or it would if it were all easily accessible in one place which up until now has not been possible.
Enter the Data Quality Campaign, whose goal is "to ensure that every citizen is prepared for the knowledge economy." In their most recent document
Pivotal Role of Policymakers as Leaders of P–20/Workforce Data Governance the DQC wrote, "Achieving this goal requires unprecedented alignment of policies and practices across the early childhood; elementary, secondary, and postsecondary education; and workforce sectors (P–20W). Consequently, many policy questions require data from multiple agencies to answer."
See, they need data from all these agencies in order to answer policy questions about education. But they have a problem. Though states have independent databases that track the information policy makers claim they need (we'll get back to that in a minute) they run into "challenges" accessing this information due to: turf, time, technical issues, and trust.
Challenge 1 Turf - Data is power and money. One does not just casually hand that over to another agency just because the other agency has claimed a need for it. Those who currently manage the data "silos" need assurance that they will not lose control or have another entity assigned oversight on what they do. This is a reasonable concern since education data collection which started in the states has had rules and restrictions placed on it by the states that cannot and should not be violated. DQC's response is to
"define clear and distinct roles and responsibilities aligned to commonly established goals. This creates and fosters a culture of shared responsibility..."
Challenge 2 Time - Only so many hours in a day and money to pay people to manage all this data. And since all that money comes from taxpayers, regardless of whether it is a government employee or a government contracted company, there needs to be assurances in place that the time/money is well spent on data management.
Challenge 3 Technical Issues -each agency defines its own data standards and protocols and procedures for data use, making sharing data difficult and inefficient. Here is where DQC can really shine because their goal is to make all these databases talk to each other so sharing data across them is - they use the word efficient, but let's call it - easy. These inefficiencies and mismatching may be the last thing protecting your privacy and DQC is working like bunnies to strip that away.
Challenge 4 Trust -"Agencies are concerned about how their data might be used once the data are linked, matched, and shared." How about parents? Mightn't they be concerned about how this data will be used once matched and shared? Throughout this entire document the people who really "own" this data, the children and those who speak for them, their parents, are never mentioned.
Maybe I came too late to the discussion. When was it discussed that the government had a right to collect and use personal data on every single American? That seems to already have been agreed upon by unelected bureaucrats who don't answer to parents. Here are the Board members of DQC.
Tom Luce, Chair Chairman,
National Math and Science Initiative John Bailey Director,
Dutko Worldwide Tammi Chun
Policy Analyst, Office of the Governor, State of Hawaii Kathy Cox CEO,
U.S. Education Delivery Institute Kati Haycock President,
The Education Trust Bruce Hoyt
Former Board Member,
Denver Public Schools Board of Education Sharon Robinson
President and CEO, American Association of Colleges for Teacher Education Bob Swiggum
Chief Information Officer, Georgia Department of Education Gene Wilhoit
Executive Director, Council of Chief State School Officers
Their process looks like this:
- Link Systems to allow for efficient matching of data that have been deemed necessary for specified purposes.
- Match Data to create datasets with connected records on the same individuals from two or more databases.
- Share information to provide participating agencies and institutions knowledge that was unavailable prior to the data matching.
There are circumstances where some data would be useful. How could colleges improve their course offerings if they didn't track how many of their graduates got jobs and in what fields? How would high schools know whether they were truly preparing their graduates for the real world if they didn't track how many went to college and how many got jobs?
The problem is more in the Field of Dreams area. If you build it, they will come. If you begin to create a completely integrated data stream of personal data (which everyone always refers to as lacking individually identifiable data, right) with guidelines on how to set up new databases that can link to it and job descriptions that include making sure your data is compatible with the integrated system, you begin to create something so powerful that its governance should not be in the hands of any single individual or agency. Try preventing that from happening.
Most people only look at the privacy issues in terms of the individual databases. So what if someone knows my kid's student ID. Who cares if I'm part of the public record as someone who receives unemployment payments. With groups like DQC working to connect all this data and develop policy on it, who knows what kinds of policies could be developed because of someone's interpretation of that data. Maybe a policy needs to be established that requires an automatic visit by Child Protective Services for every child whose parent has become unemployed because past data showed a statistical potential for neglect when a parent loses a job.
The bigger issue is that government agencies will be self directed by data to address problems that the public has not asked to be addressed. Our elected representatives could, in essence, be replaced by databases. Whatever efficiencies or solutions might be gained by creating such a system should be weighed heavily against the possibility of such systems being abused by someone you don't agree with. In addition should always be the concern of such data being compromised, maybe even from entities outside the U.S. One of the key elements in the P-20 system is that it be accessible. That means, by definition, outside entities need to have a way in. There is no such thing as a completely secure system that needs broad access and any honest IT person will confirm that. So how much data do we want to put in such a system? Has anyone asked us?