CGCS presents Big Data in India: Transparency  & Accountability vs. Privacy & Protection. Originally posted on February 12th, 2012 in blog format for, incoming CGCS visiting scholar Malavika Jayaram provides in full, her useful insight on the privacy policy debates brought up by recent SMS and social media driven events in India. 

“Big data” projects generally, and biometric schemes in particular, have faced intense scrutiny and opposition the world over. Apart from posing fairly obvious threats to civil liberties and constitutional freedoms, they are prone to more practical errors around reliability, security, accuracy and access. India is one example of a country on the cusp of a radical shift towards increasing digitisation and e-governance, seemingly without much thought being given to the potential dangers of such technocratic ambitions. Controversial programs like the National Population Register, the Unique Identity Scheme, the NatGrid[1] and a host of other e-governance initiatives potentially expose physical data and intangible identities to immense threat.  Some of these schemes are yet to receive parliamentary sanction, duplicate each other’s efforts or appear to be simplistic technical solutions to more complex socio-political/economic problems. While the overarching impetus for rolling out sometimes hasty or ill-considered (and always hugely expensive) projects might well be benign rather than sinister, and might often be transformative – if implemented with the right checks and balances in place (such as strong data protection and privacy laws, which India is still grappling with drafting) – there is a compelling need to be cautious and thorough. Several critics of such grand projects have stressed the especial need to be nuanced and balanced in implementing technologies that seek to (or at least, have the side effect of) fundamentally alter the relationship between the citizen and the state.

The technologies have not been validated for this population size. The possibility of errors, leakage, scope creep and abuse (both online and offline) during the capture, transmission, storage and use of the data cannot be discounted. The federated architecture of such systems, with enrolment and service provision farmed out to a vast ecosystem of private and governmental registrars and other actors (of varying degrees of literacy, education and understanding of such schemes) put sensitive personal information at risk of access and abuse by many. Devices can be spoofed, people can be made to register or authenticate under duress (an offline “human” subversion that no technology can pre-empt) and processes can fail.

To take a step back from flaws, errors and leakage (i.e. the “it simply won’t work as designed/conceived” risk), a more fundamental issue is whether this sort of massive information collection should be done at all. The risks in data leaking out and being misused is a debate around the robustness of the methodology, processes and technologies, whether carried out by an inept bureaucracy or the preferred solution of today -outsourcing to independent vendors. The former is normally callous about use and access of data, and the latter focused on maximizing its return on capital and harvesting the data for more commercial and marketing purposes. However, such massive data gathering exercises are usually justified on grounds of stated “social” goals. They seek to achieve increased efficiencies through centralized databases, more equitable distribution, more accurate targeting and delivery of government benefits, anti terrorist and national security purposes and all manner of other worthy objectives. To put it very baldly, the argument often goes that the conservatives want to gather information allegedly for security reasons or to monitor deviant behavior (the public order or national interest argument that often trumps individual rights and liberties); the liberals because they have figured out what is good for you and want to implement it whether you like it or not (the nanny state problem). In India, there is another high level goal, a laudable one if successfully achieved, and this is to eliminate or at least minimize the rampant and grotesque corruption that is endemic to Indian society. There is a desire to use data to expose the rot in the system, to disincentivize corrupt bureaucrats from taking bribes and siphoning monies, food grains and other public distribution away from their intended beneficiaries and recipients, as also to track and prevent misuse and fraud at the customer end.

In an era of increasing pressure for governments and agencies to promote transparency, openness and accountability in India, it would be easy for privacy to fall by the way side. Many do believe that it is a small price to pay for a larger goal that is deemed more useful in a country that is crippled by bribery, middlemen benefiting from fraudulent transactions, and significant hardship in carrying out the most basic transactions. The initial successes of the Right to Information regime led many to believe that greater openness would foster greater accountability. However, agencies are getting better at working around information requests or providing data of little value, and certain limitations of the RTI Act ensure that it is not as powerful a tool as it was hoped.

It is often the case that there is an uneasy tension between transparency and accountability on the one hand and privacy (and perhaps security) on the other. However, it is not quite the zero sum game that it is often portrayed as (i.e. you can have one or the other, not both). The issue can be better parsed if we think of transparency and accountability of state actors, versus the privacy of individuals and their sensitive personal information, and recognize that both serve important functions. In India, there are so many complex issues around open data (for e.g., people’s caste, religion and exact place in the social hierarchy can be revealed by something as simple as their name) and many instances of misuse. People with access to electoral databases have targeted and committed atrocities and hate crimes against a particular ethnic group or religious minority. Information about HIV positive residents of a village have led to great ostracism as such data is not anonymized.

While some presumed benefits of biometric and Big Data schemes appeal to nations paranoid about national security, immigration and porous borders, or seemingly offer a magic wand to clean up corruption, fraud and malaise – laudable goals, all – implementing them without the requisite checks and balances by way of privacy laws, data protection principles, informed consent and a robust public debate would be at best foolish, and at worst, dangerous. To proceed with the naïve view that Indians, for example, simply don’t care about privacy when their basic needs of food, shelter, clothing, employment and healthcare are not met, is not a tenable or compelling argument against being guarded and proportional in the use of open data and the drive towards greater accountability and transparency. On the contrary, in a country that is still struggling with crippling poverty, disparity, discrimination, ossified social structures and illiteracy, those in a position of power have an even greater responsibility to design systems to be privacy preserving by default, to protect those who are incapable of the sophistry or ease around technology to make decisions about the use of their data. Unless the citizen is put back in the centre of the discourse, the very logic of open data collapses. Transparency is not a value in and of itself; humanizing systems and focusing on the real people behind the numbers and metrics is of paramount importance.

[1] The “national intelligence grid” which seeks to join up 21 disparate databases containing banking, healthcare, tax, police and other records and make them available on a searchable basis to law enforcement agencies as a counter terrorism measure.

//Malavika Jayaram is a practising technology lawyer, Fellow at the Centre for Internet and Society, Bangalore and PhD candidate focusing on data and privacy.



