The Wire: New Delhi: Thursday,
January 26, 2017.
The ministry
of electronics and information technology carried out a public consultation for
an open license for government data use in August 2016. The draft license
prepared had provisions designed to suppress right to information and hold no
government official accountable. Every major organisation’s submission for the
consultation opposed these provisions. However, these objections were ignored
by the department at the end of the consultation process.
There was
virtually no major change between the draft and the final license version,
essentially making the whole public consultation waste of time.
India’s open
data ecosystem has not grown significantly over the last few years especially
when compared to ecosystems seen in the European Union or the US even though
the National Data Sharing and Accessibility Policy (NDSAP) was introduced in
2012.
The growth
around it was dampened by several factors along with government data being
under copyright in India. As all the data being published through open
government data platform had the copyright apply notice, it created ambiguity
around licensing for data users. After almost 4 years of the data.gov.in portal
going live, a open government license limiting provisions granted to citizens
under the Right to Information Act is out.
Government
Open Data Use License
The draft of
the Government Open Data Use License has been prepared by a committee lead by
Suresh Chandra, (Law Secretary, Department of Legal Affairs) including
representatives from several government departments, academia and civil society
organisations. The scope of license applies to all data being published under
NDSAP; and also data published through data.gov.in. The license covers issues
like attribution, distribution and usage permissions, but there are certain
provisions which are normally not found in a license. Section (6) of the draft
license has the following parts:
“The license
does not cover the following kinds of data
(a) Personal
information;
(b) Data that
the data provider(s) is not authorised to license, that is data that is
non-shareable and or/sensitive
(f) Identity
documents; and
(g) Any data
that should not have been publicly disclosed for the grounds provided under
Section 8 of the Right to Information
Act, 2005”
The above
exemptions sound like guidelines aimed to data providers (to not publish
datasets that contain those types of data), but are clearly directed at data
users. The policy principles have not been followed while defining these
exemptions as:
· Personal information can’t be published by a data
provider under the policy.
· How will a data user know if the data provider had
authorisation to release a data set?
· Identity documents such as birth and death certificates
are public. How is access to it restricted?
· Section 8 of RTI deals with non-obligation of state to
release data for citizens and is clearly not applicable to data user of the
license.
There are certain
issues which NDSAP did not cover during its formation and doesn’t clarify:
· It doesn’t clearly distinguish how the classification of
data should be done, instead letting departments take the decision, thus
creating ambiguity in type of data classifications among data providers.
· It doesn’t provide how the negative list, sensitive
datasets, should be managed either by
using cyber security practice guidelines or policies.
· It doesn’t clarify how an oversight committee should
ideally monitor every new dataset being generated by various government
departments nor does it include provisions to safeguard data as an asset.
· The policy fails to recognise the necessity of data skills in government departments and
doesn’t mandate any capacity building mechanisms.
· The timelines envisioned in the policy were too short for
departments to act on it with enough consultations. The policy expected every
public dataset to be uploaded within one year of notification i.e by March,
2013.
Clearly the
committee was hoping to rectify some of these issues within NDSAP by including
certain clauses in the license. They instead ended up making the license
complicated by mixing it with bizarre policy statements.
Warranty
of Data
Section 4 of
the draft license has certain clauses of no warranty, no continuity for the
datasets being published. Any data being published under this license has no
warranty. That basically implies you can’t make the data provider liable; the
data provider here is technically a department in government and not an
individual alone. This clearly violates
our fundamental right to expression (by limiting access to information) and
access to disclosure of public records under Article 19 (1) of the constitution
and has been upheld by Supreme court in the case of Raj Narain vs State of UP.
Section 4,
clause (d) ‘No Warranty’ also states: “The data provider(s) are not liable for
any errors or omissions, and will not under any circumstances be liable for any
direct, indirect, special, incidental, consequential, or other loss, injury or
damage caused by its use or otherwise arising in connection with this license
or the data, even if specifically advised of the possibility of such loss,
injury or damage. Under any circumstances, the user may not hold the data
provider(s) responsible for: i) any error, omission or loss of data, and/or ii)
any undesirable consequences due to the use of the data as part of an
application/product/service (including violation of any prevalent law).”
Clause (e)
Continuity of Provision states: “The data provider(s) will strive for
continuously updating the data concerned, as new data regarding the same
becomes available. However, the data provider(s) do not guarantee the continued
supply of updated or up-to-date versions of the data, and will not be held
liable in case the continued supply of updated data is not provided”
These clauses
are outrightly against the rights guaranteed under the RTI act: government
documents have some warranty and are definitely admissible in the courts. Every
public document out there could be brought under NDSAP and will potentially
fall under this license with a no warranty clause. Clearly a dataset that
contains the text of various Supreme Court judgments has some warranty. The
scope of license is too vague and if it wants to be the default license for all
data of the government, if it wants to replace RTI, it needs legal basis to do
so. The license is stepping its boundaries and needs to remove the clause of no
warranty for government data. The Indian Customs and Central Excise Department
for instance has shut down access to a high-value open dataset of every product
being exported and imported of the country after demonetisation, moving away
from providing a continuous supply of public information and data.
Data
mismanagement and security
The National
Cyber Security policy of 2013 rightfully identifies data leakages as a cyber
threat and sets safeguarding privacy of citizen data as an objective. Personal
information or any other sensitive data typically falls under negative list of
NDSAP and thus clearly needs to be handled with care by potentially encrypting
it. While India’s draft encryption policy was a disaster, we still have the
Information Technology Act which mandates the central government to provide
minimum guidelines to be followed to secure data from theft; there have been
none so far. But a license is no place to announce these intentions or
restrictions on a potentially published/leaked dataset.
Data has been
mismanaged countless times by government officials, but publishing personal
information knowingly or accidentally and trying to regulate it through a
license has been never heard off or have been done in practice. During a hackathon in 2015, Bangalore Police
released the call data records of people who were potentially under investigation
and called it ‘open data’. On the launch day of Sikkim’s open data portal, two
datasets revealing names, religion, caste and other personal information of
students and teachers in Sikkim was released. During the public consultation of
net neutrality in India, the Telecom Regulatory Authority of India (TRAI)
published the email addresses of every respondent. All the datasets in question
violate the very definition of ‘open data’ and were reported responsibly and
taken down. Accidents like these can happen again and will need a legal framework
to stop them. Current frameworks will not let data users to appropriately
report cyber incidents to any authority at all.
Passing the
responsibility of not accessing sensitive data on to the data user, and making
the distributor not liable, threatens every user and the community that is
built around that data. Again to quote an example, in 2014 government started
publishing details of every RTI request. In doing so they were exposing
personal details of individuals making the RTI requests. When questioned about
the personal information being published part of the RTI requests, officials
were quick to respond they can’t afford to redact the personal details due to
lack of resources. Yet Indian Railways was found digitising names, addresses of
individuals who filed the RTI’s, An RTI request I made is accessible to public
using railways search portal. The railways should be made liable for publishing
personal information on the web without providing enough security. Incidents
like these may stop people and activists from filing RTI’s; security through
obscurity won’t help us.
User and
citizen rights
Government
departments inevitably have personal information about India’s citizens. Some
of this personal information, like electoral rolls, is public and has been
easily accessible by marketing agencies or political parties for their own data
needs and analysis.
Initiatives
around Aadhaar and Digital India are creating multiple interconnected
databases, which are prone to all sort of data mismanagement issues. For
example, most details around a student’s performance is being recorded in
certain states. Someone from Hyderabad used education board data to lure young
girls under the pretext of counselling and ended up sexually exploiting them.
What can you do as a parent to stop your child’s data becoming public without
any laws in place?
Digital
rights have been at center of debates around Internet with the rise of apps,
which more often than not either obfuscate a consumer’s rights or simply take
them away. Service providers on the
Internet often have complex terms and conditions, which take away your basic
rights and give them unlimited leverage over you and your data. Dropbox,
Facebook, Google can, at the drop of a hat, suspend access to your own account,
making you lose your personal data without giving any reason at all.
It is
unfortunate that open data is also being crippled with these strange terms and
conditions, designed to make sure that there is no accountability if mistakes
are committed by government authorities.
The recently
released BHIM app also fails to acknowledge the usage of open source software
anywhere in the application.
The
disclosure and usage of software licenses in government applications is low,
most officials may not be even aware that such licenses exist and seem to
always have copyrights on open source work. In a democracy, public data and
documents help bring transparency and accountability. But restricting the usage
of these documents and data can harm us than help empower citizens. At the same
time government departments and agencies are digitising their records faster
than ever before, collecting personal sensitive data at every stage.
Initiatives like Digital India and the Smart Cities mission can be both boon
and bane, helping digitally empower citizens while also creating new problems
of security, privacy or a new digital inequality for them. Open data is
relatively a new concept and can harm developing countries if we don’t tread
carefully. What works for the West may not necessarily work for us. Open data
is “my data, your data and our data”. Let’s be careful with it and make sure we
as citizens have our say in keeping officials transparent and accountable.
Disclosure:
The author is co-founder of Open Stats, a startup focused on opendata.
Srinivas
Kodali is an interdisciplinary researcher working on issues of cities, data and
internet. He volunteers with internet movements and communities