Information Security on a Budget: Data Classification & Data Leakage Prevention

Contribute to take security back, spread the word

A guest post by Zouheir Abdallah, CISA, Cyber Security – Senior Specialist Q-CERT/ictQatar

This article first appeared in the “Towards a Secure Cyber Space” special publication booklet 2014-2015 of the Research & Strategic Studies Center (RSSC), Lebanese Armed Forces.

Not long ago, protecting information was as easy as locking documents in suitcases and safes. Physical security was all that was needed to safeguard a top-secret document. Today, word documents have taken the place of printed papers, folders have replaced suitcases, and hard drives are now the preferable storage medium (The modern safes).

Data is no longer confined to a single place. Files can now be effortlessly, modified, copied, and transmitted, and for a trained smuggler, all without a trace.

There is no denying that modern technology has made our lives much easier, but such ease comes with a price. New technology presents new challenges, the challenge of maintaining organizational secrets, the challenge of safeguarding organizational data from modification and loss of data integrity, and the challenge of ensuring that the data is available when needed.

These challenges are summarized into what is known as the C.I.A triad; Confidentiality, Integrity, and Availability. All of which are at the heart of information security. Information Security is the process of protecting the Confidentiality, Integrity and Availability of information.

Confidentiality refers to limiting information access and disclosure to authorized users “the right people” and preventing access by or disclosure to unauthorized ones “the wrong people”. For example, if an unauthorized employee is able to view payroll data, this is a loss of confidentiality. Similarly, if an attacker is able to access a customer database including names and credit card information, this is also a loss of confidentiality.

Integrity refers to the trustworthiness of information resources, and the loss of integrity means that the information has been modified or destroyed. For example, if a file is infected with a virus, the file has lost its integrity. Similarly, if a message within an email is modified in transit, the email has lost its integrity.

Availability refers to the availability of the information.  The information that is not available when you need it is almost as bad as none at all.  It may be much worse, depending on how much the organization has become reliant on a functioning computer and communications infrastructure.

Protecting against loss of Confidentiality

Organizations protect against loss of confidentiality with access controls. For example, users are first required to login and then access is granted to users based on their proven identity. In short, users are granted access to data via permissions. If users do not have permissions, they are denied access.

Encryption is also used to assure confidentiality. Encryption changes clear data into ciphered data that cannot be read. The only way that the encrypted data can be read is by decrypting the data using an electronic key that should be properly secured. Anyone with access to this key can decrypt the encrypted data and change it back to clear text.

Data can sometimes be intercepted (when in transit ) and if not properly protected, confidentiality of this data can be lost. For that data should be encrypted whenever it is being moved around. Additionally, data at rest (stored data) should preferably also be encrypted, since these data could be stolen or lost.

Protecting against loss of Integrity

One of the ways of ensuring integrity of the data is by using hashing. A hash is a unique value of the data. Hashing is the process of calculating the hash using a mathematical function. So as long as the data has not changed, its hash will remain the same (calculated using the same mathematical function).

As an example, if you calculate the hash of “RSSC-Revue” it will be different than the hash of “RSSC-Revuu”. So by comparing any given two hashes, we will be able to verify if the data is identical or has been altered.

Protecting against loss of Availability

Backups are one of the many methods that organizations should use to ensure that important data is available for restoration in case the original data becomes corrupt. Backups take a mirror image of the data and ideally these backups should be stored separately to ensure that one set is available in case the other is not.

Data Classification

A data classification program is an extremely important first step in building a secure organization. Classifying data is the process of categorizing the data assets based on the value according to its sensitivity. For example, data might be classified as public, internal, confidential (or highly confidential), restricted, regulatory, or top secret.

Data and information assets are classified according to the value of the risk of unauthorized disclosure (e.g. lost or stolen intentionally or unintentionally). High risk data, typically classified “Confidential”, requires a greater level of protection, while lower risk data, possibly labeled “internal” requires proportionately less protection. Public data, typically classified as “Public”, requires no level of protection (e.g. Public press releases). In short, data classification gives you an overview on how your data should be protected. The riskier the data, the more confidential it is, and the more protection it should get.

Data classification is not necessarily a complex issue. It can be initiated on a personal level by following the company’s policies and using logic and common sense. Ideally in an organization, all data and information are classified as internal. That means that only employees should handle this data, and no outsider should have access to it. Public data contain no sensitive information and are meant to the general public, e.g. Press releases. While, Confidential data is highly sensitive data intended for limited, specific use by a workgroup, department, or group of individuals with a legitimate need-to-know.

It is very important for an organization to adopt a common set of terms and relationships between those terms in order to clearly communicate and begin to classify data types. Consequently, it is highly important that all data are labeled and the classification explicitly clear.

The label can be thought of as a code that the author uses to communicate to the users of the data how it should be protected based on the “sensitivity” of the data. For example, when data is labeled as  “Confidential,” one communicates to all custodians and users of the data that it is only to be seen by those with “need to know.” When one labels it “top secret,” one asserts that, among other measures, the data should be locked up when not in use.

Data Loss Prevention

Data loss is a situation that happens when information is lost due to an intentional action, unintentional action, failure, disaster, or crime. Data loss prevention employs several techniques and technologies to prevent the loss of data; it is a shared responsibility amongst everyone in the organization, and usually is fairly straightforward.

The majority of loss in data occurs due to a human error, and thus addressing the human factor through awareness and education is probably the most effective method of minimizing data loss.

A couple of years ago, a contractor working for an international intelligence agency forgot a suitcase full of backup tapes on a train. The tapes contained personal information about all agency employees, contacts and overseas informants.

The extremely sensitive personal data included Social Security Numbers, home addresses, information about family members, phone numbers, dates of birth, medical information, bank account numbers, employment information, driver’s license numbers, passport numbers, and biometric information.

Such accidents could’ve been avoided if the employee were better aware of the consequences of data loss and how to protect the data he was handling.

We don’t need to be working for an intelligence agency to practice data loss prevention. As a matter of fact, we are prone to losing data on a daily bases using much rudimentary technologies. Removable storage devices, USB drives, mobile devices, tablets, and laptops, they all pose a more grave risk as they are much more widespread in organizations, and most often these devices are transporting critical information, unprotected.

Organizations with limited resources can kick-off their information security programs with initiatives that require little resources. Information security awareness, data classification, data labeling, DLP, and other topics, that address the human factor in information security, are quick wins that are surely to lay the proper bases for an effective and successful organizational information security program.

Contribute to take security back, spread the word

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>