Generative AI and Data Protection:
What Are The Biggest Risks For Employers?

by Stephen Toland, Head of Austin Office, CIPP certified, Ekaterina Lyapustina, Data Privacy Specialist, CIPM certified, and Jared Maslin, Professor at University of California, Berkeley

 

In May 2023, Bloomberg broke the news to the tech world that Samsung Electronics Co., a pioneer in consumer electronics, was banning the use of generative AI tools, including ChatGPT, for its employees in the workspace. The abrupt policy change was introduced after Samsung engineers uploaded the company’s sensitive internal source code into ChatGPT. In so doing, Samsung’s engineers exposed the company’s valuable and private resources to the catastrophic risk of a ransomware attack or public dissemination of private intellectual property.

Samsung’s story isn’t unique. According to the Cisco 2024 Data Privacy Benchmark Study, internal and external data privacy and security concerns are leading many organizations to restrict the use of generative AI (GenAI). The numbers are stark: most companies are limiting GenAI use, while nearly a third of all companies have banned the use of GenAI altogether. What is behind this growing apprehension? Businesses are particularly worried about the threats GenAI poses to their legal and intellectual property rights, as well as the risk of confidential information leaking to competitors and the public at large.

While the concerns around data security and the implementation of GenAI are understandable, banning GenAI completely can be shortsighted. To fully understand the risks, let’s analyze the concerns raised by companies such as Samsung. Armed with a better appreciation of the risks, companies can craft solutions for adopting GenAI in a responsible way. When GenAI is implemented correctly, a company can reap the benefits afforded by GenAI without incurring unnecessary data security risk.

 

Biggest Risks of GenAI for Employers

GenAI is a double-edged sword for employers who want to automate operations while keeping their data security intact. Here are three significant risks employers face when using GenAI:

Legal and IP Theft

The potential misuse of GenAI to effectuate theft of a company’s intellectual property is a major concern. GenAI can be trained on hundreds of datasets, including copyrighted material. This raises concerns about AI tools generating near-replicas of protected works, like music or software code. In February 2023, Getty Images, a stock photo giant, filed a lawsuit against Stability AI, the creators of Stable Diffusion, a popular image-generating tool. Getty Images claims that Stability AI used nearly 12 million copyrighted images without permission to train Stable Diffusion. With a looming trial date, the parties remain at odds on the copyright suit. If asked privately, however, both sides would likely agree that GenAI has the potential to create unauthorized copies of copyrighted works that blur the lines of originality.

With the Getty suit pending, any changes to existing copyright laws remain unclear. What is clear, however, is the dogged determination of regulatory agencies to actively address concerns around IP theft and the use of GenAI. The EU AI Act and the Federal Trade Commission (FTC) are just two of the many regulatory agencies actively addressing issues related to AI and IP theft. The EU AI Act includes a transparency obligation mandating the labeling of all AI-generated outputs to prevent confusion for users of that content and misappropriation of the underlying source material. Similarly, the FTC’s new AI rules include provisions stating that the failure to label GenAI created products could be considered deceptive under the FTC Act.

Data Leaks & Unintentional Sharing

Another significant risk when using GenAI involves data security breaches. Employees might inadvertently upload sensitive company data, like source code, customer information, or internal communications, to GenAI platforms, like the unintentional sharing that occurred with Samsung engineers. Once uploaded, the data is potentially exposed to unauthorized access or leaks during processing.  This is particularly problematic when the underlying platform uses information it ingests to continually train and retrain the algorithm(s) at its core.

Privacy Violations

Finally, processing personal data through GenAI tools raises privacy breach concerns. GenAI tools can collect, process, or even disseminate personal data beyond the scope originally intended. Such breaches, in many cases, are considered to be violations of data privacy regulations like the EU’s GDPR, as well as multiple data privacy laws in the United States.

 

Steps to Safeguarding Company Data with GenAI

Here are 5 steps you can take to mitigate risk and protect your company’s sensitive information when using GenAI:

1. Classify & Control

Not all data is equal. Implement a system to categorize your data based on sensitivity. Highly confidential information, such as customer records or trade secrets, should have stricter access controls and include the ability to leverage such data in environments that host GenAI solutions. Restrict employee access to only the data need to perform the specific job. Finally, document acceptable practices to establish unambiguous guardrails related to the use of specific types of information.

2. Educate & Empower

Educate your employees on the responsible use of GenAI tools. Highlight potential data security risks, thereby equipping your team with the best practices for handling sensitive information. In addition, make sure that their learning is a continual journey wherein employees are consistently reminded of their obligations, as well as informed on the use of cutting-edge technology and industry changes.

3. Isolate & Monitor

Create isolated environments, often called a “sandbox”, specifically for GenAI tasks and technologies. These sandboxes should be separate from your main company network, include bespoke monitoring and logging, and implement strict guidelines around data usage and exfiltration in order to minimize the risk and potential impact of breaches.

4. Encrypt & Anonymize

Before feeding data into GenAI tools, implement foundational encryption to render information unreadable in case of an interception in-transit or breach at rest. Furthermore, explore data de-identification and anonymization techniques to reduce personally identifiable information while preserving the data’s usefulness for GenAI processes.

5. Vet & Verify

Conduct thorough security and privacy audits of any GenAI vendors with which you plan to work. Assess their data security practices and ensure they meet your company’s compliance standards. Do not allow any GenAI tool access to your network until you are confident in the vendor’s security posture, as third-party exposure continues to be a critical area of unaddressed privacy risk.

 

Wrapping Up

GenAI sent the business world into a trance with the release of ChatGPT in November 2023. In many ways, business owners remain awestruck by the seemingly endless possibilities GenAI affords them. Slowly, however, the risks of incorporating this technology are becoming more apparent. A proactive and measured approach to GenAI is best. Stay vigilant, protect your business, and curb the use of GenAI when handling sensitive information.