Blogs & Resources

AI Needs Data and Data Needs Governance

Supreeth Rao
Grace Rotondo, Director Routes to Market - Theom

“We shape our AI; thereafter, it shapes us. Why it's so important that we be thoughtful and intentional about how we train, tune, and safeguard AI.” - Clara Shih, CEO, Salesforce AI (1)

The Generative AI landscape is changing at an unprecedented pace. While exciting, the rapid pace of change can lead to ethical oversight, posing a great risk to the world in many regards. As Clara Shih alluded to, the data we feed to AI models will, in turn, define what our world becomes. Dr. Gary Marcus echoed a similar sentiment, adding that “the data on which LLMs are trained can have bias effects on the model output,” so much so that “bad actors can use these systems for deliberate abuse, from spreading harmful medical misinformation or disrupting elections, which could gravely threaten society.” (2) With such serious consequences at stake, we are responsible for protecting our data from bias and bad actors so that what we feed to AI models will benefit our future, not harm it. 

How does data become biased, and how can bad actors misuse it? 

Though data-driven often connotes fact and objectivity, that is not always the case. Data can become biased in several ways, including when parts of the dataset are overrepresented, there are outliers, the data represents a biased sample, etc. So, when LLMs consume biased data, their outputs reinforce existing biases, fueling what can become a harmful cycle. 

Bad actors can intentionally create biased data or spread information that can be incredibly harmful. For example, LLMs like ChatGPT can spread dangerous health information founded on Chinese propaganda disguised with generic, Western phrases sprinkled in. 

These examples demonstrate just how critical data governance is in the era of AI. While AI thrives on data to make informed decisions, data, in return, requires stringent governance to ensure its integrity, security, and usability. This is where Theom comes into play. Designed as a cloud data access governance and security platform, Theom is the linchpin that holds AI and data together in a secure and governed ecosystem.

Built to integrate seamlessly with Snowflake, Databricks, Azure, and AWS, Theom follows your enterprise data from the inside, providing a comprehensive understanding of your data and every data access event. It identifies where the most sensitive data resides and who is accessing what, how, and when, providing insights into over-provisioning, atypical usage patterns, and policy violations. These insights ensure that only authorized personnel can access sensitive data and AI models and that your AI models are secure and compliant with relevant regulations. This AI governance ensures that the data being fed into your AI models is reliable and not prone to bias.

Theom’s governance over data access extends to data sharing as well. While data sharing enables collaboration and data-driven decision-making, it also opens avenues for potential unauthorized access and data breaches. Effective sharing governance ensures that data is shared securely with the right people and for the right reasons.Theom tracks how your data is being shared, who it’s being shared with when it’s being shared, and how your data is being changed by those it’s being shared with. Theom’s robust sharing governance covers data sharing in clean rooms, data exchange marketplaces, and anywhere between. 

In addition to data access governance, Theom also protects against insider threats to your enterprise data. Driven by AI, Theom automatically detects data access risks, such as anomalies in login behavior, and remediates them using Security Information and Event Management (SIEM) and Security Orchestration, Automation, and Response (SOAR) actions. Theom also has built-in auto-remediations. This proactive approach ensures that potential threats are neutralized before escalating to breaches.

Today’s AI landscape requires that we uphold the highest data protection standards. While many governing bodies are still navigating how to regulate AI, data security standards are being increasingly enforced (see the SEC’s new cybersecurity rules). Luckily, Theom can help automate compliance with these regulations so that you can focus on accelerating enterprise growth.

(1) Source:

(2) Source: Goldman Sachs Research