Theom Databricks Onboarding

Theom provides a cloud-native security platform that discovers, tracks, and protects enterprise data in cloud environments. Engineered to deploy quickly, Theom delivers immediate value to businesses of any size by uncovering risks to data loss and prioritizing corrective actions.

Theom builds a situational awareness of all data flows, known and unknown, in multi-cloud environments. Using a specialized set of Security Dimensions, the platform detects and summarizes critical risks for data at rest or in motion, across thousands of cloud data stores. Theom offers a rich set of security features and capabilities within its platform to ensure data access governance, protection, integrity and compliance of the data in Databricks and a variety of other data store platforms. Theom integrates with Unity Catalog of Databricks.

Theom delivers continuous visibility and actionable insights against emerging threats and new data breaches while ensuring that no data leaves the enterprise's realm for Theom to deliver outcomes. Theom customers can focus on growing their business through digital transformation by securely using data in the cloud.

Security Tenets

Theom can be onboarded using a secure connector to your AWS or Azure or GCP Databricks environment.

Theom prides itself on security at every step of the product development process. Our guiding principles towards this goal are:

  1. Customer data never leaves the Customer's jurisdiction
  2. Encryption at rest and in transit for pseudonymized meta data (to be displayed)

Theom Components

Before diving into the prerequisites, let us understand the three main components of Theom.

  1. Theom Cloud - this component hosts the web service and the unification layer across all the different cloud data warehouses, data lakes and clouds. This allows Theom to deliver on policies and compliance follows the data without access to customer data.
  2. Theom Engine - this component resides within the customer’s environment and deployed as native Databricks components within the customer’s Databricks workspaces for AWS. For Azure this app is deployed via the Azure enterprise app registration process.

Deployment in Azure

Onboarding Pre-requisites
Theom Cloud

Before diving into the prerequisites, let us understand the three main components of Theom.

  • User 0 - To onboard a tenant, we need the following identifiers of a user capable of invoking the setup process within Theom.
    1. Full Name
    2. Email address - This may be an email alias to which the user 0 subscribes.
  • The user should either have the following permissions or be able to work with personnel that have:
    • Azure user with privileges to:
      • Register Azure enterprise applications.
      • Provide Theom service principal permissions to read Log Analytics Workspace
    • Databricks Admin Privileges for the Databricks specific tasks
  • Have the subscription ID from Azure ready. This is an existing subscription ID in Azure where Databricks has been deployed.
Steps for Deployment
Theom UI

On the Databricks Deployment page, install the Theom App and note down the application ID.

Azure Portal
  1. Create a Azure Log Analytics Workspace (if one does not already exist) in the same subscription as the Databricks Workspace. Assign the Theom user the Log Analytics Reader permission.
    • Note: this step is not required if you already have an existing Log Analytics workspace.
  2. On the Azure Databricks page, add a Diagnostic setting under Monitoring. Select the allLogs category Group. Select the Log Analytics Workspace created as destination.
Azure Databricks Workspace
  1. Create a new catalog called theom. Add a new schema called internal. Change the ownership of the catalog to the Theom application.
  2. Create a SQL warehouse (serverless x-small) and assign the Theom application the“Can Manage” permissions.
  3. Enable Verbose Audit Logs under Admin Settings -> Workspace Settings.
CLI
  • Enable Audit System Table for billing and access system views.
  • Grant access to the system schema by providing USE and SELECT privileges on the select system schemas.
Theom UI
  • Add the databricks hostname (include https://), SQL warehouse ID and log analytics workspace ID and click Create.
Access to Datastores

Provide read-only access to catalogs, schemas and tables you would like Theom to scan and protect.

Privilege Insights

Run the following SQL command as a metastore owner in a SQL editor

create or replace table theom.internal.table_privileges as select * from system.information_schema.table_privileges

Databricks sharing Insights

If you would like Theom to provide insights and protect Databricks shares, then provide the following permissions to the Theom application ID:

GRANT USE SHARE on METASTORE to `<THEOM APPLICATION ID>`;

GRANT USE RECIPIENT on METASTORE to `<THEOM APPLICATION ID>`;

Deployment in AWS

Onboarding Pre-requisites
Theom Cloud

Before diving into the prerequisites, let us understand the three main components of Theom.

  • User 0 - To onboard a tenant, we need the following identifiers of a user capable of invoking the setup process within Theom.
    1. Full Name
    2. Email address - This may be an email alias to which the user 0 subscribes.
  • Administrative access to Databricks Account :
Steps for Deployment

AWS Databricks Account
  1. Create a service principal for theom. This service principal will later be added to all theworkspaces where Theom is to be deployed.
  2. Create an OAuth secret and make a note of client ID and secret.
AWS Databricks Workspace
  1. Add the theom service principal in the previous step to the workspace under Settings ->Identity and Access
  2. Create a new catalog called theom. Add a new schema called internal. Change theownership of the catalog to the Theom application.
  3. Create a SQL warehouse (serverless x-small) and assign the Theom application the“Can Monitor” permissions.
  4. Enable Verbose Audit Logs under Settings -> Advanced
CLI
  • Enable Audit System Table for billing and access system views.
  • Grant access to the system schema by providing USE and SELECT privileges on the select system schemas.
Theom UI
  • Add the databricks hostname (include https://), SQL warehouse ID, Client ID and Client secret for the Theom service principal.
Network Allow List

Ensure that Theom SaaS head end is allowlisted in your Databricks workspace access lists.

  • 3.230.209.61
  • 54.205.175.56
Access to Datastores

Provide read-only access to catalogs, schemas and tables you would like Theom to scan and protect.

Privilege Insights

Run the following SQL command as a metastore owner in a SQL editor

create or replace table theom.internal.table_privileges as select * from system.information_schema.table_privileges

Databricks sharing Insights

If you would like Theom to provide insights and protect Databricks shares, then provide the following permissions to the Theom application ID:

GRANT USE SHARE on METASTORE to `<THEOM APPLICATION ID>`;

GRANT USE RECIPIENT on METASTORE to `<THEOM APPLICATION ID>`;

Bringing a unified and consistent approach to data access governance and data security

Govern Lakehouse & Delta Sharing


Govern clean room access with understanding of data and identities and manage access to data exchange marketplaces to unlock new revenue streams.

Insider risk management & stop data attacks/breaches


Leverage AI to detect insider risks and stop data breaches before they happen.

Simplify compliance &
access auditing


Harmonize data access, security, and compliance policies across cloud data stores and understand both the criticality of your data and the dollar value of your data within each store, all within one policy.