Microsoft SQL Server

About

The SQL Server plugin is available on the Snowflake Marketplace with a 7-day free trial. Please contact us to discuss any extra POC requirements.

Omnata's integration service replicates data from your SQL Server source database and loads it into your destination at regular intervals as well as providing direct query capabilities to your SQL Server database.

Supported Services

Generic SQL Server
Azure SQL Database
Azure SQL Managed Instance
Amazon SQL Server RDS

Compatibility Levels

The SQL Server plugin it supported for SQL Server databases that have a compatibility level of 100 or higher.

Connectivity options

Omnata's Snowflake Native App can connect to your database in three ways; using built-in secure tunnelling, Privatelink (only available on Snowflake Business Critical Edition AWS and Azure), or directly if the database network is publicly accessible.

Private link

To use Private link:

You must have Snowflake Business Critical Edition
Your Snowflake account must be using the same cloud provider as the one you are connecting to. For example, to connect to an Azure SQL Database instance via Privatelink, your Snowflake account must be on Azure.

Omnata-ngrok tunnels

The Omnata tunnelling uses an SSH Tunnel which is provided by our OEM partner ngrok. Omnata establishes a secure TLS tunnel to the ngrok server and then connects to the SQL Server database through a tunnel established from your Snowflake account. This ensures that your database is not exposed to the internet and that your data is secure.

Should I use Privatelink or ngrok?

You will use one or the other, you don't need both.

If you have Snowflake Business Critical and the required cloud infrastructure, then we recommend using Privatelink and you won't need ngrok. If you have any other Snowflake Edition, mis-matched cloud providers and/or on-premise data, then ngrok is the best solution for private connectivity.

Authentication

SQL Server Authentication

Omnata supports SQL Server Authentication. When you create a connection, you will need to provide the username and password for the SQL Server user you want to use to connect to the database.

If connecting to Azure SQL Database, you will need to provide the username@servername for the username instead of just the username.

Microsoft Entra Service Principal

(Only valid Azure SQL Database and Azure SQL Managed Instance)

Omnata supports Microsoft Entra Service Principal authentication. When you create a connection, you will need to provide the tenant_id, client_id and client_secret for the Microsoft Entra Service Principal you want to use to connect to the database.

Connection Configuration

SQL Server Configuration

No additional configuration is required for SQL Server databases unless you are wanting to use Change Data Capture (CDC).

Azure SQL Database Configuration

Private Access - Create a new private endpoint for the Azure SQL Database
Connectivity - Update the connection policy to Default or Proxy to allow access from the private endpoint

Privatelink

You can use Azure or AWS Private link to connect to databases hosted in your cloud VPC, these are the two examples are documented below.

It is also possible to connect via Private link to databases hosted on-premise. This uses the same network load balancer approach, shown in the AWS example below, and an existing network bridge between your cloud and on-premise environments.

If you have further questions about Privatelink configuration, please contact us.

Azure

To connect via Privatelink to an Azure SQL database, first provision a Privatelink endpoint in your Snowflake account:

Then, simply choose the Privatelink option when creating the connection in Omnata.

To connect to SQL Server running on a VM in Azure:

AWS

To connect via Privatelink to an AWS RDS service, you may need to deploy some extra infrastructure in your AWS account before provisioning the Privatelink endpoint. The following diagram explains how to connect to RDS via a network load balancer:

ngrok Tunnel Configuration

The Omnata SQL Server plugin uses ngrok to establish a secure tunnel to your SQL Server database. This ensures that your database is not exposed to the internet and that your data is secure.

The connection workflow is as follows:

Omnata Sync establishes an outbound TLS connection from Snowflake to the ngrok service
You install the ngrok agent on your network, where it can access your SQL Server
The agent establishes an outbound TLS connection from your network to the ngrok service, you do not open any ports.
Omnata Sync connects through this tunnel to your SQL Server, again using TLS.

You do not need to sign up for an ngrok account. We provide ngrok as an embedded OEM product.

Download & Install the ngrok Agent

Some anti-virus software can falsely flag ngrok as malware.

We are working with ngrok to resolve this. You can visit their docs for more info.

For MacOS, use Homebrew:

brew install ngrok/ngrok/ngrok

For Linux, use Apt:

curl -s https://ngrok-agent.s3.amazonaws.com/ngrok.asc | \
  sudo gpg --dearmor -o /etc/apt/keyrings/ngrok.gpg && \
  echo "deb [signed-by=/etc/apt/keyrings/ngrok.gpg] https://ngrok-agent.s3.amazonaws.com buster main" | \
  sudo tee /etc/apt/sources.list.d/ngrok.list && \
  sudo apt update && sudo apt install ngrok

For Windows, use Chocolatey (You'll need to run this in an Administrator Command Prompt):

choco install ngrok

If you don't have one of these package managers installed or prefer to install the ngrok agent yourself, you can download the ngrok agent from the Ngrok website and install it on your network.

Configure & launch the ngrok agent

You can run the ngrok agent either as a standalone process or as a service. Instructions for each are provided in tabs on the 'launch agent' step in the Omnata UI.

Complete the Omnata UI form to populate values

The Omnata UI will provide you with a code snippet that contains the --auth token and --label edge for the connection, these pre-populated values configure the outbound connection from your network to Omnata's ngrok instance.

You will need to provide the [address]:[port] values that points the ngrok agent to the location of your SQL Server inside your network.

Common variations in agent-to-database configuration

If you run the agent on the same server as the SQL Server, you can just the port number [port]. For SQL Server, the default port is 1433.
If you run the agent on a different server, you will need to use the hostname of the server where the SQL Server is running [address], as well as the port [port].
If using a Named Instance, ensure it is accessible via [address]:[port], instead of the address/instance-name

Run as a standalone process

To run the ngrok agent as a standalone process, follow the steps provided in the Omnata UI. Omnata will provide you with a code snippet that looks like the following:

ngrok tls  --authtoken <<omnata authentication token>> --url <<omnata tls tunnel url>> [address]:[port]

No modifications to this code are required and the configuration is applied and launched in the one command.

If you need to upgrade the ngrok agent, you can run the following command:

ngrok update

Run as a service

This is the recommended approach for production use

Running the ngrok agent as a service ensures a more reliable connection tunnel as it runs independently of user sessions, auto-starts and auto-recovers.

To configure the ngrok agent to run as a service, follow these steps:

Locate the configuration directory for the ngrok agent.
- For MacOS, the configuration directory is ~/.config/ngrok
- For Linux, the configuration directory is ~/Library/Application Support/ngrok
- For Windows, the configuration directory is %HOMEPATH%\AppData\Local\ngrok
In the directory, create a configuration file (called omnata.yml) which will be used by ngrok agent. The configuration file should contain the following information:

version: 3
agent:
  authtoken: <<omnata authentication token>>
endpoints:
  - name: omnata
    url: <<omnata tls tunnel url>>
    upstream:
      url: [address]:[port]

Install the ngrok agent as a service by running the following command:

ngrok service install --config=

Start the ngrok service by running the following command:

ngrok service start --omnata

Troubleshooting Connectivity Issues

The following are common issues and solutions when connecting.

Message

Action

Cannot open server "xxxxxxx.ngrok.app" requested by the login. The login failed

Please enter your password again in the connection setup. If you are using SQL Azure Database, please enter the username@servername for the username.

The TCP/IP connection to the host xxxxxxx.ngrok.app, port 443 has failed.

Please ensure that the ngrok agent is running and that it can see the SQL Server.

The TCP/IP connection to the host xxxxxx-a.worker.database.windows.net (redirected from xxxxxxx.ngrok.app), port 11063 has failed.

Azure SQL Database : You need to modify the connection policy to Default or Proxy to allow access from the private endpoint. Azure Managed Instance: You need to ensure your private endpoint is accessing the Managed Instance via Private IP addresses

Inbound Syncs

Omnata connects to your database using SQL Server's default transaction isolation level, Read Committed. This ensures that the data you are reading is not locked by other transactions. Omnata uses the following methods to replicate data from your SQL Server source database:

Change Data Capture (CDC)
Change Tracking (CT)
Highwater Mark

Change Data Capture (CDC)

If you have a SQL Server database that supports Change Data Capture (CDC), Omnata can use CDC to replicate data from your source database. CDC is a feature that captures insert, update, and delete operations that are performed on a table and makes the details of the changes available in change tables. Omnata reads the change tables to replicate the changes to your destination database. Omnata uses the CDC instance to capture changes to the source database and replicate them to the destination database and then extracts any records not currently found in your CDC instance from the source table.

Using the APPEND strategy will run the get_all_changes function to get all changes from the CDC instance and then append them to the destination table. This is the default merge strategy for CDC syncs. Using the MERGE strategy will run the get_net_changes function to get net changes from the CDC instance and then merge them into the destination table. This merge strategy is useful when you only need to keep the latest version of a record in the destination table.

Change Tracking (CT)

If you have a SQL Server database that supports Change Tracking (CT), Omnata can use CT to replicate data from your source database. Change Tracking is a feature that tracks changes to a table and makes the details of the changes available in change tables. Omnata reads the change tables to replicate the changes to your destination database. Omnata uses the CT instance to capture changes to the source database and replicate them to the destination database and then extracts any records not currently found in your CT instance from the source table.

Full Refresh / Incremental (High Watermark)

If you have Tables which are not CDC enabled or Views you need to sync, Omnata can use Highwater Marks to replicate your data from your source database. This method is less efficient than CDC and only captures current state each run, therefore the only history available is from the last sync.

Omnata recommends the use of High Watermark method to sync only tables that do not have a CDC instance available. This method is less efficient than CDC and only captures current state each run, therefore the only history available is from the last sync.

If a table or view don't have only of the following field types available date, datetime, datetime2, smalldatetime, datetimeoffset or rowversion field available, then only a full refresh option will be available.

Composite keys

Omnata supports composite keys for high watermarking.

Managing Performance

The following should be viewed as general guidance.

SQL Server workload

We generally want to optimize our syncs so that they run in shortest timeframe possible. However, we need to be mindful of the load these routines place on our SQL Server instances. When you run a sync for the first time the Plugin is requesting all the historical data for the Streams (Tables) and at this time the heaviest load will be placed on your SQL Server instance. Subsequent loads will be a much lighter touch for CDC and Increment High Watermark strategies. Full refresh strategies will place a similar load on the Server for each run of the sync, use full refresh strategy carefully.

Concurrency settings

There are several levers the user can set to optimize performance. Each SQL Server database instance is unique in it's setup, CPU cores, Memory, Disk configuration and Network bandwidth and Server configurations all play a part in how much load we place on the server and the performance we can achieve.

Below are the three tuning levers you can use to tune the initial historic backfill of data.

Maximum Concurrent Streams (MCS)
This setting represents the how many Streams (Tables) that the Plugin will query concurrently.
Maximum Concurrent Queries (MCQ)
This setting represents how many queries per Stream (Table) that the plugin will query concurrently.
Maximum Records Per Query (MRQ)
This setting represents the maximum number of rows that each query will request.

Example

Sync has 16 Streams (Tables) Each Table has 5 Million rows MCS = 8 MCQ = 8 MRQ = 1,000,000.

In this example, we have 80 millions rows in total to acquire from our SQL Server Database. In this example that will be 80 queries in total for the sync of 1,000,000 rows per query. The maximum number of queries the Plugin will concurrently send to the Server is 64 (MCS x MRQ), each one of those queries is ask for 1 million (MRQ) rows. Once a query is processed another will be sent to the server until all the queries have been serviced by the server but 64 is the maximum concurrent queries at any one time.

Note: The above is only true if each of those Tables has a Clustered Index, Heap tables (Tables with no Clustered Index) will ignore MCQ and MRQ (but not MCS). Take care when acquiring data from Heaped Tables as there is limited concurrency control.

Snowflake warehouse settings

We have designed Omnata to be able to run on a single cluster XS (Extra Small) Warehouse. However, during these initial sync runs, we are requesting, receiving and processing a lot of data rows. With a single cluster XS Warehouse you will likely see query queueing on the Snowflake Warehouse. In order to reduce this queueing, you can temporarily set the Warehouse to be multi-clustered, this will reduce queueing and the overall duration of these initial sync runs. We recommend you set the Scaling policy to 'Standard', not 'Economy'. In our testing, a Warehouse set to 'Economy' doesn't scale out with this workload. Once the initial sync has run, you can return the Warehouse to a single cluster. We advise you continue to monitor the query queueing events and adjust your Snowflake Warehouse as needed.

Outbound Syncs

Omanta currently does not support outbound syncs for SQL Server databases.

We have a outbound syncs in private preview, please get in touch to try an early version of this feature.

Functions

To enable you to query your SQL Server database directly from your destination database, Omnata provides a set of UDFs that you can use to query your SQL Server database. These UDFs are available in the OMNATA_SQL_SERVER_PLUGIN database and the UDFS schema.

SQL_QUERY

Executes a SQL query on the source database and returns the result as a table.

Parameters:

CONNECTION_SLUG (VARCHAR): The slug of the connection to query
DATABASE_NAME (VARCHAR): The name of the database to query
QUERY (VARCHAR): The SQL query

Examples

select *
from table(OMNATA_MICROSOFT_SQL_SERVER_PLUGIN.UDFS.SQL_QUERY(
    'my-sql-server-connection',
    'AdventureWorks2022',
    'select * from [Sales].[Customer]'
))

SQL_QUERY_FILTER

Executes a SQL query (with a filter applied) on the source database and returns the result as a table. This has been designed for use with Lambda views.

Parameters:

CONNECTION_SLUG (VARCHAR): The slug of the connection to query
DATABASE_NAME (VARCHAR): The name of the database to query
QUERY (VARCHAR): The SQL query
COLUMN_NAME (VARCHAR): The name of the column to use in the where clause
OPERATOR (VARCHAR): The operator for the where clause
COLUMN_VALUE (VARCHAR): The value to use in the where clause (a string must be enclosed in quotes)

Examples

select *
from table(OMNATA_MICROSOFT_SQL_SERVER_PLUGIN.UDFS.SQL_QUERY(
    'my-sql-server-connection',
    'AdventureWorks2022',
    'select * from [Sales].[Customer]',
    'CustomerID',
    '=',
    '1'
))

Managing CDC on SQL Server

If you don't already have CDC enabled on the SQL Server instance you want to use with Omnata, then follow the Microsoft documentation.

Grant read access to CDC tables

You will need a database role in the database you are syncing from, that role will need the following permissions for each of the objects that the sync will read from.

GRANT SELECT ON OBJECT::[cdc].[fn_cdc_get_all_changes_<your_capture_instance>] TO <your_role_name>;
GRANT SELECT ON OBJECT::[cdc].[fn_cdc_get_net_changes_<your_capture_instance>] TO <your_role_name>;

Create a new CDC reader role for Omnata

Below is a script that will create a role (cdc_reader) add the user (omnata_reader) and add the grant statements for each CDC enabled table in your SQL Server instance. It will also update the role_name field in the cdc.change_tables table. Use this at your own risk, if you are unsure contact your DBA.

-- Add the variable context here 
DECLARE @role_name nvarchar(400) = N'cdc_reader';
DECLARE @user_account nvarchar(400) = N'omnata_reader';
------------------------------------------------------------
DECLARE @create_role_statement nvarchar(400) = N'create role ' + @role_name;
DECLARE @alter_role_statement nvarchar(400) = N'alter role ' + @role_name + ' add member ' + @user_account;
DECLARE @update_change_tables nvarchar(400) = N'update cdc.change_tables set role_name = ''' + @role_name + '''';
EXEC sp_executesql @create_role_statement;
 
DECLARE @sql NVARCHAR(MAX) = N'';
 
SELECT @sql += 'GRANT SELECT ON OBJECT::' + QUOTENAME(SCHEMA_NAME(o.schema_id)) + '.' + QUOTENAME(o.name) + ' TO ' + @role_name + ';' + CHAR(13)
FROM sys.objects o
WHERE o.type = 'IF';
EXEC sp_executesql @sql;

EXEC sp_executesql @alter_role_statement;

EXEC sp_executesql @update_change_tables;

PreviousMicrosoft Excel NextRelease History

Last updated 9 days ago