Infor Data Lake

About this plugin

The Infor Data Lake is a cloud-based data repository provided by Infor, designed to store and manage large volumes of structured and unstructured data. It is part of the Infor OS (Operating Service) platform, which offers a suite of tools and services for data integration, analytics, and business intelligence.

The Omnata Infor Datalake Plugin provides an alternative approach to extract data from the Infor Data Lake using the Infor Compass JDBC Driver (BYO). Data is only loaded into the Infor Data Lake every 10 minutes, so this plugin is useful for near real-time data extraction where as the Infor Streaming from the data lake could be more suitable for real-time data extraction but has limitations and doesn't handle things like schema drift which this plugin does.

Prerequisites

Data Catalog API

The Infor Data Lake provides a Data Catalog API that allows you to query the metadata of the data stored in the Infor Data Lake. This API is used by the Omnata Infor Datalake Plugin to retrieve the metadata of the tables and fields in the Infor Data Lake.

Compass JDBC Driver

Before you create a connection in Omnata, you are required to upload your own version of the infor-compass-jdbc-2025.08.jar for use by the Omnata plugin app.

Follow these steps to upload the driver:

ION download

  1. Login to Infor OS and start the ION application.

  2. Expand the ION menu and select Configuration > Downloads

  3. Select the Compass JDBC driver download. (For details, see the Infor ION User Guide)

Installing the driver into Snowflake

  1. Upload the infor-compass-jdbc-2025.08.jar to an internal stage within your snowflake instance

  2. Run the following command (ensuring you specify the stage you have uploaded the jar file into):

COPY FILES INTO @OMNATA_INFOR_DATA_LAKE_PLUGIN.CUSTOM_PACKAGES.PACKAGES
FROM @<<DATABASE_NAME>>.<<SCHEMA_NAME>>.<<STAGE_NAME>>;

Authentication

ION API Security Credentials

The Infor Compass JDBC Driver requires the use of ION API Security Credentials to authenticate with the Infor Data Lake. The credentials are downloaded through the Infor OS Portal or Infor Ming.le. The role of IONAPI-Administrator is required to setup security credentials. For more information, see the Infor OS documentation.

Generating ION API Security Credentials

Follow these steps to download the ION API Security Credentials:

  1. Log on to Infor OS Portal or Infor Ming.le.

  2. From the application tray, select the API Gateway application.

  3. Select Authorized Apps.

  4. Add a new non-Infor authorized application.

  5. Specify this information:

  • Name : Infor Compass JDBC Driver (The name must be exactly as shown)

  • Description: Infor Compass JDBC Driver (The description must be exactly as shown)

  1. Select the backend service type.

  2. Save the app.

  3. From the Authorized app menu, navigate to the new Infor Compass JDBC driver application.

  4. Click Download.

Caution: The Download Credentials dialog box is displayed. Do not click Download yet.

  • Select Create Service Account.

  • In the User Name field, specify the first few characters of your user name. Then select your username from the type-ahead list.

  1. Click Download.

  2. Open the downloaded file and copy the contents into the Omnata Connection Wizard

Inbound Syncs

The Omnata Infor Datalake Plugin provides a set of inbound syncs that can be used to load data from data lake into Snowflake. These syncs can be scheduled to run at regular intervals to keep the data up-to-date.

It is recommended that all syncs are Incremental syncs to reduce load on the Infor Data Lake. After the initial run, the plugin will use the infor.lastModified() field to determine which records to load. The plugin will only load records that have been modified since the last sync run.

Storage Strategy

Using the APPEND strategy will run the Variations to Retrieve specified in the sync. Using the MERGE strategy will enforce the HighestVariation strategy and only keep the latest version of a record in the destination table.

Additional Fields Retrieved

In addition to the fields of the table, the following fields are also retreived:

  • infor.lastModified() - Representing the timestamp in which the data object was added to Data Lake.

  • infor.Dataobjectid() - Representing the unique identifier of Data Lake data object in which the record is stored.

  • infor.DataObjectSeqId() - Representing the record or line number associated with the record location in the data object.

  • infor.Partition() - Representing the partition in which the data object is stored.

Customizing the Omnata storage location

It is recommended that you configure a custom storage location in Omnata.

Managing performance

There are several levers the user can set to optimize performance. The following should be viewed as general guidance. Depending on the size of the data and the performance of the Infor Data Lake, these settings may need to be adjusted.

Concurrency Level Settings

The plugin has the ability to adjust the size and concurrency of queries being sent to the source database. The following are available settings in the Omnata UI:

Maximum Concurrent Streams (MCS) This setting represents the how many Streams (Tables) that the Plugin will query concurrently.

Maximum Concurrent Queries (MCQ) This setting represents how many queries per Stream (Table) that the plugin will query concurrently.

Maximum Queries Per Stream Per Sync (MQS) This setting represents how may queries per Stream (Table) that the plugin will send to the server during a sync.

Maximum Records Per Query (MRQ) This setting represents the maximum number of rows that each query will request.

Usage Example

Sync has 16 Streams (Tables) Each Table has 5 Million rows MCS = 8 MCQ = 8 MRQ = 1,000,000.

In this example we have 80 millions rows in total to acquire from the DataLake. In this example that will be 80 queries in total for the sync of 1,000,000 rows per query. The maximum number of queries the Plugin will concurrently send to the Data Lake is 64 (MCS x MRQ), each one of those queries is ask upto 1 million (MRQ) rows. Once a query is processed another will be sent to the server until all the queries have been serviced by the server but 64 is the maximum concurrent queries at any one time.

Initial Data load

The initial data load will be the same as the above example but the plugin will not stop until all the data has been loaded. The plugin will continue to send queries to the server until all the data has been loaded. When determining the number of queries to send to the server the plugin will use the following formula:

  1. Determine the number of records (based on timestamp) that are available for a given table grouped by year

  2. Determine the number of records (based on timestamp) that are available for a given table grouped by year and month for the years where the number of records is greater than the specified MRQ

  3. Determine the number of records (based on timestamp) that are available for a given table grouped by year, month and day for the years and months where the number of records is greater than the specified MRQ

  4. Determine the number of records (based on timestamp) that are available for a given table grouped by year, month, day and hour for the years, months and days where the number of records is greater than the specified MRQ

Snowflake warehouse size

Omnata typically runs sync workloads on a single cluster XS warehouse. For this endpoint, large initial sync runs can often lead to query queueing. If you see queuing, you will be able to reduce sync times by adding multi-clustering. We recommend setting the Scaling policy to 'Standard', not 'Economy'. After the initial sync, you can return the Warehouse to a single cluster for the incremental syncs. We advise you to monitor the query queueing events and adjust your Snowflake Warehouse as needed.

Alternatively, you can opt for a Serverless Warehouse, this will scale out as needed and you will not see queueing events. However, keep in mind you have less control over costs when using Serverless compute.

Task timeout

Initial syncs can often run for longer than the default Omnata timeout of 4 hours, and your Snowflake account or warehouse level query timeout(s). We recommend setting these task timeout to a value greater than the expected duration of the sync. You will need to adjust this value based on the size of the data being loaded. As a rough benchmark, a table with 1 billion records may take more than 6 hours to sync.

Record History Retention

Due to the volume of data being transitioned through this plugin from the Infor Datalake, we recommend setting the retention period to 0 days. This will reduce overall sync times, however, you will no longer be able to view a record-level sync activity log.

Functions

Omnata provides a set of UDFs that you can use to query the Infor Datalake. These UDFs are available in the OMNATA_INFOR_DATA_LAKE_PLUGIN database and the UDFS schema.

INFOR_DIRECT_QUERY

The INFOR_DIRECT_QUERY function allows you to query the Infor Datalake directly from your destination database. The function takes the following parameters:

  • CONNECTION_SLUG (VARCHAR): The slug of the connection to query

  • QUERY (VARCHAR): The SQL query

Examples

select *
from table(OMNATA_INFOR_DATA_LAKE_PLUGIN.UDFS.INFOR_DIRECT_QUERY(
    'my-infor-connection-slug',
    'select * from infor.include(''MITMAS'', ''ACTIVE, DELETED, ARCHIVED'', ''AllVariations'')'
))

INFOR_TABLE_STATS

The INFOR_TABLE_STATS function allows you to get stats out of the Infor Datalake for a given table. The function takes the following parameters:

  • CONNECTION_SLUG (VARCHAR): The slug of the connection to query

  • TABLE_NAME (VARCHAR): The name of the table to get statistics for

  • COMPANIES (ARRAY): The List of Companies to get statistics for

  • FROM_TIMESTAMP (STRING): The date from which to get statistics since based on a records timestamp

  • ADDITIONAL_WHERE_CLAUSE (STRING): Additional where clause to filter the statistics

Examples

select *
from table(OMNATA_INFOR_DATA_LAKE_PLUGIN.UDFS.INFOR_TABLE_STATS(
    'my-infor-connection-slug',
    'CINACC',
    [100],
    '',
    ''
))

INFOR_TABLE_DISTINCT_STATS

The INFOR_TABLE_DISTINCT_STATS function allows you to get stats out of the Infor Data Lake for a given table and displays the number of distinct records as duplicates could exist within the DataLake.

The function takes the following parameters:

  • CONNECTION_SLUG (VARCHAR): The slug of the connection to query

  • TABLE_NAME (VARCHAR): The name of the table to get statistics for

  • PRIMARY_KEYS (ARRAY): The List of Primary Keys to get statistics for

  • COMPANIES (ARRAY): The List of Companies to get statistics for

  • FROM_TIMESTAMP (STRING): The date from which to get statistics since based on a records timestamp

  • ADDITIONAL_WHERE_CLAUSE (STRING): Additional where clause to filter the statistics

Examples

select *
from table(OMNATA_INFOR_DATA_LAKE_PLUGIN.UDFS.INFOR_TABLE_DISTINCT_STATS(
    'my-infor-connection-slug',
    'CINACC',
    ['CONO', 'DIVI', 'ANBR', 'SENO'],
    [100],
    '',
    ''
))

INFOR_TABLE_ROW_STATS

The INFOR_TABLE_ROW_STATS function allows you to get stats out of the Infor Data Lake for a given table group based on the columns passed in. The function takes the following parameters:

  • CONNECTION_SLUG (VARCHAR): The slug of the connection to query

  • TABLE_NAME (VARCHAR): The name of the table to get statistics for

  • PRIMARY_KEYS (ARRAY): The List of Primary Keys to get statistics for

  • COMPANIES (ARRAY): The List of Companies to get statistics for

  • FROM_TIMESTAMP (STRING): The date from which to get statistics since based on a records timestamp

  • ADDITIONAL_WHERE_CLAUSE (STRING): Additional where clause to filter the statistics

Examples

select *
from table(OMNATA_INFOR_DATA_LAKE_PLUGIN.UDFS.INFOR_TABLE_ROW_STATS(
    'my-infor-connection-slug',
    'CINACC',
    ['CONO', 'DIVI', 'ANBR', 'SENO'],
    [100],
    '',
    ''
))

GET_OBJECT_CATEGORIES

The GET_OBJECT_CATEGORIES function allows you to retrieve the available categories and associated tables from available in the datalake. The function takes the following parameters:

  • CONNECTION_SLUG (VARCHAR): The slug of the connection to query

Examples

select *
from table(OMNATA_INFOR_DATA_LAKE_PLUGIN.UDFS.GET_OBJECT_CATEGORIES('my-infor-connection-slug'))

GET_INBOUND_STREAM_METADATA

The GET_INBOUND_STREAM_METADATA function allows you to retrieve the metdata data required to add a stream to a sync for a given set of object names. The function takes the following parameters:

  • CONNECTION_SLUG (VARCHAR): The slug of the connection to query

  • OBJECT_NAMES (ARRAY): The List of objects to retrieve metadata for

Examples

select *
from table(OMNATA_INFOR_DATA_LAKE_PLUGIN.UDFS.GET_OBJECT_CATEGORIES('my-infor-connection-slug', ['MITMAS', 'CINACC']))

GET_M3_TABLE_COLUMN_CHECKS

The GET_M3_TABLE_COLUMN_CHECKS function allows you to validate if the columns exists for a given M3 Table. The M3 Prefix is added onto the front of the columns for the actual queries. The function takes the following parameters:

  • CONNECTION_SLUG (VARCHAR): The slug of the connection to query

  • TABLE_NAME (VARCHAR): The table to retrieve the columns for

  • COLUMN_CHECKS (ARRAY): The List of columns to check for existence

Examples

select *
from table(infor_datalake.udfs.GET_M3_TABLE_COLUMN_CHECKS('infor-demo-test', 'FGLEDG', ['RGTM', 'RGDT', 'LMDT', 'LMTS']))

GET_M3_TABLE_COUNTS

The GET_M3_TABLE_COUNTS function allows you to retrieve the counts of the specified M3 Table. The function takes the following parameters:

  • CONNECTION_SLUG (VARCHAR): The slug of the connection to query

  • TABLE_NAME (VARCHAR): The table to retrieve the counts for

  • PARAMETERS (OBJECT): The object containing the field names and date fields to use for the table counts. The object should contain the following fields:

    • date_field_name (str): The field to use for date filtering (should be in YYYYMMDD format).

    • time_field_name (str): The field to use for time filtering (should be in HHMMSS format).

    • year (int): The year for which to get data.

    • month (int): The month for which to get data.

    • day (int): The day for which to get data.

    • hour (int): The hour for which to get data.

    • minute_group_size (int): The size of the minute groupings (e.g., 15 for 15-minute intervals).

    • count_level (str): The level of counts to retrieve. Can be one of the following:

      • "year": Counts grouped by year

      • "month": Counts grouped by month

      • "day": Counts grouped by day

      • "hour": Counts grouped by hour

      • "minute_group": Counts grouped by minute groupings

      • "custom": Custom counts based on a where clause

    • where_clause (str): An optional where clause to filter the counts

    • count_all_companies (bool): Whether to count all records or only those associated with the company specified for the connection. Defaults to true.

Examples

select *
from table(OMNATA_INFOR_DATA_LAKE_PLUGIN.UDFS.GET_M3_TABLE_COUNTS('my-infor-connection-slug', 'FGLEDG', to_object(parse_json('{"date_field_name": "EGRGDT", "time_field_name": "EGRGTM", "year": 2025, "month": 1, "day": 5, "hour": 14, "minute_group_size": 15}'))))

GET_M3_TABLE_DATA

The GET_M3_TABLE_DATA function allows you to retrieve the data of the specified M3 Table. However, due to restrictions with the EXPORTMI API, the results returned are limited to 10,000. The function takes the following parameters:

  • CONNECTION_SLUG (VARCHAR): The slug of the connection to query

  • TABLE_NAME (VARCHAR): The table to retrieve the counts for

  • PARAMETERS (OBJECT): The object containing the field names and date fields to use for the table counts. The object should contain the following fields:

    • date_field_name (str): The field to use for date filtering (should be in YYYYMMDD format).

    • time_field_name (str): The field to use for time filtering (should be in HHMMSS format).

    • year (int): The year for which to get data.

    • month (int): The month for which to get data.

    • day (int): The day for which to get data.

    • hour (int): The hour for which to get data.

    • start_minute (int): The starting minute of the group (eg 0 which will be set to 00:00:00)

    • end_minute (int): The ending minute of the group. (eg 14 which will be set to 14:59:59)

    • where_clause (str): An optional where clause to filter the data

    • data_level (str): The level of data to retrieve. Can be one of the following:

      • "minute_group": Data grouped by minute intervals

      • "hour": Data grouped by hour

      • "month": Data grouped by day

      • "custom": Custom data based on a where clause

    • columns (str): A comma-separated list of columns to retrieve. If not specified, all columns will be retrieved.

    • max_records (int): The maximum number of records to fetch (default is 10,000)

Examples

select *
from table(OMNATA_INFOR_DATA_LAKE_PLUGIN.UDFS.GET_M3_TABLE_DATA('my-infor-connection-slug', 'FGLEDG', to_object(parse_json('{"date_field_name": "EGRGDT", "time_field_name": "EGRGTM", "year": 2025, "month": 1, "day": 5, "hour": 14, "start_minute": 0, "end_minute": 14}'))))

Last updated