Data Discovery Settings

Ole Dyring

September 16, 2025 12:08

When opening the Data Discovery interface, you will have an option to configure a wide range of Data Discovery Settings.

Note: This option is only available if your user has been added to a Rights Group (in TARGIT Management) that grants you 'Data Discovery Administrator' permissions.

Main

Visualize errors

This is a legacy setting that has no practical use anymore.

Legacy functionality: Enable this option to get error messages on top of popups and windows when errors are detected. Disabled by default to decrease the amount of warnings and error messages the user would otherwise experience.

Query statistics

These are some general statistics for data being queried by the Data Service engine.

TARGIT Server URL

Information about the location of the TARGIT Server picked up from installation files and configuration files.

The Data Discovery application and the TARGIT server may be installed on the same local machine or may be installed on different servers.

Time zone

Changing the time zone will have an effect on schedules - e.g., schedules for reloading data sources. You Data Discovery server may be installed in a time zone that is different from the time zone where the users are located.

Plugins

A number of data source plugins requires additional parameters - e.g., Timeout settings, Paths, Authentication settings, etc. - to run smoothly or to be able to access data at all.

The listed plugins have all been set with default parameters, but these can be changed and customized when necessary.

Plugin management

You can enable or disable plugins that should be available to Data Discovery users.

Cache

The query cache is by default enabled to generally improve the performance of your Data Discovery queries.

The cache should probably only be disabled if queries are showing wrong results, and only to check if caching is causing the issue.

How the Cache Works

When a cube is queried, the system checks if the same query result is already cached:

Cache miss: The query runs against the cube, and the result is stored in memory.
Cache hit: The stored result is returned without re-executing the query.
Cache is refreshed whenever the cube is reloaded with new data.

Max Cache Size

The maximum cache size controls how much memory is used for storing query results. Typical usage is 10–20 MB, with larger sizes being uncommon.

When to Increase It

Increase the cache if:

Repeated queries are not being served from cache, slowing performance.
Logs or monitoring show frequent cache evictions.
Many users are running the same queries concurrently.

Lifetime of a Dynamic Cube

A dynamic cube is created from a dynamic data source, such as a SQL stored procedure. The data source itself does not hold data—data is only retrieved when a query is executed.

When a query (e.g., from a cross-tab) is made:

The cube uses dimension values as parameters to execute the stored procedure.
The query result is stored in a local .targitdb file.

To improve performance, this result is cached for a defined lifetime of the dynamic cube, avoiding repeated database calls for the same data during that period. After the lifetime expires, the stored procedure will be executed again when requested.

Proxy

(Obsolete)

Database Engine

Enable Parallel script execution

(Obsolete)

General about OOP (out of process)

Why OOP is better:

Keeps the main service more responsive.
Isolates resource leaks (bad memory usage won’t affect the whole service).
Each entity creation is short-lived, so less risk of long-term memory/resource problems.

If OOP is disabled:

Everything happens inside the main process.
Side effects may occur (e.g., static/global settings interfering with other parts, memory leaks accumulating, slowdowns for other requests).
Higher risk of the service being bogged down.

Enable data source generation out of process

When out-of-process data source generation is enabled, each data source is processed and the resulting .targitdb file is generated via embedded tiimport.exe. This shifts CPU and RAM usage away from the main TARGIT.DataService.exe process, allowing it to handle other requests more efficiently and with lower resource consumption. Any potential memory leaks within the data source generation engine become less critical, as the tiimport.exe process is terminated after the data source is processed—ensuring no lingering resource usage in the main process. However, out-of-process generation may increase overall CPU load, as the processing is distributed across multiple separate processes.

In-process generation

When this strategy is used (i.e. out of process is disabled) the data sources are generated and processed in the main DataService.exe process. That might lead to a heavy CPU usage of the process when the formats are being processed. That could also lead to not-responsive data service UI, due to lack of CPU cycles (especially on a heavily-loaded DS installations)

Enable format generation out of process

When out-of-process format generation is selected each format is being processed and the resulting .targitdb file is generated via TARGIT.DataService.FormatProcessor.exe . That allows to move the CPU & RAM usage outside the main TARGIT.DataService.exe process and let it process the other requests faster with less resources consumed. Any possible memory leaks that could be inside the format calculation engine become less important since once the format is processed the process is killed - no resources hang in the main process. OOP format generation might put more pressure on server's CPU since the processing is spanned across separate processes.

The main DS process passes a path to the format metadata file (is created under storage/datasources folder, ie MyFormat.G-U-I-D) when calling FormatProcessor.exe. The metadata file contains modifications to the parent data source file that is used to generate the resulting .targitdb file.

In-process format generation

The default behavior of format generations.

When this strategy is used the formats are generated and processed in the main DataService.exe process. That might lead to a heavy CPU usage of the process when the formats are being processed. That could also lead to not-responsive data service UI, due to lack of CPU cycles (especially on a heavily-loaded DS installations)

Data Service will kill all the FormatProcessor.exe processes on service shutdown.

Enable cube generation out of process

A new strategy of cube generation is added to move the resource consumption outside the main process. The inmemory script is written in the storage folder using cubeName.G-U-I-D pattern. A call to the tiimport.exe is made and the path to the inmemory script is passed as parameter.

In-process cube generation

This is the default cube generation strategy used in Data Service. The inmemory script is generated based on the cube's metadata and is executed agains the targitdb's object model - a script is passed to the API and it produces the inmemory table that is saved on a disk (storage/cubes by default). As with the format generation it increases the CPU&RAM load on the main process, that might cause delays in responses from the data service on the UI.

Max number of parallel data source generation processes

This setting specifies the maximum number of data source generation or update operations that can run in parallel.

Default value: 0 (no limit) — All data source operations will run concurrently, limited only by system capacity.

Purpose:

To control the level of parallelism during data source generation.
To optimize CPU utilization and prevent system overload, especially on multi-core servers.

Implications:

Higher values can speed up the processing of multiple data sources in environments with sufficient CPU resources.
Lower values reduce CPU load and help maintain overall system responsiveness, especially on servers with limited processing power.

Recommendation:
Adjust this value based on the server's hardware capabilities and the expected workload. For most systems, a value equal to or slightly below the number of logical CPU cores is a good starting point.

Max number of parallel format generation processes

See description for max number of parallel data source generation processes.

Max number of parallel cube generation processes

See description for max number of parallel data source generation processes.

Disable parallel cube reloads triggered by a data source update

A cube may depend on multiple data sources.

If those sources reload at the same time, each one tries to reload the cube separately → wasteful and risky.

This setting makes sure the cube reloads only once, after all sources have finished updating.

In cloud setups, this is always ON to prevent file write conflicts and ensure cubes are reloaded with the newest data.

Use old cube data if reload fails

If a cube reload fails, you have two options:

Break the cube → clients see errors, and the admin must fix it.
Keep serving old cube data → clients still get results, but with stale data. The cube is flagged with a warning saying it’s using old data.

Transactional logging

Enable heartbeat

Purpose: Sends usage statistics from the data service to a central “heartbeat” server.
What’s included in the stats:
- RAM usage of the data service.
- Size of cache in use.
- Number of data sources (plus details: plugin type, number of rows and columns).
- Number of cubes (with similar details).

Essentially, it’s a health check that reports how the system is being used.

Telemetry

Purpose: Sends warnings and errors from the code to a telemetry service, where they’re stored in a database.
Can be monitored proactively by engineers to detect and fix issues early.

Transaction logging

Do not enable.
A special transaction logging mechanism to track every transaction happening in the config database.
Risk if enabled:
- The transaction log file (transactionsDB.db) could grow very large (2–6 GB).
- This could cause errors or crashes in the data service.
- Fix would require manual cleanup: stop DS, rename/delete the file, restart DS.
- Recommendation: Keep this disabled.

Severity (for transaction logging)

Defines how much detail is logged.
Low severity = only logs changes (inserts, updates, deletes).
High severity = logs everything, including simple reads (SELECT queries).
Risk: High severity → massive log file growth + big performance hit.
Rule of thumb: Don’t touch this unless you really know what you’re doing.

File Storage

When users add files - e.g., Excel files and CSV files - with the 'Add file' feature in the TARGIT client, these files are by default stored in C:\Program Files\TARGIT\TARGIT Data Service\Files.

You may configure a different file storage path - however, beware of the implications on data sources already added in this way.

File Browser

For those data sources where the end-user can open a file browser to locate a file or a folder, you can set up Allow or Deny permissions to specific folders.

The permissions are applied from top to bottom.

Notifications

Enable email notifications to receive notifications about failed data source updates.

In the Settings tab, you should specify the email server to be used for forwarding these notifications. You can choose between emails servers with SMTP settings or with MS Graph settings.