top of page
Untitled design (1).png
GTM Research Logo (6).png

IBM (InfoSphere Optim)

Enterprise Applicablity
HEADQUARTERS
FOUNDED
FUNDING

Armonk, NY

NA

1911

FOUNDERS

Charles Ranlett Flint

Thomas J. Watson Sr.

Thomas J. Watson Jr.

EMPLOYEES

350000

PRIVATE | PUBLIC

Public

COTS

COTS | OSS
USE CASES

Test Data Management (TDM)

GTM Insights

GTM Domain Insights


Data de-identification is an evolving domain driven by customer expectation, manifesting through Regulatory Organizations (e.g., GDPR, PIPL, etc.). There are many use cases that require de-identification, with an ever-increasing emphasis on re-identification risk.


Using GDPR as a leader in privacy regulation expectations, even if all direct identifiers are stripped out of a data set, the data will still be considered personal data if it is possible to link any data subjects to information in the data set relating to them (as per Recital 26 GDPR).  In other words, according to GDPR, a person does not have to be named to be identifiable. If there is other information enabling an individual to be simply connected to data about them, they may still be considered ‘identified’.


Of the vendors in this space, here are some notables:


  • Delphix - Delphix's differentiation is in their snapshot tree approach to virtual databases - creating whole, unique databases with minimal storage footprint. They have ephemeral capability to deploy their engines, have broad masking generators, and can integrate with CICD for automation. Their challenge is that they do NOT have synthetic data (on roadmap) and do NOT provide data subsetting capability (not on roadmap). It is best practice to subset data to the specific test being conducted, and use synthetic data for some FRs (Unit/System test) and NFRs (load, endurance, performance tests)


Use Delphix if you need a full copy of databases in the organization

  • Tonic.ai - Tonic's differentiation is in synthetic data generation (best in market) with a very broad set of generators, and in their ability to provide differential privacy. Their solution can evaluate new combinations of data in test sets to determine if row linking, or singling out may now be possible - and adding additional calibrated noise to the aggregated data set. Tonic's challenge is that they do NOT provide data virtualization and/or virtual database capability, instead focusing on synthetic data generation and privacy ONLY.

Use Tonic if you favor data security (synthetic & differential privacy), though be cautious with 'usability' of data

  • K2View - K2View's differentiation is with its data virtualization. Creating a composite view from many data sources is a powerful abstraction layer. K2View performs synthetics across a broad set of sources and by owning the abstraction layer they can also perform dynamic masking, which Tonic and Delphix can not perform. Challenge with K2View is that they have general applicability which means they are not as deep into data security like Tonic, nor are they focused on creating copies of underlying databases as Delphix.




Source Types:

  • Databases including DB2, Oracle, SQL Server, Sybase, Informix, and more


Masking (classifiers & generators):

  • K-Anonymity and L-Diversity: Not specified

  • Custom Masking: Yes

  • Data Classification: Yes, in combination with other IBM solutions like IBM Security Guardium


Automation Support:

  • API Support: Yes

  • CICD Integrations: Not specified


Deployment Types:

  • SaaS or On-Premises: Both

  • Container Deployment: Not specified


Data Management:

  • Source Code Management: Not specified

  • Subsetting: Yes

  • Referential Integrity: Yes

  • Automated Provisioning: Not specified

  • Data Masking Reports: Yes

  • Synthetic Test Data: Yes


Gartner Data Masking Review





https://www.ibm.com/docs/en/iodg/11.7?topic=data-masking-functions




bottom of page