top of page
Untitled design (1).png
GTM Research Logo (6).png

OpenRefine

Enterprise Applicablity
HEADQUARTERS
FOUNDED
FUNDING

NA

2010

FOUNDERS
EMPLOYEES

200

PRIVATE | PUBLIC

NA

OSS

COTS | OSS
USE CASES

ETL Masking

GTM Insights




  • Project name: OpenRefine

  • Year project was started: 2010

  • Number of contributors: Over 200 contributors

  • Founders: Metaweb Technologies, Google

  • URL to GitHub repository: https://github.com/OpenRefine/OpenRefine

  • Brief description of the project: OpenRefine, formerly known as Google Refine, is an open source data cleaning and transformation tool. It provides a user-friendly interface for exploring, cleaning, and transforming messy data, making it more usable for analysis and processing.

  • Brief description of the data masking capabilities: OpenRefine is primarily focused on data cleaning and transformation rather than data masking specifically. While it does not provide built-in data masking capabilities, it can be used as part of a broader data pipeline to clean and transform data in a way that effectively masks sensitive information.

  • List of compatible data source types: OpenRefine can work with various data source types, including CSV files, Excel spreadsheets, JSON files, XML files, databases (via JDBC), and web-based APIs.

  • Support for re-identification methods like k-anonymity and l-diversity: OpenRefine does not directly support re-identification methods like k-anonymity and l-diversity. It focuses on data cleaning and transformation rather than privacy-preserving techniques.

  • Support for custom masking: OpenRefine provides a wide range of transformation functions and operations that can be used to implement custom masking techniques. Users can define their own transformation rules to mask sensitive data.

  • Ability to discover and classify sensitive data: OpenRefine does not have built-in capabilities for automatic discovery and classification of sensitive data. Users need to manually identify and define sensitive data based on their domain knowledge.

  • API availability: OpenRefine provides a JSON-based API that allows users to interact with and automate tasks in OpenRefine programmatically.

  • Integration with CI/CD solutions: OpenRefine can be integrated into CI/CD workflows by including OpenRefine scripts and commands as part of the data cleaning and transformation process.

  • Deployment in containers like Docker: OpenRefine can be deployed in containers like Docker, facilitating easy deployment and management.

  • Maintenance of data masking configurations in source code management: OpenRefine does not explicitly support maintaining data masking configurations in source code management. However, users can version control OpenRefine projects and configuration files using standard source code management practices.

  • Support for subsetting: OpenRefine is primarily focused on data cleaning and transformation rather than subsetting capabilities.

  • Support for referential integrity: OpenRefine does not have built-in support for enforcing referential integrity constraints as it primarily focuses on data cleaning and transformation.

  • Support for automated provisioning of test environments and data: OpenRefine does not provide specific features for automated provisioning of test environments and data. However, it can be integrated into automated workflows through scripting and automation.

  • Production of data masking reports: OpenRefine does not produce specific data masking reports. However, it provides various logging and auditing features to track the data cleaning and transformation process.

  • Support for synthetic creation of test data: OpenRefine does not explicitly support synthetic creation of test data. Its main focus is on data cleaning and transformation.

  • Support for data virtualization: OpenRefine does not explicitly support data virtualization.

Tel: 202-431-0558

27734 Ave Scott, Suite 120

Santa Clarita 91355

SUBSCRIBE

Sign up to receive GTM Catalyst news and updates.

Thanks for submitting!

© 2023 by GTM Catalyst

  • LinkedIn
  • Twitter
bottom of page