causalAssembly

📅 April 1, 2024

Description

Semisynthetic data generator for benchmarking causal discovery algorithms with realistic production data and known ground truth causal relationships.

Languages

Python

Status: maintained

causalAssembly is a Python tool designed to facilitate benchmarking of causal discovery methods. It generates semisynthetic data based on real-world manufacturing assembly line measurements with established ground truth causal relationships.

Key Features

  • Generates benchmark datasets with known causal structures
  • Based on real production data from assembly lines
  • Uses distributional random forests for privacy-preserving data generation
  • Validates causal discovery algorithms reliably
  • Markovian guarantee with respect to ground truth

Use Cases

  • Benchmarking causal discovery algorithms
  • Validating new causality inference methods
  • Testing robustness of structure learning approaches
  • Privacy-preserving data generation for research

Citation

If you use causalAssembly in your research, please cite:

Göbler, K., Windisch, T., Drton, M., Pychynski, T., Roth, M. & Sonntag, S. (2024). causalAssembly: Generating Realistic Production Data for Benchmarking Causal Discovery. Proceedings of Machine Learning Research, 236, 609-642.