Causal-Copilot: An Autonomous Causal Analysis Agent

Autonomous Causal Analysis with Causal-Copilot

The Challenge

Causal analysis is fundamental to scientific discovery and evidence-based decision-making. However, despite rapid advances in causal learning methods, a significant gap remains between theoretical sophistication and practical applicability. Domain experts often cannot leverage these powerful tools due to:

  • Algorithmic complexity: 20+ methods with distinct assumptions and hyperparameters
  • Steep learning curves: Requiring expertise in both causal theory and implementation
  • Configuration challenges: Selecting appropriate methods for specific data characteristics

Our Solution

Causal-Copilot is an LLM-powered autonomous agent that democratizes causal analysis by automating the entire analytical pipeline. Users simply upload their data and describe their analysis goals in natural language—Causal-Copilot handles the rest.

End-to-end workflow: from natural language query to comprehensive causal analysis report
End-to-end workflow: from natural language query to comprehensive causal analysis report

System Architecture

Causal-Copilot is built on a modular architecture with five core components:

Modular architecture of Causal-Copilot
Modular architecture of Causal-Copilot
ModuleFunction
User InteractionNatural language query parsing, domain knowledge integration, interactive feedback loop
PreprocessingData cleaning, schema extraction, statistical diagnostics (linearity, stationarity, heterogeneity)
Algorithm SelectionLLM-guided filtering, ranking based on data characteristics, hyperparameter configuration
PostprocessingBootstrap confidence evaluation, LLM-guided graph refinement, user revision loop
Report GenerationCausal graph visualization, result interpretation, LaTeX report compilation

Supported Algorithms

Causal-Copilot integrates 20+ state-of-the-art algorithms across three categories:

Causal Discovery

FamilyAlgorithmsData Type
Constraint-basedPC, FCI, CD-NOD, PCMCITabular & Time Series
Score-basedGES, FGES, XGES, GRaSPTabular
Continuous OptimizationNOTEARS, GOLEM, CALM, CORL, DYNOTEARSTabular & Time Series
LiNGAM FamilyICA-LiNGAM, DirectLiNGAM, VAR-LiNGAMTabular & Time Series
MB-basedInterIAMB, IAMBnPC, HITON-MB, BAMBTabular
Granger CausalityLinear & Nonlinear GrangerTime Series

Causal Inference

MethodDescription
Double Machine LearningLinearDML, SparseLinearDML, CausalForestDML
Doubly Robust LearningLinearDRL, SparseLinearDRL, ForestDRL
Instrumental VariablesDRIV Family (Linear, Sparse, Forest)
Matching MethodsPSM, CEM
CounterfactualDoWhy-based counterfactual estimation

Auxiliary Analysis

  • Feature Importance: SHAP-based attribution
  • Anomaly Attribution: Causal structure-based root cause analysis

Performance

Causal-Copilot consistently outperforms individual algorithms across diverse scenarios:

Tabular Data (F1 Score)

ScenarioCausal-CopilotPCFCIGESDirectLiNGAM
Default (p=10, n=1000)0.890.920.910.920.22
Dense Graph (p=0.5)0.650.410.440.400.26
Large Scale (p=50)0.940.700.79N/A0.23
Super Large (p=100)0.900.680.74N/AN/A
Extreme Large (p=500)0.60N/AN/AN/AN/A
Non-Gaussian Noise0.970.840.850.860.57
Heterogeneous Domains0.770.510.620.400.23
Measurement Error0.860.690.800.790.28

Time Series Data (F1 Score)

ScenarioCausal-CopilotPCMCIDYNOTEARSVARLiNGAM
Small (p=5, l=3)0.980.920.970.97
Large Lag (l=20)0.850.840.770.77
Very Large (p=100)0.18N/AN/A0.12

Interactive Demo

Web-based interface for interactive causal analysis
Web-based interface for interactive causal analysis

Try it yourself on Hugging Face Space.


Sample Output

Example PDF report: dataset statistics, causal graph visualization, and analysis results
Example PDF report: dataset statistics, causal graph visualization, and analysis results

Quick Start

# Command-line usage
python main.py --data_file data.csv --apikey YOUR_KEY --initial_query "Discover causal relationships"

# Web interface deployment
python Gradio/demo.py

Citation

@article{causalcopilot2025,
  title={Causal-Copilot: An Autonomous Causal Analysis Agent},
  author={Wang, Xinyue and Zhou, Kun and Wu, Wenyi and others},
  journal={arXiv preprint arXiv:2504.13263},
  year={2025}
}
Xinyue Wang
Xinyue Wang
PhD Student (HDSI @ UCSD)

Scalable causal learning, world models, and agents.