#correlation #statistics #coefficients #ord #xi-correlation #chatterjee #sourav

bin+lib xicor

An implementation of Sourav Chatterjee's xi-correlation coefficient

1 unstable release

0.1.0 Dec 31, 2024

#791 in Math

Download history 34/week @ 2024-12-25 106/week @ 2025-01-01

140 downloads per month

MIT/Apache

13KB
157 lines

Xicor

This crate provides a reasonably efficient implementation of Sourav Chatterjee's xi-correlation coefficient, based on the original paper.

Chatterjee's xi provides a measure of one variable's dependence on another in a much more general sense than, for example, Pearson's correlation coefficient. Suppose we have some sequence of random x values uniformly distributed from zero to tau. For each one, we compute y = sin(x). Pearson's correlation coefficient will be roughly zero for this data, as it measures linear dependence. On the other hand, Chatterjee's xi will be close to 1, representing that y is strongly a function of x, regardless of what function that may be.

Highlights

  • Extremely simple to use (just call xicor(), xicorf(), etc, with two slices containing the data)
  • Generic over Ord, as xi does not require calculations on the elements themselves, only the ability to compare them. In principle even strings could be correlated in this manner (lexicographically), for example.
  • Quite fast. In release mode on a 12-year-old machine (Dell M4700), xicorf was able to process 1,000,000 pairs in 0.33 seconds. Profiling revealed that 80% of this calculation lay in the standard library's sorting routines.

Progress

  • Calculation of the xi coefficient itself
  • P-values for testing independence

Dependencies

~245KB