This document right now is normative and aspirational, not a description of the testing that's currently done .
The academic keyword to search for in relation to this document is "algebra of random variables ". Squiggle doesn't yet support getting the standard deviation, denoted by σ \sigma σ , but such support could yet be added.
m e a n ( f + g ) = m e a n ( f ) + m e a n ( g ) mean(f+g) = mean(f) + mean(g) m e an ( f + g ) = m e an ( f ) + m e an ( g )
σ ( f + g ) = σ ( f ) 2 + σ ( g ) 2 \sigma(f+g) = \sqrt{\sigma(f)^2 + \sigma(g)^2} σ ( f + g ) = σ ( f ) 2 + σ ( g ) 2
In the case of normal distributions,
m e a n ( n o r m a l ( a , b ) + n o r m a l ( c , d ) ) = m e a n ( n o r m a l ( a + c , b 2 + d 2 ) ) mean(normal(a,b) + normal(c,d)) = mean(normal(a+c, \sqrt{b^2 + d^2})) m e an ( n or ma l ( a , b ) + n or ma l ( c , d )) = m e an ( n or ma l ( a + c , b 2 + d 2 ))
m e a n ( f − g ) = m e a n ( f ) − m e a n ( g ) mean(f-g) = mean(f) - mean(g) m e an ( f − g ) = m e an ( f ) − m e an ( g )
σ ( f − g ) = σ ( f ) 2 + σ ( g ) 2 \sigma(f-g) = \sqrt{\sigma(f)^2 + \sigma(g)^2} σ ( f − g ) = σ ( f ) 2 + σ ( g ) 2
m e a n ( f ⋅ g ) = m e a n ( f ) ⋅ m e a n ( g ) mean(f \cdot g) = mean(f) \cdot mean(g) m e an ( f ⋅ g ) = m e an ( f ) ⋅ m e an ( g )
σ ( f ⋅ g ) = ( σ ( f ) 2 + m e a n ( f ) ) ⋅ ( σ ( g ) 2 + m e a n ( g ) ) − ( m e a n ( f ) ⋅ m e a n ( g ) ) 2 \sigma(f \cdot g) = \sqrt{ (\sigma(f)^2 + mean(f)) \cdot (\sigma(g)^2 + mean(g)) - (mean(f) \cdot mean(g))^2} σ ( f ⋅ g ) = ( σ ( f ) 2 + m e an ( f )) ⋅ ( σ ( g ) 2 + m e an ( g )) − ( m e an ( f ) ⋅ m e an ( g ) ) 2
Divisions are tricky, and in general we don't have good expressions to characterize properties of ratios. In particular, the ratio of two normals is a Cauchy distribution, which doesn't have to have a mean.
Specifying the pdf of the sum/multiplication/... of distributions as a function of the pdfs of the individual arguments can still be done. But it requires integration. My sense is that this is still doable, and I (Nuño) provide some pseudocode to do this.
Let f , g f, g f , g be two independently distributed functions. Then, the pdf of their sum, evaluated at a point z z z , expressed as ( f + g ) ( z ) (f + g)(z) ( f + g ) ( z ) , is given by:
( f + g ) ( z ) = ∫ − ∞ ∞ f ( x ) ⋅ g ( z − x ) d x (f + g)(z)= \int_{-\infty}^{\infty} f(x)\cdot g(z-x) \,dx ( f + g ) ( z ) = ∫ − ∞ ∞ f ( x ) ⋅ g ( z − x ) d x
See a proof sketch here
Here is some pseudocode to approximate this:
// pdf1 and pdf2 are pdfs,
// and cdf1 and cdf2 are their corresponding cdfs
let epsilonForBounds = 2 ** - 16 ;
let getBounds = ( cdf ) => {
let cdf_min = - 1 ;
let cdf_max = 1 ;
let n = 0 ;
while (
( cdf (cdf_min) > epsilonForBounds || 1 - cdf (cdf_max) > epsilonForBounds) &&
n < 10
) {
if ( cdf (cdf_min) > epsilonForBounds) {
cdf_min = cdf_min * 2 ;
}
if ( 1 - cdf (cdf_max) > epsilonForBounds) {
cdf_max = cdf_max * 2 ;
}
}
return [cdf_min, cdf_max];
};
let epsilonForIntegrals = 2 ** - 16 ;
let pdfOfSum = ( pdf1 , pdf2 , cdf1 , cdf2 , z ) => {
let bounds1 = getBounds (cdf1);
let bounds2 = getBounds (cdf2);
let bounds = [
Math. min (bounds1[ 0 ], bounds2[ 0 ]),
Math. max (bounds1[ 1 ], bounds2[ 1 ]),
];
let result = 0 ;
for ( let x = bounds[ 0 ]; (x = x + epsilonForIntegrals); x < bounds[ 1 ]) {
let delta = pdf1 (x) * pdf2 (z - x);
result = result + delta * epsilonForIntegrals;
}
return result;
};
With ∀ d i s t , p d f : = x ↦ pdf ( d i s t , x ) ∧ c d f : = x ↦ cdf ( d i s t , x ) ∧ q u a n t i l e : = p ↦ quantile ( d i s t , p ) \forall dist, pdf := x \mapsto \texttt{pdf}(dist, x) \land cdf := x \mapsto \texttt{cdf}(dist, x) \land quantile := p \mapsto \texttt{quantile}(dist, p) ∀ d i s t , p df := x ↦ pdf ( d i s t , x ) ∧ c df := x ↦ cdf ( d i s t , x ) ∧ q u an t i l e := p ↦ quantile ( d i s t , p ) ,
∀ x ∈ ( 0 , 1 ) , c d f ( q u a n t i l e ( x ) ) = x ∧ ∀ x ∈ dom ( c d f ) , x = q u a n t i l e ( c d f ( x ) ) \forall x \in (0,1), cdf(quantile(x)) = x \land \forall x \in \texttt{dom}(cdf), x = quantile(cdf(x)) ∀ x ∈ ( 0 , 1 ) , c df ( q u an t i l e ( x )) = x ∧ ∀ x ∈ dom ( c df ) , x = q u an t i l e ( c df ( x ))
cod ( c d f ) = ( 0 , 1 ) = cod ( p d f ) \texttt{cod}(cdf) = (0,1) = \texttt{cod}(pdf) cod ( c df ) = ( 0 , 1 ) = cod ( p df )
Write out invariants for CDFs and Inverse CDFs
Provide sources or derivations, useful as this document becomes more complicated
Provide definitions for the probability density function, exponential, inverse, log, etc.
Provide at least some tests for division
See if playing around with characteristic functions turns out anything useful