Foothills of Combinatorics (part 2)
Introducing the simplex generator for higher dimension pascal/combinatorics and looking at the bigger picture: exhaustive function composition.
See also part 1
The simplex generator
We have to broaden our humble 2D pascal generator matrix to higher dimensions to generate the cells for the simplex.
Geometrically, a 2D triangle extends to 3D as a tetrahedron, and to higher dimensions still as the simplex, the simplest possible polytope in any dimension.
In maths there’s a standard simplex definition, but it doesn’t quite suit our use here with a generator, so we’re defining our own slight variation.
Right-simplex recipe:
- the 0 dimensional simplex is a point1
- to make a simplex into the next-dimension-up version, join all its existing points to a new point at 1 on the new dimension’s axis
Thus, the progression of simplices’ vertices2 through the dimensions:
$$ \begin{aligned} \text{dim 0:} \quad &() \\ \\ \text{dim 1:} \quad &(1) \\ &(0) \\ \\ \text{dim 2:} \quad &(1, 0) \\ &(0, 1) \\ &(0, 0) \\ \\ \text{dim 3:} \quad &(1, 0, 0) \\ &(0, 1, 0) \\ &(0, 0, 1) \\ &(0, 0, 0) \\ \\ \text{dim 4:} \quad &(1, 0, 0, 0) \\ &(0, 1, 0, 0) \\ &(0, 0, 1, 0) \\ &(0, 0, 0, 1) \\ &(0, 0, 0, 0) \\ \end{aligned} $$And now we’re almost seeing the generators for d-dimensional pascal’s simplex, but not quite.
First we need to choose which of the \(d+1\) points is our collector vertex \(v_c\) — this is the vertex where the rest of the vertices get their values summed into. Once we’ve chosen \(v_c\), we remove that coordinate and translate all remaining ones by \(v_c\). This makes the right-simplex coordinates into generator matrix coordinates.
For \(v_c\) we are going to use \((0, 0, \dots 1)\)3.
Here’s an example of the process for pascal’s triangle (2D). We start with the points for the 2D simplex:
$$ \begin{aligned} &(1, 0) \\ &(0, 1) \\ &(0, 0) \\ \end{aligned} $$We choose \(v_c = (0, 1)\), remove it from the coordinates, and add \(v_c\) to the remaining coords:
$$ \begin{aligned} (1, 0) + (0, 1) &= (1, 1) \\ (0, 0) + (0, 1) &= (0, 1) \end{aligned} $$And these two coordinates are recognisable as making up the generator matrix for pascal’s triangle in eqn. 1:
$$ \begin{pmatrix} \hfill 1 & \hfill 1 \\ \hfill 0 & \hfill 1 \end{pmatrix} $$If we do the same process to make the generator for the 3D and 4D simplices, we end up with these generator matrices:
$$ \begin{aligned} M_{p}^{3} &= \begin{pmatrix} \hfill 1 & \hfill 0 & \hfill 1 \\ \hfill 0 & \hfill 1 & \hfill 1 \\ \hfill 0 & \hfill 0 & \hfill 1 \\ \end{pmatrix} \\ \\ M_{p}^{4} &= \begin{pmatrix} \hfill 1 & \hfill 0 & \hfill 0 & \hfill 1 \\ \hfill 0 & \hfill 1 & \hfill 0 & \hfill 1 \\ \hfill 0 & \hfill 0 & \hfill 1 & \hfill 1 \\ \hfill 0 & \hfill 0 & \hfill 0 & \hfill 1 \\ \end{pmatrix} \end{aligned} $$These matrices might look familiar — like identity matrices but with a column of ones. They’re actually homogenous translation matrices as used in 3D graphics.4
So we can generally define the pascal simplex generator as:
$$ g_p^d = \left[ \Sigma, \text{Trans}(1_{d-1}) \right] \tag{2} $$with \(\text{Trans}(1_n)\) being the homogenous translation matrix for the 1s vector for dimension \(n\).
The generated combinatorial (pascal) simplex
To denote the generated combinatorial simplex in its entirety, we’re going to use a plain triangle: \(\triangle\). This is a multidimensional array, which we’ll address into using NumPy slice notation.
\(\triangle\) has infinite entries for its first index. The slice in this first index at \(n\), written \(\triangle[n]\) or more simply \(\triangle_n\), is the entire probability structure for repeated choose with \(n\) objects.
If you run the sample code, it will generate the first \(m + 1\) slices by doing \(m\) generation steps. This corresponds to starting with \(0 \dots m\) objects and making from 0 to \(m\) choices.
You access the generated repeated choose values just by indexing into \(\triangle\) in the obvious way:
$$ \binom{a}{b_0, b_1, \dots b_n} = \triangle_a[b_0, b_1, \dots b_n] $$Note that \(\triangle_0\) is zero-dimension and contains a single \(1\), which is \(\binom{0}{0}\), the number of ways of choosing \(0\) objects from \(0\).
Slices \(\triangle_1\) and up are all \(m\)-dimensional. That might not seem to make sense – e.g. consider \(\triangle_2\), the possibilities when starting with 2 objects. If we’ve generated data for \(m=3\), i.e. three choices, how does that work? It works because with repeated choices, taking zero items is a valid at any time. So in this case \(\triangle_2\) includes the numbers for:
$$ \binom{2}{0}, \, \binom{2}{0,1}, \, \binom{2}{0,0,1}, \, \binom{2}{0,0,2}, \, \binom{2}{0,1} \dots $$(and so on).
The bigger picture
When we think of pascal’s triangle, we usually think of the obvious things: binomials, probability, coin tosses, etc.
But there’s something slightly deeper here that’s worth acknowledging.
Pascal’s triangle and similar generated objects are a kind of exhaustive function composition, which we might compare to monte carlo simulations, which are a kind of stochastic function composition5.
In pascal’s triangle the two functions are \(+0\) and \(+1\). There are two ‘base’ starting values \( 0 \) and \( 1 \); the \( 0 \) comes from the space outside the triangle. It’s zero because that’s the identity value for addition: \( \mathrm{I}_{+} = 0\).
There’s only one generator function, but that function enapsulates two separate functions \(+0\) and \(+1\) and manages to exercise them individually in some places by ‘pointing’ the function applications in different directions, sometimes towards the identity value, which ‘does nothing’.
Generally, the simplex of any dimension D ‘glues together’ D homogenous fuction variants into one exhuastive whole, with only one non-identity seed cell needed to kick everything off.
Consider this picture of Pascal 2D shown as the composed functions:
1
0+1 1+0
0+(0+1) (0+1)+(1+0) (1+0)+0
0+(0+(0+1)) (0+(0+1))+((0+1)+(1+0)) ((0+1)+(1+0))+(1+0)+0 ((1+0)+0)+0
. . .
The brackets aren’t strictly necessary, but shown the hierarchical structure. The sums (amount of 1s) are the usual triangle values.
So this isn’t a very interesting function. What else is possible?
Suppose you’re thinking about Euler angles and how they compose rotations6.
Firstly we need some data type and a homogenous function for it. Let’s have a pair of angles \((\theta, \phi)\) for a two-angle Euler system. We need a function that composes two of them (for a 2D triangle) into a new pair of angles. Just adding the parts isn’t that interesting; we want some asymmetric function of the two input vaues \(a\) and \(b\) (just like the unfair coin is more interesting with its asymmetric generator function).
We could choose this:
$$ f(a, b) = \textrm{rot}_\psi(a, 30) + \textrm{rot}_\theta(b, 30) $$That’s more interesting! We’d need to choose what we mean by \(+\) exactly (combine the two angle components somehow), but we’re incrementing different angles in the two input values then summing them in some way.
The asymmetry makes it more interesting. If we use the above generator, what are traditionally two arms of 1s on left and right edges of our triangle become pure ‘seed’ values for \(\theta\) and \(\phi\) increments, e.g. (0, 30, 60, 90…). Values inside the triangle are where these seed ranges interact.
One of the notable things about Euler angles is their habit of getting into gimbal lock, but that requires three angles to be in play. So in that case we’re talking a 3D pascal simplex with a generator like
$$ f(a, b, c) = \textrm{rot}_\psi(a, 30) + \textrm{rot}_\theta(b, 30) + \textrm{rot}_\phi(c, 30) $$and the tetrahedron has thre ‘seed’ lines (edges) that have increments of just the pure \(\psi\), \(\theta\), or \(\phi\) angle on each.
Beyond the Euler angle idea, you might do a simplex with a generator for quaternion-based rotations, for comparison (which should show that there are no degenerate cases like gimbal lock)7.
Wastage: the curse of dimensionality
The pascal simplex is very wasteful. By which I mean there’s repetition of the same information, apparent even in very low dimensions, which only gets worse in higher dimensions; for more on the general principle see Curse of dimensionality.
Part of this is down to the order of numbers on the bottom of a multinomial (or the order of numbers in a path into the simplex; same thing) not mattering; only the (multi)set of numbers is relevant. So we’re talking about the number of partitions \(p(n)\) here being proportional to amount of original information.
Compare this to the number compositions of \(n\), denoted \(C(n)\) — where order does matter — which grows much faster (exponentially). Note that it’s the number of compositions that closely relates to how much space the simplex takes up.
So the ‘wastage’ can be seen in this statement:
$$ \lim_{n \to \infty} \frac{p(n)}{C(n)} \to 0 $$which basically tells us ‘as dimensions increase, the percentage of original information tends to 0’. This tending-to-zero is because a super-polynomial8 function like \(p(n)\) grows much more slowly than an exponential function9 like \(C(n)\).
Wastage example: for the case of a pool of 5 objects, the ‘original information’ is already down to 43% by the reckoning of the above ratio.
Part 3 is to come.
note that the coordinate of the single point in the 0 dimensional simplex is the empty tuple \(()\) — the only thing it can be in 0 dimensions ↩︎
the 0d to 1d move feels like some sneaky trick has happened, but at the point we embed the 0d point in 1 dimension, it effectively gets a 0 tacked on to the empty tuple, which makes it have coordinate \((0)\) ↩︎
this isn’t an arbitrary choice; more on that later, perhaps ↩︎
the actual translation this matrix represents is \((1, 1, 1, \dots 1)\), written more precisely as \(1_d\). The ‘official’ definition of simplex vertices in mathematics, which we deviate from here (for generator purposes), is recognised as having the form of an affine transformation. The translation matrix we used here is an affine transformation too ↩︎
I imagine the exhaustive (pascal etc.) as a solid, shallow triangle, and the stochastic (monte carlo) as a much taller jellyfish thing with sparse tendrils descending far down. When you have a large search space, the jellyfish is doable, and the solid triangle isn’t, see curse of dimensionality. For smaller search space, generating all the data and then having ability to slice and inspect it can be nice (brute force analysis) ↩︎
it doesn’t matter too much, but let’s assume I’m using the Tait-Bryan system, used in aerospace: rotations in order are \(\psi\) yaw, \(\theta\) pitch, \(\phi\) roll ↩︎
consider: would a quaternion rotation simplex need to be done in 3D or 4D? ↩︎
growth rates: polynomial \(<\) super-polynomial \(<\) exponential ↩︎
for the record, the definitions of partition size and approx composition size:
$$ \begin{aligned} p(n) &\sim \frac {1} {4n\sqrt3} \exp\left({\pi \sqrt {\frac{2n}{3}}}\right) \\ \\ C(n) &= 2^{n-1} \end{aligned} $$ ↩︎