Mathematical details#

To simulate times series with multivariate effects, the users has to minimally provide:

X: design matrix (\(n_{trials} \times n_{conditions}\))
effects: specifies the condition for which the effect should be present, its time windows and the effect_size (\(\Delta\))
noise_std: the additive noise within each subject (\(\sigma^2\))
n_channels: the number of channels/sensors
tmin, tmax, sfreq: the timing information of each trial required to determine the number of samples (\(n_t\)) per trials
ch_cov: channel by channel covariance (\(\Sigma\))

Based on this inputs, time series data with multivariate effects are simulated using a multivariate general linear model (GLM). For each subject \(s\), the simulated data matrix \(\boldsymbol{Y_s}\in\mathbb{R}^{n_{samples} \times n_{channels}}\) is generated as:

\[ \mathbf{Y}_s = \mathbf{X}_{full} \mathbf{B}_s + \mathbf{1} \boldsymbol{\beta}_{0,s}^\top + \boldsymbol{\varepsilon}_s, \]

where \(\mathbf{X}_{full}\) is the full design matrix, \(\mathbf{B}_s\) is the subject specific matrix of \(\beta\) regression coefficient, \(\boldsymbol{\beta}_{0,s}\) is the channel specific intercept for each subject, and \(\boldsymbol{\varepsilon}_s\) is the subject specific multivariate additive noise matrix.

The full design matrix \(\mathbf{X}_{full}\) is obtained by taking the Kronecker product between the user specified trial-wise design matrix \(X \in \mathbb{R}^{n_{trials}\times n_{conditions}}\) (i.e. X) and the identity matrix \(\mathbf{I}_t \in \mathbb{R}^{nt \times nt}\) where \(n_t\) is the number of sample per trials (derived from tmin, tmax, sfreq):

\[\mathbf{X}_{full} = \mathbf{X}\bigotimes \mathbf{I}_t\]

yielding a \(\mathbf{X}_{full}\) with \(N_{samples}=n_{trials}\times n_t\) rows-one for every trial–time pair-and \(n_{conditions} \times n_t\) columns-one for every condition–time pair.

The intercept is sampled independently for every channel,

\[\boldsymbol{\beta}_{0,s} \sim \mathcal{N(0, \sigma^2)} \in \mathbb{R}^{n_{channels}}\]

and the subject specific noise \(\boldsymbol{\varepsilon}_s \) is drawn from a multivariate normal distribution with spatial covariance \(\mathbf{\Sigma}\):

\[ \boldsymbol{\varepsilon}_s \sim \mathcal{N}(0, \sigma^2 \mathbf{\Sigma}) \]

The \(\mathbf{B}_s\in\mathbb{R}^{(n_{conditions} \times n_t)\times n_{channels}}\) stacks, row-wise, the time-resolved regression coefficients (β-weights) for every condition. Each row therefore corresponds to a specific condition–time pair, and each column to a sensor or feature. To embed a multivariate effect at selected time points, we draw a random spatial pattern from a standard normal distribution \(v \sim \mathcal{N}(0, \mathbf{I})\) across channels, and rescale it by a constant \(a\) so that its Mahalanobis length equals the user-requested effect size (effect_size, see below). Rows of \(\mathbf{B}_s\) that correspond to the chosen condition-times windows are then set to \(av^\top\).

Effect size#

Effect sizes are simulated based on the Mahalanobis distance ([1], [2]), which is a multivariate generalization of the standard z-score, taking into account the covariance structure of the data. For two classes, the Mahalanobis distance is defined as:

\[\Delta = \sqrt{(\mu_{1} - \mu_{2})^{T}\Sigma^{-1}(\mu_{1} - \mu_{2})}\]

\(\mu_{1}\) is the mean of the first condition
\(\mu_{2}\) is the mean of the second condition
\(\Sigma\) is the covariance matrix
\(T\) denotes matrix transpose

In our specific case, as the additive noise is multiplied by the covariance matrix to generate the final data, the effect size is equal to:

\[\Delta = \sqrt{(\mu_{1} - \mu_{2})^{T}(\sigma^{2}\Sigma)^{-1}(\mu_{1} - \mu_{2})}\]

Which simplifies to:

\[\Delta = \frac{1}{\sigma}\sqrt{(\mu_{1} - \mu_{2})^{T}\Sigma^{-1}(\mu_{1} - \mu_{2})}\]

To simulate data with the require effect size \(\Delta\), we generate a random vector \(\tilde v\) by drawing a random number from a standard normal distribution for each channel (which are used as the \(\mathbf{B}\) of our generative GLM). We then normalize that vector to have a Mahalanobis length of 1:

\[ v = \frac{\tilde v}{\sqrt{(\tilde v)^{T}\Sigma^{-1}(\tilde v)}}\]

We can scale up or down that vector by a constant \(a\) and place the centroid of each class on each side thereof to generate a pattern of the desired effect size. Therefore, the Mahalanobis distance of our effect is:

\[\Delta = \frac{1}{\sigma} \sqrt{av^{T}\Sigma^{-1}av}\]

Which simplifies to:

\[\Delta = \frac{a}{\sigma} \sqrt{v^{T}\Sigma^{-1}v}\]

Where:

\(v\) is a random vector of Mahalanobis length of 1 (i.e. Mahalanobis unit length vector)
\(a\) is a constant to scale the effect up or down to achieve the desired effect size.

As \(v\) is of unit length, the term \(\sqrt{v^{T}\Sigma^{-1}v}\) is equal to 1. Accordingly, the equation simplifies to:

\[\Delta = \frac{a}{\sigma}\]

Accordingly, to generate a multivariate pattern of the desired effect size, we have to resolve \(a\) for \(||av||_{\Sigma^{-1}}\) and \(sigma\), which gives:

\[a = \Delta * \sigma\]

where:

\(\Delta =\)effect_size
\(\sigma\)=noise_std

By multplying our vector \(v\) by the constant \(a\), we ensure that the distance between the two classes matches the desired effect size.

Effect size and decoding accuracy#

With equal class covariances, a Bayes-optimal linear classifier achieves:

\[\Phi(-\frac{1}{2}d')\]

Where \(\Phi\) is the normal distribution cummulative distribution function ([1], [3]). Accordingly, the maximal theoretical decoding accuracy is equal to:

\[1 - \Phi(-\frac{1}{2}d')\]

Thus an effect size of \(d'=0.5\) implies a theoretical ceiling of ≈ 69 % accuracy, \(d'=1\) gives ≈ 84 %, and so on. By scaling the injected pattern according to the formula above, multisim ensures that simulated data respect this relationship irrespective of the number of channels or their covariance.

References#

[1] (1,2)

Goeffrey J McLachlan. Mahalanobis distance. Resonance, 4(6):20–26, 1999.

[2]

Prasanta Chandra Mahalanobis. On tests and measures of group divergence. J. Asiat. Soc. Bengal, 26:541–588, 1930.

[3]

Geoffrey J McLachlan. Discriminant analysis and statistical pattern recognition. John Wiley & Sons, 2005.

Mathematical details#

Effect size#

Effect size and decoding accuracy#

References#

This Page