{ "cells": [ { "cell_type": "markdown", "id": "5e9cf430", "metadata": {}, "source": [ "# Mathematical details\n", "\n", "To simulate times series with multivariate effects, the users has to minimally provide:\n", "\n", "- ``X``: design matrix ($n_{trials} \\times n_{conditions}$)\n", "- ``effects``: specifies the ``condition`` for which the effect should be present, its time ``windows`` and the ``effect_size`` ($\\Delta$)\n", "- ``noise_std``: the additive noise within each subject ($\\sigma^2$)\n", "- ``n_channels``: the number of channels/sensors\n", "- ``tmin, tmax, sfreq``: the timing information of each trial required to determine the number of samples ($n_t$) per trials \n", "- ``ch_cov``: channel by channel covariance ($\\Sigma$)\n", "\n", "Based on this inputs, time series data with multivariate effects are simulated using a multivariate general linear model (GLM). For each subject $s$, the simulated data matrix $\\boldsymbol{Y_s}\\in\\mathbb{R}^{n_{samples} \\times n_{channels}}$ is generated as:\n", "\n", "$$\n", "\\mathbf{Y}_s = \\mathbf{X}_{full} \\mathbf{B}_s + \\mathbf{1} \\boldsymbol{\\beta}_{0,s}^\\top + \\boldsymbol{\\varepsilon}_s,\n", "$$\n", "\n", "where $\\mathbf{X}_{full}$ is the full design matrix, $\\mathbf{B}_s$ is the subject specific matrix of $\\beta$ regression coefficient, $\\boldsymbol{\\beta}_{0,s}$ is the channel specific intercept for each subject, and $\\boldsymbol{\\varepsilon}_s$ is the subject specific multivariate additive noise matrix. \n", "\n", "The full design matrix $\\mathbf{X}_{full}$ is obtained by taking the Kronecker product between the user specified trial-wise design matrix $X \\in \\mathbb{R}^{n_{trials}\\times n_{conditions}}$ (i.e. ``X``) and the identity matrix $\\mathbf{I}_t \\in \\mathbb{R}^{nt \\times nt}$ where $n_t$ is the number of sample per trials (derived from ``tmin, tmax, sfreq``):\n", "\n", "$$\\mathbf{X}_{full} = \\mathbf{X}\\bigotimes \\mathbf{I}_t$$\n", "\n", "yielding a $\\mathbf{X}_{full}$ with $N_{samples}=n_{trials}\\times n_t$ rows-one for every trial–time pair-and $n_{conditions} \\times n_t$ columns-one for every condition–time pair.\n", "\n", "The intercept is sampled independently for every channel,\n", "\n", "$$\\boldsymbol{\\beta}_{0,s} \\sim \\mathcal{N(0, \\sigma^2)} \\in \\mathbb{R}^{n_{channels}}$$\n", "\n", "and the subject specific noise $\\boldsymbol{\\varepsilon}_s $ is drawn from a multivariate normal distribution with spatial covariance $\\mathbf{\\Sigma}$:\n", "\n", "$$\n", "\\boldsymbol{\\varepsilon}_s \\sim \\mathcal{N}(0, \\sigma^2 \\mathbf{\\Sigma})\n", "$$\n", "\n", "The $\\mathbf{B}_s\\in\\mathbb{R}^{(n_{conditions} \\times n_t)\\times n_{channels}}$​ stacks, row-wise, the time-resolved regression coefficients (β-weights) for every condition. Each row therefore corresponds to a specific condition–time pair, and each column to a sensor or feature. To embed a multivariate effect at selected time points, we draw a random spatial pattern from a standard normal distribution $v \\sim \\mathcal{N}(0, \\mathbf{I})$ across channels, and rescale it by a constant $a$ so that its Mahalanobis length equals the user-requested effect size (``effect_size``, see below). Rows of $\\mathbf{B}_s$ that correspond to the chosen condition-times windows are then set to $av^\\top$.\n", "\n", "## Effect size\n", "\n", "Effect sizes are simulated based on the Mahalanobis distance ({cite}`mclachlan1999mahalanobis`, {cite}`mahalanobis1930tests`), which is a multivariate generalization of the standard z-score, taking into account the covariance structure of the data. For two classes, the Mahalanobis distance is defined as:\n", "\n", "$$\\Delta = \\sqrt{(\\mu_{1} - \\mu_{2})^{T}\\Sigma^{-1}(\\mu_{1} - \\mu_{2})}$$\n", "\n", "- $\\mu_{1}$ is the mean of the first condition\n", "- $\\mu_{2}$ is the mean of the second condition\n", "- $\\Sigma$ is the covariance matrix\n", "- $T$ denotes matrix transpose\n", "\n", "In our specific case, as the additive noise is multiplied by the covariance matrix to generate the final data, the effect size is equal to:\n", "\n", "$$\\Delta = \\sqrt{(\\mu_{1} - \\mu_{2})^{T}(\\sigma^{2}\\Sigma)^{-1}(\\mu_{1} - \\mu_{2})}$$\n", "\n", "Which simplifies to:\n", "\n", "$$\\Delta = \\frac{1}{\\sigma}\\sqrt{(\\mu_{1} - \\mu_{2})^{T}\\Sigma^{-1}(\\mu_{1} - \\mu_{2})}$$\n", "\n", "To simulate data with the require effect size $\\Delta$, we generate a random vector $\\tilde v$ by drawing a random number from a standard normal distribution for each channel (which are used as the $\\mathbf{B}$ of our generative GLM). We then normalize that vector to have a Mahalanobis length of 1:\n", "\n", "$$ v = \\frac{\\tilde v}{\\sqrt{(\\tilde v)^{T}\\Sigma^{-1}(\\tilde v)}}$$\n", "\n", "We can scale up or down that vector by a constant $a$ and place the centroid of each class on each side thereof to generate a pattern of the desired effect size. Therefore, the Mahalanobis distance of our effect is:\n", "\n", "$$\\Delta = \\frac{1}{\\sigma} \\sqrt{av^{T}\\Sigma^{-1}av}$$\n", "\n", "Which simplifies to:\n", "\n", "$$\\Delta = \\frac{a}{\\sigma} \\sqrt{v^{T}\\Sigma^{-1}v}$$\n", "\n", "Where: \n", "- $v$ is a random vector of Mahalanobis length of 1 (i.e. Mahalanobis unit length vector)\n", "- $a$ is a constant to scale the effect up or down to achieve the desired effect size.\n", "\n", "As $v$ is of unit length, the term $\\sqrt{v^{T}\\Sigma^{-1}v}$ is equal to 1. Accordingly, the equation simplifies to:\n", "\n", "$$\\Delta = \\frac{a}{\\sigma}$$\n", "\n", "Accordingly, to generate a multivariate pattern of the desired effect size, we have to resolve $a$ for $||av||_{\\Sigma^{-1}}$ and $sigma$, which gives:\n", "\n", "$$a = \\Delta * \\sigma$$\n", "\n", "where: \n", "- $\\Delta =$``effect_size``\n", "- $\\sigma$=``noise_std``\n", "\n", "By multplying our vector $v$ by the constant $a$, we ensure that the distance between the two classes matches the desired effect size.\n", "\n", "### Effect size and decoding accuracy\n", "\n", "With equal class covariances, a Bayes-optimal linear classifier achieves:\n", "\n", "$$\\Phi(-\\frac{1}{2}d')$$\n", "\n", "Where $\\Phi$ is the normal distribution cummulative distribution function ({cite}`mclachlan1999mahalanobis`, {cite}`mclachlan2005discriminant`). Accordingly, the maximal theoretical decoding accuracy is equal to:\n", "\n", "$$1 - \\Phi(-\\frac{1}{2}d')$$\n", "\n", "Thus an effect size of $d'=0.5$ implies a theoretical ceiling of ≈ 69 % accuracy, $d'=1$ gives ≈ 84 %, and so on. By scaling the injected pattern according to the formula above, **multisim** ensures that simulated data respect this relationship irrespective of the number of channels or their covariance.\n", "\n", "\n", "## References\n", "```{bibliography}\n", ":style: unsrt\n", ":filter: docname in docnames\n", "```" ] } ], "metadata": { "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.0" } }, "nbformat": 4, "nbformat_minor": 5 }