Functional Alignment#

Group-wise function alignment using SRSF framework and Dynamic Programming

moduleauthor:: J. Derek Tucker <jdtuck@sandia.gov>

time_warping.align_fPCA(f, time, num_comp=3, showplot=True, smoothdata=False, cores=-1)[source]#

aligns a collection of functions while extracting principal components. The functions are aligned to the principal components

Parameters:

f (np.ndarray) – numpy ndarray of shape (M,N) of N functions with M samples
time (np.ndarray) – vector of size M describing the sample points
num_comp – number of fPCA components
showplot – Shows plots of results using matplotlib (default = T)
smooth_data (bool) – Smooth the data using a box filter (default = F)
cores – number of cores for parallel (default = -1 (all))

Return type:

tuple of numpy array

Return fn:

aligned functions - numpy ndarray of shape (M,N) of N functions with M samples

Return qn:

aligned srvfs - similar structure to fn

Return q0:

original srvf - similar structure to fn

Return mqn:

srvf mean or median - vector of length M

Return gam:

warping functions - similar structure to fn

Return q_pca:

srsf principal directions

Return f_pca:

functional principal directions

Return latent:

latent values

Return coef:

coefficients

Return U:

eigenvectors

Return orig_var:

Original Variance of Functions

Return amp_var:

Amplitude Variance

Return phase_var:

Phase Variance

time_warping.align_fPLS(f, g, time, comps=3, showplot=True, smoothdata=False, delta=0.01, max_itr=100)[source]#

This function aligns a collection of functions while performing principal least squares

Parameters:

f (np.ndarray) – numpy ndarray of shape (M,N) of N functions with M samples
g (np.ndarray) – numpy ndarray of shape (M,N) of N functions with M samples
time (np.ndarray) – vector of size M describing the sample points
comps – number of fPLS components
showplot – Shows plots of results using matplotlib (default = T)
smooth_data (bool) – Smooth the data using a box filter (default = F)
delta – gradient step size
max_itr – maximum number of iterations

Return type:

tuple of numpy array

Return fn:

aligned functions - numpy ndarray of shape (M,N) of N functions with M samples

Return gn:

aligned functions - numpy ndarray of shape (M,N) of N functions with M samples

Return qfn:

aligned srvfs - similar structure to fn

Return qgn:

aligned srvfs - similar structure to fn

Return qf0:

original srvf - similar structure to fn

Return qg0:

original srvf - similar structure to fn

Return gam:

warping functions - similar structure to fn

Return wqf:

srsf principal weight functions

Return wqg:

srsf principal weight functions

Return wf:

srsf principal weight functions

Return wg:

srsf principal weight functions

Return cost:

cost function value

class time_warping.fdawarp(f, time)[source]#

This class provides alignment methods for functional data using the SRVF framework

Usage: obj = fdawarp(f,t)

Parameters:

f – (M,N): matrix defining N functions of M samples
time – time vector of length M
fn – aligned functions
qn – aligned srvfs
q0 – initial srvfs
fmean – Karcher mean
mqn – mean srvf
gam – warping functions
psi – srvf of warping functions
stats – alignment statistics
qun – cost function
lambda – lambda
method – optimization method
gamI – inverse warping function
rsamps – random samples
fs – random aligned functions
gams – random warping functions
ft – random warped functions
qs – random aligned srvfs
type – alignment type
mcmc – mcmc output if bayesian

Author : J. D. Tucker (JDT) <jdtuck AT sandia.gov> Date : 15-Mar-2018

gauss_model(n=1, sort_samples=False)[source]#

This function models the functional data using a Gaussian model extracted from the principal components of the srvfs

Parameters:

n (integer) – number of random samples
sort_samples (bool) – sort samples (default = T)

joint_gauss_model(n=1, no=3)[source]#

This function models the functional data using a joint Gaussian model extracted from the principal components of the srsfs

Parameters:

n (integer) – number of random samples
no (integer) – number of principal components (default = 3)

multiple_align_functions(mu, omethod='DP2', smoothdata=False, parallel=False, lam=0.0, pen='roughness', cores=-1, grid_dim=7)[source]#

This function aligns a collection of functions using the elastic square-root slope (srsf) framework.

Usage: obj.multiple_align_functions(mu)

Parameters:

mu – vector of function to align to
omethod – optimization method (DP, DP2, RBFGS, cRBFGS) (default = DP2)
smoothdata (bool) – Smooth the data using a box filter (default = F)
parallel – run in parallel (default = F)
lam (double) – controls the elasticity (default = 0)
penalty – penalty type (default=”roughness”) options are “roughness”, “l2gam”, “l2psi”, “geodesic”. Only roughness implemented in all methods. To use others method needs to be “RBFGS” or “cRBFGS”
cores – number of cores for parallel (default = -1 (all))
grid_dim – size of the grid, for the DP2 method only (default = 7)

plot()[source]#

plot functional alignment results

Usage: obj.plot()

srsf_align(method='mean', omethod='DP2', center=True, smoothdata=False, MaxItr=20, parallel=False, lam=0.0, pen='roughness', cores=-1, grid_dim=7, verbose=True)[source]#

This function aligns a collection of functions using the elastic square-root slope (srsf) framework.

Parameters:

method – (string) warp calculate Karcher Mean or Median (options = “mean” or “median”) (default=”mean”)
omethod – optimization method (DP, DP2, RBFGS, cRBFGS) (default = DP2)
center – center warping functions (default = T)
smoothdata (bool) – Smooth the data using a box filter (default = F)
MaxItr – Maximum number of iterations (default = 20)
parallel – run in parallel (default = F)
lam (double) – controls the elasticity (default = 0)
penalty – penalty type (default=”roughness”) options are “roughness”, “l2gam”, “l2psi”, “geodesic”. Only roughness implemented in all methods. To use others method needs to be “RBFGS” or “cRBFGS”
cores – number of cores for parallel (default = -1 (all))
grid_dim – size of the grid, for the DP2 method only (default = 7)
verbose – print status output (default = T)

Examples >>> import tables >>> fun=tables.open_file(“../Data/simu_data.h5”) >>> f = fun.root.f[:] >>> f = f.transpose() >>> time = fun.root.time[:] >>> obj = fs.fdawarp(f,time) >>> obj.srsf_align()

time_warping.normal(loc=0.0, scale=1.0, size=None)#

Draw random samples from a normal (Gaussian) distribution.

The probability density function of the normal distribution, first derived by De Moivre and 200 years later by both Gauss and Laplace independently [2], is often called the bell curve because of its characteristic shape (see the example below).

The normal distributions occurs often in nature. For example, it describes the commonly occurring distribution of samples influenced by a large number of tiny, random disturbances, each with its own unique distribution [2].

Note

New code should use the ~numpy.random.Generator.normal method of a ~numpy.random.Generator instance instead; please see the random-quick-start.

Parameters:

loc (float or array_like of floats) – Mean (“centre”) of the distribution.
scale (float or array_like of floats) – Standard deviation (spread or “width”) of the distribution. Must be non-negative.
size (int or tuple of ints, optional) – Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if loc and scale are both scalars. Otherwise, np.broadcast(loc, scale).size samples are drawn.

Returns:

out – Drawn samples from the parameterized normal distribution.

Return type:

ndarray or scalar

See also

scipy.stats.norm: probability density function, distribution or cumulative density function, etc.
random.Generator.normal: which should be used for new code.

Notes

The probability density for the Gaussian distribution is

\[p(x) = \frac{1}{\sqrt{ 2 \pi \sigma^2 }} e^{ - \frac{ (x - \mu)^2 } {2 \sigma^2} },\]

where $\mu$ is the mean and $\sigma$ the standard deviation. The square of the standard deviation, $\sigma^2$, is called the variance.

The function has its peak at the mean, and its “spread” increases with the standard deviation (the function reaches 0.607 times its maximum at $x + \sigma$ and $x - \sigma$ [2]). This implies that normal is more likely to return samples lying close to the mean, rather than those far away.

References

Examples

Draw samples from the distribution:

>>> mu, sigma = 0, 0.1 # mean and standard deviation
>>> s = np.random.normal(mu, sigma, 1000)

Verify the mean and the variance:

>>> abs(mu - np.mean(s))
0.0  # may vary

>>> abs(sigma - np.std(s, ddof=1))
0.1  # may vary

Display the histogram of the samples, along with the probability density function:

>>> import matplotlib.pyplot as plt
>>> count, bins, ignored = plt.hist(s, 30, density=True)
>>> plt.plot(bins, 1/(sigma * np.sqrt(2 * np.pi)) *
...                np.exp( - (bins - mu)**2 / (2 * sigma**2) ),
...          linewidth=2, color='r')
>>> plt.show()

Two-by-four array of samples from the normal distribution with mean 3 and standard deviation 2.5:

>>> np.random.normal(3, 2.5, size=(2, 4))
array([[-4.49401501,  4.00950034, -1.81814867,  7.29718677],   # random
       [ 0.39924804,  4.68456316,  4.99394529,  4.84057254]])  # random

time_warping.pairwise_align_bayes(f1i, f2i, time, mcmcopts=None)[source]#

This function aligns two functions using Bayesian framework. It will align f2 to f1. It is based on mapping warping functions to a hypersphere, and a subsequent exponential mapping to a tangent space. In the tangent space, the Z-mixture pCN algorithm is used to explore both local and global structure in the posterior distribution.

The Z-mixture pCN algorithm uses a mixture distribution for the proposal distribution, controlled by input parameter zpcn. The zpcn$betas must be between 0 and 1, and are the coefficients of the mixture components, with larger coefficients corresponding to larger shifts in parameter space. The zpcn[“probs”] give the probability of each shift size.

Usage: out = pairwise_align_bayes(f1i, f2i, time)

Parameters:

f1i – vector defining M samples of function 1
f2i – vector defining M samples of function 2
time – time vector of length M
mcmopts – dict of mcmc parameters

default mcmc options: tmp = {“betas”:np.array([0.5,0.5,0.005,0.0001]),”probs”:np.array([0.1,0.1,0.7,0.1])} mcmcopts = {“iter”:2*(10**4) ,”burnin”:np.minimum(5*(10**3),2*(10**4)//2), “alpha0”:0.1, “beta0”:0.1,”zpcn”:tmp,”propvar”:1, “initcoef”:np.repeat(0,20), “npoints”:200, “extrainfo”:True}

:rtype collection containing :return f2_warped: aligned f2 :return gamma: warping function :return g_coef: final g_coef :return psi: final psi :return sigma1: final sigma

if extrainfo :return accept: accept of psi samples :return betas_ind :return logl: log likelihood :return gamma_mat: posterior gammas :return gamma_stats: posterior gamma stats :return xdist: phase distance posterior :return ydist: amplitude distance posterior)

time_warping.pairwise_align_bayes_infHMC(y1i, y2i, time, mcmcopts=None)[source]#

This function aligns two functions using Bayesian framework. It uses a hierarchical Bayesian framework assuming mearsurement error error It will align f2 to f1. It is based on mapping warping functions to a hypersphere, and a subsequent exponential mapping to a tangent space. In the tangent space, the infty-HMC algorithm is used to explore both local and global structure in the posterior distribution.

Usage: out = pairwise_align_bayes_infHMC(f1i, f2i, time)

Parameters:

y1i – vector defining M samples of function 1
y2i – vector defining M samples of function 2
time – time vector of length M
mcmopts – dict of mcmc parameters

default mcmc options: mcmcopts = {“iter”:1*(10**4), “nchains”:4, “vpriorvar”:1, “burnin”:np.minimum(5*(10**3),2*(10**4)//2), “alpha0”:0.1, “beta0”:0.1, “alpha”:1, “beta”:1, “h”:0.01, “L”:4, “f1propvar”:0.0001, “f2propvar”:0.0001, “L1propvar”:0.3, “L2propvar”:0.3, “npoints”:200, “thin”:1, “sampfreq”:1, “initcoef”:np.repeat(0,20), “nbasis”:10, “basis”:’fourier’, “extrainfo”:True}

Basis can be ‘fourier’ or ‘legendre’

:rtype collection containing :return f2_warped: aligned f2 :return gamma: warping function :return v_coef: final v_coef :return psi: final psi :return sigma1: final sigma

if extrainfo :return theta_accept: accept of psi samples :return f2_accept: accept of f2 samples :return SSE: SSE :return gamma_mat: posterior gammas :return gamma_stats: posterior gamma stats :return xdist: phase distance posterior :return ydist: amplitude distance posterior)

1. Tucker, L. Shand, and K. Chowdhary. “Multimodal Bayesian Registration of Noisy Functions using Hamiltonian Monte Carlo”, Computational Statistics and Data Analysis, accepted, 2021.

time_warping.pairwise_align_functions(f1, f2, time, omethod='DP2', lam=0, pen='roughness', grid_dim=7)[source]#

This function aligns f2 to f1 using the elastic square-root

slope (srsf) framework.

Usage: out = pairwise_align_functions(f1, f2, time)

out = pairwise_align_functions(f1, f2, time, omethod, lam,: grid_dim)

Parameters:

f1 – vector defining M samples of function 1
f2 – vector defining M samples of function 2
time – time vector of length M
omethod – optimization method (DP, DP2, RBFGS, cRBFGS) (default = DP)
lam – controls the elasticity (default = 0)
penalty – penalty type (default=”roughness”) options are “roughness”, “l2gam”, “l2psi”, “geodesic”. Only roughness implemented in all methods. To use others method needs to be “RBFGS” or “cRBFGS”
grid_dim – size of the grid, for the DP2 method only (default = 7)

:rtype list containing :return f2n: aligned f2 :return gam: warping function :return q2n: aligned q2 (srsf)

time_warping.rand(d0, d1, ..., dn)#

Random values in a given shape.

Note

This is a convenience function for users porting code from Matlab, and wraps random_sample. That function takes a tuple to specify the size of the output, which is consistent with other NumPy functions like numpy.zeros and numpy.ones.

Create an array of the given shape and populate it with random samples from a uniform distribution over [0, 1).

Parameters:

d0 (int, optional) – The dimensions of the returned array, must be non-negative. If no argument is given a single Python float is returned.
d1 (int, optional) – The dimensions of the returned array, must be non-negative. If no argument is given a single Python float is returned.
... (int, optional) – The dimensions of the returned array, must be non-negative. If no argument is given a single Python float is returned.
dn (int, optional) – The dimensions of the returned array, must be non-negative. If no argument is given a single Python float is returned.

Returns:

out – Random values.

Return type:

ndarray, shape (d0, d1, ..., dn)