GaussianMixtureModel¶
lumicks.pylake.GaussianMixtureModel
- class GaussianMixtureModel(data, n_states, init_method, n_init, tol, max_iter)¶
A wrapper around
sklearn.mixture.GaussianMixture.This model accepts a 1D array as training data. The state parameters are sorted according to state mean in order to facilitate comparison of models with different number of states or trained on different datasets. As the current implementation is designed to specifically handle 1D data, model parameters are also returned as 1D arrays (
numpy.squeeze()is applied to the results) so that users do not have to be concerned with the shape of the output results.Warning
This is early access alpha functionality. While usable, this has not yet been tested in a large number of different scenarios. The API can still be subject to change without any prior deprecation notice! If you use this functionality keep a close eye on the changelog for any changes that may affect your analysis.
- Parameters
data (numpy.ndarray) – Data array used for model training.
n_states (int) – The number of Gaussian components in the model.
init_method ({'kmeans', 'random'}) –
“kmeans” : parameters are initialized via k-means algorithm
”random” : parameters are initialized randomly
n_init (int) – The number of initializations to perform.
tol (float) – The tolerance for training convergence.
max_iter (int) – The maximum number of iterations to perform.
- extract_dwell_times(trace, *, exclude_ambiguous_dwells=True)¶
Calculate lists of dwelltimes for each state in a time-ordered statepath array.
- Parameters
- Returns
Dictionary of all dwell times (in seconds) for each state. Keys are state labels.
- Return type
- classmethod from_channel(slc, n_states, init_method='kmeans', n_init=1, tol=0.001, max_iter=100)¶
Initialize a model from channel data.
- Parameters
slc (Slice) – Channel data used for model training.
n_states (int) – The number of Gaussian components in the model.
init_method ({'kmeans', 'random'}) –
“kmeans” : parameters are initialized via k-means algorithm
”random” : parameters are initialized randomly
n_init (int) – The number of initializations to perform.
tol (float) – The tolerance for training convergence.
max_iter (int) – The maximum number of iterations to perform.
- hist(trace, n_bins=100, plot_kwargs=None, hist_kwargs=None)¶
Plot a histogram of the data overlaid with the model PDF.
- pdf(x)¶
Calculate the Probability Distribution Function (PDF) given the independent data array
x.- Parameters
x (numpy.ndarray) – Array of independent variable values at which to calculate the PDF.
- Returns
PDF array split into components for each state with shape (n_states, x.size). The full normalized PDF can be calculated by summing across rows.
- Return type
- plot(trace, trace_kwargs=None, label_kwargs=None)¶
Plot a time trace with each data point labeled with the state assignment.
- property aic: float¶
Calculates the Akaike Information Criterion:
\[AIC = 2 k - 2 \ln{(L)}\]Where k refers to the number of parameters, n to the number of observations (or data points) and L to the maximized value of the likelihood function.
- property bic: float¶
Calculates the Bayesian Information Criterion:
\[BIC = k \ln{(n)} - 2 \ln{(L)}\]Where k refers to the number of parameters, n to the number of observations (or data points) and L to the maximized value of the likelihood function
- property means: numpy.ndarray¶
Model state means.
- property std: numpy.ndarray¶
Model state standard deviations.
- property variances: numpy.ndarray¶
Model state variances.
- property weights: numpy.ndarray¶
Model state weights.