GaussianMixtureModel

lumicks.pylake.GaussianMixtureModel

class GaussianMixtureModel(data, n_states, init_method='kmeans', n_init=1, tol=0.001, max_iter=100)

A wrapper around sklearn.mixture.GaussianMixture.

This model accepts a 1D array as training data. The state parameters are sorted according to state mean in order to facilitate comparison of models with different number of states or trained on different datasets. As the current implementation is designed to specifically handle 1D data, model parameters are also returned as 1D arrays (numpy.squeeze() is applied to the results) so that users do not have to be concerned with the shape of the output results.

Warning

This is early access alpha functionality. While usable, this has not yet been tested in a large number of different scenarios. The API can still be subject to change without any prior deprecation notice! If you use this functionality keep a close eye on the changelog for any changes that may affect your analysis.

Parameters:
  • data (numpy.ndarray | Slice) – Data array used for model training.

  • n_states (int) – The number of Gaussian components in the model.

  • init_method ({'kmeans', 'random'}) –

    • “kmeans” : parameters are initialized via k-means algorithm

    • ”random” : parameters are initialized randomly

  • n_init (int) – The number of initializations to perform.

  • tol (float) – The tolerance for training convergence.

  • max_iter (int) – The maximum number of iterations to perform.

emission_path(trace)

Calculate the emission path for a given data trace.

Parameters:

trace (Slice) – Channel data to determine path.

Returns:

emission_path – Estimated emission path

Return type:

Slice

extract_dwell_times(trace, *, exclude_ambiguous_dwells=True)

Calculate lists of dwelltimes for each state in a time-ordered state path array.

Parameters:
  • trace (Slice) – Channel data to be analyzed.

  • exclude_ambiguous_dwells (bool) – Determines whether to exclude dwelltimes which are not exactly determined. If True, the first and last dwells are not used in the analysis, since the exact start/stop times of these events are not definitively known.

Returns:

Dictionary of all dwell times (in seconds) for each state. Keys are state labels.

Return type:

dict

classmethod from_channel(slc, n_states, init_method='kmeans', n_init=1, tol=0.001, max_iter=100)

Initialize a model from channel data.

Deprecated since version 1.4.0: This method has been deprecated and will be removed in a future version. You can now use Slice instances to construct this class directly.

hist(trace, n_bins=100, plot_kwargs=None, hist_kwargs=None)

Plot a histogram of the trace data overlaid with the model state path.

Parameters:
  • trace (Slice) – Data object to histogram.

  • n_bins (int) – Number of histogram bins.

  • plot_kwargs (Optional[dict]) – Plotting keyword arguments passed to the state path line plot.

  • hist_kwargs (Optional[dict]) – Plotting keyword arguments passed to the histogram plot.

label(trace)

Label channel data as states.

Parameters:
  • trace (Slice) – Channel data to label.

  • deprecated: (..) – 1.4.0: This method has been replaced with GaussianMixtureModel.state_path() and will be removed in a future release.

pdf(x)

Calculate the Probability Distribution Function (PDF) given the independent data array x.

Parameters:

x (numpy.ndarray) – Array of independent variable values at which to calculate the PDF.

Returns:

PDF array split into components for each state with shape (n_states, x.size). The full normalized PDF can be calculated by summing across rows.

Return type:

numpy.ndarray

plot(trace, *, trace_kwargs=None, label_kwargs=None)

Plot a histogram of the trace data with data points classified in states.

Parameters:
  • trace (Slice) – Data object to histogram.

  • trace_kwargs (Optional[dict]) – Plotting keyword arguments passed to the data line plot.

  • label_kwargs (Optional[dict]) – Plotting keyword arguments passed to the state labels plot.

plot_path(trace, *, trace_kwargs=None, path_kwargs=None)

Plot a histogram of the trace data overlaid with the model path.

Parameters:
  • trace (Slice) – Data object to histogram.

  • trace_kwargs (Optional[dict]) – Plotting keyword arguments passed to the data line plot.

  • path_kwargs (Optional[dict]) – Plotting keyword arguments passed to the path line plot.

state_path(trace)

Calculate the state path for a given data trace.

Parameters:

trace (Slice) – Channel data to determine path.

Returns:

state_path – Estimated state path

Return type:

Slice

property aic: float

Calculates the Akaike Information Criterion:

Deprecated since version 1.4.0: This method has been deprecated and will be removed in a future version. Use GaussianMixtureModel.fit_info.aic instead.

property bic: float

Calculates the Bayesian Information Criterion:

Deprecated since version 1.4.0: This property has been deprecated and will be removed in a future version. Use GaussianMixtureModel.fit_info.bic instead.

property exit_flag: dict

Model optimization information.

Deprecated since version 1.4.0: This property has been replaced with GaussianMixtureModel.fit_info and will be removed in a future release.

property fit_info: PopulationFitInfo

Information about the model training exit conditions.

property means: ndarray

Model state means.

property std: ndarray

Model state standard deviations.

property variances: ndarray

Model state variances.

property weights

Model state weights.