Choose parameters

Some details on parameters for each step.

Preprocessor

  • highpass_freq (float): frequency of high pass filter typically 250~500Hz. This remove LFP component in the signal Theorically a low value is better (250Hz) but if the signal contain oscillation at high freqyencies (during sleep for insctance) theye must removed so 400Hz should be OK.
  • lowpass_freq (float): low pass frequency (typically 3000~10000Hz) This remove noise in high freuqnecy. This help
    to smooth the spike for peak alignement. This must not exceed niquist frequency (sample_rate/2)
  • smooth_size (int): other possibility to smooth signal. This apply a kernel (more or less triangle) the smooth_size*
    width in sample. This is like a lowpass filter. If you don’t known put 0.
  • common_ref_removal (bool): this substracts sample by sample the median across channels
    When there is a strong noise that appears on all channels (sometimes due to reference) you can substract it. This is as if all channels would re referenced numerically to there medians.
  • chunksize (int): the whole processing chain is applied chunk by chunk, this is the chunk size in sample. Typically 1024.
    The smaller size lead to less memory but more CPU comsuption in Peeler. For online, this will be more or less the latency.
  • lostfront_chunksize (int): size in sample of the margin at the front edge for each chunk to avoid border effect in backward filter.
    In you don’t known put None then lostfront_chunksize will be int(sample_rate/highpass_freq)*3 which is quite robust (<5% error) compared to a true offline filtfilt.
  • engine (str): ‘numpy’ or ‘opencl’. There is a double implementation for signal preprocessor : With numpy/scipy flavor (and so CPU) or opencl with home made CL kernel (and so use GPU computing). If you have big fat GPU and are able to install “opencl driver” (ICD) for your platform the opencl flavor should speedup the peeler because pre processing signal take a quite important amoung of time.

Peak detector

  • peakdetector_engine (str): ‘numpy’ or ‘opencl’. See signal_preprocessor_engine. Here the speedup is small.
  • peak_sign (str) : sign of the peak (‘+’ or ‘-‘). The double detection (‘+-‘) is intentionaly NOT implemented is tridesclous because it lead to many mistake for users in multi electrode arrays where the same cluster is seen both on negative peak and positive rebounce.
  • relative_threshold (str): the threshold without sign with MAD units (robust standard deviation). See Important details.
  • peak_span_ms (float) : this avoid double detection of the same peak in a short span. The units is millisecond.

Waveform extraction

  • wf_left_ms (float); size in ms of the left sweep from the peak index. This number is negative.
  • wf_right_ms (float): size in ms of the right sweep from the peak index. This number is positive.
  • mode (str): ‘rand’ or ‘all’ With ‘all’ all detected peaks are extracted. With ‘all’ only an randomized subset is taken.
    Note that if you use tridesclous with the script/notebook method you can also choose by yourself which peak are choosen for waveform extraction. This can be usefull to avoid electrical/optical stimlation periods or force peak around stimulus periods.

Clean peaks

  • alien_value_threshold (float): units=one mad. above this threshold the waveforms is tag as “Alien” and not use for features and clustering

  • mode ‘extremum_amplitude’ or ‘full_waveform’: use only the peak value (fast) or the whole waevform (slower)

  • {‘name’:’extract_waveforms’, ‘type’:’group’, ‘children’ : waveforms_params}, {‘name’:’clean_peaks’, ‘type’:’group’, ‘children’ : clean_peaks_params}, {‘name’:’peak_sampler’, ‘type’:’group’, ‘children’ : peak_sampler_params},

    {‘name’: ‘alien_value_threshold’, ‘type’: ‘float’, ‘value’:100.}, {‘name’: ‘mode’, ‘type’: ‘list’, ‘values’:[, ]}, # full_waveform

]

Peak sampler

This step must be carrefully inspected. It select some peak mongs all peaks to make features and clustering. This highly depend on : the duration on which the catalogue constructor is done + the number of channel (and so the number of cells) + the density (firing rate) fo each cluster. Since this can’t be known in advance, the user must explore cluster and extract again while changing theses parameters given dense enough clusters. This have a strong imptact of the CPU and RAM. So do not choose too big numbers. But you there are too few spikes, cluster could be not detected.

  • mode : ‘rand’, ‘rand_by_channel’ or ‘all’ Startegy to select some peak to then make feature and clustering.
    ‘rand’ take a global number of spike indepently of channels detection ‘rand_by_channel’ take a random number of spike per channel ‘all’ take all spike internally with script ‘force’ allow to select manually which spike we want. For instance to force when there is a stimulus.
  • nb_max: when mode=’rand’
  • nb_max_by_channel when mode=’rand_by_channel’

Noise snippet extraction

  • nb_snippet (int): the number of noise snippet taken in the signal in between peaks.

Features extraction

Several methods possible. See Important details.

  • global_pca: good option for tetrode.
    • n_components (int): number of components of the pca for all the channel.
  • peak_max : good options when clustering is sawchaincut
  • pca_by_channel: good option for high channel counts
    • n_components_by_channel (int): number of component for each channel.

Cluster

Several methods possible. See Important details.

  • kmeans : kmeans implemented in sklearn
    • n_clusters (int): number of cluster
  • onecluster no clustering. All label set to 0.
  • gmm gaussian mixture model implemented in sklearn
    • n_clusters (int): number of cluster
    • covariance_type (str): ‘full’, ‘tied’, ‘diag’, ‘spherical’
    • n_init (int) The number of initializations to perform.
  • agglomerative AgglomerativeClustering implemented in sklearn
    • n_clusters: number of cluster
  • dbscan DBSCAN implemented in sklearn
    • eps (float): The maximum distance between two samples for them to be considered as in the same neighborhood.
  • hdbscan HDBSCAN density base clustering without the problem of the eps
  • isosplit ISOSPLIT5 develop for moutainsort (another sorter)
  • optics OPTICS implemented in sklearn
    • min_samples (int): The number of samples in a neighborhood for a point to be considered as a core point.
  • sawchaincut Home made automatic clustering, usefull for dense arrays. Autodetect well isolated cluster and put to trash ambiguous things.
  • pruningshears Another home made automatic clustering. Internaly use hdbscan. Have better performance than sawcahincut but it is slower.