Choose parameters

Some details on parameters for each step.

Preprocessor

  • highpass_freq (float): frequency of high pass filter typically 250~500Hz. This remove LFP component in the signal Theorically a low value is better (250Hz) but if the signal contain oscillation at high freqyencies (during sleep for insctance) theye must removed so 400Hz should be OK.
  • lowpass_freq (float): low pass frequency (typically 3000~10000Hz) This remove noise in high freuqnecy. This help
    to smooth the spike for peak alignement. This must not exceed niquist frequency (sample_rate/2)
  • smooth_size (int): other possibility to smooth signal. This apply a kernel (more or less triangle) the smooth_size*
    width in sample. This is like a lowpass filter. If you don’t known put 0.
  • common_ref_removal (bool): this substracts sample by sample the median across channels
    When there is a strong noise that appears on all channels (sometimes due to reference) you can substract it. This is as if all channels would re referenced numerically to there medians.
  • chunksize (int): the whole processing chain is applied chunk by chunk, this is the chunk size in sample. Typically 1024.
    The smaller size lead to less memory but more CPU comsuption in Peeler. For online, this will be more or less the latency.
  • lostfront_chunksize (int): size in sample of the margin at the front edge for each chunk to avoid border effect in backward filter.
    In you don’t known put None then lostfront_chunksize will be int(sample_rate/highpass_freq)*3 which is quite robust (<5% error) compared to a true offline filtfilt.
  • signalpreprocessor_engine (str): ‘numpy’ or ‘opencl’. There is a double implementation for signal preprocessor : With numpy/scipy flavor (and so CPU) or opencl with home made CL kernel (and so use GPU computing). If you have big fat GPU and are able to install “opencl driver” (ICD) for your platform the opencl flavor should speedup the peeler because pre processing signal take a quite important amoung of time.

Peak detector

  • peakdetector_engine (str): ‘numpy’ or ‘opencl’. See signal_preprocessor_engine. Here the speedup is small.
  • peak_sign (str) : sign of the peak (‘+’ or ‘-‘). The double detection (‘+-‘) is intentionaly NOT implemented is tridesclous because it lead to many mistake for users in multi electrode arrays where the same cluster is seen both on negative peak and positive rebounce.
  • relative_threshold (str): the threshold without sign with MAD units (robust standard deviation). See Important details.
  • peak_span_ms (float) : this avoid double detection of the same peak in a short span. The units is millisecond.

Waveform extraction

  • wf_left_ms (float); size in ms of the left sweep from the peak index. This number is negative.
  • wf_right_ms (float): size in ms of the right sweep from the peak index. This number is positive.
  • mode (str): ‘rand’ or ‘all’ With ‘all’ all detected peaks are extracted. With ‘all’ only an randomized subset is taken.
    Note that if you use tridesclous with the script/notebook method you can also choose by yourself which peak are choosen for waveform extraction. This can be usefull to avoid electrical/optical stimlation periods or force peak around stimulus periods.
  • nb_max (int): for ‘rand’ mode this is the number of peak extracted. This number must be carrefully choosen. This highly depend on : the duration on which the catalogue constructor is done + the number of channel (and so the number of cells) + the density (firing rate) fo each cluster. Since this can’t be known in advance, the user must explore cluster and extract again while changing this number given dense enough clusters. This have a strong imptact of the CPU and RAM. So do not choose to big number.
  • align_waveform (bool): experimental make an oversampling before extracting waveform. This slow down a lot the process
    with few enhancement for centroids. Note that this is NOT the intersample true peak estimation done by the Peeler (which is very important)

Waveform clean

  • alien_value_threshold (float): units=one mad. above this threshold the waveforms is tag as “Alien” and not use for features and clustering

Noise snippet extraction

  • nb_snippet (int): the number of noise snippet taken in the signal in between peaks.

Features extraction

Several methods possible. See Important details.

  • global_pca:
    • n_components (int): number of components of the pca for all the channel.
  • peak_max no parameters
  • pca_by_channel:
    • n_components_by_channel (int): number of component for each channel.
  • neighborhood_pca:
    • n_components_by_neighborhood (int): number of component by channel and its neighborhood
    • radius_um (float): radius around the channel in mircometers.

Cluster

Several methods possible. See Important details.

  • kmeans : kmeans implemented in sklearn
    • n_clusters (int): number of cluster
  • onecluster no clustering. All label set to 0.
  • gmm gaussian mixture model implemented in sklearn
    • n_clusters (int): number of cluster
    • covariance_type (str): ‘full’, ‘tied’, ‘diag’, ‘spherical’
    • n_init (int) The number of initializations to perform.
  • agglomerative AgglomerativeClustering implemented in sklearn
    • n_clusters: number of cluster
  • dbscan DBSCAN implemented in sklearn
    • eps (float): The maximum distance between two samples for them to be considered as in the same neighborhood.
  • optics OPTICS implemented in sklearn
    • min_samples (int): The number of samples in a neighborhood for a point to be considered as a core point.
  • sawchaincut Home made automatic clustering, usefull for dense arrays. Autodetect well isolated cluster and put to trash ambiguous things.