The options for the SCE algorithm are explained below, other options are explained in Input options, Creating animations and Parallelisation.
The progress bar shows the percentage complete, the current learning rate \(\eta\), the divergence \(Eq\) (and clash rate, see Parallelisation).:
Optimizing Progress: 99.9%, eta=0.0010, Eq=0.0547636151, clashes=0.1% Optimizing done in 20s
The algorithm should be run until \(Eq\) stabilises.
--perplexity PERPLEXITY Perplexity for distance to similarity conversion [default = 15] --no-preprocessing Turn off entropy pre-processing of distances --weight-file WEIGHT_FILE Weights for samples --maxIter MAXITER Maximum SCE iterations [default = 100000] --nRepuSamp NREPUSAMP Number of neighbours for calculating repulsion (1 or 5) [default = 5] --eta0 ETA0 Learning rate [default = 1] --bInit BINIT 1 for over-exaggeration in early stage [default = 0] --no-clustering Turn off HDBSCAN clustering after SCE
--perplexityroughly sets the balance between global and local structure, smaller values making fewer neighbours matter, and emphasising local structure. Typical values are between 5 and 50, but making plots at multiple values is often useful.
--no-preprocessinguses raw similarities as input, rather than a probability distribtion using a desired perplexity. Do not use this option unless you are having issues setting the perplexity.
--weight-fileallows you to give different samples different weights in being picked to attract or repel. The default is to equally weight samples, but you could for example weight by cluster size or its inverse. This requires a tab-separated file with no header, the first column with sample names, the second column with their weights.
--maxItersets how long the algorithm will run for. Larger datasets will need more iterations. We recommend running until \(Eq\) stabilises.
--nRepuSampis the number of neighbours used for calculating the repulsion at each iteration. This can be one or five.
--eta0sets the scale of the learning rate. A higher value will move points around more at each iteration.
--bInitturns on over-exaggeration, which sets the stength of attraction four times higher in the first 10% of the iterations.
--no-clusteringturns off the HDBSCAN clustering if you don’t want spatial clustering run on the embedding (which may stall when no structure has been found). This is implied if
--labelshave been provided.