Parameter options
1. Specificity metric:
There are several options for the specificity metric. The first is a simple fold enrichment
score calculated for each prey and the bait it was detected with, relative to the entire dataset:

where xi,j is the spectral count for prey j relative to bait i and N is the number of baits.
The other scores are implemented as described by the Comparative Proteomic Analysis Software Suite
(CompPASS). We would refer the user to the tutorial page for detailed descriptions.
Link:
CompPASS tutorial
- Z-score: a prey's Z-score indicates the number of standard deviations away it is from the mean.
- S-score: the S-score reflects the abundance of a prey adjusted by the frequency with which
it is found across baits (lower frequency = higher score). Unlike the fold-enrichment and Z-scores,
prey abundance will affect comparisons between preys, for example if two preys are equally
frequent, the one with the higher abundance will receive a higher score.
- D-score: the D-score is calculated in the same was as the S-score, except reproducibility is incorporated
into it, i.e. a reproducibly found prey will score higher than one that isn't. This score should only be
selected when abundance information is available for two or more replicates.
This abundance column must contain the replicate values as a pipe-separated list. See the "Spec" column from
the example SAINT file to see how this should be formatted.
- WD-score: the WD-score is a weighted D-score, that attempts to adjust the D-score to better recover/score
frequently found proteins that show behavior typical of true interactors. Like the D- and S-scores, prey
abundance affects comparisons between preys.
2. Score filter:
All preys that satisfy this score cutoff will be displayed in the scatter plot. Note:
specificity scores will be calculated for all preys; this cutoff is only for display purposes.
3. Points to label (default 10):
The number of points to label on the plot beginning with the highest specificity score and moving
downwards.
4. Control subtraction:
The average value of a prey across control samples will be subtracted from the detected value for the bait
if this is set to "yes". The quantitative value for the prey becomes the value above and beyond what is seen
across the control samples. Specify the column to use for controls in the adjacent "Control column" field.
5. Adjust abundance to protein length (no by default):
The spectral count/abundance value of each prey can be normalized to its protein length if a
column with protein length is available in the input file. This normalization will not affect specificity
scores. It can be used to adjust the x (abundance) dimension on the output scatterplots so that it is weighted
relative to protein length. The multiplcation factor used to normalize a prey's abundance is calculated as the
median of the length of all significant preys (those passing the cutoff) divided by the prey's length.
6. Normalization between samples (none by default):
No normalization across baits is applied by default, but when baits in the same dataset have been run on
instruments with varying sensitivity or dynamic range, normalization should be applied. The
options for normalization are to divide by the total abundance for all proteins identified
in the run or normalize based on a specific prey.
Normalization will be applied after control subtraction if both are specified.
7. Log transform (default no):
If desired, data can be log-transformed by base 2, base 10 or natural. Log transformation will be performed
after control subtraction and/or normalization if these are also specified.
8. Mark expression level on node (default no):
The RNA expression level of a gene can be drawn on nodes by selecting this option. You must specificy a cap
for high expression in transcripts per million (TPM, default 50) and specify the cell line. Expression
information is taken from
The Human Protein Atlas. Expression
level will be indicated on a node as the edge length. Nodes with expression ≥ the specified cap
will have a edge length equal to the complete node circumference, while nodes with levels of expression
less than that will be shown with an edge length proportional to their expression divided by the cap, as
shown in this example:
9. Remove contaminants (default no):
If you wish to omit plotting of preys considered to be contaminants (or for other reasons), you can
select this box and specify a list of them in the text area to the right. Gene names must be entered
one per line and are case sensitive.